Version 19 (modified by 4 years ago) (diff) | ,
---|
Table of Contents
Latest news on the DEEP-EST prototype system
This is a summary of the latest news concerning the system. For a list of known problems related to the system, please refer to this page.
Last update: 2021-01-12
System hardware
ESB nodes
- in CW51 the update of the first ESB rack started for using Extoll Fabri3 network (instead of IB)
- work is still ongoing
DAM nodes
- Persistent memory for nodes dp-dam[03-16] has been extended to 3 TB with next maintenance (CW51)
Network Federation Gateways
- Gateway nodes have been completed in CW51 to the final layout:
- 2 NF-GW EDR/Fabri3
- 2x NF-GW 40GbE/Fabri3
- 2x NF-GW EDR/40GbE
- due to the ongoing Fabri3 installation two of the NF-GWs (the ones equipped with Fabri3 PCIe cards) are not in operation yet
- for an example on how to use the gateway nodes and for further information, please refer to the batchsystem wiki page.
Global resources
NAM
- a NAM SW implementation has been done, a test environment on the DAM has been set up on dp-dam[09-16].
File Systems
- a new All Flash Storage Module (AFSM) is going to be added to the system in January 2020
- DEEP-EST storage has been rebuilt for performance reasons
- BeeGFS servers and clients have been updated
- the SDV is de-coupled now meaning that the SDV nodes do not mount
/work
anymore and the DEEP-EST (CM,DAM,ESB) nodes only mount/work
(not/sdv-work
)
- BeeGFS (
/work
) user quotas is in place now (see section "User management")
- It is possible to access the
$ARCHIVE
file system from thedeepv
login node under/arch
. For more information about$ARCHIVE
, please refer to the Filesystems page and see also the hint in the MOTD for efficient usage of the archive filesystem
- DCPMM usage within BeeGFS currently being tested on
dp-dam03
- more tests (with different kernel versions) being done
System software
SW updates
- new SLURM features are being integrated and will be rolled out soon:
- extended logging and improved resource management for jobs within a workflow
- burst buffer plugin
- 2020 Easybuild stage is being set up
- Intel oneAPI Beta 10 version has been installed to /usr/local/intel/oneapi
User management
BeeGFS Quotas
- a quota for the BeeGFS file system (mounted to /work) has been implemented
- thresholds still to be defined