[[TOC]] = Latest news on the DEEP-EST prototype system = This is a summary of the latest news concerning the system. For a list of known problems related to the system, please refer to [wiki:Public/User_Guide/PaS this page]. ''Last update: 2020-12-03'' {{{#!comment [[span(style=color: #FF0000, System will be in maintenance in CW37 (Monday, 2020-09-07 to Friday 2020-09-11))]] }}} == System hardware == {{{#!comment === CM nodes === }}} === ESB nodes === - in CW51 the first ESB rack will be updated to use Extoll network (instead of IB) === DAM nodes === - Persistent memory for nodes dp-dam[03-16] will be extended to 3 TB with next maintenance (CW51) === Network Federation Gateways === - Gateway nodes will be completed in CW51 to the final layout: * 2 NF-GW EDR/Fabri3 * 2x NF-GW 40GbE/Fabri3 * 2x NF-GW EDR/40GbE - for an example on how to use the gateway nodes and for further information, please refer to the [wiki:/Public/User_Guide/Batch_system#HeterogeneousjobswithMPIcommunicationacrossmodules batchsystem] wiki page. === Global resources === ==== NAM ==== - a NAM SW implementation has been done, a test environment on the DAM is being set up (using dp-dam[03,04]) === File Systems === - **a new All Flash Storage Module (AFSM) is going to be added to the system in January 2020** - DEEP-EST storage has been rebuilt for performance reasons - BeeGFS servers and clients have been updated - the SDV is de-coupled now meaning that the SDV nodes do not mount `/work` anymore and the DEEP-EST (CM,DAM,ESB) nodes only mount `/work` (not `/sdv-work`) - BeeGFS (`/work`) user quotas is in place now (see section "User management") - It is possible to access the `$ARCHIVE` file system from the `deepv` login node under `/arch`. For more information about `$ARCHIVE`, please refer to the [wiki:Filesystems Filesystems page] and see also the hint in the MOTD for efficient usage of the archive filesystem - DCPMM usage within BeeGFS currently being tested on `dp-dam03` - more tests (with different kernel versions) being done == System software == === SW updates === - new SLURM features are being integrated and will be rolled out in near future: - extended logging and improved resource management for jobs within a workflow - burst buffer pluggin - 2020 Easybuild stage is currently being set up - Intel oneAPI Beta 10 version has been installed to /usr/local/intel/oneapi === User management === ==== BeeGFS Quotas ==== - a quota for the BeeGFS file system (mounted to /work) has been implemented - thresholds still to be defined