[[TOC]] = Latest news on the DEEP-EST prototype system = This is a summary of the latest news concerning the system. For a list of known problems related to the system, please refer to [wiki:Public/User_Guide/PaS this page]. ''Last update: 2024-01-16' {{{#!comment [[span(style=color: #FF0000, System will be in maintenance in CW37 (Monday, 2020-09-07 to Friday 2020-09-11))]] }}} == System software == - !ParaStation update (psmgmt) to 5.1.53-1 has been performed === OS - compute nodes, bxi nodes and login node have been updated to Rocky 8.6 - file servers and master nodes to follow === !EasyBuild - [[span(style=color: #FF0000, 2023 stage is the default now)]] - Stage 2023 was relocated to `/p/software/deep/stages/2023`, if you run into trouble please check if you have the old path hardcoded somewhere. - Please don't use `module use $OTHERSTAGES` when loading Stage 2023. This won't work anymore. == System hardware == === CM nodes === - the cluster nodes have direct EBR IB access to the SSSM storage nodes now (without using the IB <-> 40 GbE gateway) === ESB nodes === - all ESB nodes (`dp-esb[01-75]`) are using EDR Infiniband interconnect (no Extoll anymore) - SSSM and AFSM file servers can be directly accessed through IB === DAM nodes === - DAM nodes are using EDR Infiniband (instead of using 40 GbE and Extoll) now - SSSM and AFSM file servers can be directly accessed through IB - current accelerator layout: - `dp-dam[01-08]`: 1 x Nvidia V100 GPU - `dp-dam02: 1 x Intel PAC D5005 FPGA (for testing) - `dp-dam[09-12]`: 2 x Nvidia V100 GPU - `dp-dam[13-16]`: 2 x Intel PAC D5005 FPGA === BXI nodes, Network Federation Gateways === - former network federation gateways now used for BXI testing: `dp-nfgw[02,03,05,06]` - can be accessed via Slurm using partition `dp-bxi` === SDV === - **two Intel test nodes have been added and are available to users via `dp-intelmax` partition** == File Systems == **please also refer to the** [wiki:Filesystems Filesystems] **overview** - quota has been added to `/tmp` on `deepv` to avoid congestion - the All Flash Storage Module (AFSM) provides a fast work file system mounted to `/afsm` (symbolic link to `/work`) on all compute nodes (CM, DAM, ESB) and the login node (`deepv`) - it is managed via project subfolders: after activating a project environment using `jutil` command the `$WORK` will be set accordingly - the older System Services and Storage Module (SSSM) work file system is obsolete, but still available at (`/work_old`) for data migration - SSSM still serves the /usr/local/software file system, but - starting from Rocky 8 image `/usr/local` will is a local file system on the compute nodes - `/usr/local/software` is still shared and provided by the SSSM storage - in addition to the !Easybuild software stack the shared `/usr/local/software` filesystem contains some manually installed software in a `legacy` subfolder