wiki:Public/User_Guide/News

Latest news on the DEEP-EST prototype system

This is a summary of the latest news concerning the system. For a list of known problems related to the system, please refer to this page.

Last update: 2024-01-16'

System software

  • ParaStation update (psmgmt) to 5.1.53-1 has been performed

OS

  • compute nodes, bxi nodes and login node have been updated to Rocky 8.6
  • file servers and master nodes to follow

EasyBuild

  • 2023 stage is the default now
  • Stage 2023 was relocated to /p/software/deep/stages/2023, if you run into trouble please check if you have the old path hardcoded somewhere.
  • Please don't use module use $OTHERSTAGES when loading Stage 2023. This won't work anymore.

System hardware

CM nodes

  • the cluster nodes have direct EBR IB access to the SSSM storage nodes now (without using the IB ↔ 40 GbE gateway)

ESB nodes

  • all ESB nodes (dp-esb[01-75]) are using EDR Infiniband interconnect (no Extoll anymore)
  • SSSM and AFSM file servers can be directly accessed through IB

DAM nodes

  • DAM nodes are using EDR Infiniband (instead of using 40 GbE and Extoll) now
  • SSSM and AFSM file servers can be directly accessed through IB
  • current accelerator layout:
  • dp-dam[01-08]: 1 x Nvidia V100 GPU
  • `dp-dam02: 1 x Intel PAC D5005 FPGA (for testing)
  • dp-dam[09-12]: 2 x Nvidia V100 GPU
  • dp-dam[13-16]: 2 x Intel PAC D5005 FPGA

BXI nodes, Network Federation Gateways

  • former network federation gateways now used for BXI testing: dp-nfgw[02,03,05,06]
  • can be accessed via Slurm using partition dp-bxi

SDV

  • two Intel test nodes have been added and are available to users via dp-intelmax partition

File Systems

please also refer to the Filesystems overview

  • quota has been added to /tmp on deepv to avoid congestion
  • the All Flash Storage Module (AFSM) provides a fast work file system mounted to /afsm (symbolic link to /work) on all compute nodes (CM, DAM, ESB) and the login node (deepv)
    • it is managed via project subfolders: after activating a project environment using jutil command the $WORK will be set accordingly
  • the older System Services and Storage Module (SSSM) work file system is obsolete, but still available at (/work_old) for data migration
  • SSSM still serves the /usr/local/software file system, but
    • starting from Rocky 8 image /usr/local will is a local file system on the compute nodes
    • /usr/local/software is still shared and provided by the SSSM storage
    • in addition to the !Easybuild software stack the shared /usr/local/software filesystem contains some manually installed software in a legacy subfolder

Last modified 3 months ago Last modified on Jan 16, 2024, 10:42:04 AM