wiki:Public/User_Guide/PaS

Version 45 (modified by Jochen Kreutz, 2 years ago) (diff)

update knl01 info

This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen.

Last update: 2022-01-21

Rocky 8.5 being rolled out to the compute nodes, expect limited access to some of the nodes !

To stay informed, please refer to the News page. Also, please pay attention to the information contained in the "Message of the day" displayed when logging onto the system.

Detected HW and node issues

CM nodes

  • dp-cn06: MCE Errors found (#2819)
  • dp-cn25: SEL ProblemsFW issues (#2769)
  • several cluster nodes marked as down in the scope of the Rocky 8.5 roll out

DAM nodes

  • dp-dam02: reserved for FPGA tests
  • dp-dam[05-08]: reservation "maint-dam-rocky85" in place for Rocky 8.5 tests
  • dp-dam[09-16]: OS update ongoing

ESB nodes

  • dp-esb[01,02]: pshealthcheck failed for BeeGFS
  • dp-esb[07,13,16,22]: problems with energy meter

SDV nodes

  • deeper-sdv cluster nodes (Haswell) have been taken offline: deeper-sdv[01-16]
    • not included in SLURM anymore
    • deeper-sdv[09-10] used for testing (please contact j.kreutz(at)fz-juelich.de if you would like to get access
  • knl01: serves as golden client for imaging only
  • dp-sdv-esb[01,02]: Slurm update required

Software issues

  • Moving to Rocky 8.5 and the new Easybuild stage 2022 (in February) might cause unexpected behavior and problems with the installed software components:

Please, use the support mailing list sup(at)deep-sea-project.eu to report any issues