[[TOC]] This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen. ''Last update: 2022-07-01'' **Please, use the support mailing list `sup(at)deep-sea-project.eu` to report any issues** {{{#!comment highlighted red text [[span(style=color: #FF0000, System maintenance from Monday, 2020-09-07 to Friday, 2020-09-11, no user access !)]] }}} To stay informed, please refer to the [wiki:Public/User_Guide/News News page]. Also, please pay attention to the information contained in the "Message of the day" displayed when logging onto the system. == Detected HW and node issues == === CM nodes === * dp-cn25: SEL ProblemsFW issues (#2769) * dp-cn27: MCE Errors found (#2919) === DAM nodes === * dp-dam02: reserved for FPGA tests * dp-dam03: PCI link speed degraded (#2931) * dp-dam10: PMEM module issue (#2875) * dp-dam16: testbed === ESB nodes === * dp-esb[07]: used for Rocky 8.6 tests * dp-esb[11]: memory issues === SDV nodes === * deeper-sdv cluster nodes (Haswell) have been taken offline: deeper-sdv[01-16] * not included in SLURM anymore * deeper-sdv[09-10] used for testing (please contact j.kreutz(at)fz-juelich.de if you would like to get access * knl01: serves as golden client for imaging only * dp-sdv-esb[01,02]: Slurm update required == Software issues == === nvidia driver mismatch === * loading CUDA module and trying to run `nvidia-smi` (or any application trying to use the GPU) leads to {{{ Failed to initialize NVML: Driver/library version mismatch }}} * workaround is to unload the unload the driver module: `ml -nvidia-driver/.default` * for furhter information, please also seeĀ  [https://gitlab.jsc.fz-juelich.de/hps-public/easybuild-repository/-/wikis/Failed-to-initialize-NVML-Driver-library-version-mismatch-message here][[BR]] === Easybuild === * Moving the new Easybuild stage 2022 (in February) might cause unexpected behavior and problems with the installed software components: {{{#!comment JK: invalid === GPU direct usage with Extoll on DAM === * new Extoll driver for GPU direct over Extoll still shows low performance on the DAM nodes * available via Developer stage, for testing load: {{{ ml --force purge ml use $OTHERSTAGES ml load Stages/Devel-2020 ml load Intel ml load ParaStationMPI }}} * expect performance (and maybe also stability) issues }}}