wiki:Public/User_Guide/PaS

Version 4 (modified by Jochen Kreutz, 4 years ago) (diff)

filled in currently known issues discussed in last admin telco

This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen.

To stay informed, please also read the information presented in the "Message of the day" when logging onto the system.

Software issues

GPU direct usage with Extoll ==

  • new Extoll driver for GPU direct over Extoll currently being tested on the DAM nodes
  • only available via Developer statge, for testing load:
module --force purge
module use $OTHERSTAGES
module load Stages/Devel-2019a
module load GCC/8.3.0
module load ParaStationMPI
  • expect performance and stability issues

Detected HW and node issues

CM nodes

  • dp-cn49 and dp-cn50: nodes currently reserved for special use case

DAM nodes

  • dp-dam03: being investigated after unexptected reboot (#2323)
  • dp-dam07: showing problems with its FPGA (#2353)
  • dp-dam08: issues with second socket CPU seen (#2304)

ESB nodes

  • dp-esb08: GPU shows PCIe x8 connection only (#2370)
  • dp-esb11: no GPU device detected, under repair (#2358)
  • dp-esb23: MCE problems (#2350)