Changes between Version 45 and Version 46 of Public/User_Guide/PaS


Ignore:
Timestamp:
Jul 1, 2022, 12:26:37 PM (22 months ago)
Author:
Jochen Kreutz
Comment:

update node info and add hint for GPU usage

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/PaS

    v45 v46  
    33This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen.
    44
    5 ''Last update: 2022-01-21''
     5''Last update: 2022-07-01''
    66
    7 [[span(style=color: #FF0000, Rocky 8.5 being rolled out to the compute nodes, expect limited access to some of the nodes !)]]
    8 
     7**Please, use the support mailing list `sup(at)deep-sea-project.eu` to report any issues**
    98
    109{{{#!comment highlighted red text
     
    1918=== CM nodes ===
    2019
    21 * dp-cn06: MCE Errors found (#2819)
    22 
    2320* dp-cn25: SEL ProblemsFW issues (#2769)
    2421
    25 * several cluster nodes marked as down in the scope of the Rocky 8.5 roll out
     22* dp-cn27: MCE Errors found (#2919)
    2623
    2724       
     
    2926
    3027* dp-dam02: reserved for FPGA tests
    31 * dp-dam[05-08]: reservation "maint-dam-rocky85" in place for Rocky 8.5 tests
    32 * dp-dam[09-16]: OS update ongoing
     28* dp-dam03: PCI link speed degraded (#2931)
     29* dp-dam10: PMEM module issue (#2875)
     30* dp-dam16: testbed
    3331
    3432
    3533=== ESB nodes ===
    3634
    37 * dp-esb[01,02]: pshealthcheck failed for BeeGFS
    38 * dp-esb[07,13,16,22]: problems with energy meter
     35* dp-esb[07]: used for Rocky 8.6 tests
     36* dp-esb[11]: memory issues
    3937
    4038
     
    5250== Software issues ==
    5351
    54 - Moving to Rocky 8.5 and the new Easybuild stage 2022 (in February) might cause unexpected behavior and problems with the installed software components:
     52=== nvidia driver mismatch ===
     53
     54- loading CUDA module and trying to run `nvidia-smi` (or any application trying to use the GPU) leads to
     55
     56{{{
     57Failed to initialize NVML: Driver/library version mismatch
     58}}}
     59
     60- workaround is to unload the unload the driver module: `ml -nvidia-driver/.default`
     61
     62=== Easybuild ===
     63
     64- Moving the new Easybuild stage 2022 (in February) might cause unexpected behavior and problems with the installed software components:
    5565
    5666
    57 **Please, use the support mailing list `sup(at)deep-sea-project.eu` to report any issues**
     67
    5868
    5969