Changes between Version 25 and Version 26 of Public/User_Guide/PaS


Ignore:
Timestamp:
Sep 7, 2020, 10:50:00 AM (4 years ago)
Author:
Jochen Kreutz
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/PaS

    v25 v26  
    33This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen.
    44
    5 ''Last update: 2020-09-02''
     5''Last update: 2020-09-07''
    66{{{#!comment highlighted red text
    77[[span(style=color: #FF0000, System maintenance from Monday, 2020-09-07 to Friday, 2020-09-11, no user access !)]]
     
    3131=== ESB nodes ===
    3232
     33{{{#!comment JK: EM client has been fixed
    3334[[span(style=color: #FF0000, Currently facing issues in reading the ESB Energy Meter leading to nodes going offline. A fix is ready for roll-out)]]
     35}}}
    3436
    3537* dp-esb02: energy meter reading issues
     
    5456
    5557
    56 {{{#!comment JK: status to be clarified on Thursday, 2020-09-03
     58
    5759== Software issues ==
    5860
     61=== Modular jobs failing ===
     62
     63- users reported failing jobs that are doing MPI on more than one module using the gateways
     64- the problem is being investigated
     65
     66
     67{{{#!comment JK: status to be clarified on Thursday, 2020-09-03
    5968=== LDAP error message during login ===
    6069
     
    7685- this might lead to (temporary) failing job starts for certain users
    7786- if you cannot start jobs via SLURM, please write an email to the support list: `sup(at)deep-est.eu`
    78 
    79 === /sdv-work corrupted ===
    80 
    81 - due to failing disks the SDV work filesystem mounted to `/sdv-work` got corrupted and has to be rebuild
    82 - meta data still seems to be ok, so directories and files can be seen, but no file access is possible
    83 - not sure if any data can be recovered since work filesystems are not in backup
    8487
    8588=== GPU direct usage with IB on ESB ===
     
    119122
    120123
    121 === slurmtop ===
    122 
    123 The `slurmtop`tool is not working properly, a workaround is to call it via
    124 
    125 {{{
    126 slurmtop 2> /dev/null
    127124}}}
    128 
    129 }}}