Changes between Version 43 and Version 44 of Public/User_Guide/PaS
- Timestamp:
- Jan 21, 2022, 12:12:14 PM (3 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Public/User_Guide/PaS
v43 v44 3 3 This page is intended to give a short overview on known issues and to provide potential solutions and workarounds to the issues seen. 4 4 5 ''Last update: 202 1-12-10''5 ''Last update: 2022-01-21'' 6 6 7 [[span(style=color: #FF0000, System maintenance from Tuesday, 2021-12-14 to Thursday, 2021-12-16, limited user access !)]]7 [[span(style=color: #FF0000, Rocky 8.5 being rolled out to the compute nodes, expect limited access to some of the nodes !)]] 8 8 9 9 … … 19 19 === CM nodes === 20 20 21 * dp-cn0 5: memory issue - node at Megware for repair (#2682)21 * dp-cn06: MCE Errors found (#2819) 22 22 23 * dp-cn25: FW issues (#2495)23 * dp-cn25: SEL ProblemsFW issues (#2769) 24 24 25 * dp-cn42: memory issue (#2675) 26 27 * dp-cn[47-50]: rocky linux testbed 25 * several cluster nodes marked as down in the scope of the Rocky 8.5 roll out 28 26 29 27 30 28 === DAM nodes === 31 29 32 * dp-dam08: memory issues (#2722) 30 * dp-dam02: reserved for FPGA tests 31 * dp-dam[05-08]: reservation "maint-dam-rocky85" in place for Rocky 8.5 tests 32 * dp-dam[09-16]: OS update ongoing 33 33 34 34 35 35 === ESB nodes === 36 36 37 {{{#!comment JK: EM client has been fixed 38 [[span(style=color: #FF0000, Currently facing issues in reading the ESB Energy Meter leading to nodes going offline. A fix is ready for roll-out)]] 39 }}} 40 41 * dp-esb[01-25]: currently being prepared as rocky linux testbed 42 43 * dp-esb75: node currently reserved for special use case (#2568) 37 * dp-esb[01,02]: pshealthcheck failed for BeeGFS 38 * dp-esb[07,13,16,22]: problems with energy meter 44 39 45 40 … … 48 43 * deeper-sdv cluster nodes (Haswell) have been taken offline: deeper-sdv[01-16] 49 44 - not included in SLURM anymore 50 - deeper-sdv[0 1-10] will be used for testing45 - deeper-sdv[09-10] used for testing (please contact j.kreutz(at)fz-juelich.de if you would like to get access 51 46 52 47 * knl01: NVMe issues (#2011) 48 49 * dp-sdv-esb[01,02]: Slurm update required 53 50 54 51 55 52 == Software issues == 56 53 57 === SLURM jobs === 54 - Moving to Rocky 8.5 and the new Easybuild stage 2022 (in February) might cause unexpected behavior and problems with the installed software components: 58 55 59 - due to introduction of accounting there is some re-configuration 60 of user accounts needed within SLURM to assign the correct QOS levels and priorities for the jobs 61 * this might lead to (temporary) failing job starts for certain users 62 * if you cannot start jobs via SLURM, please write an email to the support list: `sup(at)deep-sea.eu` 56 57 **Please, use the support mailing list `sup(at)deep-sea-project.eu` to report any issues** 58 59 63 60 64 61 {{{#!comment JK: invalid