Version 21 (modified by 3 years ago) (diff) | ,
---|
Table of Contents
System usage
The DEEP-EST Data Analytics Module (DAM) can be used through the SLURM based batch system that is also used for (most of) the Software Development Vehicles (SDV). You can request a DAM node (dp-dam[01-16]
) with an interactive session like this:
srun -A deepsea -N 1 --tasks-per-node 4 -p dp-dam --time=1:0:0 --pty --interactive /bin/bash kreutz1@dp-dam01 ~]$ srun -n 8 hostname dp-dam01 dp-dam01 dp-dam01 dp-dam01
When using a batch script, you have to adapt the partition option within your script: --partition=dp-dam
(or short form: -p dp-dam
)
Persistent Memory
Each of the DAM nodes is equipped with Intel's Optane DC Persistent Memory Modules (DCPMM).
All DAM nodes (dp-dam[01-16]
) expose 3 TB of persistent memory.
The DCPMMs can be driven in different modes. For further information of the operation modes and how to use them, please refer to the following information
Currently all nodes are running in "App Direct Mode".
Using Cuda
The first 12 DAM nodes are equipped with GPUs
dp-dam[01-08]
: 1 x Nvidia V100dp-dam[09-12]
: 2 x Nvidia V100
Please use the gres
option with srun
if you would like to use GPUs on DAM nodes, e.g. in an interactive session:
srun -A deepsea -p dp-dam --gres=gpu:1 -t 1:0:0 --interactive --pty /bin/bash # to start an interactive session on an DAM node exposig at least 1 GPU srun -A deepsea -p dp-dam --gres=gpu:2 -t 1:0:0 --interactive --pty /bin/bash # to start an interactive session on an DAM node exposig 2 GPUs
To compile and run Cuda applications on the Nvidia V100 cards included in the DAM nodes, it is necessary to load the CUDA module. It's advised to use the 2022 Stage to avoid Nvidia driver mismatch issues.
module --force purge ml use $OTHERSTAGES ml Stages/2022 ml CUDA [kreutz1@deepv ~]$ ml Currently Loaded Modules: 1) Stages/2022 (S) 2) nvidia-driver/.default (H,g,u) 3) CUDA/11.5 (g,u) Where: S: Module is Sticky, requires --force to unload or purge g: built for GPU u: Built by user
Using FPGAs
Nodes `dp-dam[13-16] are equipped with 2 x Stratix 10 FPGAs each (Intel PAC d5005).
It is recommended to do the first steps in an interactive session on a DAM node.
Since there is (currently) no FPGA resource defined in SLURM for these nodes, please use the --hostlist=
option with srun
to open a session on a DAM node equipped with FPGAs, for example:
srun -A deepsea -p dp-dam --nodelist=dp-dam13 -t 1:0:0 --interactive --pty /bin/bash
For getting started using OpenCL with the FPGAs you can find some hints as well as the slides and exercises from the Intel FPGA workshop held at JSC in:
/usr/local/software/legacy/fpga/
More details to follow.