Changes between Version 7 and Version 8 of Public/User_Guide/DEEP-EST_DAM
- Timestamp:
- Oct 16, 2019, 10:08:54 PM (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Public/User_Guide/DEEP-EST_DAM
v7 v8 83 83 84 84 == Multi-node Jobs == 85 Currently, multi-node MPI jobs are possible on the DAM only by modifying the environment in the following way: 85 Multi-node jobs can be launched on the `dp-dam` partition with ParaStationMPI by loading the `pscom` module (currently `pscom/5.3.1-1`) and the `extoll` module. Please beware that the `extoll` module can be loaded only on nodes with an EXTOLL device, therefore it cannot be loaded on the login node: please load it in a batch script for `sbatch` or directly on the compute nodes within an interactive session (see [wiki:Batch_system#Fromashellonanode here] for more information on the interactive sessions). 86 86 87 {{{ 88 $ ml Intel ParaStationMPI 89 $ env LD_LIBRARY_PATH=/opt/parastation/lib64:/opt/extoll/x86_64/lib/:${LD_LIBRARY_PATH} PSP_TCP=0 PSP_OPENIB=0 PSP_EXTOLL=1 srun -p dp-dam -N 2 -n 2 ./MPI_HelloWorld 90 Hello World from processor dp-dam02, rank 1 out of 2 91 Hello World from processor dp-dam01, rank 0 out of 2 92 }}} 93 **Attention:** This is a temporary workaround. 87 A release-candidate version of ParaStationMPI with CUDA awareness is also available on the system. It is installed under the GCC stack (run `ml spider ParaStationMPI` to find the relevant installation for CUDA). This version also automatically loads a CUDA-aware installation of `pscom`. 94 88 95 {{{#!comment 96 **Attention:** Since the Extoll network is not in place yet multi-node MPI Jobs are currently disabled. 97 }}} 89 **Attention:** As of 16.10.2019, there is no support for GPUDirect over EXTOLL. As a temporary workaround, this version of ParaStationMPI automatically handles the device-to-host, host-to-host and host-to-device copies transparently to the user and can be used to test the functionality of applications requiring a CUDA-aware MPI implementation. Support for GPUDirect will be provided by EXTOLL in the near future. 98 90 99 {{{#!comment 100 101 Loading the most recent ParaStation module will be enough to run multi-node MPI jobs over Extoll 102 103 {{{ 104 module load ParaStationMPI 105 }}} 106 }}} 91 **Attention:** As of 16.10.2019, a minor bug in the `pscom/.5.4.0-1-CUDA` module requires the user to explicitly export `PSP_CUDA=1` in order to use the CUDA-aware ParaStationMPI. This bug will be fixed in the next days.