Changes between Version 12 and Version 13 of Public/ParaStationMPI


Ignore:
Timestamp:
Apr 6, 2020, 12:54:57 PM (4 years ago)
Author:
Carsten Clauß
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/ParaStationMPI

    v12 v13  
    88
    99== Heterogeneous Jobs using inter-module MPI communication ==
    10 ParaStationMPI provides support for inter-module communication in federated high-speed networks. Therefore, so-called gateway (GW) daemons bridge the MPI traffic between the modules. This mechanism is transparent to the MPI application, i.e., the MPI ranks see a common `MPI_COMM_WORLD` across all modules within the job. However, the user has to account for these additional GW resources during the job submission. An example SLURM Batch script illustrating the submission of heterogeneous pack jobs including the allocation of GW resources can be found [wiki:User_Guide/Batch_system#HeterogeneousjobswithMPIcommunicationacrossmodules here].
     10!ParaStation MPI provides support for inter-module communication in federated high-speed networks. Therefore, so-called gateway (GW) daemons bridge the MPI traffic between the modules. This mechanism is transparent to the MPI application, i.e., the MPI ranks see a common `MPI_COMM_WORLD` across all modules within the job. However, the user has to account for these additional GW resources during the job submission. An example SLURM Batch script illustrating the submission of heterogeneous pack jobs including the allocation of GW resources can be found [wiki:User_Guide/Batch_system#HeterogeneousjobswithMPIcommunicationacrossmodules here].
     11
     12== Modularity-aware Collectives ==
     13
     14=== Feature Description ===
     15In the context of DEEP-EST and MSA, !ParaStation MPI has been extended by modularity awareness also for collective MPI operations.
     16In doing so, an MSA-aware collective operation is conducted in a hierarchical manner where the intra- and inter- module phases are strictly separated:
     17 1. First do all module-internal gathering and/or reduction operations  if required.
     18 2. Then perform the inter-module operation with only one process per module being involved.
     19 3. Finally, distribute the data within each module in a strictly module-local manner.
     20This approach is here exemplarily shown in the following figure for a Broadcast operation with nine processes and three modules:
     21
     22[[Image(ParaStationMPI_MSA_Bcast.jpg)]]
     23
     24Besides Broadcast, the following collective operations are currently provided with this awareness:
     25 * `MPI_Bcast` / `MPI_Ibcast`
     26 * `MPI_Reduce` / `MPI_Ireduce`
     27 * `MPI_Allreduce` / `MPI_Iallreduce`
     28 * `MPI_Scan` / `MPI_Iscan`
     29 * `MPI_Barrier`
     30
     31=== Feature Usage ===
     32For using this feature, the following environment variables must be set and/or considered:
     33{{{
     34- PSP_MSA_AWARENESS=1      # NOT enabled by default
     35- PSP_MSA_AWARE_COLLOPS=1  # Enabled by default if PSP_MSA_AWARENESS=1 is set
     36- PSP_MSA_MODULE_ID=xyz    # Pass the respective module ID (Integer) explicitly to the processes
     37}}}
     38
     39**Attention:** Please note that the environment variable for the respective Module ID (`PSP_MSA_MODULE_ID`) is currently ''not'' set automatically!
     40This means that the user has to set and pass this variable explicitly, for example, via a bash script:
     41{{{
     42#!/bin/bash
     43# Script (script0.sh) for Module 0: (e.g. Cluster)
     44APP="./IMB-MPI1 Bcast"
     45MODULE_ID=0  # <- set an arbitrary ID for this module!
     46export PSP_MSA_AWARENESS=1 PSP_MSA_MODULE_ID="${MODULE_ID}" ./${APP}
     47}}}
     48{{{
     49#!/bin/bash
     50# Script (script1.sh) for Module 1: (e.g. ESB)
     51APP="./IMB-MPI1 Bcast"
     52MODULE_ID=1  # <- set a different ID for this module!
     53export PSP_MSA_AWARENESS=1 PSP_MSA_MODULE_ID="${MODULE_ID}" ./${APP}
     54}}}
     55{{{
     56> srun ./script0 : ./script1
     57}}}
    1158
    1259== CUDA Support by !ParaStation MPI ==