Changes between Version 28 and Version 29 of Public/User_Guide/Batch_system


Ignore:
Timestamp:
Apr 6, 2020, 8:37:10 AM (4 years ago)
Author:
Simon Pickartz
Comment:

Update example script for "Heterogeneous jobs with MPI communication across modules"

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/Batch_system

    v28 v29  
    228228
    229229In order to establish MPI communication across modules using different interconnect technologies, some special Gateway nodes must be used. On the DEEP-EST system, MPI communication across gateways is needed only between Infiniband and Extoll interconnects.
     230
    230231**Attention:** Only !ParaStation MPI supports MPI communication across gateway nodes.
    231232
     
    239240# Use the packjob feature to launch separately CM and DAM executable
    240241
    241 #SBATCH --job-name=imb
     242
     243# General configuration of the job
     244#SBATCH --job-name=modular-imb
    242245#SBATCH --account=deep
    243 #SBATCH --output=IMB-%j.out
    244 #SBATCH --error=IMB-%j.err
    245 #SBATCH --time=00:05:00
     246#SBATCH --time=00:10:00
     247#SBATCH --output=modular-imb-%j.out
     248#SBATCH --error=modular-imb-%j.err
     249
     250# Configure the gateway daemon
    246251#SBATCH --gw_num=1
    247 #SBATCH --gw_binary=/opt/parastation/bin/psgwd.extoll
    248252#SBATCH --gw_psgwd_per_node=1
    249253
     254# Configure node and process count on the CM
    250255#SBATCH --partition=dp-cn
    251256#SBATCH --nodes=1
    252 #SBATCH --ntasks=1
     257#SBATCH --ntasks-per-node=1
    253258
    254259#SBATCH packjob
    255260
     261# Configure node and process count on the DAM
    256262#SBATCH --partition=dp-dam-ext
    257263#SBATCH --nodes=1
    258 #SBATCH --ntasks=1
    259 
     264#SBATCH --ntasks-per-node=1
     265
     266# Echo job configuration
    260267echo "DEBUG: SLURM_JOB_NODELIST=$SLURM_JOB_NODELIST"
    261268echo "DEBUG: SLURM_NNODES=$SLURM_NNODES"
    262269echo "DEBUG: SLURM_TASKS_PER_NODE=$SLURM_TASKS_PER_NODE"
    263270
    264 # Execute
    265 srun hostname : hostname
    266 srun module_dp-cn.sh : module_dp-dam-ext.sh
    267 }}}
    268 
    269 It uses two execution scripts for loading the correct environment and starting the IMB on the CM and the DAM node (this approach can also be used to start different programs, e.g. one could think of a master and worker use case). The execution scripts could look like:
    270 
    271 {{{
    272 #!/bin/bash
    273 # Script for the CM using InfiniBand
    274 
     271
     272# Set the environment to use PS-MPI
    275273module --force purge
    276274module use $OTHERSTAGES
     
    279277module load ParaStationMPI
    280278
    281 # Execution
    282 EXEC=$PWD/mpi-benchmarks/IMB-MPI1
    283 ${EXEC} PingPong
    284 }}}
    285 
    286 {{{
    287 #!/bin/bash
    288 # Script for the DAM using Extoll
    289 
    290 module --force purge
    291 module use $OTHERSTAGES
    292 module load Stages/Devel-2019a
    293 module load Intel
    294 module load ParaStationMPI
    295 
    296 # Execution
    297 EXEC=$PWD/mpi-benchmarks/IMB-MPI1
    298 ${EXEC} PingPong
    299 }}}
     279# Show the hosts we are running on
     280srun hostname : hostname
     281
     282# Execute
     283APP="/p/project/cfa_partec/pickartz/mpi-benchmarks/src_c/IMB-MPI1 Uniband"
     284srun ${APP}  : ${APP}
     285}}}
     286
     287
    300288
    301289**Attention:** During the first part of 2020, only the DAM nodes will have Extoll interconnect, while the CM and the ESB nodes will be connected via Infiniband. This will change later during the course of the project (expected Summer 2020), when the ESB will be equipped with Extoll connectivity (Infiniband will be removed from the ESB and left only for the CM).