Changes between Version 20 and Version 21 of Public/User_Guide/Batch_system


Ignore:
Timestamp:
Jan 9, 2020, 2:56:03 PM (4 years ago)
Author:
Jacopo de Amicis
Comment:

Updated information about heterogeneous jobs.

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/Batch_system

    v20 v21  
    146146}}}
    147147
    148 In order to submit a heterogeneous job, the user needs to set the batch script similarly to the following:
     148Please notice the `:` separating the definitions for each sub-job of the heterogeneous job. Also, please be aware that it is possible to have more than two sub-jobs in a heterogeneous job.
     149
     150The user can also request several sets of nodes in a heterogeneous allocation using `salloc`. For example:
     151{{{
     152salloc --partiton=dp-cn -N 2 : --partition=dp-dam -N 4
     153}}}
     154
     155In order to submit a heterogeneous job via `sbatch`, the user needs to set the batch script similar to the following one:
    149156
    150157{{{#!sh
     
    177184}}}
    178185
    179 Here the `packjob` keyword allows to define Slurm parameter for each sub-job of the heterogeneous job.
     186Here the `packjob` keyword allows to define Slurm parameter for each sub-job of the heterogeneous job. Some Slurm options can be defined once at the beginning of the script and are automatically propagated to all sub-jobs of the heterogeneous job, while some others (i.e. `--nodes` or `--ntasks`) must be defined for each sub-job. You can find a list of the propagated options on the [https://slurm.schedmd.com/heterogeneous_jobs.html#submitting Slurm documentation].
    180187
    181188When submitting a heterogeneous job with this colon notation using ParaStationMPI, a unique `MPI_COMM_WORLD` is created, spanning across the two partitions. If this is not desired, one can use the `--pack-group` key to submit independent job steps to the different node-groups of a heterogeneous allocation:
     
    187194Using this configuration implies that inter-communication must be established manually by the applications during run time, if needed.
    188195
    189 More information about heterogeneous and cross-module jobs (including how to used gateway nodes) can be found on [https://apps.fz-juelich.de/jsc/hps/jureca/modular-jobs.html this page] of the JURECA documentation. All information available there applies for the DEEP system as well. Please be aware that the DEEP system currently includes 2 gateway nodes between the Infiniband and EXTOLL fabrics.
    190 
    191 Also, for more information about heterogeneous jobs please refer to the [https://slurm.schedmd.com/heterogeneous_jobs.html relevant page] of the Slurm documentation.
     196For more information about heterogeneous jobs please refer to the [https://slurm.schedmd.com/heterogeneous_jobs.html relevant page] of the Slurm documentation.
     197
     198=== Heterogeneous jobs with MPI communication across modules ===
     199
     200In order to establish MPI communication across modules using different interconnect technologies, some special Gateway nodes must be used. A general description of how the user can request and use gateway nodes is provided at [https://apps.fz-juelich.de/jsc/hps/jureca/modular-jobs.html#mpi-traffic-across-modules this section] of the JURECA documentation.
     201
     202**Attention:** some information provided on the JURECA documentation do not apply for the DEEP system. In particular:
     203* as of 09/01/2020, the DEEP system has 1 gateway node. In the next weeks at least one additional gateway node will be installed.
     204
     205* As of 09/01/2020 the gateway nodes are exclusive to the job requesting them. Given the limited number of gateway nodes available on the system, this may change in the future.
     206
     207* The `xenv` utility (necessary on JURECA to load modules for different architectures - Haswell and KNL) is needed on DEEP only to load the `extoll` module on the DAM and ESB nodes (the `extoll` module is not available on the CM. Trying to load it there will produce an error and cause the job to fail). All the other modules can be loaded via the usual `module load` or `ml` command on the batch script before the `srun` command. If desired, `xenv` can still be used to load different set of modules for different sub-jobs of a heterogeneous jobs.
    192208
    193209{{{#!comment
     
    275291
    276292entry.
     293
     294Also, jobs chan be chained after they have been submitted using the `scontrol` command by updating their `Dependency` field.
    277295
    278296=== How can check the status of partitions and nodes? ===
     
    387405=== How do I use SMT on the DEEP CPUs? ===
    388406
    389 On DEEP, SMT is enabled by default on all nodes. Please be aware that on all JSC systems (including DEEP), each hardware thread is exposed by the OS as a physical core. For a ''n''-core node, with ''m'' hardware threads per core, the OS cores from ''0'' to ''n-1'' will correspond to the first hardware thread of all hardware cores (from all sockets), the OS cores from ''n'' to ''2n-1'' to the second hardware thread of the hardware cores, and so on.
     407On DEEP, SMT is enabled by default on all nodes. Please be aware that on all JSC systems (including DEEP), each hardware thread is exposed by the OS as a separate CPU. For a ''n''-core node, with ''m'' hardware threads per core, the OS cores from ''0'' to ''n-1'' will correspond to the first hardware thread of all hardware cores (from all sockets), the OS cores from ''n'' to ''2n-1'' to the second hardware thread of the hardware cores, and so on.
    390408
    391409For instance, on a Cluster node (with two sockets with 12 cores each, with 2 hardware threads per core):