Context Navigation

Batch_system

Timestamp:: Jan 9, 2020, 2:56:03 PM (4 years ago)
Author:: Jacopo de Amicis
Comment:: Updated information about heterogeneous jobs.

Legend:

: Unmodified
: Added
: Removed
: Modified

Public/User_Guide/Batch_system

-                      v20
+                      v21
 }}}
+In order to submit a heterogeneous job, the user needs to set the batch script similarly to the following:
+Please notice the `:` separating the definitions for each sub-job of the heterogeneous job. Also, please be aware that it is possible to have more than two sub-jobs in a heterogeneous job.
+The user can also request several sets of nodes in a heterogeneous allocation using `salloc`. For example:
+{{{
+salloc --partiton=dp-cn -N 2 : --partition=dp-dam -N 4
+}}}
+In order to submit a heterogeneous job via `sbatch`, the user needs to set the batch script similar to the following one:
 {{{#!sh
 …
 }}}
 Here the `packjob` keyword allows to define Slurm parameter for each sub-job of the heterogeneous job.
+Here the `packjob` keyword allows to define Slurm parameter for each sub-job of the heterogeneous job. Some Slurm options can be defined once at the beginning of the script and are automatically propagated to all sub-jobs of the heterogeneous job, while some others (i.e. `--nodes` or `--ntasks`) must be defined for each sub-job. You can find a list of the propagated options on the [https://slurm.schedmd.com/heterogeneous_jobs.html#submitting Slurm documentation].
 When submitting a heterogeneous job with this colon notation using ParaStationMPI, a unique `MPI_COMM_WORLD` is created, spanning across the two partitions. If this is not desired, one can use the `--pack-group` key to submit independent job steps to the different node-groups of a heterogeneous allocation:
 …
 Using this configuration implies that inter-communication must be established manually by the applications during run time, if needed.
+More information about heterogeneous and cross-module jobs (including how to used gateway nodes) can be found on [https://apps.fz-juelich.de/jsc/hps/jureca/modular-jobs.html this page] of the JURECA documentation. All information available there applies for the DEEP system as well. Please be aware that the DEEP system currently includes 2 gateway nodes between the Infiniband and EXTOLL fabrics.
+Also, for more information about heterogeneous jobs please refer to the [https://slurm.schedmd.com/heterogeneous_jobs.html relevant page] of the Slurm documentation.
+For more information about heterogeneous jobs please refer to the [https://slurm.schedmd.com/heterogeneous_jobs.html relevant page] of the Slurm documentation.
+=== Heterogeneous jobs with MPI communication across modules ===
+In order to establish MPI communication across modules using different interconnect technologies, some special Gateway nodes must be used. A general description of how the user can request and use gateway nodes is provided at [https://apps.fz-juelich.de/jsc/hps/jureca/modular-jobs.html#mpi-traffic-across-modules this section] of the JURECA documentation.
+**Attention:** some information provided on the JURECA documentation do not apply for the DEEP system. In particular:
+* as of 09/01/2020, the DEEP system has 1 gateway node. In the next weeks at least one additional gateway node will be installed.
+* As of 09/01/2020 the gateway nodes are exclusive to the job requesting them. Given the limited number of gateway nodes available on the system, this may change in the future.
+* The `xenv` utility (necessary on JURECA to load modules for different architectures - Haswell and KNL) is needed on DEEP only to load the `extoll` module on the DAM and ESB nodes (the `extoll` module is not available on the CM. Trying to load it there will produce an error and cause the job to fail). All the other modules can be loaded via the usual `module load` or `ml` command on the batch script before the `srun` command. If desired, `xenv` can still be used to load different set of modules for different sub-jobs of a heterogeneous jobs.
 {{{#!comment
 …
 entry.
+Also, jobs chan be chained after they have been submitted using the `scontrol` command by updating their `Dependency` field.
 === How can check the status of partitions and nodes? ===
 …
 === How do I use SMT on the DEEP CPUs? ===
 On DEEP, SMT is enabled by default on all nodes. Please be aware that on all JSC systems (including DEEP), each hardware thread is exposed by the OS as a physical core. For a ''n''-core node, with ''m'' hardware threads per core, the OS cores from ''0'' to ''n-1'' will correspond to the first hardware thread of all hardware cores (from all sockets), the OS cores from ''n'' to ''2n-1'' to the second hardware thread of the hardware cores, and so on.
+On DEEP, SMT is enabled by default on all nodes. Please be aware that on all JSC systems (including DEEP), each hardware thread is exposed by the OS as a separate CPU. For a ''n''-core node, with ''m'' hardware threads per core, the OS cores from ''0'' to ''n-1'' will correspond to the first hardware thread of all hardware cores (from all sockets), the OS cores from ''n'' to ''2n-1'' to the second hardware thread of the hardware cores, and so on.
 For instance, on a Cluster node (with two sockets with 12 cores each, with 2 hardware threads per core):