Changes between Version 18 and Version 19 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 11, 2019, 3:24:41 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v18 v19  
    3838== Quick Setup on DEEP System ==
    3939
    40 OmpSs-2 has already been installed on DEEP and can be utilised by loading the following modules:
     40We highly recommend to log in a **cluster module (CM)** node to begin using OmpSs-2.  To request an entire CM node interactively, please execute the following command:
     41 `srun --partition=dp-cn --nodes=1 --ntasks=48 --ntasks-per-socket=24  --ntasks-per-node=48 --pty /bin/bash -i`   
     42
     43The command above is consistent with the actual hardware configuration of the cluster module with **hyper-threading enabled**.  In this particular case, the command `srun --partition=dp-cn --nodes=1 --pty /bin/bash -i` would have yielded a similar request.
     44
     45OmpSs-2 has already been installed on DEEP and can be used by simply loading the following modules:
    4146* `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Core:$modulepath"`
    4247* `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Compiler/mpi/intel/2019.0.117-GCC-7.3.0:$modulepath"`
     
    4550* `module load OmpSs-2`
    4651
     52Remember that OmpSs?-2 uses **thread-pool** execution model which means that it permanently **uses all the threads** present on the system.  The reader check the **system affinity** by running the **NUMA command** `numactl --show`:
     53{{{
     54$ numactl --show
     55policy: bind
     56preferred node: 0
     57physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 24 25 26 27 28 29 30 31 32 33 34 35
     58cpubind: 0
     59nodebind: 0
     60membind: 0
     61}}}
     62as well as the **Nanos6 command** `nanos6-info --runtime-details | grep List`:
     63{{{
     64$ nanos6-info --runtime-details | grep List
     65Initial CPU List 0-11,24-35
     66NUMA Node 0 CPU List 0-35
     67NUMA Node 1 CPU List
     68}}}
     69
     70Notice that both commands return consistent outputs and, even though an entire node with two sockets has been requested, only the first NUMA node (i.e. socket) has been correctly bind.  As a result, only 48 threads of the first socket (0-11, 24-35), from which 24 are physical and 24 logical (hyper-threading enabled), are going to be utilised whilst the other 48 threads available on the second socket will remain idle. Therefore, **the system affinity showed above is not correct.**
     71
     72System affinity can be used to specify, for example, the ratio of MPI and OmpSs-2 processes for a hybrid application and can be modified by user request in different ways:
     73* Via SLURM: if the affinity does not correspond with the ressources requested like in the example above, then contact the system admin.
     74* Via the command `numactl`.
     75* Via the command `taskset`.
    4776
    4877