Changes between Version 35 and Version 36 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 14, 2019, 3:39:05 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v35 v36  
    55Table of contents:
    66* [#QuickOverview Quick Overview]
    7 * [#QuickSetuponDEEPSystem Quick Setup on DEEP System]
    8 * [#RepositorywithExamples Repository with Examples]
     7* [#QuickSetuponDEEPSystemforaPureOmpSs-2Application Quick Setup on DEEP System for a Pure OmpSs-2 Application]
     8* [#UsingtheRepositories Using the Repositories]
    99* [#multisaxpybenchmarkOmpSs-2 multisaxpy benchmark (OmpSs-2)]
    1010* [#dot-productbenchmarkOmpSs-2 dot-product benchmark (OmpSs-2)]
     
    4949----
    5050
    51 = Quick Setup on DEEP System =
    52 
    53 We highly recommend to log in a **cluster module (CM) node** to begin using !OmpSs-2.  To request an entire CM node for an interactive session, please execute the following command:
    54  `srun --partition=dp-cn --nodes=1 --ntasks=48 --ntasks-per-socket=24  --ntasks-per-node=48 --pty /bin/bash -i`   
     51= Quick Setup on DEEP System for a Pure !OmpSs-2 Application =
     52
     53We highly recommend to interactively log in a **cluster module (CM) node** to begin using !OmpSs-2.  To request an entire CM node for an interactive session, please execute the following command to use all the 48 available threads:
     54
     55`srun -p dp-cn -N 1 -n 1 -c 48 --pty /bin/bash -i`
    5556
    5657Note that the command above is consistent with the actual hardware configuration of the cluster module with **hyper-threading enabled**.
    5758
    5859!OmpSs-2 has already been installed on DEEP and can be used by simply executing the following commands:
    59 * `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Core:$modulepath"`
    60 * `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Compiler/mpi/intel/2019.0.117-GCC-7.3.0:$modulepath"`
    61 * `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/MPI/intel/2019.0.117-GCC-7.3.0/psmpi/5.2.1-1-mt:$modulepath"`
    62 * `export MODULEPATH="$modulepath:$MODULEPATH"`
    63 * `module load OmpSs-2`
    64 
    65 Remember that !OmpSs-2 uses a **thread-pool** execution model which means that it **permanently uses all the threads** present on the system. Users are strongly encouraged to always check the **system affinity** by running the **NUMA command** `numactl --show`:
    6660{{{
    67 $ numactl --show
    68 policy: bind
    69 preferred node: 0
    70 physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 24 25 26 27 28 29 30 31 32 33 34 35
    71 cpubind: 0
    72 nodebind: 0
    73 membind: 0
     61modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Core:$modulepath"
     62modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Compiler/mpi/intel/2019.0.117-GCC-7.3.0:$modulepath"
     63modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/MPI/intel/2019.0.117-GCC-7.3.0/psmpi/5.2.1-1-mt:$modulepath"
     64export MODULEPATH="$modulepath:$MODULEPATH"
     65module load OmpSs-2
    7466}}}
    75 as well as the **Nanos6 command** `nanos6-info --runtime-details | grep List`:
     67
     68Remember that !OmpSs-2 uses a **thread-pool** execution model which means that it **permanently uses all the threads** present on the system. Users are strongly encouraged to always check the **system affinity** by running the **NUMA command** `srun numactl --show`:
    7669{{{
    77 $ nanos6-info --runtime-details | grep List
    78 Initial CPU List 0-11,24-35
     70$ srun numactl --show
     71policy: default
     72preferred node: current
     73physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
     74cpubind: 0 1
     75nodebind: 0 1
     76membind: 0 1
     77}}}
     78as well as the **Nanos6 command** `srun nanos6-info --runtime-details | grep List`:
     79{{{
     80$ srun nanos6-info --runtime-details | grep List
     81Initial CPU List 0-47
    7982NUMA Node 0 CPU List 0-35
    80 NUMA Node 1 CPU List
     83NUMA Node 1 CPU List 12-47
    8184}}}
    8285
    83 Notice that both commands return consistent outputs and, even though an entire node with two sockets has been requested, only the first NUMA node (i.e. socket) has been correctly bind.  As a result, only 48 threads of the first socket (0-11, 24-35), from which 24 are physical and 24 logical (hyper-threading enabled), are going to be utilised whilst the other 48 threads available in the second socket will remain idle. Therefore, **the system affinity showed above is not valid since it does not represent the resources requested via SLURM.**
    84 
    8586System affinity can be used to specify, for example, the ratio of MPI and !OmpSs-2 processes for a hybrid application and can be modified by user request in different ways:
    86 * Via SLURM. However, if the affinity does not correspond to the resources requested like in the previous example, it should be reported to the system administrators.
     87* Via the command `srun` or `salloc`. However, if the affinity given by SLURM does not correspond to the resources requested, it should be reported to the system administrators.
    8788* Via the command `numactl`.
    8889* Via the command `taskset`.
     
    9293
    9394
    94 = Repository with Examples =
     95= Using the Repositories =
    9596
    9697All the examples shown here are publicly available at [https://pm.bsc.es/gitlab/ompss-2/examples].  Users must clone/download each example's repository and then transfer it to a DEEP working directory.
     
    9899== System configuration ==
    99100
    100 Please refer to section [#QuickSetuponDEEPSystem Quick Setup on DEEP System] to get a functional version of !OmpSs-2 on DEEP. It is also recommended to run !OmpSs-2 on a cluster module (CM) node.
     101Please refer to section [#QuickSetuponDEEPSystem Quick Setup on DEEP System] to get a functional version of !OmpSs-2 on DEEP. It is also recommended to run !OmpSs-2 via an interactive session on a cluster module (CM) node.
    101102
    102103== Building and running the examples ==