Changes between Version 20 and Version 21 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 11, 2019, 5:00:11 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v20 v21  
    66* [#QuickOverview Quick Overview]
    77* [#QuickSetuponDEEPSystem Quick Setup on DEEP System]
    8 * [#Examples Examples]
     8* [#RepositorywithExamples Repository with Examples]
    99
    1010
    11 == Quick Overview ==
     11= Quick Overview =
     12
    1213OmpSs-2 is a programming model composed of a set of directives and library routines that can be used in conjunction with a high-level programming language (such as C, C++ or Fortran) in order to develop concurrent applications. Its name originally comes from two other programming models: **OpenMP** and **StarSs**. The design principles of these two programming models constitute the fundamental ideas used to conceive the OmpSs philosophy.
    1314
     
    3233* OmpSs-2 specification. [https://pm.bsc.es/ftp/ompss-2/doc/spec]
    3334* OmpSs-2 user guide. [https://pm.bsc.es/ftp/ompss-2/doc/user-guide]
    34 * OmpSs-2 examples and exercises. [https://pm.bsc.es/ftp/ompss-2/doc/examples/index.html]
     35* OmpSs-2 examples repository. [https://pm.bsc.es/gitlab/ompss-2/examples]
     36* OmpSs-2 manual with examples and exercises. [https://pm.bsc.es/ftp/ompss-2/doc/examples/index.html]
    3537* Mercurium official website. [https://www.bsc.es/research-and-development/software-and-apps/software-list/mercurium-ccfortran-source-source-compiler Link 1], [https://pm.bsc.es/mcxx Link 2]
    3638* Nanos official website. [https://www.bsc.es/research-and-development/software-and-apps/software-list/nanos-rtl Link 1], [https://pm.bsc.es/nanox Link 2]
    3739
    3840
    39 == Quick Setup on DEEP System ==
     41= Quick Setup on DEEP System =
    4042
    4143We highly recommend to log in a **cluster module (CM) node** to begin using OmpSs-2.  To request an entire CM node for an interactive session, please execute the following command:
     
    6971}}}
    7072
    71 Notice that both commands return consistent outputs and, even though an entire node with two sockets has been requested, only the first NUMA node (i.e. socket) has been correctly bind.  As a result, only 48 threads of the first socket (0-11, 24-35), from which 24 are physical and 24 logical (hyper-threading enabled), are going to be utilised whilst the other 48 threads available on the second socket will remain idle. Therefore, **the system affinity showed above does not represent the resources requested via SLURM.**
     73Notice that both commands return consistent outputs and, even though an entire node with two sockets has been requested, only the first NUMA node (i.e. socket) has been correctly bind.  As a result, only 48 threads of the first socket (0-11, 24-35), from which 24 are physical and 24 logical (hyper-threading enabled), are going to be utilised whilst the other 48 threads available on the second socket will remain idle. Therefore, **the system affinity showed above is not valid since it does not represent the resources requested via SLURM.**
    7274
    7375System affinity can be used to specify, for example, the ratio of MPI and OmpSs-2 processes for a hybrid application and can be modified by user request in different ways:
     
    7779
    7880
    79 == Examples ==
     81= Repository with examples =
     82
     83All the examples shown here are publicly available at [https://pm.bsc.es/gitlab/ompss-2/examples].  Users must clone/download each example's repository and then transfer it to a DEEP working directory.
     84
     85== System configuration ==
     86
     87Please refer to section [#QuickSetuponDEEPSystem Quick Setup on DEEP System] to get a functional version of OmpSs-2 on DEEP. It is also recommended to run OmpSs-2 on a cluster module (CM) node.
     88
     89== Building and running the examples ==
     90
     91All the examples come with a Makefile already configured to build (e.g. `make`) and run (e.g. `make run`) them.
     92
     93== Controlling available threads ==
     94
     95In order to limit or constraint the available threads for an application, the Unix ''taskset'' tool can be used to launch applications with a given thread affinity.  In order to use taskset, simply precede the application's binary with ''taskset'' followed by a list of CPU IDs specifying the desired affinity:
     96
     97`taskset -c 0,2-4 ./application`
     98
     99The example above will run ''application'' with 4 cores: 0, 2, 3, 4.
     100
     101== Dependency graphs ==
     102
     103Nanos6 allows for a graphical representation of data dependencies to be extracted. In order to generate said graph, run the application with the ''NANOS6'' environment variable set to ''graph'':
     104
     105`NANOS6=graph ./application`
     106
     107By default graph nodes will include the full path of the source code. To remove these, set the following environment variable:
     108
     109`NANOS6_GRAPH_SHORTEN_FILENAMES=1`
     110
     111The result will be a PDF file with several pages, each representing the graph at a certain point in time. For best results, we suggest to display the PDF with ''single page'' view, showing a full page and to advance page by page.
     112
     113== Obtaining statistics ==
     114
     115Another equally interesting feature of Nanos6 is obtaining statistics. To do so, simply run the application as:
     116
     117`NANOS6=stats ./application` or `NANOS6=stats-papi ./application`
     118
     119The first collects timing statistics while the second also records hardware counters (compilation with PAPI is needed for the second). By default, the statistics are emitted standard error when the program ends.
     120
     121== Tracing with Extrae ==
     122
     123A ''trace.sh'' file can be used to include all the environment variables needed to get an instrumentation trace of the execution. The content of this file is as follows:
     124
     125{{{
     126#!/bin/bash
     127export EXTRAE_CONFIG_FILE=extrae.xml
     128export NANOS6="extrae"
     129$*
     130}}}
     131
     132Additionally, you will need to change your running script in order to invoke the program through this ''trace.sh'' script. Although you can also edit your running script adding all the environment variables related with the instrumentation, it is preferable to use this extra script to easily change between instrumented and non-instrumented executions. When in need to instrument your execution, simply include ''trace.sh'' before the program invocation. Note that the ‘’extrae.xml’’ file, which is used to configure the Extrae library to get a Paraver trace, is also needed.
     133
     134= Example: Multisaxpy =
     135
     136The examples shown here are publicly available at [https://pm.bsc.es/gitlab/ompss-2/examples].
     137
     138Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/multisaxpy] and transfer it to a DEEP working directory.
    80139
    81140
     141
     142
     143
     144