Changes between Version 42 and Version 43 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 14, 2019, 5:31:09 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v42 v43  
    262262== Override the Number of Threads Used ==
    263263
    264 Let's have a closer look at the third implementation ''03.multisaxpy_task'' which took 0.648536 seconds to finish using 48 threads.
    265 
    266 A full CM node features 48 threads (0-47) divided in two sockets: 0-11,24-35 for the first socket and 12-23,36-47 for the second socket.  **Notice that they are indeed not consecutive! **
     264Let's have a closer look at the third implementation, i.e. ''03.multisaxpy_task'', which took 0.648536 seconds to finish using 48 threads. Remember that a full CM node features 48 threads (0-47) divided in two sockets: 0-11,24-35 for the first socket and 12-23,36-47 for the second socket.  **Notice that they are indeed not consecutive! **
    267265
    268266We can change the threads used by !OmpSs-2 with the Linux command `taskset`.  For example, the command to run this binary with 24 threads interleaved between the two sockets would be:
     
    282280`taskset -c 0-5,12-17 ./03.multisaxpy_task 16777216 8192 100`
    283281
    284 Changing the number of threads assigned to !OmpSs-2 affects the performance of the application:
     282Changing the number of threads assigned to !OmpSs-2 affects the performance of the application and not necessarily in a negative way, e.g. see below:
    285283{{{
    286284$ ./03.multisaxpy_task 16777216 8192 100
     
    296294}}}
    297295
     296== Creating a Dependency Graph ==
     297
     298Let's continue with the same example used above and create a dependency graph using only 12 threads of one socket (e.g. the second), which demonstrated to be the affinity giving the best results. Furthermore, we are not longer interested in running 100 iterations (nor using a large number of elements) to benchmark this example and hence only one iteration will suffice to generate a complete graph of this application. Run the following command:
     299
     300`NANOS6=graph taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1`
     301
     302This command can take some time.  Once it has finished it should have created a script with the name ''graph-XXXXX-YYYYYYYYY-script.sh'' and a directory ''graph-XXXXX-YYYYYYYYY-components''.  Execute the script by typing:
     303
     304`bash graph-XXXXX-YYYYYYYYY-script.sh`
     305
     306to merge the intermediate results into a single PDF file which should look like this:
     307
     308[[Image(SAXPYgraph.png, 30%)]]
     309
     310which illustrates 24 tasks executed in parallel using 12 threads.
    298311
    299312