Changes between Version 39 and Version 40 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 14, 2019, 4:19:33 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v39 v40  
    141141A **trace.sh** file can be used to include all the environment variables needed to get an instrumentation trace of the execution. The content of this file is as follows:
    142142
    143 {{{
     143{{{#!bash
    144144#!/bin/bash
    145145export EXTRAE_CONFIG_FILE=extrae.xml
     
    177177`git clone https://pm.bsc.es/gitlab/ompss-2/examples/multisaxpy`
    178178
    179 and upload it to the ''/work/cdeep/'' directory of the DEEP cluster:
    180 
    181 `scp -r multisaxpy/ USERNAME@deep.fz-juelich.de:~/work/cdeep/`
     179and upload it to the ''/work/cdeep/USERNAME/'' directory (which might not exist yet) of the DEEP cluster:
     180
     181`scp -r multisaxpy/ USERNAME@deep.fz-juelich.de:~/work/cdeep/USERNAME/`
    182182
    183183Now connect to the DEEP login node:
     
    187187and from there go to the ''multisaxpy'' folder
    188188
    189 `cd /work/cdeep/multisaxpy`
     189`cd /work/cdeep/USERNAME/multisaxpy`
    190190
    191191to request an interactive cluster module (CM) node in order to use all the available 48 threads to run a pure !OmpSs-2 application:
     
    205205`module load OmpSs-2`
    206206
    207 and check the affinity via the command `srun numactly --show` which should report:
     207and check the affinity via the command `srun numactly --show` which should report the following:
    208208{{{
     209$ srun numactly --show
    209210policy: default
    210211preferred node: current
     
    215216}}}
    216217
    217 
    218 
    219 
    220 `hola
    221 
    222 holo`
    223 
    224 
    225 
    226 
     218Now you should be able to clean, build and execute this benchmark consisting of 7 implementations via the command `make`:
    227219{{{
    228220$ make clean
     
    253245$ make run
    254246./01.multisaxpy_seq 16777216 8192 100
    255 size: 16777216, bs: 8192, iterations: 100, time: 3.2982, performance: 0.508678
     247size: 16777216, bs: 8192, iterations: 100, time: 3.30132, performance: 0.508197
    256248NANOS6_SCHEDULER=fifo ./02.multisaxpy_task_loop 16777216 8192 100
    257 size: 16777216, bs: 8192, iterations: 100, time: 0.40835, performance: 4.10854
     249size: 16777216, bs: 8192, iterations: 100, time: 0.411888, performance: 4.07325
    258250./03.multisaxpy_task 16777216 8192 100
    259 size: 16777216, bs: 8192, iterations: 100, time: 0.646697, performance: 2.59429
     251size: 16777216, bs: 8192, iterations: 100, time: 0.648536, performance: 2.58694
    260252./04.multisaxpy_task+dep 16777216 8192 100
    261 size: 16777216, bs: 8192, iterations: 100, time: 1.00903, performance: 1.6627
     253size: 16777216, bs: 8192, iterations: 100, time: 1.04207, performance: 1.60998
    262254./05.multisaxpy_task+weakdep 16777216 8192 100
    263 size: 16777216, bs: 8192, iterations: 100, time: 1.17464, performance: 1.42829
     255size: 16777216, bs: 8192, iterations: 100, time: 1.09049, performance: 1.5385
    264256NANOS6_SCHEDULER=fifo ./06.multisaxpy_task_loop+weakdep 16777216 8192 100
    265 size: 16777216, bs: 8192, iterations: 100, time: 3.81836, performance: 0.439382
     257size: 16777216, bs: 8192, iterations: 100, time: 8.91, performance: 0.188296
    266258./07.multisaxpy_task+reduction 16777216 8192  100
    267 size: 16777216, bs: 8192, iterations: 100, time: 4.26565, performance: 0.39331
     259size: 16777216, bs: 8192, iterations: 100, time: 7.03558, performance: 0.238462
    268260}}}
    269261