Changes between Version 39 and Version 40 of Public/User_Guide/OmpSs-2
- Timestamp:
- Jun 14, 2019, 4:19:33 PM (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Public/User_Guide/OmpSs-2
v39 v40 141 141 A **trace.sh** file can be used to include all the environment variables needed to get an instrumentation trace of the execution. The content of this file is as follows: 142 142 143 {{{ 143 {{{#!bash 144 144 #!/bin/bash 145 145 export EXTRAE_CONFIG_FILE=extrae.xml … … 177 177 `git clone https://pm.bsc.es/gitlab/ompss-2/examples/multisaxpy` 178 178 179 and upload it to the ''/work/cdeep/ '' directoryof the DEEP cluster:180 181 `scp -r multisaxpy/ USERNAME@deep.fz-juelich.de:~/work/cdeep/ `179 and upload it to the ''/work/cdeep/USERNAME/'' directory (which might not exist yet) of the DEEP cluster: 180 181 `scp -r multisaxpy/ USERNAME@deep.fz-juelich.de:~/work/cdeep/USERNAME/` 182 182 183 183 Now connect to the DEEP login node: … … 187 187 and from there go to the ''multisaxpy'' folder 188 188 189 `cd /work/cdeep/ multisaxpy`189 `cd /work/cdeep/USERNAME/multisaxpy` 190 190 191 191 to request an interactive cluster module (CM) node in order to use all the available 48 threads to run a pure !OmpSs-2 application: … … 205 205 `module load OmpSs-2` 206 206 207 and check the affinity via the command `srun numactly --show` which should report :207 and check the affinity via the command `srun numactly --show` which should report the following: 208 208 {{{ 209 $ srun numactly --show 209 210 policy: default 210 211 preferred node: current … … 215 216 }}} 216 217 217 218 219 220 `hola 221 222 holo` 223 224 225 226 218 Now you should be able to clean, build and execute this benchmark consisting of 7 implementations via the command `make`: 227 219 {{{ 228 220 $ make clean … … 253 245 $ make run 254 246 ./01.multisaxpy_seq 16777216 8192 100 255 size: 16777216, bs: 8192, iterations: 100, time: 3. 2982, performance: 0.508678247 size: 16777216, bs: 8192, iterations: 100, time: 3.30132, performance: 0.508197 256 248 NANOS6_SCHEDULER=fifo ./02.multisaxpy_task_loop 16777216 8192 100 257 size: 16777216, bs: 8192, iterations: 100, time: 0.4 0835, performance: 4.10854249 size: 16777216, bs: 8192, iterations: 100, time: 0.411888, performance: 4.07325 258 250 ./03.multisaxpy_task 16777216 8192 100 259 size: 16777216, bs: 8192, iterations: 100, time: 0.64 6697, performance: 2.59429251 size: 16777216, bs: 8192, iterations: 100, time: 0.648536, performance: 2.58694 260 252 ./04.multisaxpy_task+dep 16777216 8192 100 261 size: 16777216, bs: 8192, iterations: 100, time: 1.0 0903, performance: 1.6627253 size: 16777216, bs: 8192, iterations: 100, time: 1.04207, performance: 1.60998 262 254 ./05.multisaxpy_task+weakdep 16777216 8192 100 263 size: 16777216, bs: 8192, iterations: 100, time: 1. 17464, performance: 1.42829255 size: 16777216, bs: 8192, iterations: 100, time: 1.09049, performance: 1.5385 264 256 NANOS6_SCHEDULER=fifo ./06.multisaxpy_task_loop+weakdep 16777216 8192 100 265 size: 16777216, bs: 8192, iterations: 100, time: 3.81836, performance: 0.439382257 size: 16777216, bs: 8192, iterations: 100, time: 8.91, performance: 0.188296 266 258 ./07.multisaxpy_task+reduction 16777216 8192 100 267 size: 16777216, bs: 8192, iterations: 100, time: 4.26565, performance: 0.39331259 size: 16777216, bs: 8192, iterations: 100, time: 7.03558, performance: 0.238462 268 260 }}} 269 261