Changes between Version 42 and Version 43 of Public/User_Guide/OmpSs-2
- Timestamp:
- Jun 14, 2019, 5:31:09 PM (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Public/User_Guide/OmpSs-2
v42 v43 262 262 == Override the Number of Threads Used == 263 263 264 Let's have a closer look at the third implementation ''03.multisaxpy_task'' which took 0.648536 seconds to finish using 48 threads. 265 266 A full CM node features 48 threads (0-47) divided in two sockets: 0-11,24-35 for the first socket and 12-23,36-47 for the second socket. **Notice that they are indeed not consecutive! ** 264 Let's have a closer look at the third implementation, i.e. ''03.multisaxpy_task'', which took 0.648536 seconds to finish using 48 threads. Remember that a full CM node features 48 threads (0-47) divided in two sockets: 0-11,24-35 for the first socket and 12-23,36-47 for the second socket. **Notice that they are indeed not consecutive! ** 267 265 268 266 We can change the threads used by !OmpSs-2 with the Linux command `taskset`. For example, the command to run this binary with 24 threads interleaved between the two sockets would be: … … 282 280 `taskset -c 0-5,12-17 ./03.multisaxpy_task 16777216 8192 100` 283 281 284 Changing the number of threads assigned to !OmpSs-2 affects the performance of the application :282 Changing the number of threads assigned to !OmpSs-2 affects the performance of the application and not necessarily in a negative way, e.g. see below: 285 283 {{{ 286 284 $ ./03.multisaxpy_task 16777216 8192 100 … … 296 294 }}} 297 295 296 == Creating a Dependency Graph == 297 298 Let's continue with the same example used above and create a dependency graph using only 12 threads of one socket (e.g. the second), which demonstrated to be the affinity giving the best results. Furthermore, we are not longer interested in running 100 iterations (nor using a large number of elements) to benchmark this example and hence only one iteration will suffice to generate a complete graph of this application. Run the following command: 299 300 `NANOS6=graph taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1` 301 302 This command can take some time. Once it has finished it should have created a script with the name ''graph-XXXXX-YYYYYYYYY-script.sh'' and a directory ''graph-XXXXX-YYYYYYYYY-components''. Execute the script by typing: 303 304 `bash graph-XXXXX-YYYYYYYYY-script.sh` 305 306 to merge the intermediate results into a single PDF file which should look like this: 307 308 [[Image(SAXPYgraph.png, 30%)]] 309 310 which illustrates 24 tasks executed in parallel using 12 threads. 298 311 299 312