Changes between Version 26 and Version 27 of Public/User_Guide/OmpSs-2


Ignore:
Timestamp:
Jun 12, 2019, 10:23:33 AM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/OmpSs-2

    v26 v27  
    77* [#QuickSetuponDEEPSystem Quick Setup on DEEP System]
    88* [#RepositorywithExamples Repository with Examples]
    9 * [#Example:Multisaxpy Example: Multisaxpy]
     9* [#multisaxpybenchmark(OmpSs-2) multisaxpy benchmark (OmpSs-2)]
     10* [#dot-productbenchmark(OmpSs-2) dot-product benchmark (OmpSs-2)]
     11* [#mergesortbenchmark(OmpSs-2) mergesort benchmark (OmpSs-2)]
     12* [#nqueensbenchmark(OmpSs-2) nqueens benchmark (OmpSs-2)]
     13* [#Choleskybenchmark(OmpSs-2) Cholesky benchmark (OmpSs-2)]
     14* [#matmulbenchmark(OmpSs-2) matmul benchmark (OmpSs-2)]
    1015
    1116
     
    133138Additionally, you will need to change your running script in order to invoke the program through this trace.sh script. Although you can also edit your running script adding all the environment variables related with the instrumentation, it is preferable to use this extra script to easily change between instrumented and non-instrumented executions. When in need to instrument your execution, simply include trace.sh before the program invocation. Note that the **extrae.xml** file, which is used to configure the Extrae library to get a Paraver trace, is also needed.
    134139
    135 = Example: Multisaxpy =
     140= multisaxpy benchmark (OmpSs-2) =
    136141
    137142Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/multisaxpy] and transfer it to a DEEP working directory.
     
    148153
    149154where:
    150 * SIZE is the number of elements of the vectors used on the SAXPY operation.
    151 * The SAXPY operation will be applied to the vector in blocks that contains BLOCK_SIZE elements.
    152 * ITERATIONS is the number of times the SAXPY operation is executed.
     155* `SIZE` is the number of elements of the vectors used on the SAXPY operation.
     156* The SAXPY operation will be applied to the vector in blocks that contains `BLOCK_SIZE` elements.
     157* `ITERATIONS` is the number of times the SAXPY operation is executed.
    153158
    154159== Example output ==
     
    201206* [https://pm.bsc.es/gitlab/ompss-2/examples/multisaxpy]
    202207* [https://pm.bsc.es/ftp/ompss-2/doc/examples/local/sphinx/03-fundamentals.html]
    203 * [http://en.wikipedia.org/wiki/AXPY]
    204 
     208* [https://en.wikipedia.org/wiki/AXPY]
     209
     210
     211= dot-product benchmark (OmpSs-2) =
     212
     213Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/dot-product] and transfer it to a DEEP working directory.
     214
     215== Description ==
     216
     217This benchmark runs a dot-product operation. The dot-product (also known as scalar product) is an algebraic operation that takes two equal-length sequences of numbers and returns a single number.
     218
     219There are **3 implementations** of this benchmark.
     220
     221== Execution instructions ==
     222
     223`./dot_product SIZE CHUNK_SIZE`
     224
     225where:
     226* `SIZE` is the number of elements of the vectors used on the dot-product operation.
     227* The dot-product operation will be applied to the vector in blocks that contains `CHUNK_SIZE` elements.
     228
     229== References ==
     230
     231* [https://pm.bsc.es/gitlab/ompss-2/examples/dot-product]
     232* [https://en.wikipedia.org/wiki/Dot_product]
     233
     234= mergesort benchmark (OmpSs-2) =
     235
     236Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/mergesort] and transfer it to a DEEP working directory.
     237
     238== Description ==
     239
     240This benchmark is a recursive sorting algorithm based on comparisons.
     241
     242There are **6 implementations** of this benchmark.
     243
     244== Execution instructions ==
     245
     246`./mergesort N BLOCK_SIZE`
     247
     248where:
     249* `N` is the number of elements to be sorted. Mandatory for all versions of this benchmark.
     250* `BLOCK_SIZE` is used to determine the threshold when the task becomes ''final''. If the array size is less or equal than `BLOCK_SIZE`, the task will become final, so no more tasks will be created inside it. Mandatory for all versions of this benchmark.
     251
     252== References ==
     253
     254* [https://pm.bsc.es/gitlab/ompss-2/examples/mergesort]
     255* [https://en.wikipedia.org/wiki/Merge_sort]
     256
     257
     258= nqueens benchmark (OmpSs-2) =
     259
     260Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/nqueens] and transfer it to a DEEP working directory.
     261
     262== Description ==
     263
     264This benchmark computes, for a N x N chessboard, the number of configurations
     265of placing N chess queens in the chessboard such that none of them is able to
     266attack any other. It is implemented using a branch-and-bound algorithm.
     267
     268There are **7 implementations** of this benchmark.
     269
     270== Execution instructions ==
     271
     272`./n-queens N [threshold]`
     273
     274where:
     275* `N` is the chessboard's size. Mandatory for all versions of this benchmark.
     276* `threshold` is the number of rows of the chessboard that will generate tasks.
     277The remaining rows (N - threshold) will not generate tasks and will be executed
     278in serial mode. Mandatory from all versions of this benchmark except from 01 (sequential version) and 02 (fully parallel version).
     279
     280== References ==
     281
     282* [https://pm.bsc.es/gitlab/ompss-2/examples/nqueens]
     283* [https://en.wikipedia.org/wiki/Eight_queens_puzzle]
     284
     285
     286= Cholesky benchmark (OmpSs-2+CBLAS|LAPACKE) =
     287
     288Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/cholesky] and transfer it to a DEEP working directory.
     289
     290== Description ==
     291
     292This benchmark shows a Cholesky decomposition with OmpSs-2 using tasks with priorities.
     293
     294There are **3 implementations** of this benchmark.
     295
     296The code uses the CBLAS and LAPACKE interfaces to both BLAS and LAPACK.
     297By default we try to find MKL, ATLAS and LAPACKE from the MKLROOT, LIBRARY_PATH and C_INCLUDE_PATH environment variables. If you are using an implementation with other linking requirements, please edit the `LIBS` entry in the makefile accordingly.
     298
     299The Makefile has three additional rules:
     300* **run:** runs each version one after the other.
     301* **run-graph:** runs the OmpSs-2 versions with the graph instrumentation.
     302* **run-extrae:** runs the OmpSs-2 versions with the extrae instrumentation.
     303
     304For the graph instrumentation, it is recommended to view the resulting PDF in single page mode and to advance through the pages. This will show the actual instantiation and execution of the code. For the extrae instrumentation, extrae must be loaded and available at least through the `LD_LIBRARY_PATH` environment variable.
     305
     306== Execution instructions ==
     307
     308`./cholesky SIZE BLOCK_SIZE`
     309
     310where:
     311* `SIZE` is the number of elements per side of the matrix.
     312* The decomposition is made by blocks of `BLOCK_SIZE` by `BLOCK_SIZE` elements.
     313
     314== References ==
     315
     316* [https://pm.bsc.es/gitlab/ompss-2/examples/nqueens]
     317* [https://en.wikipedia.org/wiki/Eight_queens_puzzle]