402 | | * [https://pm.bsc.es/ftp/ompss-2/doc/examples/local/sphinx/02-examples.html] |
403 | | * [https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm] |
404 | | |
405 | | |
406 | | ---- |
| 404 | * [https://en.wikipedia.org/wiki/N-body_simulation] |
| 405 | |
| 406 | |
| 407 | ---- |
| 408 | |
| 409 | |
| 410 | = heat benchmark (MPI+OmpSs-2+TAMPI) = |
| 411 | |
| 412 | Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/heat] and transfer it to a DEEP working directory. |
| 413 | |
| 414 | == Description == |
| 415 | |
| 416 | This benchmark uses an iterative Gauss-Seidel method to solve the heat equation, |
| 417 | which is a parabolic partial differential equation that describes the distribution of heat (or variation in temperature) in a given region over time. The heat equation is of fundamental importance in a wide range of science fields. In |
| 418 | mathematics, it is the parabolic partial differential equation par excellence. In statistics, it is related to the study of the Brownian motion. Also, the diffusion equation is a generic version of the heat equation, and it is related to the study of chemical diffusion processes. |
| 419 | |
| 420 | There are **9 implementations** of this benchmark which are compiled in different |
| 421 | binaries by executing the command `make`. |
| 422 | |
| 423 | The interoperability versions (MPI+OmpSs-2+TAMPI) are compiled only if the environment variable `TAMPI_HOME` is set to the Task-Aware MPI (TAMPI) library's installation directory. |
| 424 | |
| 425 | == Execution instructions == |
| 426 | |
| 427 | The binaries accept several options. The most relevant options are the size |
| 428 | of the matrix in each dimension (`-s`) and the number of timesteps (`-t`). More options can be seen with the `-h` option. An example of execution |
| 429 | could be: |
| 430 | |
| 431 | `mpiexec -n 4 -bind-to hwthread:16 ./heat -t 150 -s 8192` |
| 432 | |
| 433 | in which the application will perform 150 timesteps in 4 MPI processes with 16 |
| 434 | hardware threads in each process (used by the OmpSs-2 runtime). The size of the |
| 435 | matrix in each dimension will be 8192 (8192^2^ elements in total), this means |
| 436 | that each process will have 2048x8192 elements (16 blocks per process). |
| 437 | |
| 438 | == References == |
| 439 | |
| 440 | * [https://pm.bsc.es/gitlab/ompss-2/examples/heat] |
| 441 | * [https://pm.bsc.es/ftp/ompss-2/doc/examples/local/sphinx/04-mpi+ompss-2.html] |
| 442 | * [https://en.wikipedia.org/wiki/Heat_equation] |
| 443 | |
| 444 | ---- |
| 445 | |
| 446 | = krist benchmark (OmpSs-2+CUDA) = |
| 447 | |
| 448 | Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/krist] and transfer it to a DEEP working directory. |
| 449 | |
| 450 | == Description == |
| 451 | |
| 452 | This benchmark represents the krist kernel, which is used on crystallography to find the exact shape of a molecule using Rntgen diffraction on single crystals or powders. |
| 453 | |
| 454 | There are **2 implementations** of this benchmark, ''krist'' and ''krist-unified'' using regular and unified CUDA memory, repectively. |
| 455 | |
| 456 | == Execution instructions == |
| 457 | |
| 458 | `./krist N_A N_R` |
| 459 | |
| 460 | where: |
| 461 | * `N_A` is the number of atoms (1000 by default). |
| 462 | * `N_R` is the umber of reflections (10000 by default). |
| 463 | |
| 464 | == References == |
| 465 | |
| 466 | * [https://pm.bsc.es/gitlab/ompss-2/examples/krist] |