Changes between Version 3 and Version 4 of Public/User_Guide/TAMPI


Ignore:
Timestamp:
Jun 17, 2019, 3:15:06 PM (5 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/TAMPI

    v3 v4  
    183183}}}
    184184
    185 
    186 
    187 
    188 ----
     185----
     186
     187
     188= Nbody Benchmark (MPI+!OmpSs-2+TAMPI) =
     189
     190Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/nbody] and transfer it to a DEEP working directory.
     191
     192== Description ==
     193
     194This benchmark represents an N-body simulation to numerically approximate the evolution of a system of bodies in which each body continuously interacts with every other body.  A familiar example is an astrophysical simulation in which each body represents a galaxy or an individual star, and the bodies attract each other through the gravitational force.
     195
     196There are **7 implementations** of this benchmark which are compiled in different
     197binaries by executing the command `make`. These versions can be blocking,
     198when the particle space is divided into smaller blocks, or non-blocking, when
     199it is not.
     200
     201The interoperability versions (MPI+!OmpSs-2+TAMPI) are compiled only if the environment variable `TAMPI_HOME` is set to the Task-Aware MPI (TAMPI) library's installation directory.
     202
     203== Execution Instructions ==
     204
     205The binaries accept several options. The most relevant options are the number
     206of total particles (`-p`) and the number of timesteps (`-t`). More options
     207can be seen with the `-h` option. An example of execution could be:
     208
     209`mpiexec -n 4 -bind-to hwthread:16 ./nbody -t 100 -p 8192`
     210
     211in which the application will perform 100 timesteps in 4 MPI processes with 16 hardware threads in each process (used by the !OmpSs-2 runtime). The total number of particles will be 8192 so that each process will have 2048 particles (2 blocks per process).
     212
     213== References ==
     214
     215* [https://pm.bsc.es/gitlab/ompss-2/examples/nbody]
     216* [https://en.wikipedia.org/wiki/N-body_simulation]
     217
     218
     219----
     220
     221
     222= Heat Benchmark (MPI+!OmpSs-2+TAMPI) =
     223
     224Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/heat] and transfer it to a DEEP working directory.
     225
     226== Description ==
     227
     228This benchmark uses an iterative Gauss-Seidel method to solve the heat equation,
     229which is a parabolic partial differential equation that describes the distribution of heat (or variation in temperature) in a given region over time. The heat equation is of fundamental importance in a wide range of science fields. In
     230mathematics, it is the parabolic partial differential equation par excellence. In statistics, it is related to the study of the Brownian motion. Also, the diffusion equation is a generic version of the heat equation, and it is related to the study of chemical diffusion processes.
     231
     232There are **9 implementations** of this benchmark which are compiled in different
     233binaries by executing the command `make`.
     234
     235The interoperability versions (MPI+!OmpSs-2+TAMPI) are compiled only if the environment variable `TAMPI_HOME` is set to the Task-Aware MPI (TAMPI) library's installation directory.
     236
     237== Execution Instructions ==
     238
     239The binaries accept several options. The most relevant options are the size
     240of the matrix in each dimension (`-s`) and the number of timesteps (`-t`). More options can be seen with the `-h` option. An example of execution
     241could be:
     242
     243`mpiexec -n 4 -bind-to hwthread:16 ./heat -t 150 -s 8192`
     244
     245in which the application will perform 150 timesteps in 4 MPI processes with 16
     246hardware threads in each process (used by the !OmpSs-2 runtime). The size of the
     247matrix in each dimension will be 8192 (8192^2^ elements in total), this means
     248that each process will have 2048x8192 elements (16 blocks per process).
     249
     250== References ==
     251
     252* [https://pm.bsc.es/gitlab/ompss-2/examples/heat]
     253* [https://pm.bsc.es/ftp/ompss-2/doc/examples/local/sphinx/04-mpi+ompss-2.html]
     254* [https://en.wikipedia.org/wiki/Heat_equation]
     255
     256----
     257
     258= Krist Benchmark (!OmpSs-2+CUDA) =
     259
     260Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/krist] and transfer it to a DEEP working directory.
     261
     262== Description ==
     263
     264This benchmark represents the krist kernel, which is used in crystallography to find the exact shape of a molecule using Rntgen diffraction on single crystals or powders.
     265
     266There are **2 implementations** of this benchmark, ''krist'' and ''krist-unified'' using regular and unified CUDA memory, repectively.
     267
     268== Execution Instructions ==
     269
     270`./krist N_A N_R`
     271
     272where:
     273* `N_A` is the number of atoms (1000 by default).
     274* `N_R` is the umber of reflections (10000 by default).
     275
     276== References ==
     277
     278* [https://pm.bsc.es/gitlab/ompss-2/examples/krist]