Context Navigation

Offloading_hybrid_apps

Timestamp:: Sep 17, 2019, 12:59:05 PM (6 years ago)
Author:: Kevin Sala
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

Public/User_Guide/Offloading_hybrid_apps

-                      v4
+                      v5
   * '''Task-Aware MPI (TAMPI)''': The Task-Aware MPI library provides the interoperability mechanism
     for MPI and OpenMP/!OmpSs-2. Downloads and more information at https://github.com/bsc-pm/tampi.
+=== Versions ===
+The NBody application has several versions which are compiled in different binaries,
+by executing the `make` command. All of them divide the particle space into smaller
+blocks. MPI processes are divided into two groups: GPU processes and CPU processes.
+GPU processes are responsible for computing the forces between each pair of particles
+blocks, and then, these forces are sent to the CPU processes, where each process
+updates its particles blocks using the received forces. The particles and forces blocks
+are equally distributed amongst each MPI process in each group. Thus, each MPI process
+is in charge of computing the forces or updating the particles of a consecutive chunk
+of blocks.
+The available versions are:
+  * '''nbody.mpi.${BS}bs.bin''': Parallel version using MPI.
+  * '''nbody.mpi.ompss2.${BS}bs.bin''': Parallel version using MPI + !OmpSs-2 tasks. Both computation and
+    communication phases are taskified, however, communication tasks are serialized by declaring an
+    artificial dependency on a sentinel variable. This is to prevent deadlocks between processes,
+    since communication tasks perform blocking MPI calls.
+  * '''nbody.mpi.ompss2.cuda.${BS}bs.bin''': The same as the previous version but using CUDA tasks to
+    execute the most compute-instensive parts of the application at the available GPUs.
+  * '''nbody.tampi.ompss2.${BS}bs.bin''': Parallel version using MPI + !OmpSs-2 tasks + TAMPI library. This
+    version disables the artificial dependencies on the sentinel variable, so communication tasks can
+    run in parallel. The TAMPI library is in charge of managing the blocking MPI calls to avoid the
+    blocking of the underlying execution resources.
+  * '''nbody.tampi.ompss2.cuda.${BS}bs.bin''': The same as the previous version but using CUDA tasks to
+    execute the most compute-instensive parts of the application at the available GPUs.
+  * '''nbody.mpi.omp.${BS}bs.bin''':
+  * '''nbody.mpi.omptarget.${BS}bs.bin''':
+  * '''nbody.tampi.omp.${BS}bs.bin''':
+  * '''nbody.tampi.omptarget.${BS}bs.bin''':