Changes between Version 13 and Version 14 of Public/User_Guide/Offloading_hybrid_apps
- Timestamp:
- Sep 17, 2019, 4:46:11 PM (5 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Public/User_Guide/Offloading_hybrid_apps
v13 v14 70 70 The available versions are: 71 71 72 * `nbody.mpi.bin`: Simple parallel version using '''blocking MPI''' primitives for sending and72 * `nbody.mpi.bin`: Simple MPI parallel version using '''blocking MPI''' primitives for sending and 73 73 receiving each block of particles/forces. 74 74 75 * `nbody.mpi.ompss2.bin`: Parallel version using MPI + !OmpSs-2 tasks. Both '''computation''' and76 '''communication''' phases are '''taskified''', however, communication tasks (each one sending75 * `nbody.mpi.ompss2.bin`: Parallel version using '''MPI + !OmpSs-2 tasks'''. Both '''computation''' 76 and '''communication''' phases are '''taskified''', however, communication tasks (each one sending 77 77 or receiving a block) are serialized by an artificial dependency on a sentinel variable. This 78 78 is to prevent deadlocks between processes, since communication tasks perform '''blocking MPI''' … … 86 86 not need to move the data to/from the GPU''' device. 87 87 88 * `nbody.tampi.ompss2.bin`: Parallel version using MPI + !OmpSs-2 tasks + '''TAMPI''' library. This88 * `nbody.tampi.ompss2.bin`: Parallel version using '''MPI + !OmpSs-2 tasks + TAMPI''' library. This 89 89 version disables the artificial dependencies on the sentinel variable, so communication tasks can 90 90 run in parallel and overlap computations. The TAMPI library is in charge of managing the '''blocking … … 95 95 compute-intensive tasks to the GPUs. 96 96 97 * `nbody.mpi.omp.bin`: Parallel version using MPI + OpenMP tasks. Both '''computation''' and97 * `nbody.mpi.omp.bin`: Parallel version using '''MPI + OpenMP tasks'''. Both '''computation''' and 98 98 '''communication''' phases are '''taskified''', however, communication tasks (each one sending 99 99 or receiving a block) are serialized by an artificial dependency on a sentinel variable. This … … 112 112 Progress'' state. 113 113 114 * `nbody.tampi.omp.bin`: 115 116 * `nbody.tampi.omptarget.bin`: 114 * `nbody.tampi.omp.bin`: Parallel version using '''MPI + OpenMP tasks + TAMPI''' library. This 115 version disables the artificial dependencies on the sentinel variable, so communication tasks can 116 run in parallel and overlap computations. Since OpenMP only supports the non-blocking mechanism of 117 TAMPI, this version leverages non-blocking primitive calls. In this way, TAMPI library is in charge 118 of managing the '''non-blocking MPI''' operations to efficiently overlap communication and computation 119 tasks. 120 121 * `nbody.tampi.omptarget.bin`: A mix of the previous two variants where '''TAMPI''' is leveraged for 122 allowing the concurrent execution of communication tasks, and GPU processes '''offload''' the 123 compute-intensive tasks to the GPUs. '''Note:''' This version is not compiled by default since it is 124 still in a ''Work in Progress'' state. 117 125 118 126 === Building & Executing on DEEP ===