Changes between Version 13 and Version 14 of Public/User_Guide/Offloading_hybrid_apps


Ignore:
Timestamp:
Sep 17, 2019, 4:46:11 PM (5 years ago)
Author:
Kevin Sala
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/Offloading_hybrid_apps

    v13 v14  
    7070The available versions are:
    7171
    72   * `nbody.mpi.bin`: Simple parallel version using '''blocking MPI''' primitives for sending and
     72  * `nbody.mpi.bin`: Simple MPI parallel version using '''blocking MPI''' primitives for sending and
    7373    receiving each block of particles/forces.
    7474
    75   * `nbody.mpi.ompss2.bin`: Parallel version using MPI + !OmpSs-2 tasks. Both '''computation''' and
    76     '''communication''' phases are '''taskified''', however, communication tasks (each one sending
     75  * `nbody.mpi.ompss2.bin`: Parallel version using '''MPI + !OmpSs-2 tasks'''. Both '''computation'''
     76    and '''communication''' phases are '''taskified''', however, communication tasks (each one sending
    7777    or receiving a block) are serialized by an artificial dependency on a sentinel variable. This
    7878    is to prevent deadlocks between processes, since communication tasks perform '''blocking MPI'''
     
    8686    not need to move the data to/from the GPU''' device.
    8787
    88   * `nbody.tampi.ompss2.bin`: Parallel version using MPI + !OmpSs-2 tasks + '''TAMPI''' library. This
     88  * `nbody.tampi.ompss2.bin`: Parallel version using '''MPI + !OmpSs-2 tasks + TAMPI''' library. This
    8989    version disables the artificial dependencies on the sentinel variable, so communication tasks can
    9090    run in parallel and overlap computations. The TAMPI library is in charge of managing the '''blocking
     
    9595    compute-intensive tasks to the GPUs.
    9696
    97   * `nbody.mpi.omp.bin`: Parallel version using MPI + OpenMP tasks. Both '''computation''' and
     97  * `nbody.mpi.omp.bin`: Parallel version using '''MPI + OpenMP tasks'''. Both '''computation''' and
    9898    '''communication''' phases are '''taskified''', however, communication tasks (each one sending
    9999    or receiving a block) are serialized by an artificial dependency on a sentinel variable. This
     
    112112    Progress'' state.
    113113
    114   * `nbody.tampi.omp.bin`:
    115 
    116   * `nbody.tampi.omptarget.bin`:
     114  * `nbody.tampi.omp.bin`: Parallel version using '''MPI + OpenMP tasks + TAMPI''' library. This
     115    version disables the artificial dependencies on the sentinel variable, so communication tasks can
     116    run in parallel and overlap computations. Since OpenMP only supports the non-blocking mechanism of
     117    TAMPI, this version leverages non-blocking primitive calls. In this way, TAMPI library is in charge
     118    of managing the '''non-blocking MPI''' operations to efficiently overlap communication and computation
     119    tasks.
     120
     121  * `nbody.tampi.omptarget.bin`: A mix of the previous two variants where '''TAMPI''' is leveraged for
     122    allowing the concurrent execution of communication tasks, and GPU processes '''offload''' the
     123    compute-intensive tasks to the GPUs. '''Note:''' This version is not compiled by default since it is
     124    still in a ''Work in Progress'' state.
    117125
    118126=== Building & Executing on DEEP ===