Changes between Version 8 and Version 9 of Public/ParaStationMPI


Ignore:
Timestamp:
Oct 24, 2019, 12:10:42 PM (5 years ago)
Author:
Carsten Clauß
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/ParaStationMPI

    v8 v9  
    1111
    1212=== What is CUDA-awareness for MPI? ===
    13 In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUdirect'' capabilities that allow for direct RDMA transfers from and to Device memory.
     13In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUDirect'' capabilities that allow for direct RDMA transfers from and to Device memory.
    1414
    1515=== Some external Resources ===
     
    2020=== Current status on the DEEP system ===
    2121Currently (effective October 2019), !ParaStation MPI supports CUDA-awareness for Extoll just from the semantic-related point of view: The usage of Device pointers as arguments for send and receive buffers when calling MPI functions is supported but by an explicit ''Staging'' when Extoll is used.
    22 This is because the Extoll runtime up to now does not support GPUdirect, but EXTOLL is currently working on this in the context of DEEP-EST.
    23 As soon as GPUdirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI.
    24 (BTW: For !InfiniBand communication, !ParaStation MPI is already GPUdirect enabled.)
     22This is because the Extoll runtime up to now does not support GPUDirect, but EXTOLL is currently working on this in the context of DEEP-EST.
     23As soon as GPUDirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI.
     24(BTW: For !InfiniBand communication, !ParaStation MPI is already GPUDirect enabled.)
    2525
    2626=== Usage on the DEEP system ===