Context Navigation

Changes between Version 9 and Version 10 of Public/ParaStationMPI

Timestamp:: Oct 24, 2019, 12:18:03 PM (5 years ago)
Author:: Simon Pickartz
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

Public/ParaStationMPI

-                      v9
+                      v10
 = CUDA Support by !ParaStation MPI =
 === What is CUDA-awareness for MPI? ===
 In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUDirect'' capabilities that allow for direct RDMA transfers from and to Device memory.
+=== What is CUDA awareness for MPI? ===
+In brief, ''CUDA awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions such as `MPI_Send()` or `MPI_Recv()`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly, e.g., via load/store or `memcpy()` but has to be transferred in advance to the host memory via special routines such as `cudaMemcpy()`. As opposed to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the device memory and can then copy this buffer prior to the communication into a temporarily host buffer -- what is called ''staging'' of this buffer. Additionally, a CUDA-aware MPI library may also apply some kind of optimizations, e.g., by means of exploiting so-called ''GPUDirect'' capabilities that allow for direct RDMA transfers from and to the device memory.
 === Some external Resources ===
 …
 **Warning:** ''This manual section is currently under development. Therefore, the following usage guidelines may be not flawless and are likely to change in some respects in the near future! ''
 On the DEEP system, the CUDA-awareness can be enabled by loading a dedicated module that links to a dedicated !ParaStation MPI library that has been compiled with CUDA support:
+On the DEEP system, the CUDA awareness can be enabled by loading a module that links to a dedicated !ParaStation MPI library providing CUDA support:
 {{{
 module load GCC