Context Navigation

Changes between Version 5 and Version 6 of Public/ParaStationMPI

Timestamp:: Oct 24, 2019, 12:02:44 PM (5 years ago)
Author:: Carsten Clauß
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

Public/ParaStationMPI

-                      v5
+                      v6
 ...a modularity-enabled MPI library.
+* [#CUDASupportbyParaStationMPI CUDA Support by ParaStation MPI]
+* [#NAMIntegrationforParaStationMPI NAM Integration for ParaStation MPI]
+----
+= CUDA Support by !ParaStation MPI =
+=== What is CUDA-awareness for MPI ===
+In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUdirect'' capabilities that allow for direct RDMA transfers from and to Device memory.
+=== Resources ===
+ * [http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#axzz44ZswsbEt Getting started with CUDA] (by NVIDIA)
+ * [https://developer.nvidia.com/gpudirect NVIDIA GPUDirect Overview] (by NVIDIA)
+ * [https://devblogs.nvidia.com/parallelforall/introduction-cuda-aware-mpi/ Introduction to CUDA-Aware MPI] (by NVIDIA)
+=== Current status on the DEEP system ===
+Currently (effective October 2019), !ParaStation MPI supports CUDA-awareness for Extoll just from the semantic-related point of view: The usage of Device pointers as arguments for send and receive buffers when calling MPI functions is supported but by an explicit ''Staging'' when Extoll is used.
+This is because the Extoll runtime up to now does not support GPUdirect, but EXTOLL is currently working on this in the context of DEEP-EST.
+As soon as GPUdirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI.
+(BTW: For !InfiniBand communication, !ParaStation MPI is already GPUdirect enabled.)
+=== Usage of CUDA-awareness on the DEEP system ===
+**Warning:** ''This manual section is currently under development. Therefore, the following usage guidelines may be not flawless and are likely to change in some respects in the near future! ''
+On the DEEP system, the CUDA-awareness can be enabled by loading a dedicated module that links to a dedicated !ParaStation MPI library that has been compiled with CUDA support.
+{{{
+module load GCC
+module load ParaStationMPI/5.4.0-1-CUDA
+}}}
+Please note that CUDA-awareness might impact the MPI performance on systems parts where CUDA is not used.
+Therefore, it might be useful (and the other way around necessary) to disable/enable the CUDA-awareness also by setting this environment variable:
+{{{
+PSP_CUDA=0|1
+}}}
 ----