| 4 | |
| 5 | * [#CUDASupportbyParaStationMPI CUDA Support by ParaStation MPI] |
| 6 | * [#NAMIntegrationforParaStationMPI NAM Integration for ParaStation MPI] |
| 7 | |
| 8 | ---- |
| 9 | |
| 10 | = CUDA Support by !ParaStation MPI = |
| 11 | |
| 12 | === What is CUDA-awareness for MPI === |
| 13 | In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUdirect'' capabilities that allow for direct RDMA transfers from and to Device memory. |
| 14 | |
| 15 | === Resources === |
| 16 | * [http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#axzz44ZswsbEt Getting started with CUDA] (by NVIDIA) |
| 17 | * [https://developer.nvidia.com/gpudirect NVIDIA GPUDirect Overview] (by NVIDIA) |
| 18 | * [https://devblogs.nvidia.com/parallelforall/introduction-cuda-aware-mpi/ Introduction to CUDA-Aware MPI] (by NVIDIA) |
| 19 | |
| 20 | === Current status on the DEEP system === |
| 21 | Currently (effective October 2019), !ParaStation MPI supports CUDA-awareness for Extoll just from the semantic-related point of view: The usage of Device pointers as arguments for send and receive buffers when calling MPI functions is supported but by an explicit ''Staging'' when Extoll is used. |
| 22 | This is because the Extoll runtime up to now does not support GPUdirect, but EXTOLL is currently working on this in the context of DEEP-EST. |
| 23 | As soon as GPUdirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI. |
| 24 | (BTW: For !InfiniBand communication, !ParaStation MPI is already GPUdirect enabled.) |
| 25 | |
| 26 | === Usage of CUDA-awareness on the DEEP system === |
| 27 | |
| 28 | **Warning:** ''This manual section is currently under development. Therefore, the following usage guidelines may be not flawless and are likely to change in some respects in the near future! '' |
| 29 | |
| 30 | On the DEEP system, the CUDA-awareness can be enabled by loading a dedicated module that links to a dedicated !ParaStation MPI library that has been compiled with CUDA support. |
| 31 | {{{ |
| 32 | module load GCC |
| 33 | module load ParaStationMPI/5.4.0-1-CUDA |
| 34 | }}} |
| 35 | Please note that CUDA-awareness might impact the MPI performance on systems parts where CUDA is not used. |
| 36 | Therefore, it might be useful (and the other way around necessary) to disable/enable the CUDA-awareness also by setting this environment variable: |
| 37 | {{{ |
| 38 | PSP_CUDA=0|1 |
| 39 | }}} |
| 40 | |