13 | | In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUdirect'' capabilities that allow for direct RDMA transfers from and to Device memory. |
| 13 | In brief, ''CUDA-awareness'' in an MPI library means that a mixed CUDA + MPI application is allowed to pass pointers to CUDA buffers (these are memory regions located on the GPU, the so-called ''Device'' memory) directly to MPI functions like `MPI_Send` or `MPI_Recv`. A non CUDA-aware MPI library would fail in such a case because the CUDA-memory cannot be accessed directly e.g. via load/store or `memcpy()` but has previously to be transferred via special routines like `cudaMemcpy()` to the Host memory. In contrast to this, a CUDA-aware MPI library recognizes that a pointer is associated with a buffer within the Device memory and can then copy this buffer before communication temporarily into the Host memory -- what is called ''Staging'' of this buffer. In addition, a CUDA-aware MPI library may also apply some kind of optimizations, for example, by means of exploiting so-called ''GPUDirect'' capabilities that allow for direct RDMA transfers from and to Device memory. |
22 | | This is because the Extoll runtime up to now does not support GPUdirect, but EXTOLL is currently working on this in the context of DEEP-EST. |
23 | | As soon as GPUdirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI. |
24 | | (BTW: For !InfiniBand communication, !ParaStation MPI is already GPUdirect enabled.) |
| 22 | This is because the Extoll runtime up to now does not support GPUDirect, but EXTOLL is currently working on this in the context of DEEP-EST. |
| 23 | As soon as GPUDirect will be supported by Extoll, this will also be integrated and enabled in !ParaStation MPI. |
| 24 | (BTW: For !InfiniBand communication, !ParaStation MPI is already GPUDirect enabled.) |