Changes between Version 18 and Version 19 of Public/User_Guide/TAMPI_NAM


Ignore:
Timestamp:
Apr 21, 2021, 5:29:43 PM (3 years ago)
Author:
Kevin Sala
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/TAMPI_NAM

    v18 v19  
    1717In the following sections, we describe the TAMPI library and how hybrid task-based applications should access NAM memory regions.
    1818Finally, we show the Heat equation benchmark as an example of using this support to save periodic snapshots of the computed matrix in a NAM allocation.
    19 
    2019
    2120=== Task-Aware MPI (TAMPI) ===
     
    4342
    4443=== Accessing NAM through ParaStationMPI and TAMPI ===
     44
     45The main idea is to allow tasks to access data stored in NAM regions efficiently and potentially in parallel, e.g. using several tasks to put/get data to/from the NAM.
     46The mechanism to access NAM regions is provided by the !ParaStation MPI and is based on the MPI RMA model.
     47!ParaStation MPI allows allocating NAM regions as MPI RMA windows, so that they can be accessed remotely by the ranks that participate in those windows using the standard `MPI_Put` and `MPI_Get` RMA operations.
     48This support is based on the fence synchronization mode of the MPI RMA model.
     49This mode works as follows: (1) all ranks participating in a given window have to open an access epoch on that window calling `MPI_Win_fence`, (2) they perform the desired RMA operations on the NAM window, and (3) they close the epoch with another call to `MPI_Win_fence`.
     50This mode of operation can be integrated into a hybrid task-based application by instantiating a task to open the epoch on the NAM window with `MPI_Win_fence`, followed by multiple concurrent tasks that write or read data to/from the NAM window, and finally, another task to close the access epoch.
     51Notice that tasks can define the corresponding dependencies on the window to ensure this order of execution, as in the following example:
     52
     53{{{#!c
     54// Open RMA access epoch to write the NAM window
     55#pragma oss task inout(namWindow)
     56{
     57    MPI_Request request;
     58    MPI_Win_ifence(0, namWindow, &request);
     59    TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
     60}
     61
     62// Write to NAM region concurrently
     63for (...) {
     64    #pragma oss task concurrent(namWindow)
     65    MPI_Put(..., namWindow);
     66}
     67
     68// Close RMA access epoch to write the NAM window
     69#pragma oss task inout(namWindow)
     70{
     71    MPI_Request request;
     72    MPI_Win_ifence(0, namWindow, &request);
     73    TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
     74}
     75}}}
    4576
    4677== Heat Benchmark ==
     
    98129    {
    99130        MPI_Request request;
    100         MPI_Win_ifence(namWindow, 0, &request);
     131        MPI_Win_ifence(0, namWindow, &request);
    101132        TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
    102133    }
     
    104135    // Write all blocks from the current rank to NAM subregions concurrently
    105136    for (B : all blocks in current rank) {
    106         #pragma oss task in(..block B..) in(namWindow)
     137        #pragma oss task in(..block B..) concurrent(namWindow)
    107138        {
    108139            MPI_Put(/* source data */   ..block B..,
     
    117148    {
    118149        MPI_Request request;
    119         MPI_Win_ifence(namWindow, 0, &request);
     150        MPI_Win_ifence(0, namWindow, &request);
    120151        TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
    121152    }