Changes between Version 22 and Version 23 of Public/User_Guide/TAMPI_NAM


Ignore:
Timestamp:
Apr 26, 2021, 4:08:26 PM (3 years ago)
Author:
Kevin Sala
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/TAMPI_NAM

    v22 v23  
    4545=== Accessing NAM through !ParaStation MPI and TAMPI ===
    4646
    47 The main idea is to allow tasks to access data stored in NAM regions efficiently and potentially in parallel, e.g. using several tasks to put/get data to/from the NAM.
     47Our main objective is to allow tasks to access data stored in NAM regions efficiently and potentially in parallel, e.g. using several tasks to put/get data to/from the NAM.
    4848The mechanism to access NAM regions is provided by the !ParaStation MPI and is based on the MPI RMA model.
    4949!ParaStation MPI allows allocating NAM regions as MPI RMA windows, so that they can be accessed remotely by the ranks that participate in those windows using the standard `MPI_Put` and `MPI_Get` RMA operations.
    5050This support is based on the fence synchronization mode of the MPI RMA model.
    51 This mode works as follows: (1) all ranks participating in a given window have to open an access epoch on that window calling `MPI_Win_fence`, (2) they perform the desired RMA operations on the NAM window, and (3) they close the epoch with another call to `MPI_Win_fence`.
    52 This mode of operation can be integrated into a hybrid task-based application by instantiating a task to open the epoch on the NAM window with an MPI fence, followed by multiple concurrent tasks that write or read data to/from the NAM window, and finally, another task closing the access epoch with another fence.
    53 Notice that tasks can define the corresponding dependencies on the window to ensure this order of execution.
    54 
    55 In order to support this taskification, the fences should be managed by the TAMPI library to perform it efficiently and safely.
    56 We have extended the !ParaStation MPI to provide a new non-blocking function called MPI_Win_ifence that performs that starts a fence operation on a specific window and generates an MPI request to check its completion later.
     51This mode works as follows: (1) all ranks participating in a given window have to open an access epoch on that window calling `MPI_Win_fence`, (2) perform the desired RMA operations on the NAM window reading or writing NAM's data, and (3) close the epoch with another call to `MPI_Win_fence`.
     52This mode of operation can be easily integrated into a hybrid task-based application by instantiating a task that opens the epoch on the NAM window with an MPI fence, followed by multiple concurrent tasks that write or read data to/from the NAM window, and finally, another task that closes the access epoch with a fence.
     53Notice that tasks should define the corresponding dependencies on the window to ensure this strict order of execution.
     54
     55However, since the MPI fence operations are blocking and synchronous, calling them from tasks is not safe.
     56Having multiple windows and opening/closing epochs on them concurrently from different tasks could end up producing a communication deadlock.
     57Thus, the taskification of the window fences and RMA operations should be managed by the TAMPI library so that we can execute them efficiently and safely, and potentially in parallel among different windows.
     58To that end, we have extended the !ParaStation MPI to provide a new non-blocking function called `MPI_Win_ifence` that starts a fence operation on a specific window and generates an MPI request to check its completion later on.
    5759This MPI request can be then naturally handled by the TAMPI library using the `TAMPI_Iwait` function, as shown below.
    5860
     
    8385In this example, the first task opens an access epoch on the NAM window using the new `MPI_Win_ifence`.
    8486This function starts a fence operation, generates an MPI request and returns immediately.
    85 This request is then handled by the TAMPI library through the `TAMPI_Iwait`, which bind the completion of the task to the finalization of the fence operation.
     87This request is then handled by the TAMPI library through the `TAMPI_Iwait`, which binds the completion of the calling task to the finalization of the fence operation.
    8688The task can continue executing immediately without blocking and finalize its execution.
    8789Once the fence operation finalizes, the task will complete automatically and its successor task will become ready.
    88 The successors are the tasks that perform `MPI_Put` operations on the window in parallel.
    89 Notice that the dependencies of these tasks allow them to execute all `MPI_Put` in parallel, always after the window epoch has been opened.
    90 After all `MPI_Put` tasks have executed, the last task can run and close the window epoch.
     90The successors are the tasks that perform `MPI_Put` operations on the NAM window concurrently.
     91Notice that the dependencies of these tasks allow them to execute all `MPI_Put` in parallel, but always after the window epoch has been fully opened.
     92After all `MPI_Put` tasks have executed, the last task can run and close the access epoch on the NAM window.
    9193
    9294The extended version of !ParaStation MPI supporting the new `MPI_Win_ifence` can be found in the `ifence-nam` branch at the [https://gitlab.version.fz-juelich.de/DEEP-EST/psmpi] repository.