Changes between Version 7 and Version 8 of Public/User_Guide/TAMPI_NAM


Ignore:
Timestamp:
Mar 3, 2021, 7:38:08 PM (3 years ago)
Author:
Kevin Sala
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/TAMPI_NAM

    v7 v8  
    3131When there is a timestep that requires a snapshot, the application instantiates multiple tasks that save the matrix data into the corresponding NAM subregion. Each MPI rank creates a task for saving the data of each matrix block into the NAM subregion. These communication tasks do not have any data dependency between them, so they can run in parallel writing data to the NAM region using regular `MPI_Put`. Ranks only write to the subregions that belong to themselves, never in other ranks' subregions. Even so, all `MPI_Put` calls must be done inside an RMA access epoch, so there must be one fence call before all the `MPI_Put` calls and another one after them to close the epoch for each of the timesteps with snapshot. Thus, here is where we use the new function `MPI_Win_ifence` together with the TAMPI non-blocking support. In this way, we taskify both synchronization and writing of NAM regions, keeping the data-flow model, and without having to stop the parallelism (e.g., with a `taskwait`) to perform the snapshots. Thanks to the task data dependencies and TAMPI, we cleanly include the snapshots in the application's data-flow execution as any other regular task.
    3232
     33The following pseudo-code shows how the saving of snapshots work in `02.heat_itampi_ompss2_tasks.bin`:
     34
     35{{{#!c
     36void solve() {
     37    int namSnapshotFreq = ...;
     38    int namSnapshotId = 0;
     39
     40    for (t = 1; t <= timesteps; ++t) {
     41        // Computation and communication tasks declaring
     42        // dependencies on the blocks they process
     43        gaussSeidelSolver(...);
     44
     45        if (t % namSnapshotFreq == 0) {
     46            namSaveMatrix(namSnapshotId, namWindow, ...);
     47            ++namSnapshotId;
     48        }
     49    }
     50    #pragma oss taskwait
     51}
     52}}}
     53
     54{{{#!c
     55void namSaveMatrix(int namSnapshotId, MPI_Win namWindow, ...) {
     56    // Compute destination offset in NAM region
     57    int snapshotOffset = namSnapshotId*sizeof(..all blocks..);
     58
     59    // Open RMA access epoch to write the NAM window for this timestep
     60    #pragma oss task in(..all blocks..) inout(namWindow)
     61    {
     62        MPI_Request request;
     63        MPI_Win_ifence(namWindow, 0, &request);
     64        TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
     65    }
     66
     67    // Write all blocks from the current rank to NAM subregions concurrently
     68    for (B : all blocks) {
     69        #pragma oss task in(..block B..) in(namWindow)
     70        {
     71            MPI_Put(/* origin */ ..block B..,
     72                /* target rank */ currentRank,
     73                /* target offset */ snapshotOffset + B,
     74                /* target window */ namWindow);
     75        }
     76    }
     77
     78    // Close RMA access epoch to write the NAM window for this timestep
     79    #pragma oss task in(..all blocks..) inout(namWindow)
     80    {
     81        MPI_Request request;
     82        MPI_Win_ifence(namWindow, 0, &request);
     83        TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
     84    }
     85}
     86}}}
     87
     88
    3389=== Requirements ===
    3490The requirements of this application are shown in the following lists. The main requirements are: