Changes between Version 33 and Version 34 of Public/ParaStationMPI


Ignore:
Timestamp:
May 31, 2021, 1:02:22 PM (3 years ago)
Author:
Carsten Clauß
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/ParaStationMPI

    v33 v34  
    538538}
    539539}}}
     540
     541
     542=== Releasing PSNAM Memory ===
     543
     544According to the standard, an MPI RMA window must be freed by the collective call of `MPI_Win_free()`.
     545In case of a PSNAM window, the selection of the `psnam_consistency` MPI info key decided whether the corresponding NAM memory regions are to be freed, too.
     546Since `MPI_Win_free()` has no info parameter, the corresponding selection has either already to be made when calling `MPI_Win_allocate()` and/or can also be made/changed later by using `MPI_Win_info_set()`.
     547
     548A sound MPI application must free all MPI window objects before calling `MPI_Finalize()` -- regardless whether the corresponding NAM region should be persistent or not.
     549According to this, there are different degrees with respect to the lifetime of an MPI window:
     550Common MPI windows just live as long as `MPI_Win_free()  has not been called and the related session is still alive.
     551In contrast to this, persistent NAM windows exist as long as the assigned NAM space is granted by the NAM manager.
     552Upon an `MPI_Win_free()` call, such windows are merely freed from the perspective of the MPI current application, not from the view of the NAM manager.
     553
     554
     555=== Attaching to Persistent Memory Regions ===
     556
     557Obviously, there needs to be a way for subsequent MPI sessions to attach to the persistent NAM regions previous MPI sessions have created.
     558The PSNAM wrapper layer enables this to be done via a call to `MPI_Comm_connect()`, which is normally used for establishing communication between distinct MPI sessions:
     559
     560{{{
     561MPI_Comm_connect(window_name, info, root, comm, newcomm)
     562IN  window_name // globally unique window name (string, used only on root)
     563IN  info    // implementation-dependent information (handle, used only on root)
     564IN  root    // rank in comm of root node (integer)
     565IN  comm    // intra-communicator over which call is collective (handle)
     566OUT newcomm //inter-communicator with server as remote group (handle)
     567}}}
     568
     569When passing a valid name of a persistent NAM window plus an info argument with the key `psnam_window_connect` and the value true, this function will return an inter-communicator that then serves for accessing the remote NAM memory regions.
     570However, this returned inter-communicator is just a pseudo communicator that cannot be used for any point-to-point or collective communication, but that rather acts like a handle for RMA operations on a virtual window object embodied by the remote NAM memory.
     571In doing so, the original structure of the NAM window is being retained.
     572That means that the window is still divided (and thus addressable) in terms of the MPI ranks of that process group that created the window before.
     573Therefore, a call to `MPI_Comm_remote_size()` on the returned inter-communicator reveals the former number of processes in that group.
     574For actually creating the local representative for the window in terms of an `MPI_Win` datatype, the `MPI_Win_create_dynamic()` function can be used with the inter-communicator as the input and the window handle as the output parameter.
     575
     576==== Querying Information about a Remote Window ====
     577
     578After determining the size of the former progress group via `MPI_Comm_remote_size()`, there might also be a demand for getting the information about the remote region sizes as well as the related unit sizes for displacements.
     579For this purpose, the PSNAM wrapper hooks into the `MPI_Win_shared_query()` function that returns these values according to the passed rank:
     580
     581{{{
     582MPI_Win_shared_query(win, rank, size, disp_unit, baseptr)
     583IN  win       // window object used for communication (handle)
     584IN  rank      // remote region rank
     585OUT size      // size of the region at the given rank
     586OUT disp_unit // local unit size for displacements
     587OUT baseptr   // always NULL in case of PSNAM windows
     588}}}
     589
     590
     591==== Example ====
     592
     593{{{
     594MPI_Info_create(&win_info);
     595MPI_Info_set(win_info, "psnam_window_connect", "true");
     596MPI_Comm_connect(window_name, info, 0, MPI_COMM_WORLD, &inter_comm);
     597MPI_Info_free(&info);
     598
     599printf("Connection to persistent memory region established!\n");
     600MPI_Comm_remote_size(inter_comm, &remote_group_size);
     601printf("Number of former process group that created the NAM window: %d\n", remote_group_size);
     602MPI_Win_create_dynamic(MPI_INFO_NULL, inter_comm, &win);
     603
     604For (int region_rank=0; region_rank < remote_group_size; region_rank++) {
     605MPI_Win_shared_query(win, region_rank, &region_size[i], &disp_unit[i], NULL);
     606}
     607
     608}}}