Changes between Version 11 and Version 12 of Public/User_Guide/TAMPI


Ignore:
Timestamp:
Nov 23, 2020, 9:56:17 AM (3 years ago)
Author:
Pedro Martinez-Ferror
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Public/User_Guide/TAMPI

    v11 v12  
    5858TAMPI has already been installed on DEEP and can be used by simply executing the following commands:
    5959
    60 `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Core:$modulepath"`
    61 
    62 `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/Compiler/mpi/intel/2019.0.117-GCC-7.3.0:$modulepath"`
    63 
    64 `modulepath="/usr/local/software/skylake/Stages/2018b/modules/all/MPI/intel/2019.0.117-GCC-7.3.0/psmpi/5.2.1-1-mt:$modulepath"`
    65 
    66 `export MODULEPATH="$modulepath:$MODULEPATH"`
     60`module load Intel/2019.5.281-GCC-8.3.0  ParaStationMPI/5.4.6-1-mt`
     61
     62`module load OmpSs-2`
    6763
    6864`module load TAMPI`
    69 
    70 Note that loading the TAMPI module will automatically load the **!OmpSs-2** and **Parastation MPI** modules (this MPI library has been compiled with multi-threading support enabled).
    7165
    7266You might want to request more MPI ranks per socket depending on your particular application.  See the examples below together with the corresponding system affinity report:
     
    250244can be seen with the `-h` option. An example of execution could be:
    251245
    252 `mpiexec -n 4 -bind-to hwthread:16 ./nbody -t 100 -p 8192`
    253 
    254 in which the application will perform 100 timesteps in 4 MPI processes with 16 hardware threads in each process (used by the !OmpSs-2 runtime). The total number of particles will be 8192 so that each process will have 2048 particles (2 blocks per process).
     246`srun -n 4 07.nbody_mpi_ompss_tasks_interop_async.N2.2048bs.bin -t 100 -p 8192`
     247
     248in which the application will perform 100 timesteps using 4 MPI processes. The total number of particles will be 8192 so that each process will have 2048 particles (2 blocks per process).
    255249
    256250== References ==
     
    284278could be:
    285279
    286 `mpiexec -n 4 -bind-to hwthread:16 ./heat -t 150 -s 8192`
    287 
    288 in which the application will perform 150 timesteps in 4 MPI processes with 16
    289 hardware threads in each process (used by the !OmpSs-2 runtime). The size of the
    290 matrix in each dimension will be 8192 (8192^2^ elements in total), this means
    291 that each process will have 2048x8192 elements (16 blocks per process).
     280`srun -n 4 05.heat_mpi_ompss_tasks.1024x1024bs.bin -t 150 -s 8192`
     281
     282in which the application will perform 150 timesteps using 4 MPI processes. The size of the matrix in each dimension will be 1024 (1024^2^ elements in total), which means
     283that each process will have 256x1024 elements (4 blocks per process).
    292284
    293285== References ==