710 | | |
711 | | |
712 | | ---- |
713 | | |
714 | | |
715 | | = Nbody Benchmark (MPI+!OmpSs-2+TAMPI) = |
716 | | |
717 | | Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/nbody] and transfer it to a DEEP working directory. |
718 | | |
719 | | == Description == |
720 | | |
721 | | This benchmark represents an N-body simulation to numerically approximate the evolution of a system of bodies in which each body continuously interacts with every other body. A familiar example is an astrophysical simulation in which each body represents a galaxy or an individual star, and the bodies attract each other through the gravitational force. |
722 | | |
723 | | There are **7 implementations** of this benchmark which are compiled in different |
724 | | binaries by executing the command `make`. These versions can be blocking, |
725 | | when the particle space is divided into smaller blocks, or non-blocking, when |
726 | | it is not. |
727 | | |
728 | | The interoperability versions (MPI+!OmpSs-2+TAMPI) are compiled only if the environment variable `TAMPI_HOME` is set to the Task-Aware MPI (TAMPI) library's installation directory. |
729 | | |
730 | | == Execution Instructions == |
731 | | |
732 | | The binaries accept several options. The most relevant options are the number |
733 | | of total particles (`-p`) and the number of timesteps (`-t`). More options |
734 | | can be seen with the `-h` option. An example of execution could be: |
735 | | |
736 | | `mpiexec -n 4 -bind-to hwthread:16 ./nbody -t 100 -p 8192` |
737 | | |
738 | | in which the application will perform 100 timesteps in 4 MPI processes with 16 hardware threads in each process (used by the !OmpSs-2 runtime). The total number of particles will be 8192 so that each process will have 2048 particles (2 blocks per process). |
739 | | |
740 | | == References == |
741 | | |
742 | | * [https://pm.bsc.es/gitlab/ompss-2/examples/nbody] |
743 | | * [https://en.wikipedia.org/wiki/N-body_simulation] |
744 | | |
745 | | |
746 | | ---- |
747 | | |
748 | | |
749 | | = Heat Benchmark (MPI+!OmpSs-2+TAMPI) = |
750 | | |
751 | | Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/heat] and transfer it to a DEEP working directory. |
752 | | |
753 | | == Description == |
754 | | |
755 | | This benchmark uses an iterative Gauss-Seidel method to solve the heat equation, |
756 | | which is a parabolic partial differential equation that describes the distribution of heat (or variation in temperature) in a given region over time. The heat equation is of fundamental importance in a wide range of science fields. In |
757 | | mathematics, it is the parabolic partial differential equation par excellence. In statistics, it is related to the study of the Brownian motion. Also, the diffusion equation is a generic version of the heat equation, and it is related to the study of chemical diffusion processes. |
758 | | |
759 | | There are **9 implementations** of this benchmark which are compiled in different |
760 | | binaries by executing the command `make`. |
761 | | |
762 | | The interoperability versions (MPI+!OmpSs-2+TAMPI) are compiled only if the environment variable `TAMPI_HOME` is set to the Task-Aware MPI (TAMPI) library's installation directory. |
763 | | |
764 | | == Execution Instructions == |
765 | | |
766 | | The binaries accept several options. The most relevant options are the size |
767 | | of the matrix in each dimension (`-s`) and the number of timesteps (`-t`). More options can be seen with the `-h` option. An example of execution |
768 | | could be: |
769 | | |
770 | | `mpiexec -n 4 -bind-to hwthread:16 ./heat -t 150 -s 8192` |
771 | | |
772 | | in which the application will perform 150 timesteps in 4 MPI processes with 16 |
773 | | hardware threads in each process (used by the !OmpSs-2 runtime). The size of the |
774 | | matrix in each dimension will be 8192 (8192^2^ elements in total), this means |
775 | | that each process will have 2048x8192 elements (16 blocks per process). |
776 | | |
777 | | == References == |
778 | | |
779 | | * [https://pm.bsc.es/gitlab/ompss-2/examples/heat] |
780 | | * [https://pm.bsc.es/ftp/ompss-2/doc/examples/local/sphinx/04-mpi+ompss-2.html] |
781 | | * [https://en.wikipedia.org/wiki/Heat_equation] |
782 | | |
783 | | ---- |
784 | | |
785 | | = Krist Benchmark (!OmpSs-2+CUDA) = |
786 | | |
787 | | Users must clone/download this example's repository from [https://pm.bsc.es/gitlab/ompss-2/examples/krist] and transfer it to a DEEP working directory. |
788 | | |
789 | | == Description == |
790 | | |
791 | | This benchmark represents the krist kernel, which is used in crystallography to find the exact shape of a molecule using Rntgen diffraction on single crystals or powders. |
792 | | |
793 | | There are **2 implementations** of this benchmark, ''krist'' and ''krist-unified'' using regular and unified CUDA memory, repectively. |
794 | | |
795 | | == Execution Instructions == |
796 | | |
797 | | `./krist N_A N_R` |
798 | | |
799 | | where: |
800 | | * `N_A` is the number of atoms (1000 by default). |
801 | | * `N_R` is the umber of reflections (10000 by default). |
802 | | |
803 | | == References == |
804 | | |
805 | | * [https://pm.bsc.es/gitlab/ompss-2/examples/krist] |