1 | | = Offloading computational tasks of hybrid MPI+OpenMP/OmpSs-2 applications to GPUs = |
| 1 | = Offloading computational tasks of hybrid MPI + OpenMP/OmpSs-2 applications to GPUs = |
| 2 | |
| 3 | Table of contents: |
| 4 | * [#QuickOverview Quick Overview] |
| 5 | * Examples: |
| 6 | * [#NBodyBenchmark NBody Benchmark] |
| 7 | |
| 8 | ---- |
| 9 | |
| 10 | = Quick Overview = |
| 11 | |
| 12 | ---- |
| 13 | |
| 14 | = NBody Benchmark = |
| 15 | |
| 16 | Users can clone or download this examples from the https://pm.bsc.es/gitlab/DEEP-EST/apps/NBody |
| 17 | repository and transfer it to a DEEP working directory. |
| 18 | |
| 19 | |
| 20 | == Description |
| 21 | An NBody simulation numerically approximates the evolution of a system of |
| 22 | bodies in which each body continuously interacts with every other body. A |
| 23 | familiar example is an astrophysical simulation in which each body represents a |
| 24 | galaxy or an individual star, and the bodies attract each other through the |
| 25 | gravitational force. |
| 26 | |
| 27 | N-body simulation arises in many other computational science problems as well. |
| 28 | For example, protein folding is studied using N-body simulation to calculate |
| 29 | electrostatic and ''Van der Waals'' forces. Turbulent fluid flow simulation and |
| 30 | global illumination computation in computer graphics are other examples of |
| 31 | problems that use NBody simulation. |
| 32 | |
| 33 | == Requirements |
| 34 | The requirements of this application are shown in the following lists. The main requirements are: |
| 35 | |
| 36 | * '''GNU Compiler Collection'''. |
| 37 | * '''!OmpSs-2''': !OmpSs-2 is the second generation of the '''!OmpSs''' programming model. It is a task-based |
| 38 | programming model originated from the ideas of the OpenMP and !StarSs programming models. The |
| 39 | specification and user-guide are available at https://pm.bsc.es/ompss-2-docs/spec/ and |
| 40 | https://pm.bsc.es/ompss-2-docs/user-guide/, respectively. !OmpSs-2 requires both '''Mercurium''' and |
| 41 | '''Nanos6''' tools. Mercurium is a source-to-source compiler which provides the necessary support for |
| 42 | transforming the high-level directives into a parallelized version of the application. The Nanos6 |
| 43 | runtime system library provides the services to manage all the parallelism in the application |
| 44 | (e.g., task creation, synchronization, scheduling, etc). Downloads at https://github.com/bsc-pm. |
| 45 | * '''Clang + LLVM OpenMP''' (derived): |
| 46 | * '''MPI''': This application requires an MPI library supporting the multi-threading level of |
| 47 | thread support. |
| 48 | |
| 49 | In addition, there are some optional tools which enable the building of other application versions: |
| 50 | |
| 51 | * '''CUDA''' and NVIDIA '''Unified Memory''' devices: This application has CUDA variants in which some of |
| 52 | the N-body kernels are executed on the available GPU devices. |
| 53 | * '''Task-Aware MPI (TAMPI)''': The Task-Aware MPI library provides the interoperability mechanism |
| 54 | for MPI and OpenMP/!OmpSs-2. Downloads and more information at https://github.com/bsc-pm/tampi. |