wiki:Public/User_Guide/Offloading_hybrid_apps

Version 4 (modified by Kevin Sala, 5 years ago) (diff)

Offloading computational tasks of hybrid MPI + OpenMP/OmpSs-2 applications to GPUs

Table of contents:


Quick Overview

NBody Benchmark

Users can clone or download this examples from the https://pm.bsc.es/gitlab/DEEP-EST/apps/NBody repository and transfer it to a DEEP working directory.

Description

An NBody simulation numerically approximates the evolution of a system of bodies in which each body continuously interacts with every other body. A familiar example is an astrophysical simulation in which each body represents a galaxy or an individual star, and the bodies attract each other through the gravitational force.

N-body simulation arises in many other computational science problems as well. For example, protein folding is studied using N-body simulation to calculate electrostatic and Van der Waals forces. Turbulent fluid flow simulation and global illumination computation in computer graphics are other examples of problems that use NBody simulation.

Requirements

The requirements of this application are shown in the following lists. The main requirements are:

  • GNU Compiler Collection.
  • OmpSs-2: OmpSs-2 is the second generation of the OmpSs programming model. It is a task-based programming model originated from the ideas of the OpenMP and StarSs programming models. The specification and user-guide are available at https://pm.bsc.es/ompss-2-docs/spec/ and https://pm.bsc.es/ompss-2-docs/user-guide/, respectively. OmpSs-2 requires both Mercurium and Nanos6 tools. Mercurium is a source-to-source compiler which provides the necessary support for transforming the high-level directives into a parallelized version of the application. The Nanos6 runtime system library provides the services to manage all the parallelism in the application (e.g., task creation, synchronization, scheduling, etc). Downloads at https://github.com/bsc-pm.
  • Clang + LLVM OpenMP (derived):
  • MPI: This application requires an MPI library supporting the multi-threading level of thread support.

In addition, there are some optional tools which enable the building of other application versions:

  • CUDA and NVIDIA Unified Memory devices: This application has CUDA variants in which some of the N-body kernels are executed on the available GPU devices.
  • Task-Aware MPI (TAMPI): The Task-Aware MPI library provides the interoperability mechanism for MPI and OpenMP/OmpSs-2. Downloads and more information at https://github.com/bsc-pm/tampi.