wiki:Public/User_Guide/Tutorial1/MSA_Idea

The MSA story of the DEEP projects family

1) Motivation

General purpose systems

+ Highly flexible
- High energy consumption
+ Preferred by many applications

Highly scalable systems

- Few (highly parallelizable) codes can fully exploit them
+ Highly energy efficient

2) Can one combine the best of these two worlds into a single system? → Yes! Exploit heterogeneity!

Homogeneous cluster

  • General purpose CPUs attached to a high-speed network

+ Easy to use, very flexible
- Power hungry

Traditional heterogeneous cluster

  • Attach accelerators (e.g. GPUs) to each CPU

+ Energy efficient, easy management
- Static assignment of accelerators to CPUs

3) The basis for the MSA: The Cluster-Booster Concept

The MSA developed in DEEP-EST builds on the so-called Cluster-Booster architecture.

Cluster-Booster architecture

+ Energy efficient, high flexibility, dynamic ressource assignment

Does this work?

The Cluster-Booster architecture was first conceptualized and proven with prototypes in the DEEP project.
It is a combination of a standard HPC Cluster and a tightly connected HPC Booster built of many-core
processors or accelerators. The second project DEEP-ER evolved this architecture to address two significant
Exascale computing challenges: highly scalable and efficient parallel I/O and system resiliency. Co-Design
was the key to tackle these challenges – through thoroughly integrated development of new hardware and software
components, fine-tuned with actual HPC applications in mind. Results of these two projects showed: Yes, it works!

4) Towards a modular supercomputing architecture - The theory

The idea of a MSA is a generalization to any number of specialized modules to address diverse application needs.
An example could be a system arranged like this:

Multiple specialized modules will allow a wide range of different applications to efficiently use the system. Each
application has its own way of using the MSA. One scenario where the workflos of several applications are distributed
over the system could be:

5) The DEEP-EST MSA prototype

The DEEP-EST project expands the Cluster-Booster architecture by adding a new module to the system: The Data Analytics Module

This MSA creates a unique HPC system by coupling various modules together. All modules together behave as a single machine. The modules are connected through a high-speed network and, most importantly, operated with a uniform system software and programming environment. This gives applications the opportunity to be distributed over several modules, running each part of its code onto the best suited hardware.

Last modified 4 years ago Last modified on Feb 5, 2020, 7:28:08 AM

Attachments (7)

Download all attachments as: .zip