[[TOC]] = Overview of our systems = This page is supposed to give a short overview on the available systems from a hardware point of view. All hardware can be reached through a login node via SSH: '''!deep@fz-juelich.de'''. The login node is implemented as virtual machine hosted by the master nodes (in a failover mode). Please, see also information about [wiki:Public/User_Guide/Account getting an account] and using the [wiki:Public/User_Guide/Batch_system batch system]. == DEEP-EST Modular Supercomputer == [[Image(DEEP-EST_Modules.png, width=500px, align=center)]] The DEEP-EST system is a prototype of Modular Supercomputing Architecture (MSA) consisting of the following modules: * Cluster Module * Extreme Scale Booster * Data Analytics Module In addition to the previous compute modules, the Scalable Storage Service Module provides the shared storage infrastructure for the DEEP-EST prototype. The modules are connected together by the Network Federation, composed by different types of interconnects and briefly described below. === Cluster Module === It is composed of 50 nodes with the following hardware specifications: {{{#!td Cluster [50 nodes]: `dp-cn[01-50]` * 2 Intel Xeon 'Skylake' Gold 6146 (12 cores (24 threads), 3.2GHz) * 192 GB RAM * 1 x 400GB NVMe SSD * network: !InfiniBand EDR (100 Gb/s) }}} {{{#!td [[Image(CM_node_hardware.png, width=600px, align=center)]] }}} === Extreme Scale Booster === It is composed of 75 nodes with the following hardware specifications: {{{#!td * Extreme Scale Booster [75 nodes]: `dp-esb[01-75]` * 1 x Intel Xeon 'Cascade Lake' Silver 4215 CPU @ 2.50GHz * 1 x Nvidia V100 Tesla GPU (32 GB HBM2) * 48 GB RAM * 1 x 250 GB SSD (for boot and OS) * network: EXTOLL 100 (Gb/s) }}} {{{#!td [[Image(ESB_node_hardware.png, width=400px, align=center)]] }}} [[span(style=color: #FF0000, **Attention:** )]] the Extreme Scale Booster will become available in March 2020. === Data Analytics Module === It is composed of 16 nodes with the following hardware specifications: {{{#!td * Data Analytics Module [16 nodes]: `dp-dam[01-16]` * 2 x Intel Xeon 'Cascade Lake' Platinum 8260M CPU @ 2.40GHz * 1 x Nvidia V100 Tesla GPU (32 GB HBM2) * 1 x Intel STRATIX10 FPGA (32 GB DDR4) * 384 GB RAM + 2 or 3 TB non-volatile memory ( 14 nodes with 2, 2 nodes with 3) * 2 x 1.5 TB Intel Optane SSD (1 for local scratch, 1 for BeeOND) * 1 x 240 GB SSD (for boot and OS) * network: EXTOLL (100 Gb/s) + 40 Gb Ethernet }}} {{{#!td [[Image(DAM_node_hardware.png, width=620px, align=center)]] }}} == Network overview == Different types of interconnects are in use along with the Gigabit Ethernet connectivity (used for administration and service network) that is available for all the nodes. The following sketch should give a rough overview. Network details will be of particular interest for the storage access. Please also refer to the description of the [wiki:Public/User_Guide/Filesystems filesystems]. [[Image(DEEP-EST_Networks_Schematic_Overview.png, width=850px, align=center)]] **Attention:** performance measurements for the Network Federation will be provided in the future. **Attention:** Additional information will be provided in the future when the EXTOLL fabric for the Extreme Scale Booster will become available (ETA: September 2020). == Rack plan == This is a sketch of the available hardware including a short description of the hardware interesting for the system users (the nodes you can use for running your jobs and that can be used for testing). [[Image(Prototype_plus_SSSM_and_SDV_Rackplan_47U-2019-11.png, 60%, align=center)]] {{{#!comment == miclogin: == * knc1: * 2 Xeon CPUs * 64 GB memory * 4 KNCs (named knc1-mic![0-3]) with 61 cores and 16 GB each * knc2: * 2 Xeon CPUs * 64 GB memory * 2 KNCs (named knc2-mic![0-1]) with 57 cores and 6 GB each The DEEP cluster has been removed permanently! == DEEP: == * Cluster: * 2 Xeon CPUs per node * 32 GB memory per node * 128 nodes * Booster: * 2 KNCs per BNC (Booster Node Card) * 16 GB per KNC * 192 BNCs }}} === SSSM rack === This rack hosts the master nodes, files servers and the storage as well as network components for the Gigabit Ethernet administration and service networks. Users can access the login node via '''!deep@fz-juelich.de''' (implemented as virtual machine running on the master nodes). The rack is air-cooled. === CM rack === Contains the hardware of the DEEP-EST Cluster Module including compute nodes, management nodes, network components and liquid cooling unit. === DAM rack === This rack hosts the nodes of the Data Analytics Module of the DEEP-EST prototype. The rack is air-cooled. === SDV rack === Along with the prototype systems serveral test nodes and so called software development vehicles (SDVs) have been installed in the scope of the DEEP(-ER,EST) projects. These are located in the SDV rack (07). The following components can be accessed by the users: * Prototype DAM [4 nodes]: `protodam[01-04]` * 2 x Intel Xeon 'Skylake' (26 cores per socket) * 192 GB RAM * network: Gigabit Ethernet * Old DEEP-ER Cluster Module SDV [16 nodes]: `deeper-sdv[01-16]` * 2 Intel Xeon 'Haswell' E5-v2680 v3 (2.5 GHz) * 128 GB RAM * 1 NVMe with 400 GB per node( accessible through BeeGFS on demand) * network: 100 Gb/s Extoll tourmalet * KNLs [4 nodes]: `knl[01,04-06]` * 1 Intel Xeon Phi (64-68 cores) * 1 NVMe with 400 GB per node (accessible through BeeGFS on demand) * 16 GB MCDRAM plus 96 GB RAM per KNL * network: Gigabit Ethernet {{{#!comment have been removed meanwhile * KNMs [2 nodes]: `knm[01-02]` * 1 Intel Xeon Phi - Knight Mill (72 cores) * 16 GB MCDRAM plus 96 GB RAM per KNL * network: Gigabit Ethernet }}} * GPU nodes for Machine Learning [3 nodes]: `ml-gpu[01-03]` * 2 x Intel Xeon 'Skylake' Silver 4112 (2.6 GHz) * 192 GB RAM * 4 x Nvidia Tesla V100 GPU (PCIe Gen3), 16 GB HBM2 * network: 40GbE connection * Old DEEP-ER NAM SDV: * size: 2 GB * network: Extoll * details: https://www.deep-projects.eu/hardware/memory-hierarchies/49-nam {{{#!comment Not available anymore === FPGA test server === In addition to the seven racks hosting the SDV and prototype hardware there is an FPGA workstation available for testing. Please, get in contact to j.kreutz@fz-juelich.de if you would like to get access. * FPGA [1 node]: `fpga01` * 2 x Intel CPU (8 cores) * 64 GB RAM * 1 x Intel Arria 10 PAC }}} = Further information = * [wiki:Public/User_Guide/Batch_system Information about the batchsystem] * [wiki:Public/User_Guide/Filesystems Filesystems] * [wiki:Public/User_Guide/Information_on_software Information on available software and tools] {{{#!comment * [wiki:Public/User_Guide/Cluster Use the Cluster] outdated * [wiki:Public/User_Guide/Booster Use the Booster] outdated }}} * [wiki:Public/User_Guide/DEEP-EST_CM Use the DEEP-EST Cluster Module] * [wiki:Public/User_Guide/DEEP-EST_DAM Use the DEEP-EST Data Analytics Module] * [wiki:Public/User_Guide/SDV_Cluster Use the SDV Cluster] * [wiki:Public/User_Guide/SDV_KNLs Use the SDV KNLs]