A-CLASS The rack-level supercomputer platform with hot-water cooling



Similar documents
Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Typical Air Cooled Data Centre Compared to Iceotope s Liquid Cooled Solution

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

UCS M-Series Modular Servers

HPC Update: Engagement Model

PRIMERGY server-based High Performance Computing solutions

Extreme Computing: The Bull Way

Data Sheet Fujitsu PRIMERGY BX900 S1 Blade Server

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

How To Write An Article On An Hp Appsystem For Spera Hana

NEXLINK STABLEFLEX MODULAR SERVER

power rid B ge C o m p u t e r

Data Sheet Fujitsu PRIMERGY BX400 S1 Blade Server

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

FatTwin. 16% Lower Power Consumption 30% More Storage Capacity. Evolutionary 4U Twin Architecture. 4 Nodes in 4U 8 Hot-swap 3.

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

FLOW-3D Performance Benchmark and Profiling. September 2012

Fujitsu PRIMERGY Servers Portfolio

integrated lights-out in the ProLiant BL p-class system

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

How To Buy An Fujitsu Primergy Bx400 S1 Blade Server

Data Centers. Mapping Cisco Nexus, Catalyst, and MDS Logical Architectures into PANDUIT Physical Layer Infrastructure Solutions

Designed for Maximum Accelerator Performance

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

Sun Constellation System: The Open Petascale Computing Architecture

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Arista and Leviton Technology in the Data Center

præsentation oktober 2011

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Motherboard- based Servers versus ATCA- based Servers

Microsoft Private Cloud Fast Track Reference Architecture

WHITE PAPER SGI ICE X. Ultimate Flexibility for the World s Fastest Supercomputer

Technical Brief: Egenera Taps Brocade and Fujitsu to Help Build an Enterprise Class Platform to Host Xterity Wholesale Cloud Service

Rittal Liquid Cooling Series

PCI Express and Storage. Ron Emerick, Sun Microsystems

Cost-effective, extremely manageable, high-density rack servers September Why Blade Servers? by Mark T. Chapman IBM Server Group

IBM BladeCenter H with Cisco VFrame Software A Comparison with HP Virtual Connect

Up to 4 PCI-E SSDs Four or Two Hot-Pluggable Nodes in 2U

HP ProLiant SL270s Gen8 Server. Evaluation Report

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

EMC VMAX3 FAMILY - VMAX 100K, 200K, 400K

TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

How To Build A Cisco Ukcsob420 M3 Blade Server

HUAWEI Tecal E6000 Blade Server

Green Data Center and Virtualization Reducing Power by Up to 50% Pages 10

SX1012: High Performance Small Scale Top-of-Rack Switch

Fujitsu PRIMERGY BX600 S3 Blade Server

White Paper WP01. Blade Server Technology Overview

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

Scaling from Workstation to Cluster for Compute-Intensive Applications

Data Sheet Fujitsu PRIMERGY BX600 S3 Blade Server

Private cloud computing advances

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Pioneer DreamMicro. Blade Server S75 Series

3 Red Hat Enterprise Linux 6 Consolidation

Next Gen Data Center. KwaiSeng Consulting Systems Engineer

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

COMPUTING. Centellis Virtualization Platform An open hardware and software platform for implementing virtualized applications

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Smarter Cluster Supercomputing from the Supercomputer Experts

Fujitsu PRIMERGY BX Blade Server

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

EMC VMAX3 FAMILY VMAX 100K, 200K, 400K

Data Sheet Fujitsu PRIMERGY BX400 S1 Blade Server

1U µtca.4 Chassis with 2 AMC Slots, PCIe Gen 3 VT816

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

White paper. ATCA Compute Platforms (ACP) Use ACP to Accelerate Private Cloud Deployments for Mission Critical Workloads. Rev 01

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

Cisco 7816-I5 Media Convergence Server

CUBIX ACCEL-APP SYSTEMS Linux2U Rackmount Elite

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

The following InfiniBand products based on Mellanox technology are available for the HP BladeSystem c-class from HP:

Top of Rack: An Analysis of a Cabling Architecture in the Data Center

SUN BLADE 6000 CHASSIS

A QUICK AND EASY GUIDE TO SETTING UP THE DELL POWEREDGE C8000

Rack Solutions Integration Solution Service

How To Build A Cisco Uniden Computing System

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Transcription:

A-CLASS The rack-level supercomputer platform with hot-water cooling INTRODUCTORY PRESENTATION JUNE 2014 Rev 1 ENG

COMPUTE PRODUCT SEGMENTATION 3 rd party board T-MINI P (PRODUCTION): Minicluster/WS systems 4-8 DP compute trays 1 fixed head INTEL//NVIDIA Integrated storage Integrated networking Workgroup Shared T-Platforms V210 system board and common management intrinsics E-CLASS (H2 14): 2U general server/storage server 1 DP compute node V210 board (INTEL) HS disks Shares V-Class management V-CLASS (PRODUCTION): 5U air-cooled chassis 10 DP compute modules 5 accelerated DP modules V2x0/V205/402 boards INTEL/AMD/NVIDIA/ELBRUS Centralized management Workgroup, Departmental and Divisional A-CLASS (2014): ~52U cabinet Hot water cooling Integrated PSU blades Integrated blade switches 1P INTEL+ NVIDIA GPU @ launch Rack level integration Supercomputer FAMILY ABBREVIATION SEGMENT A-CLASS ADVANCED HIGH-END HPC V-CLASS VOLUME MIDDLE-END HPC/HIGH END CLOUD/SMB E-CLASS ENTERPRISE STORAGE OR GENERAL SERVER T-MINI P (P-CLASS) PERSONAL SCALABLE ALL-IN ONE HPC/WS SYSTEMS

INTRODRUCING A-CLASS Developed for multipetaflops supercomputers A peak performance of 420Tflops per system* Scales up to 128 systems (over 54Pflops) Direct hot water cooling technology Energy efficiency of: 3400Gflops/W** Modular architecture supports various future compute nodes The new level or reliability and availability Thermally isolated system cabinet Early availability: Q2 2014 * Based on 1-way Xeon E5 2680 v2 with single NVIDIA Tesla K40 ** Peak value 3

POTENTIAL APPLICATION SEGMENTS Aerospace Machine building Shipbuilding Transportation and automotive Oil, gas and utilities Data security Pharmacology and drug design Finance Semiconductor design Human science Climate research 4

THE А-СLASS OVERVIEW 5

RACK ENCLOSURE Custom rack level enclosure Integrated high speed signal and power backplane Dimensions 1500mm wide 800mm deep 2400 high, (~52U) Hybrid cooling system Direct hot water for management, compute and switch modules Inlet temp up to 45 C Outlet temp over 50 C when the inlet temp is 45 C Water pump speed up to 10,5 litre/s Air cooled power supplies Integrated heat exchanger Closer air circulation inside the enclosure Low operational noise 6

RACK ENCLOSURE (2) Provides up to 194kW of total power N+1 power redundancy for each compute section 380 VAC 3-phase input 48 VDC bus with metering function 8 groups x 12 PSUs (3 kw PSU model with up to 97,2% efficiency) Main frontal sections: 2 bays for independent management modules (with switches) 8 bays for compute modules (with switches) 8 for PSU groups (up to 12 each) All of the frontal sections support hot-swap functionality The enclosure comes with semitransparent French doors at the front and at the back for a thermal isolation 7

3 TYPES OF SECTIONS MANAGEMENT SECTION (х2) COMPUTE SECTION W SWITCHES (х8) POWER SECTION (х8) 8

MANAGEMENT SECTION 2 identical independent management sections Each hot swap module includes: 1P head node Intel Xeon E5-2600 v2 Different to compute node configuration of network ports - 2 x 10GbE ports and 1x QSFP InfiniBand port 2 x SSD 256 GB Top layer Ethernet switch for system management network Top layer FDR InfiniBand switch for data network 2 identical modules act as a failover cluster (see the diagram on the Simplified networks topology slide) Management modules are hot water-cooled 9

COMPUTE AND POWER SECTIONS 12 hot-swap modules per section MODULE WITH FOUR COMPUTE MODULES FDR IB MODULE B FDR IB+ETH MODULE A 3kW PSUs 10

HOT-SWAP MODULES 4 types of HS modules Management (server + 2 switches) Computational (4 compute nodes, CM ) Communication, type A - one IB data network switch and two ETH switches for management network) Communication, type B - IB system interconnect switch 4 CMs are mounted in pairs on both sides of the water plate Single compute node features* 1 x Intel E5-2600 v2, TDP 135 W+ Up to 32 GB of DDR3-1866 Reg. ECC, 4 modules Optional 256GB SSD 2 internal GbE ports 2 internal InfiniBand FDR 56 Gb/s ports 1 x NVIDIA Tesla K40 (SXM), TDP 235 W * Haswell-based node is planned for H2 2014 Extraction handles Spill proof inlet and outlet water connectors Compute node system board CAD model of compute module 11

SIMPLIFIED NETWORKS TOPOLOGY 2 independent InfiniBand networks: Two-layer data network MPI network with support of various topologies 2 independent management networks: 2 x two-layer Gigabit / 10Gigabit Ethernet networks 12

MANAGEMENT NETWORKS 2 independent two-layer Gigabit/10Gigabit Ethernet networks Two Ethernet bottom layer switches per each computational section (1 switch per network) One top layer Ethernet switch per each management section Every top-layer switch is connected to the bottom layer Every top-layer switch has two 10-Gigabit Ethernet uplink ports to connect to the external Ethernet network A total of 1 top-layer and 8 bottom-layer switches in each network A total of four 10Gigabit Ethernet ports 13

MANAGEMENT NETWORKS TOPOLOGY 14

DATA NETWORK (IB FABRIC 1) Two-layer topology Every compute section has one InfiniBand bottom-layer switch Each management section has one InfiniBand top-layer switch Every bottom layer switch is connected to both of top-layer switches Every top-layer switch has 18 FDR InfiniBand external ports to connect to external network A total of two top-layer switches and 8 bottom-layer switches A total of 36 external FDR InfiniBand data network ports 15

MPI TRAFFIC NETWORK (IB-FABRIC 2) A flexible topology network with n-tor support Every compute section has 4 MPI network InfiniBand switches Every FDR InfiniBand switch has 28 external ports to connect the system to the larger external InfiniBand network or to form self-contained MPI network within one A-Class system A total of 32 MPI traffic switches per system A total of 896 FDR InfiniBand MPI traffic ports 16

INTERCONNECT FABRICS TOPOLOGY 17

EXTERNAL IB FABRIC TOPOLOGIES External InfiniBand networks support various topologies, including 3D- and 4D-Torus, Dragonfly and Flattened Butterfly Ports are configured inside the rack chassis in accordance with desired topology 18

SYSTEM SOFTWARE ClustrX HPC Pack software: Cluster management module User management module Various resource managers and monitoring systems depending on a preference Hardware management tools ClustrX Safe automated equipment shutdown in case of abnormalities (AESS) Various functionality Support for distributed service nodes Virtual machine support Multiple running OS support within one cluster Local or diskless node boot via Ethernet, InfiniBand и iscsi Various file systems and database support Customized GUI dashboard to present system s operational status data 19

COOLING INFRASTRUCTURE

THE А-СLASS ADVANTAGE 21

A-CLASS ADVANTAGE ENERGY EFFICIENCY COMPUTE DENSITY SCALABILITY RELIABILITY 22

ENERGY EFFICIENCY Benefits of direct hot-water cooling: High peak energy efficiency of 3400Mflops per Watt* Year-round free cooling with street temperature up to +35С Savings due to no compressors and transition from conditioners to dry air-cooled heat exchangers Reuse hot-water heat buildings in winters Lower operational noise (TBA) Thermally isolated rack cabinet: The air inside A-Class does not mix with ambient datacentre air * Theoretical peak value. The real-life is to be measured from the plug 23

COMPUTATIONAL DENSITY The A-Class design packs 12,3kW of power per square meter, which is 2.8 times higher than industry s average 420Tflops per rack including switches* High computational density factors: Custom modular rack High level of integration to include management, compute, communication, cooling and power components Unique water radiator design to cool up to 2500W Custom form-factor boards * 350 Tflops/m 2 Thermal simulation of A-Class compute module 24

SCALABILITY Up to 128 systems with 32К nodes for 54Tflops of peak performance* Balanced ratio of CPU and GPU performance to available IB interfaces throughput of 3,3 GB/s/Tflops System is quickly expanded by adding new racks Networks Full bandwidth with low diameter for 32К of nodes Separation of data and MPI traffic Supported are Dragonfly and Flattened Butterfly as well as classis kd-torus topologies Integrated switches improve cabling infrastructure Management and monitoring system with automation scripts to accelerate larger deployments * Peak 128 performance of 128 systems with 256 Intel Xeon 2680 v2 CPUs v2 и 256 NVIDIA Tesla K40 GPUs 25

RELIABILITY Hardware features Two independent hot-swap management modules with dedicated management networks Redundant input power and independent groups of power suplies with N+1 redundancy inside each group Local scratch-disks Temperature, current, pressure and other sensors Monitoring at the chassis, section and node levels Leakage sensors automatically block water supply Network subsystem is routed away from power and cooling subsystems Software features Specialized ClustrX.HPC Pack module for the A-Class hardware features ClustrX.Safe software for Automated Equipment Shutdown system support 26

SUMMARY A-Class is the most advanced system ever designed by T- Platforms Developed from scratch Features high-level integration of custom components The system will be offered to the largest international HPC datacenters from June 23, 2014 For the access to a demo system please direct your inquiries to sales@t-platforms.ru Check out www.t-platforms.com/a-class for the additional system information 27

THANK YOU! www.t-platforms.com/a-class sales@t-platforms.ru 28

ADDITIONAL PHOTOS Front view Rear view (w/o cables) 29

ADDITIONAL PHOTOS 30