Scaling Across the Supercomputer Performance Spectrum

Similar documents
Cray XT3 Supercomputer Scalable by Design CRAY XT3 DATASHEET

Smarter Cluster Supercomputing from the Supercomputer Experts

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Kriterien für ein PetaFlop System

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Designed for Maximum Accelerator Performance

A-CLASS The rack-level supercomputer platform with hot-water cooling

HPC Software Requirements to Support an HPC Cluster Supercomputer

Cluster Implementation and Management; Scheduling

Sun Constellation System: The Open Petascale Computing Architecture

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

Cisco UCS B-Series M2 Blade Servers

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

PRIMERGY server-based High Performance Computing solutions

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC Update: Engagement Model

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

Designed for Maximum Accelerator Performance

Simplify Data Management and Reduce Storage Costs with File Virtualization

Smarter Cluster Supercomputing from the Supercomputer Experts

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

The Ultimate in Scale-Out Storage for HPC and Big Data

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Easier - Faster - Better

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

LS DYNA Performance Benchmarks and Profiling. January 2009

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

SAS Business Analytics. Base SAS for SAS 9.2

VBLOCK SOLUTION FOR SAP: SAP APPLICATION AND DATABASE PERFORMANCE IN PHYSICAL AND VIRTUAL ENVIRONMENTS

How To Write An Article On An Hp Appsystem For Spera Hana

UNIFIED HYBRID STORAGE. Performance, Availability and Scale for Any SAN and NAS Workload in Your Environment

We Take Supercomputing Personally

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

Clusters: Mainstream Technology for CAE

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

DATA WAREHOuSIng eb Data Warehouse 6700

PCI Express and Storage. Ron Emerick, Sun Microsystems

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

ORACLE BIG DATA APPLIANCE X3-2

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

Data Sheet Fujitsu PRIMERGY BX400 S1 Blade Server

How To Build A Cisco Uniden Computing System

Cisco UCS B200 M3 Blade Server

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

Cisco UCS B440 M2 High-Performance Blade Server

FLOW-3D Performance Benchmark and Profiling. September 2012

Managing Data Center Power and Cooling

Improved LS-DYNA Performance on Sun Servers

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

SGI High Performance Computing

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

HUAWEI Tecal E6000 Blade Server

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Sun in HPC. Update for IDC HPC User Forum Tucson, AZ, Sept 2008

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

Virtualized High Availability and Disaster Recovery Solutions

Cisco UCS B460 M4 Blade Server

Microsoft SQL Server 2005 on Windows Server 2003

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

SUN ORACLE EXADATA STORAGE SERVER

Open-E Data Storage Software and Intel Modular Server a certified virtualization solution

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

supercomputing. simplified.

Integrated Grid Solutions. and Greenplum

Cray DVS: Data Virtualization Service

ECLIPSE Performance Benchmarks and Profiling. January 2009

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Cisco, Citrix, Microsoft, and NetApp Deliver Simplified High-Performance Infrastructure for Virtual Desktops

Cut I/O Power and Cost while Boosting Blade Server Performance

Building Clusters for Gromacs and other HPC applications

UCS M-Series Modular Servers

HP reference configuration for entry-level SAS Grid Manager solutions

How To Build A Cisco Ukcsob420 M3 Blade Server

Quantum StorNext. Product Brief: Distributed LAN Client

Cisco 7816-I5 Media Convergence Server

Cloud Computing through Virtualization and HPC technologies

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

Panasas: High Performance Storage for the Engineering Workflow

SUN STORAGETEK DUAL 8 Gb FIBRE CHANNEL DUAL GbE EXPRESSMODULE HOST BUS ADAPTER

ORACLE EXADATA DATABASE MACHINE X2-8

Deliver More Applications for More Users

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

HP ConvergedSystem 900 for SAP HANA Scale-up solution architecture

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

IBM System x family brochure

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Transcription:

Scaling Across the Supercomputer Performance Spectrum Cray s XC40 system leverages the combined advantages of next-generation Aries interconnect and Dragonfly network topology, Intel Xeon processors, integrated storage solutions and major enhancements to the Cray OS and programming environment. The Cray XC series supercomputer is a groundbreaking architecture upgradable to 100 petaflops per system and delivering: Sustained and scalable application performance Tight HPC optimization and integration Production supercomputing Investment protection upgradability by design User productivity

The Cray XC40 supercomputer is a massively parallel processing (MPP) architecture focused on producing more capable HPC systems to address a broad range of user communities. The Cray XC series is targeted at scientists, researchers, engineers, analysts and students across the technology, science, industry and academic fields. Adaptive Supercomputing Architecture Building on Cray s adaptive supercomputing vision, the Cray XC series integrates extreme performance HPC interconnect capabilities with best-in-class processing technologies to produce a single, scalable architecture. Understanding that no single processor engine is ideal for every type of user application, the Cray XC series highlights the flexibility of scalar processing, coprocessing and accelerators to build hybrid systems capable of leveraging the strengths of each technology into one adaptive HPC environment. Computing applications often need a balance of scalar and parallel operations to execute optimally. Merging x86 multicore benefits with coprocessor or accelerator many-core advantages addresses this need to target the best processing engine type for a specific function. With a robust hardware and software environment empowering customer choice, users can configure their Cray XC40 systems to their own unique requirements to meet their specific goals. With the Cray XC40 system, the adaptive supercomputing concept applies to building flexibility into hardware and network upgradeability, the comprehensive HPC software environment, optimized OS and ISV support, networking, storage and data management as well as power and cooling elements. Holistic and Integrated Platform Cray uses holistic and integrated methods to develop the world s most complete and robust HPC systems. This R&D process is a key benefit of Cray supercomputers and ensures that each product is developed, tested and validated against the most demanding realworld HPC applications. Rather than assembling cluster-like components and commodity networks that can degrade or act unpredictably at high node counts, the Cray XC series architecture improves upon the Cray history of system-centric development. With a comprehensive scope of extreme performance interconnects, processing, packaging, cooling, power options, file systems, upgradability, supervisory systems, OS and software development environment, the Cray XC series delivers a quality, reliable HPC solution. Additionally, like all Cray systems, Cray XC40 supercomputers offer the ability to efficiently scale key software applications, easy and proactive future upgradability by design, and a tightly coupled interconnect and software environment. Regardless of your application requirements, the Cray XC series scales across the performance spectrum from smaller footprint, lower-density configurations up to the world s largest and highest-performing supercomputers. Extreme Scalability and Sustained Performance Cray has an established reputation for regularly running the biggest jobs on the largest numbers of nodes in the HPC industry. With the Cray XC series there s even more focus on solving extreme capability computational challenges. Designed to avoid the limitations of commodity assemblies, the Cray XC40 system scales hardware, networking and software across a broad throughput spectrum to deliver true sustained, real-world production performance. It all means users can now model even bigger datasets and simulate massive systems which previously needed to be partitioned down into numerous smaller-sized modules.

Aries Interconnect and Dragonfly Topology To provide this breakthrough performance and scalability, Cray XC series supercomputers integrate the HPC-optimized Aries interconnect. This innovative intercommunications technology, implemented with a high-bandwidth, low-diameter network topology called Dragonfly, provides substantial improvements on all the network performance metrics for HPC: bandwidth, latency, message rate and more. Delivering unprecedented global bandwidth scalability at reasonable cost across a distributed memory system, this network provides programmers with global access to all of the memory of parallel applications and supports the most demanding global communication patterns. The Dragonfly network topology is constructed from a configurable mix of backplane, copper and optical links, providing scalable global bandwidth and avoiding expensive external switches. Adaptive supercomputing means a modular structure providing a customizable system for users to manage entry costs, and enables easy in-place upgrades for growing bandwidth requirements in the future. The Aries ASIC provides the network interconnect for the compute nodes on the Cray XC40 system base blades and implements a standard PCI Express Gen3 host interface, empowering connectivity to a wide range of HPC processing compute engines. This universal nature of the Cray XC series open architecture allows the system to be configured with the best available devices today, and then augmented or upgraded in the future with the user s choice of processors/coprocessors utilizing processor daughter cards (PDCs), each with their own independent capabilities and development schedule. The Cray XC series architecture implements two processor engines per compute node, and has four compute nodes per blade. Compute blades stack in eight pairs (16 to a chassis) and each cabinet can be populated with up to three chassis, culminating in 384 sockets per cabinet. Following the Intel Xeon processor roadmap and assuming 16 cores per processor, it sums to a possible 6,144 cores each, enabling up to 226 teraflops per cabinet and upgradable with the Intel schedules to advance clock frequency and the number of embedded cores. The open architecture of the Cray XC40 system empowers users to accelerate performance and reduce power consumption by running hybrid applications leveraging GPUs or coprocessors. Adaptive supercomputing means power-efficient performance and customer choice. The Cray XC Series and Intel Xeon Processors The Cray XC series marks the first time Cray is using the industryleading 64-bit Intel Xeon processor E5 family members in a high-end supercomputer line. With these Intel processors, Cray XC40 systems can scale in excess of 1 million cores.

Custom or ISV Jobs on the Same System Extreme Scale and Cluster Compatibility Rather than being boxed in by a restricted system architecture, the Cray XC series provides complete workload flexibility. Based on generations of experience with both environments, Cray has leveraged a single machine to run both highly scalable custom workloads as well as industry-standard ISV jobs via the powerful Cray Linux Environment (CLE). CLE enables a Cluster Compatibility Mode (CCM) to run out-of-the-box Linux/x86 versions of ISV software without any requirement for porting, recompiling or relinking. Alternatively, Cray s Extreme Scalability Mode (ESM) can be set to run in a performance-optimized scenario for custom codes. These flexible and optimized operation modes are dynamic and available to the user on an individual job basis. CLE has been optimized to make the most of the advancements in the Aries interconnect and the Dragonfly topology without requiring user tuning. Adaptive supercomputing means supporting different techniques of code execution on the fly. ROI, Upgradability and Investment Protection Besides the customizable configuration of the exact machine a user requires, Cray XC40 supercomputer architecture is engineered for easy, flexible upgrades and expansion, a benefit that prolongs its productive lifetime and the user s investment. As new technology advancements become available, users can take advantage of these next-generation progressions deep into the life cycle before ever considering replacing an HPC system. Adaptive supercomputing means longevity. Cray XC40 System Resiliency Features The Aries interconnect is designed to scale to massive HPC systems in which failures are to be expected, but where it is imperative that applications run to successful completion in the presence of errors. Aries uses error correcting code (ECC) to protect major memories and data paths within the device. The ECC combined with the Aries adaptive routing hardware (which spreads data packets over the available lanes that comprise each of the Dragonfly links) provide improved system and applications resiliency. In the event of a lane failure, the adaptive routing hardware will automatically mask it out. The HSS can even automatically reconfigure to route around the bad links in the event of losing all connectivity between two interconnects. Additionally, the Cray XC40 system features NodeKARE (Node Knowledge and Reconfiguration). If a user s program terminates abnormally, NodeKARE automatically runs diagnostics on all involved compute nodes and removes any unhealthy nodes from the compute pool. Subsequent jobs are allocated only to healthy nodes and run reliably to completion. Innovative Cooling and Green Systems Cray continues to advance its HPC cooling efficiency advantages, integrating a combination of vertical liquid coil units per compute cabinet and transverse air flow reused through the system. Fans in blower cabinets can be hot swapped and the system yields room neutral air exhaust. Improve your TCO by reducing the number of chillers and air handlers. Eliminate the need for hot/ cold aisles in your datacenter.

The Cray XC40 Supercomputer Delivers Sustained and scalable application performance Production supercomputing Tight HPC optimization and integration Investment protection Upgradability by design User productivity Cray Provides A robust software environment Integrated storage solutions A broad partner ecosystem Cray XC Series Targets Environmental and Earth Sciences Climate Oceanography Weather Energy, Oil and Gas Seismic analysis Reservoir simulation Manufacturing and CAE Aerospace CFD simulation Automotive safety and crash analysis Noise, vibration and harshness simulation Materials Defense and National Security Life Sciences Bioinformatics Proteomics Structural biology

The HPC Challenge Higher performance or lower power? More flexibility or lower cost? Increased capabilities or improved greenness? Today s demanding HPC technology requirements are often in conflict with each other. The driving metrics around supercomputing are no longer purely about peak performance and faster clock rates. Now, users need to balance speed and power with other critical system capabilities such as sustained real-world performance, economized power, cooling efficiency, tight packaging, and a robust partner ecosystem as well as full integration of OS, software and programming environments. Business requirements are equally important and need to address upgradeability and return on investment, operating and capital equipment costs, reliability and resiliency. PDCs Processor daughter card options providing the adaptive supercomputing flexibility of customer choice on processing engine technology are outlined in detail in separate product briefs covering CPUs, accelerators and coprocessors. Software Environment Cray XC series software environment details are described in a product brief covering the Cray Linux Environment and Cray Programming Environment, including discussions of programming paradigms, compilers, debuggers, science libraries and workload managers, as well as a wide variety of software, middleware and ISV partner offerings. Storage Systems The Cray Sonexion data storage system brings together an integrated file system, software and storage offering designed specifically for a wide range of HPC workloads, and provides users with an integrated, scalable and easy-to-install/ maintain Lustre solution. Alternatively, the Cray Data Virtualization Service (DVS) allows for the projection of various other file systems (including NFS, GPFS, Panasas PanFS and StorNext ) to the compute and login nodes. Additional product briefs may be downloaded from Cray s website at www.cray.com/xc40

Cray XC40 Series Specifications Processor Memory Compute Cabinet Interconnect System Administration Reliability Features (Hardware) Reliability Features (Software) Operating System Compilers, Libraries & Tools Job Management External I/O Interface Disk Storage Parallel File System Power Cooling Dimensions (Cabinets) Weight (Operational) Regulatory Compliance 64-bit Intel Xeon processor E5 family; up to 384 per cabinet 64-256 GB per node Memory bandwidth: up to 137 GB/s per node Up to 384 sockets per cabinet, upgradeable with processors of varying core counts Peak performance: initially up to 226 TF per system cabinet 1 Aries routing and communications ASIC per 4 compute nodes 48 switch ports per Aries chip (500 GB/s switching capacity per chip) Dragonfly interconnect: low latency, high bandwidth topology Cray System Management Workstation (SMW) Single-system view for system administration System software rollback capability Integrated Cray Hardware Supervisory System (HSS) Independent, out-of-band management network Full ECC protection of all packet traffic in the Aries network Redundant power supplies; redundant voltage regulator modules Redundant paths to all system RAID Hot swap blowers, power supplies and compute blades Integrated pressure and temperature sensors HSS system monitors operation of all operating system kernels Lustre file system object storage target failover; Lustre metadata server failover Software failover for critical system services including system database, system logger, and batch subsystems NodeKARE (Node Knowledge and Reconfiguration) Cray Linux Environment (includes SUSE Linux SLES11, HSS and SMW software) Extreme Scalability Mode (ESM) and Cluster Compatibility Mode (CCM) Cray Compiler Environment, Intel Compiler, PGI Compiler, GNU compiler Support for the ISO Fortran standard (2008) including parallel programming using coarrays, C/C++ and UPC MPI 3.0, Cray SHMEM, other standard MPI libraries using CCM. Cray Apprentice and CrayPAT performance tools. Intel Parallel Studio Development Suite (option) PBS Professional job management system Moab Adaptive Computing Suite job management system SLURM Simple Linux Unified Resource Manager Infiniband, 40 and 10 Gigabit Ethernet, Fibre Channel (FC) and Ethernet Full line of FC, SAS and IB based disk arrays with support for FC and SATA disk drives, Sonexion data storage system Lustre, Data Virtualization Service (DVS) allows support for NFS, external Lustre and other file systems 90 kw per compute cabinet, maximum configuration Support for 480 VAC and 400 VAC computer rooms 6 kw per blower cabinet, 20 AMP at 480 VAC or 16 AMP at 400 VAC (three-phase, ground) Water cooled with forced transverse air flow: 6,900 cfm intake H 80.25 x W 35.56 x D 62.00 (compute cabinet) H 80.25 x W 18.00 x D 42.00 (blower cabinet) 3,450 lbs. per compute cabinet liquid cooled, 243 lbs./square foot floor loading 750 lbs. per blower cabinet EMC: FCC Part 15 Subpart B, CE Mark, CISPR 22 & 24, ICES-003, C-tick, VCCI Safety: IEC 60950-1, TUV SUD America CB Report Acoustic: ISO 7779, ISO 9296

Cray Inc. 901 Fifth Avenue, Suite 1000 Seattle, WA 98164 Tel: 206.701.2000 Fax: 206.701.2500 www.cray.com 2014 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray is a registered trademark of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20151210EMS