Smarter Cluster Supercomputing from the Supercomputer Experts



Similar documents
Smarter Cluster Supercomputing from the Supercomputer Experts

Designed for Maximum Accelerator Performance

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC Software Requirements to Support an HPC Cluster Supercomputer

Scaling from Workstation to Cluster for Compute-Intensive Applications

Cray Cluster Supercomputers. John Lee VP of Advanced Technology Solutions CUG 2013

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

Sun in HPC. Update for IDC HPC User Forum Tucson, AZ, Sept 2008

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Sun Constellation System: The Open Petascale Computing Architecture

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

PRIMERGY server-based High Performance Computing solutions

Simplify Data Management and Reduce Storage Costs with File Virtualization

HUAWEI Tecal E6000 Blade Server

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

HPC Update: Engagement Model

Cluster Implementation and Management; Scheduling

A-CLASS The rack-level supercomputer platform with hot-water cooling

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

FLOW-3D Performance Benchmark and Profiling. September 2012

SUN HPC SOFTWARE CLUSTERING MADE EASY

High-Performance Computing Clusters

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

STORAGE CENTER WITH NAS STORAGE CENTER DATASHEET

Unified Computing Systems

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

Fujitsu HPC Cluster Suite

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

WHITE PAPER SGI ICE X. Ultimate Flexibility for the World s Fastest Supercomputer

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Integrated Grid Solutions. and Greenplum

How To Write An Article On An Hp Appsystem For Spera Hana

The Asterope compute cluster

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Hadoop on the Gordon Data Intensive Cluster

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Cray XT3 Supercomputer Scalable by Design CRAY XT3 DATASHEET

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Overview of HPC systems and software available within

Accelerating CFD using OpenFOAM with GPUs

The following InfiniBand products based on Mellanox technology are available for the HP BladeSystem c-class from HP:

Resource Scheduling Best Practice in Hybrid Clusters

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

UCS M-Series Modular Servers

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

SAS Business Analytics. Base SAS for SAS 9.2

Clusters: Mainstream Technology for CAE

Building Clusters for Gromacs and other HPC applications

OPTIMIZING SERVER VIRTUALIZATION

Evaluation of Dell PowerEdge VRTX Shared PERC8 in Failover Scenario

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

High Performance Computing in CST STUDIO SUITE

LANL Computing Environment for PSAAP Partners

LS DYNA Performance Benchmarks and Profiling. January 2009

HP Moonshot System. Table of contents. A new style of IT accelerating innovation at scale. Technical white paper

DATA WAREHOuSIng eb Data Warehouse 6700

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Easier - Faster - Better

SUN ORACLE EXADATA STORAGE SERVER

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

How To Build A Cloud Computer

Deliver More Applications for More Users

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

IBM System x family brochure

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

HP ProLiant SL270s Gen8 Server. Evaluation Report

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

HPC Storage Solutions at transtec. Parallel NFS with Panasas ActiveStor

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

NEC Express Partner Program. Deliver true innovation. Enjoy the rewards.

FUJITSU Enterprise Product & Solution Facts

Dell s SAP HANA Appliance

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

Mellanox Academy Online Training (E-learning)

NEXLINK STABLEFLEX MODULAR SERVER

The On-Demand Application Delivery Controller

SGI High Performance Computing

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Transcription:

Smarter Cluster Supercomputing from the Supercomputer Experts

Maximize Your Productivity with Flexible, High-Performance Cray CS400 Cluster Supercomputers In science and business, as soon as one question is answered another is waiting. And with so much depending on fast, accurate answers to complex problems, you need reliable high performance computing (HPC) tools matched to your specific tasks. Understanding that time is critical and all HPC problems are not created equal, we developed the Cray CS400 cluster supercomputer series. These systems are industry standards-based, highly customizable, easy to manage, and purposefully designed to handle the broadest range of medium- to large-scale simulations and data-intensive workloads. All CS400 components have been carefully selected, optimized and integrated to create a powerful, reliable high-performance compute environment capable of scaling to over 11,000 compute nodes and 40 peak petaflops. Flexible node configurations featuring the latest processor and interconnect technologies mean you can get to the solution faster by tailoring a system to your specific HPC applications needs. Innovations in packaging, power, cooling and density translate to superior energy efficiency and compelling price/performance. Expertly engineered system management software instantly boosts your productivity by simplifying system administration and maintenance, even for very large systems. Cray has long been a leader in delivering tightly integrated supercomputer systems for large-scale deployments. With the CS400 system, you get that same Cray expertise and productivity in a flexible, standards-based and easy-to-manage cluster supercomputer. CS400-AC Cluster Supercomputer: Air-Cooled and Designed for Your Workload The CS400-AC system is our air-cooled cluster supercomputer. Designed to offer the widest possible choice of configurations, it is a modular, highly scalable platform based on the latest x86 processing, coprocessing and accelerator technologies from Intel and NVIDIA. Industry standards-based server nodes and components have been optimized for HPC workloads and paired with a comprehensive HPC software stack, creating a unified system that excels at capacity- and data-intensive workloads. Regardless of your application requirements, the Cray XC series scales across the performance spectrum from smaller configurations up to the world s largest and highest-performing supercomputers.

Choice of Flexible, Scalable Configurations Flexibility is at the heart of the Cray CS400-AC system design. At the system level, the CS400-AC cluster is built using blades or rackmount servers. The Cray GreenBlade platform is comprised of server blades aggregated into chassis. The platform is designed to provide mixand-match building blocks for easy, flexible configuration at the node, chassis and whole-system level. Among its advantages, the GreenBlade platform offers extremely high density (up to 80 blades per 42U rack), excellent memory capacity (up to 512 GB per blade node), a choice of coprocessors and accelerators, local storage and a built-in management module in each chassis for industry-leading reliability. Cray rackmount servers offer maximum configurability in an industry-standard package optimized for HPC by Cray. Rackmount servers can be configured with large memories, up to eight disk drives or coprocessors as required for your specific need. Both GreenBlade and rackmount platforms feature the latest Intel Xeon processors with available support for Intel Xeon Phi coprocessors and NVIDIA Tesla GPU accelerators. The CS400-AC system also offers multiple interconnect topology options, local storage and network-attached file system options. It has the ability to integrate with Lustre -based global parallel storage systems including Cray Cluster Connect, Cray Tiered Adaptive Storage (TAS) and Cray Sonexion scale-out storage. With these blade and rackmount platforms, a Cray CS400-AC system can be tailored to multiple purposes from an all-purpose massively parallel HPC cluster, to one suited for shared memory parallel tasks, to a cluster optimized for hybrid compute and data-intensive workloads. At the node functionality level, the cluster has two types of server nodes: compute and service. Compute nodes run parallel MPI and/or Open MP tasks with maximum efficiency. Service nodes provide I/O connectivity and can also function as login nodes. The configuration options available for GreenBlade and rackmount servers are described on their respective datasheets. With industry-standard components throughout, each configuration can be replicated over and over to create a reliable and powerful large-scale system. Additionally, you have the option of fitting a CS400-AC system with liquid-cooled rack rear door heat exchangers and chillers for more energy and cost savings. CS400-AC Hardware Configuration Options Cray GreenBlade or rackmount servers Two- or four-socket x86 Intel Xeon processors Intel Xeon Phi coprocessors or NVIDIA Tesla GPU accelerators Large memory per node Multiple interconnect options: 3D torus/fat tree, single/ dual rail, QDR/FDR InfiniBand Local hard drives in each server Choice of network-attached file systems and parallel file systems Server management options

Easy, Comprehensive Manageability A flexible system is only as good as your ability to use it. The Cray CS400-AC cluster supercomputer offers two key productivity-boosting tools a customizable HPC cluster software stack and the Cray Advanced Cluster Engine (ACE ) system management software. Cray HPC Cluster Software Stack The HPC cluster software stack consists of a range of software tools compatible with most open source and commercial compilers, debuggers, schedulers and libraries. Also available as part of the software stack is the Cray Programming Environment, which includes the Cray Compiling Environment, Cray Scientific and Math Libraries, and Performance Measurement and Analysis Tools. HPC Programming Tools Schedulers, File Systems and Management Operating Systems and Drivers Development & Performance Tools Application Libraries Debuggers Resource Management / Job Scheduling Cray PE on CS Cray LibSci, LibSci_ACC Intel Parallel Studio XE Cluster Edition Intel MPI PGI Cluster Development GNU Toolchain NVIDIA CUDA Kit IBM Platform MPI MVAPICH2 OpenMPI Rogue Wave TotalView Allinea DDT, MAP Intel IDB PGI PGDBG GNU GDB SLURM Adaptive Computing Moab, Maui, Torque Altair PBS Professional IBM Platform LSF Grid Engine File Systems Lustre NFS GPFS Panasas PanFS Local (ext3, ext4, XFS) Cluster Management Drivers & Network Mgmt. Operating Systems Cray Advanced Cluster Engine (ACE ) Management Software Accelerator Software Stack & Drivers Linux (Red Hat, CentOS) OFED Cray Advanced Cluster Engine (ACE ) Hierarchical, Scalable Framework for Management, Monitoring and File Access MANAGEMENT MONITORING FILE ACCESS Hierarchical management infrastructure Divides the cluster into multiple logical partitions, each with unique personality Revision system with rollback Remote management and remote power control GUI and CLI to view/change/control, monitor health; plug-in capability Automatic server/network discovery Scalable, fast, diskless booting High availability, redundancy, failover Cluster event data available in real-time without affecting job performance Node, IB network status BIOS, HCA information Disk, memory, PCIe errors Temperatures, fan speeds Load averages Memory and swap usage Sub-rack and node power I/O status RootFS high-speed, cached access to root file system allowing for scalable booting High-speed network access to external storage ACE-managed, high-availability NFS storage The Advanced Cluster Engine (ACE) management software simplifies cluster management for large scale-out environments with extremely scalable network, server, cluster and storage management capabilities. Command line (CLI) and graphical user interface (GUI) options provide flexibility for the cluster administrator. An easy-to-use ACE GUI connects directly to the ACE daemon on the management server and can be executed on a remote system. With ACE, a large system is almost as easy to understand and manage as a workstation. ACE at a Glance Simplifies compute, network and storage management Supports multiple network topologies and diskless configurations with optional local storage Provides network failover with high scalability Integrates easily with standards-based HPC software stack components Manages heterogeneous nodes with different software stacks Monitors node and network health, power and component temperatures

Built-in Energy Efficiencies and Reliability Features Lower Your TCO Energy efficiency features built into the CS400-AC system, combined with our long-standing expertise in meeting the reliability demands of very large, high-usage deployments, means you get more work done for less. The CS400-AC options for additional energy and cost savings include high-efficiency load balancing power supplies, liquid-cooled rack rear door heat exchangers and chillers and a 480V power distribution unit with a choice of 208V or 277V three-phase power supplies. It means you can use industry-standard 208V and 230V power as well as 277V (single-phase of a 480V three-phase input) and reduce power loss caused by step-down transformers and resistive losses as the power is delivered from the wall directly to the rack. Reliability is built into the system design, starting with our careful selection of boards and components. Then multiple levels of redundancy and fault tolerance ensure the system meets your uptime needs. The CS400-AC cluster has redundant power, cooling, management servers and networks all with failover capabilities. Intel Xeon Processor E5-2600 Product Family The Intel Xeon processor is at the heart of the agile, efficient datacenter. Built on Intel s industry-leading microarchitecture based on the 22nm 3D Tri-Gate transistor technology, the Intel Xeon processor supports high-speed DDR4 memory technology with increased bandwidth, larger density and lower voltage over previous generations. The Intel support for PCI Express (PCIe) 3.0 ports improves I/O bandwidth, offering extra capacity and flexibility for storage and networking connections. The processor delivers energy efficiency and performance that adapts to the most complex and demanding workloads. Intel Xeon Phi Coprocessor for Parallel Workloads The Intel Xeon Phi coprocessor x100 series is based on Intel Many Integrated Core (Intel MIC) architecture and works synergistically with the Intel Xeon processor to increase developer productivity via common programming models and tools. It enables dramatic performance gains for demanding applications delivering over 1 teraflop peak double-precision performance. Additionally, it offers many-core compared to multicore with wider vector processing units for greater floating point performance/watt. The Intel Xeon Phi coprocessor is highly parallel and programmable and based on open standards with support for data thread and process parallelism with full support from Intel Cluster Studio XE while delivering outstanding aggregate performance and higher memory bandwidth. NVIDIA Tesla K40 GPU Computing Accelerator GPU-accelerated computing offers unprecedented application performance by offloading the compute-intensive portions of the application to the GPU. The NVIDIA Tesla K40 GPU accelerator features 2,880 cores and the industry s highest single- and double-precision peak floating point performance 4.29 teraflops and 1.43 teraflops, respectively. Equipped with 12 gigabytes of GPU accelerator memory, the NVIDIA Tesla K40 processes datasets twice as large as prior-generation GPUs, making it well suited for big data analytics and large-scale scientific computations. It outperforms CPUs by up to 10 times and delivers additional performance with its GPUBoost feature, converting power headroom into user-controlled performance boost.

Cray CS400-AC Specifications Architecture Processor, Coprocessor and Accelerators Memory Interconnect and Networks System Administration Reliable, Available, Serviceable (RAS) Resource Management and Job Scheduling Air cooled, up to 80 nodes per rack cabinet Support for up to 12-core, 64-bit, Intel Xeon processor E5-2600 v3 Optional support for Intel Xeon Phi coprocessors and NVIDIA Tesla GPU computing accelerators Up to 512 GB DDR4 RAM in GreenBlade systems and up to 1,536 GB in rackmount servers Available FDR InfiniBand with Connect-IB or QDR True Scale Host Channel Adapters Options for single- or dual-rail fat tree or 3D torus 1 GbE and 10 GbE Ethernet for management Redundant networks (InfiniBand, GbE and 10GbE) with failover Advanced Cluster Engine (ACE ) Complete remote management capability GUI and command line system administration System software version rollback capability Redundant management servers with automatic failover Automatic discovery and status reporting of interconnect, server and storage hardware Cluster partitioning into multiple logical clusters, each capable of hosting a unique software stack Remote server control (power on/off, cycle) and remote server initialization (reset, reboot, shutdown) Scalable fast diskless booting for large node systems and root file systems for diskless nodes Multiple global storage configurations Redundant power, cooling and management servers with failover capabilities All critical components easily accessible Options for SLURM, Altair PBS Professional, IBM Platform LSF, Adaptive Computing Torque, Maui and Moab, and Grid Engine File System Cray Cluster Connect, Cray Sonexion, NFS, Local FS (Ext3, Ext4 XFS), Lustre, GPFS and Panasas PanFS available as global file systems Disk Storage Operating System Performance Monitoring Tools Compilers, Libraries and Tools Full line of FC-attached disk arrays with support for FC, SATA disk drives and SSDs Red Hat, SUSE or CentOS available on compute nodes ACE Management Servers delivered with Red Hat Linux Open source packages such as HPCC, Perfctr, IOR, PAPI/IPM, netperf Options for Open MPI, MVAPICH2, Intel MPI and IBM Platform MPI libraries Cray Programming Environment on Cluster Systems (Cray PE on CS), PGI, Intel Cluster Toolkit, NVIDIA CUDA, CUDA C/C++ OpenCL, DirectCompute Toolkits, GNU, DDT, TotalView, OFED programming tools and many others Power Cooling Features Cabinet Dimensions (HxWxD) Cabinet Weight Power supplies deliver up to 38 kw per cabinet, with actual consumption based upon configuration Optional 480V power distribution with a choice of 208V or 277V three-phase power supplies Air cooled Airflow: up to 3,000 cfm in densest configuration; Intake: front; Exhaust: back Optional passive or active chilled cooling rear door heat exchangers 42U/19 : 78.39 (1,991 mm) x 23.62 (600 mm) x 47.24 (1,200 mm) standard rack cabinet 42U/19 : up to 1,856.3 lbs.; 232 lbs./sq. ft. per cabinet Support and Services Turnkey installation services with worldwide support and service options Cray Inc. 901 Fifth Avenue, Suite 1000 Seattle, WA 98164 Tel: 206.701.2000 Fax: 206.701.2500 www.cray.com 2014 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray is a registered trademark of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20150419EMS