Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband



Similar documents
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC Update: Engagement Model

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Sun Constellation System: The Open Petascale Computing Architecture

Hadoop on the Gordon Data Intensive Cluster

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

FLOW-3D Performance Benchmark and Profiling. September 2012

SMB Direct for SQL Server and Private Cloud

A-CLASS The rack-level supercomputer platform with hot-water cooling

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

An Introduction to the Gordon Architecture

PCI Express and Storage. Ron Emerick, Sun Microsystems

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Headline in Arial Bold 30pt. The Need For Speed. Rick Reid Principal Engineer SGI

PRIMERGY server-based High Performance Computing solutions

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Terms of Reference Microsoft Exchange and Domain Controller/ AD implementation

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

ECLIPSE Performance Benchmarks and Profiling. January 2009

The following InfiniBand products based on Mellanox technology are available for the HP BladeSystem c-class from HP:

Oracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers

The following InfiniBand adaptor products based on Mellanox technologies are available from HP

JuRoPA. Jülich Research on Petaflop Architecture. One Year on. Hugo R. Falter, COO Lee J Porter, Engineering

Microsoft Private Cloud Fast Track Reference Architecture

David Lawler Vice President Server, Access & Virtualization Group

Up to 4 PCI-E SSDs Four or Two Hot-Pluggable Nodes in 2U

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

COMPUTING. Centellis Virtualization Platform An open hardware and software platform for implementing virtualized applications

Technical Brief: Egenera Taps Brocade and Fujitsu to Help Build an Enterprise Class Platform to Host Xterity Wholesale Cloud Service

Virtual Compute Appliance Frequently Asked Questions

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

Thematic Unit of Excellence on Computational Materials Science Solid State and Structural Chemistry Unit, Indian Institute of Science

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Mellanox Academy Online Training (E-learning)

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

Clusters: Mainstream Technology for CAE

How To Write An Article On An Hp Appsystem For Spera Hana

Scale up: Building a State-of-the Art Enterprise Supercomputer. Sponsored By: Participants

PCI Express Impact on Storage Architectures. Ron Emerick, Sun Microsystems

Fujitsu PRIMERGY Servers Portfolio

Kriterien für ein PetaFlop System

A Quantum Leap in Enterprise Computing

Building Clusters for Gromacs and other HPC applications

Fujitsu PRIMERGY Servers Portfolio

FUJITSU Enterprise Product & Solution Facts

Juniper Networks QFabric: Scaling for the Modern Data Center

Hyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011

Performance Across the Generations: Processor and Interconnect Technologies

New Developments in Processor and Cluster. Technology for CAE Applications

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

HPC Software Requirements to Support an HPC Cluster Supercomputer

SUN ORACLE DATABASE MACHINE

SUN ORACLE EXADATA STORAGE SERVER

Cooling and thermal efficiently in

EonStor DS High-Density Storage: Key Design Features and Hybrid Connectivity Benefits

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Michael Kagan.

Innovativste XEON Prozessortechnik für Cisco UCS

PCI Express IO Virtualization Overview

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS DYNA Performance Benchmarks and Profiling. January 2009

Smarter Cluster Supercomputing from the Supercomputer Experts

PCI Express Impact on Storage Architectures and Future Data Centers

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Cut I/O Power and Cost while Boosting Blade Server Performance

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

WHITE PAPER SGI ICE X. Ultimate Flexibility for the World s Fastest Supercomputer

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

EMC SYMMETRIX VMAX 20K STORAGE SYSTEM

Technical White Paper ETERNUS DX8700 S2 Hardware Architecture

IBM System x family brochure

Data Centers. Mapping Cisco Nexus, Catalyst, and MDS Logical Architectures into PANDUIT Physical Layer Infrastructure Solutions

SoftLayer Offerings. What s Inside

QUESTIONS & ANSWERS. ItB tender 72-09: IT Equipment. Elections Project

Data Center Op+miza+on

UCS M-Series Modular Servers

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment

RDMA over Ethernet - A Preliminary Study

IBM System x family brochure

Storage Architectures. Ron Emerick, Oracle Corporation

Solutions Brief. Unified All Flash Storage. FlashPoint Partner Program. Featuring. Sustainable Storage. S-Class Flash Storage Systems

June, Supermicro ICR Recipe For 1U Twin Department Cluster. Version 1.4 6/25/2009

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

White Paper Solarflare High-Performance Computing (HPC) Applications

præsentation oktober 2011

Transcription:

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering slyness@appro.com

Company Overview :: Corporate Snapshot Computer Systems manufacturer founded in 1991 From OEM to branded products and solutions in 2001 Headquarters in Milpitas, CA Regional Office in Houston, TX Global Presence via International Resellers and Partners Manufacturing and Hardware R&D in Asia 100+ employees worldwide Leading developer of high performance servers, clusters and supercomputers. Balanced Architecture Delivering a Competitive Edge Shipped over thousands of clusters and servers since 2001 Solid Growth from 69.3M in 2006 to 86.5M in 2007 *Source: IDC WW Quarterly PC Tracker Target Markets: - Electronic Design Automation - Financial Services - Government / Defense - Manufacturing - Oil & Gas SLIDE 2

Company Overview :: Why Appro High Performance Computing Expertise Designing Best in Class Supercomputers Leadership in Price/Performance Energy Efficient Solution Best System Scalability and Manageability High Availability Features SLIDE 3

Company Overview :: Our Customers SLIDE 4

Company Overview :: Solid Growth +61% Compound Annual Revenue Growth $86.5M $33.2M $69.3M 2005 2006 2007 SLIDE 5

HPC Experience :: Recent Major Installations September 2006 NOAA Cluster 1,424 processor cores 2.8TB System Memory 15 TFlops June 2007 LLNL Minos Cluster 6,912 processor cores 13.8TB System Memory 33 TFlops Recent Installations November 2006 LLNL Atlas Cluster 9,216 processor cores 18.4TB System Memory 44 TFlops February 2008 DE Shaw Research Cluster 4,608 processor cores 9.2TB System Memory 49 TFlops SLIDE 6

HPC Experience :: Current Major Projects April - June 2008 TLCC Clusters Total 426TFlops project LLNL, LANL, SNL 8 Clusters July 2008 Renault F1 CFD Cluster Total 38TFlops Dual-rail IB Current Projects June 2008 U of Tsukuba Cluster Total 95TFlops Quad-rail IB SLIDE 7

HPC/Supercomputing Trends :: What are the pain points? Technology is driven by customer s needs Needs are driven by desire to reduce pain points Clusters are still hard to use as clusters get bigger, the problem grows exponentially Power, cooling and floor space are major issues Weak interconnect performance RAS is a growing issue/problem SLIDE 8

Intel Cluster Ready and Intel Multi-Core Processors A P P R O I N T E R N A T I O N A L I N C

Intel Cluster Ready Intel Cluster Ready Solution Deployment Intel Cluster Ready Platform Certification End-Users Platform Solutions Provider Deliver SKU that is a Certified Solution Appro Xtreme-X1 Supercomputer ISV-Applications Intel Cluster Ready Application Registration Intel Cluster Ready Hardware Certification Complete Solution Intel Cluster Ready Component Certification Appro HW and SW Certified Solution from a specific Platform Integrator Mellanox ConnectX SLIDE 10

Intel Software Tools for Building HPC Applications Developer Considerations Multi-Threading Message Passing Performance Compatibility Support Productivity Cross-Platform http://www.intel.com/software SLIDE 11

Benefits of Choosing an Intel Cluster Ready Certified System In production faster Simplified selection of compatible systems and applications Faster and higher confidence initial deployment Less downtime Ability to detect and correct problems soon after they occur Better support from ISVs and your system vendor Lower initial and ongoing costs Higher ongoing productivity Demand an Intel Cluster Ready certified system for your next cluster SLIDE 12

Technology Rich Enhanced Intel Core Microarchitecture Larger Caches New SSE4 Instructions 45 nm High-k Process Technology HPC/Supercomputing Trends New 45nm Processors Quad-Core Intel Xeon processor 5400 series Compelling IT Benefits Greater Performance at Given Frequency and Higher Frequencies Greater Energy Efficiency 2nd Generation Intel Quad-Core also available Dual-Core Intel Xeon processor 5200 series SLIDE 13

Nehalem Based System Architecture Up to 25.6 Gb/sec bandwidth per link PCI Express* Gen 1, 2 Nehalem QPI Nehalem DMI I/O Hub ICH Functional system demonstrated Sept 2007 IDF Key Technologies New 45nm Intel Microarchitecture New Intel QuickPath interconnect Integrated Memory Controller Next Generation Memory (DDR3) PCI Express Gen 2 Benefits More application performance Improved energy efficiency End to end HW assist Improved Virtualization Technology Stable IT image All future products, dates, and figures are preliminary and are subject to change without any notice. Software compatible Live migration compatible with today s dual and quad-core Intel Core Extending Today s Leadership Production Q4 08 Volume Ramp 1H 09 SLIDE 14

Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C

Xtreme-X Supercomputer :: Reliability by Design Redundancy where Required Disk Drives Diskless architecture Fan Trays N+1 Redundancy at the blade Power Supplies Hot Swappable N+1 Power supplies Network components Redundant InfiniBand and Ethernet networks Servers Management Nodes RAID6 Disk storage arrays Management nodes - redundant pair (active/standby) Sub-Management Nodes - redundant pair (active/active) Compute nodes are hot swappable in sub-racks and are easily removed without removing any cabling from other nodes SLIDE 16

Xtreme-X Supercomputer :: The Thermal Problem A 10 0 C increase in ambient temperature produces a 50% reduction in the long-term reliability of electronic equipment A 5 0 C reduction in temperature can triple the life expectancy of electronic equipment. MTBF and Availability depend on cooling. Source: BCC, Inc. Report CB-185R SLIDE 17

Xtreme-X Supercomputer :: Current Datacenter Cooling Methodology Top View Hot Isle Cold Isle Hot Aisle / Cold Aisle As Much as 40% or the cold air flow by passes the equipment Since the air is not directly pushed into the equipment, the cold air from the chillers are mixed with the hot air reducing the efficiency to cool the datacenter equipment. SLIDE 18

Xtreme-X Supercomputer :: Directed Airflow Cooling Methodology Top View Up to 30% Improvement in Density with Lower Power per Equipment Rack Delivers Cold Air directly to the equipment for optimum rack cooling efficiency. Delivers comfortable temperature to the room for return to Chillers Back-to-Back Rack configuration saves floor space in the datacenter and eliminates hot isles FRU and maintenance is done in the front side of the blade rack cabinet SLIDE 19

Xtreme-X Supercomputer :: Power/Cooling Efficiency & Density Xtreme-X Supercomputer Back-To-Back Configuration Rack 1 Rack 2 Sub-Rack Sub-Rack Sub-Rack Sub-Rack Sub-Rack Sub-Rack Sub-Rack Sub-Rack SIDE VIEW SIDE VIEW FLOOR UNDER COLD FLOOR AIR COLD FROM AIR FROM CHILLERS 30% IMPROVEMENT IN SPACE BY ELIMINATING HOT ISLES AND HAVING BACK-TO-BACK RACK CONFIGURATION SLIDE 20

Xtreme-X Supercomputer :: Based on Scalable Units A new way to approach building Supercomputing Linux clusters Multiple clusters of various sizes can be built and deployed into production very rapidly. The SU concept is very flexible by accommodating clusters from very small node counts to thousands of nodes. This new approach yields many benefits to dramatically lower total cost of ownership. Awarded Supercomputing Online 2007 Product of the Year SLIDE 21

Appro Cluster Engine A P P R O I N T E R N A T I O N A L I N C

Appro Cluster Engine :: ACE: Scalable SW Architecture Issues with today s cluster environments Change Operating systems EASILY and QUICKLY Run jobs on the cluster efficiently Monitor nodes for health Simple cluster management (Ex. IP Address allocation) Cluster Management Software turns a Collection of Servers into a, Functional, Usable, Reliable, Available, computing system SLIDE 23

Appro Cluster Engine Total Management Package Network Servers Host Scheduling RAS Two Tier Management Architecture Scalable to large numbers of Servers Minimal Overhead Complete Remote Lights-Out Control Redundancy provides both High Availability and High Performance Redundant Management Servers Redundant Networks Active-Active Operation Supported SLIDE 24

Appro Cluster Engine :: ACE: Scalable SW Architecture Global File System Storage Server Storage Server N GbE Mgmt Node 2x GbE : InfiniBand for Computing : 10GbE Operation : GbE Operation Operation Network (10GbE) Mgmt Network (GbE) : GbE Management GbE or 10GbE 2x 10GbE 2x GbE External Network Firewall Router IO Node I/O Node IO Node I/O Node IO Node I/O Node IO Node I/O Node Operation Network (GbE) 2x 10GbE Operation Network (GbE) Operation Network (GbE) Operation Network (GbE) Compute Server Group 2x GbE per node Parallel File System Compute Compute Node Compute Node Node Compute Compute Node Compute Node Node Compute Compute Node Compute Node Node Compute Compute Node Compute Node Node Storage Controllers Storage Controllers FC or GbE Servers or Bridge 4X IB 4x IB InfiniBand Network SLIDE 25

Cluster Management :: Xtreme-X Architecture based on ACE Scalable Hierarchical Architecture Uniform boot and response times Cached Root File System Support Higher MTBF Higher Bandwidth Diskless Operation without light-weight OS Full Linux OS Redundant Networks GbE and IB are both dual-rail Run-Through Fault-Tolerance Both Networks Switches and Cables can fail without interrupting programs. SLIDE 26

Appro Cluster Engine Management SLIDE 27

Appro Cluster Engine Management SLIDE 28

Appro Cluster Engine Management SLIDE 29

Appro s First Experience with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C

QDR Infiniband Demonstration :: with Intel Harpertown Processors Appro, Mellanox, and Intel Demonstration at IS08 Demonstration of ANSYS Fluent Aircraft Structure Simulation 16 Dual Socket Intel Harpertown Compute Nodes Mellanox QDR Infiniband Fabric See it in Booth 36 here at the show SLIDE 31

Mellanox QDR Infiniband with Adaptive Routing A P P R O I N T E R N A T I O N A L I N C

Solving the Interconnect Problem :: Mellanox QDR Infiniband Components Mellanox ConnectX Quad Data Rate Infiniband HCA Cards CX4 and QSFP Connectors 2m reach with 30ga Copper Cables GA NOW! Mellanox 36 Port Quad Data Rate Infiniband Switches Scalability 36 4X, 12 8X, or 12 12X ports Bandwidth IB SDR, DDR, and QDR 4X up to 40Gb/s 12X up to 120Gb/s Full QDR Wire Speed 40Gb/s adapter QSFP Connectors 1U 36-port QSFP Connectors SLIDE 33

Solving the Interconnect Problem :: Mesh and Torus Topologies for Large Systems 48GB/sec 48GB/sec 48GB/sec 48GB/sec 48GB/sec 48GB/sec Dual rail QDR with 4X links to nodes and 12X rails 8GB/sec between Compute Nodes 48GB/sec between Switch Nodes SLIDE 34

Appro Xtreme-X Computer :: Scalable Unit using Mellanox QDR Infiniband Scalable Unit with 64 Compute Nodes 4 Switch Nodes per Cabinet 6.1TF Peak Performance 3GHz Nehalem Processors 128 Processors 512 cores 3TB Memory per Rack Dual-Rail QDR IB Interconnect Redundant Interconnect 8GB/sec BW for 16 node 128 core Groups 48GB/sec BW between Groups Requires 32 Racks for 197TF 32 Racks for Compute Nodes 2 Back to Back Rows of 16 Racks 256 36port QDR Switches All Short Cables 2m Redundant Cooling and Power 28.8KW per Compute Rack SLIDE 35