Paving The Road to Exascale Computing HPC Technology Update

Similar documents
Advancing Applications Performance With InfiniBand

Mellanox Academy Online Training (E-learning)

State of the Art Cloud Infrastructure

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Choosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520

Interconnecting Future DoE leadership systems

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Efficiency and Scalability

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Mellanox Accelerated Storage Solutions

Connecting the Clouds

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Long-Haul System Family. Highest Levels of RDMA Scalability, Simplified Distance Networks Manageability, Maximum System Productivity

SMB Direct for SQL Server and Private Cloud

ECLIPSE Performance Benchmarks and Profiling. January 2009

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

The following InfiniBand products based on Mellanox technology are available for the HP BladeSystem c-class from HP:

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

RoCE vs. iwarp Competitive Analysis

SX1024: The Ideal Multi-Purpose Top-of-Rack Switch

Building a Scalable Storage with InfiniBand

Pedraforca: ARM + GPU prototype

Storage, Cloud, Web 2.0, Big Data Driving Growth

PCI Express IO Virtualization Overview

SX1012: High Performance Small Scale Top-of-Rack Switch

Mellanox OpenStack Solution Reference Architecture

PCI Express Impact on Storage Architectures. Ron Emerick, Sun Microsystems

LS DYNA Performance Benchmarks and Profiling. January 2009

PCI Express and Storage. Ron Emerick, Sun Microsystems

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015

Mellanox Reference Architecture for Red Hat Enterprise Linux OpenStack Platform 4.0

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Deep Learning GPU-Based Hardware Platform

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman

HP Mellanox Low Latency Benchmark Report 2012 Benchmark Report

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

Mellanox HPC-X Software Toolkit Release Notes

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

Mellanox WinOF for Windows 8 Quick Start Guide

Introduction to Cloud Design Four Design Principals For IaaS

Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Mellanox Academy Course Catalog. Empower your organization with a new world of educational possibilities

Enabling High performance Big Data platform with RDMA

Installing Hadoop over Ceph, Using High Performance Networking

High Speed I/O Server Computing with InfiniBand

Power Saving Features in Mellanox Products

ConnectX -3 Pro: Solving the NVGRE Performance Challenge

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters

Storage Architectures. Ron Emerick, Oracle Corporation

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

PCI Express Impact on Storage Architectures and Future Data Centers

White Paper Solarflare High-Performance Computing (HPC) Applications

Sun Constellation System: The Open Petascale Computing Architecture

Mellanox ConnectX -3 Firmware (fw-connectx3) Release Notes

Storage at a Distance; Using RoCE as a WAN Transport

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

The following InfiniBand adaptor products based on Mellanox technologies are available from HP

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Cluster Grid Interconects. Tony Kay Chief Architect Enterprise Grid and Networking

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

Solution Brief July All-Flash Server-Side Storage for Oracle Real Application Clusters (RAC) on Oracle Linux

Mellanox Technologies

3G Converged-NICs A Platform for Server I/O to Converged Networks

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

RDMA over Ethernet - A Preliminary Study

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

Performance Evaluation of the RDMA over Ethernet (RoCE) Standard in Enterprise Data Centers Infrastructure. Abstract:

FLOW-3D Performance Benchmark and Profiling. September 2012

Building Enterprise-Class Storage Using 40GbE

Ethernet: THE Converged Network Ethernet Alliance Demonstration as SC 09

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

Mellanox Global Professional Services

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro

Running Native Lustre* Client inside Intel Xeon Phi coprocessor

Understanding PCI Bus, PCI-Express and In finiband Architecture

Supercomputing Clusters with RapidIO Interconnect Fabric

Oracle Big Data Appliance: Datacenter Network Integration

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

Mellanox Technologies, Ltd. Announces Record Quarterly Results

Transcription:

Paving The Road to Exascale Computing HPC Technology Update Todd Wilde, Director of Technical Computing and HPC Mellanox HPC Technology Accelerates the TOP500 InfiniBand has become the de-factor interconnect solution for High Performance Computing InfiniBand is the most used interconnect on the TOP500 list with 224 systems FDR InfiniBand connected systems doubled to 45 systems since June 12 Mellanox InfiniBand is the interconnect of choice for Petascale systems Comprehensive End-to-End 10/40/56Gb/s Ethernet and 56Gb/s InfiniBand Portfolio ICs Adapter Cards Switches/Gateways Software Cables Scalability, Reliability, Power, Performance

Mellanox InfiniBand Paves the Road to Exascale PetaFlop Mellanox Connected The Mellanox Advantage Host/Fabric Software Management - UFM, MLNX-OS - Integration with job schedulers - Inbox drivers from major distributions Application Accelerations - - - - - Networking Efficiency/Scalability Collectives Accelerations (FCA/CORE-Direct) GPU Accelerations (RDMA for GPUDirect) MPI/SHMEM/PGAS RDMA Quality of Service - Dynamically Connected Transport - Adaptive Routing - Congestion Management Server and Storage High-Speed Connectivity - Latency - Bandwidth - CPU Utilization - Message rate

FDR InfiniBand New Features and Capabilities Performance / Scalability >100Gb/s bandwidth, <0.7usec latency PCI Express 3.0 InfiniBand Routing and IB-Ethernet Bridging Reliability / Efficiency Unbreakable Link Technology Forward Error Correction Link quality auto-sensing Link Level Retransmission Link bit encoding 64/66 Lower power consumption Virtual Protocol Interconnect (VPI) Technology Applications VPI Adapter Switch OS Layer VPI Switch Unified Fabric Manager Networking Storage Clustering Management Acceleration Engines 3.0 Ethernet: 10/40 Gb/s InfiniBand:10/20/40/ 56 Gb/s 64 ports 10GbE 36 ports 40GbE 48 10GbE + 12 40GbE 36 ports IB up to 56Gb/s 8 VPI subnets LOM Adapter Card Mezzanine Card

FDR Application Benchmarks Computational Fluid Dynamics Oil and Gas Reservoir Simulation Molecular Dynamics Weather and Earth Sciences Up to 32% ROI on equipment and operating costs FDR/QDR InfiniBand Comparisons Linpack Efficiency Derived from 6/12 TOP500 List Highest &Lowest Outlier Removed from each group

Roadmap of Interconnect Innovations InfiniHost InfiniHost III ConnectX (1,2,3) Connect-IB World s first InfiniBand HCA World s first PCIe InfiniBand HCA World s first Virtual Protocol Interconnect (VPI) Adapter Built from the Ground up for the Exascale Foundation 10Gb/s InfiniBand PCI-X host interface 1 million msg/sec 20Gb/s InfiniBand PCIe 1.0 2 million msg/sec 40/56Gb/s InfiniBand PCIe 2.0, 3.0 x8 33 million msg/sec >100 Gb/s InfiniBand PCIe 3.0 x16 >135 million msg/sec 2002 2005 2008-11 Connect-IB Performance Highlights World s first 100Gb/s InfiniBand interconnect adapter PCIe 3.0 x16, dual FDR 56Gb/s InfiniBand ports to provide >100Gb/s Highest InfiniBand message rate: 130 million messages per second 4X higher than other InfiniBand solutions Enter the World of Boundless Performance

Connect-IB Scalability Features Inter-node Scalability (Scale Out) New innovative transport Dynamically Connected Transport service The new transport service combines the best of: - Unreliable Datagram (UD) no resources reservation - Reliable Connected Service transport reliability Scale out for unlimited clustering size of compute and storage Eliminates overhead and reduces memory footprint Enter the World of Unlimited Scalability Connect-IB Virtualization Enhancements Scale out virtualization solution Full dual-port virtual HCA support for each guest VM VM HCA Driver VM HCA Driver VM HCA Driver SRIOV with 256 Virtual Functions (VFs) 2X higher than previous solutions Hypervisor 1K egress QoS levels Guaranteed Quality of Service for VMs VF VF VF IB virtualization Embedded virtual IB leaf switch LID/GID based forwarding Up to 1K virtual ports PF InfiniBand HCA Enter the World of Hardware Virtualization

Connect-IB Memory Management Enhancements Extended atomics 4-256-byte argument On Demand Paging Save pinning of HCA registered memory Derived Data Types - Noncontiguous Data Elements Allows noncontiguous data elements to be grouped together in a single message Eliminates unnecessary data copy operations and multiple I/O transactions Hardware-Based Sophisticated Memory Operations Mellanox ScalableHPC Accelerate Parallel Applications MXM - Reliable Messaging Optimized for Mellanox HCA - Hybrid Transport Mechanism - Efficient Memory Registration - Receive Side Tag Matching FCA - Topology Aware Collective Optimization - Hardware Multicast - Separate Virtual Fabric for Collectives - CoreDIrect Hardware Offload InfiniBand Verbs API

FCA Collective Performance with OpenMPI Double Hadoop Performance with UDA Lower is better Disk Writes 40% Disk Reads 15% CPU Utilization 2.5X *TeraSort is a popular benchmark used to measure the performance of Hadoop cluster ~2X Faster Job Completion

GPU Direct RDMA for Fastest GPU to GPU Communications GPU Direct RDMA (previously known as GPU Direct 3.0) Enables peer to peer communication directly between HCA and GPU Dramatically reduces overall latency for GPU to GPU communications System Memory GDDR5 Memory GDDR5 Memory System Memory CPU GPU GPU CPU PCI Express 3.0 PCI Express 3.0 GPU Mellanox HCA Mellanox VPI Mellanox HCA Most Efficient GPU Computing Optimizing GPU and Accelerator Communications NVIDIA GPUs Mellanox were original partners in Co-Development of GPUDirect 1.0 Recently announced support of GPUDirect RDMA Peer-to-Peer GPU-to-HCA data path AMD GPUs Sharing of System Memory: AMD DirectGMA Pinned supported today AMD DirectGMA P2P: Peer-to-Peer GPU-to-HCA data path under development Intel MIC MIC software development system enables the MIC to communicate directly over the InfiniBand verbs API to Mellanox devices

MetroX Bringing InfiniBand to Campus Wide Networks Extends InfiniBand across campus/metro Low-cost, low-power InfiniBand Subnet 40Gb FDR-10 InfiniBand links RDMA over distant sites InfiniBand Subnet InfiniBand Subnet Support up to 10KM over dark fiber InfiniBand Subnet QSFP to SMF LC connectors module UFM Comprehensive, Robust Management Automatic Discovery Central Device Management Fabric Dashboard Fabric Health Reports Service Oriented Provisioning Health & Perf Monitoring Congestion Analysis

Mellanox HPC Paving the way to Exascale Computing ScalableHPC Ultimate Scalability with Connect-IB 100Gb/s throughput to network Over 130-million messages/second Dynamically Connected Transport service for unlimited inter-node scaling Highest Performing Interconnect Accelerating GPU Communications <0.7usec latency 56Gb/s throughput Higher scalability Maximum Reliability Thank You HPC@mellanox.com