SR-IOV: Performance Benefits for Virtualized Interconnects!

Size: px
Start display at page:

Download "SR-IOV: Performance Benefits for Virtualized Interconnects!"

Transcription

1 SR-IOV: Performance Benefits for Virtualized Interconnects! Glenn K. Lockwood! Mahidhar Tatineni! Rick Wagner!! July 15, XSEDE14, Atlanta!

2 Background! High Performance Computing (HPC) reaching beyond traditional application areas. Need for increased flexibility from HPC cyberinfrastructure." Increasingly jobs on XSEDE resources are originating from web-portals/ science gateways. " Science gateways and user communities can develop virtual "compute appliances" that contain tightly integrated application stacks that can be deployed in a hardware-agnostic fashion.! Several vendors also provide software packaged in appliances to enable easy deployment." Such benefits have been an important driving force behind compute clouds such as Amazon Web Services (AWS) EC2." 2

3 Background! The network bandwidth and latency performance of virtualized systems traditionally has been markedly worse than that of the native hardware." Hardware vendors have been providing an increased support for virtualization within hardware such as the I/O memory management units (IOMMU). " With the standardization and adoption of technologies such as Single- Root I/O Virtualization (SR-IOV) in network device hardware, a road has been paved towards truly high-performance virtualization for highperformance computing applications.! These technologies will be available to XSEDE users via future computing resources (Comet at SDSC: Production starting early 2015)." 3

4 Single Root I/O Virtualization in HPC! Problem: complex workflows demand increasing flexibility from HPC platforms" Virtualization = flexibility" Virtualization = IO performance loss (e.g., excessive DMA interrupts)" Solution: SR-IOV and Mellanox ConnectX-3 InfiniBand HCAs " One physical function (PF) à multiple virtual functions (VF), each with own DMA streams, memory space, interrupts" Allows DMA to bypass hypervisor to VMs!

5 High-Performance Virtualization on Comet! Mellanox FDR InfiniBand HCAs with SR-IOV" Rocks to manage high-performance virtual clusters." Flexibility to support complex science gateways and web-based workflow engines" Custom user defined application stack on virtual clusters." Leveraging FutureGrid expertise and experience." High bandwidth filesystem(s) access via virtualized InfiniBand."

6 Hardware/Software Configurations of Test Clusters! Native InfiniBand (SDSC) SR-IOV InfiniBand (SDSC) Platform Rocks 6.1 (EL6) Rocks 6.1 (EL6) kvm Hypervisor CPUs RAM Intel(R) Xeon E (2.2 GHz) 16 cores/node Native 10GbE (SDSC) Rocks 6.1 (EL6) Software- Virtualized 10Gbe (EC2) Amazon Linux (EL6) Xen HVM cc2.8xlarge Instance Intel(R) Xeon E (2.6 GHz) 16 cores/node 64 GB DDR GB DDR3 SR-IOV 10GbE (EC2) Amazon Linux (EL6) Xen HVM c3.8xlarge Instance Intel(R) Xeon E5-2680v2 (2.8 GHz) 16 cores/node Inteconnect QDR4X InfiniBand Mellanox ConnectX-3 10GbE 10GbE (Xen Driver) 10GbE (Intel VF driver)

7 Benchmarks! Fundamental performance characteristics of interconnect evaluated using OSU Micro-Benchmarks latency, unidirectional and bidirectional bandwidth tests." WRF: widely used weather modeling application that is run in both research and operational forecasting. CONUS-12km benchmark used for performance evaluation." Quantum ESPRESSO: Application that performs density functional theory (DFT) calculations for condensed matter problems. DEISA AUSURF112 benchmark." 7

8 Benchmark Build Details! OSU Micro-Benchmarks (OMB version 3.9) compiled with OpenMPI 1.5 and GCC ! Both test applications were built with Intel Composer XE 2013 and all of the options necessary to allow the compiler to generate 256- byte AVX vector instructions that were available on all of the testing platforms.! OpenMPI 1.5 for both InfiniBand and 10GbE platform tests.! Intel's Math Kernel Library (MKL) 11.0 to provide all BLAS, LAPACK, ScaLAPACK, and FFTW3 functions where necessary.! 8

9 SR-IOV with 10GbE*: Latency Results! MPI point-to-point latency as measured by the osu_latency benchmark % improvement. under virtualized environment with SR-IOV X slower than native case, even with SR-IOV. * SR-IOV provided with Amazon's C3 instances MPI point-to-point latency as measured by the osu_latency benchmark. Error bars are +/- three standard deviations from the mean.. SR-IOV provides 3 to 4 less variation in latency for small message sizes. 9

10 SR-IOV with 10GbE: Bandwidth Results! Unidirectional messaging bandwidth never exceeds 500 MB/s (~40% of line speed). Native performance is 1.5-2X faster. Similar results for bidirectional bandwidth. SR-IOV has very little benefit in both cases. SR-IOV helps slightly (13% for random ring, 17% for natural ring) in collective bandwidth tests. Native total ring bandwidth was more than 2X faster than SR-IOV based virtualized results. MPI (a) unidirectional bandwidth and (b) bidirectional bandwidth for 10GbE interconnect as measured by the osu_bw and osu_bibw benchmarks, respectively.. 10

11 SR-IOV with InfiniBand: Latency! SR-IOV! < 30% overhead for M < 128 bytes" < 10% overhead for eager send/recv" Overhead à 0% for bandwidth-limited regime" Amazon EC2! > 5000% worse latency" Time dependent (noisy)" 50x less latency than Amazon EC2! Figure 5. MPI point-to-point latency measured by osu_latency for QDR InfiniBand. Included for scale are the analogous 10GbE measurements from Amazon (AWS) and non-virtualized 10GbE.. 11 OSU Microbenchmarks (3.9, osu_latency)"

12 SR-IOV with InfiniBand: Bandwidth! SR-IOV! < 2% bandwidth loss over entire range" > 95% peak bandwidth" Amazon EC2! < 35% peak bandwidth" 900% to 2500% worse bandwidth than virtualized InfiniBand" Figure 6. MPI point-to-point bandwidth measured by osu_bw for QDR InfiniBand. Included for scale are the analogous 10GbE measurements from Amazon (AWS) and non-virtualized 10GbE. 10x more bandwidth than Amazon EC2!. 12 OSU Microbenchmarks (3.9, osu_bw)"

13 Application Benchmarks: WRF! WRF CONUS-12km benchmark. The domain is 12 KM horizontal resolution on a 425 by 300 grid with 35 vertical levels, with a time step of 72 seconds.! Run using six nodes (96 cores) over QDR4X InfiniBand virtualized with SR-IOV.! SR-IOV test cluster has 2.2 GHz Intel(R) Xeon E processors.! Amazon instances were using 2.6 GHz Intel(R) Xeon E processors.! 13

14 Weather Modeling 15% Overhead! 96-core (6-node) calculation" Nearest-neighbor communication" Scalable algorithms" SR-IOV incurs modest (15%) performance hit"...but still still 20% faster *** than Amazon" WRF hr forecast" *** 20% faster despite SR-IOV cluster having 20% slower CPUs"

15 Application Benchmarks: Quantum Espresso! DEISA AUSURF 112 benchmark with Quantum ESPRESSO.! Matrix diagonalization is done with the conjugate gradient algorithm. Communication overhead larger than WRF case.! 3D Fourier transforms stress the interconnect because they perform multiple matrix transpose operations where every MPI rank needs to send and receive all of its data to every other MPI rank.! These global collectives cause interconnect congestion, and efficient 3D FFTs are limited by the bisection bandwidth of the entire fabric connecting all of the compute nodes.! 15

16 Quantum ESPRESSO: 5x Faster than EC2! 48-core, 3 node calc" CG matrix inversion (irregular comm.)" 3D FFT matrix transposes (All-to-all communication)" 28% slower w/ SR-IOV" SR-IOV still > 500% faster *** than EC2" Quantum Espresso DEISA AUSURF112 benchmark" *** 20% faster despite SR-IOV cluster having 20% slower CPUs"

17 Conclusions! SR-IOV: huge step forward in high-performance virtualization" Shows substantial improvement in latency over Amazon EC2, and it has negligible bandwidth overhead! Benchmark application performance confirms this: significant improvement over EC2" SR-IOV: lowers performance barrier to virtualizing the interconnect and makes fully virtualized HPC clusters viable! Comet will deliver virtualized HPC to new/non-traditional communities that need flexibility without major loss of performance!

Comet - High performance virtual clusters to support the long-tail of science.

Comet - High performance virtual clusters to support the long-tail of science. Comet - High performance virtual clusters to support the long-tail of science. Philip M. Papadopoulos, Ph.D. San Diego Supercomputer Center California Institute for telecommunications and Information Technologies

More information

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster Ryousei Takano Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology

More information

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Mahidhar Tatineni ([email protected]) MVAPICH User Group Meeting August 27, 2014 NSF grants: OCI #0910847 Gordon: A Data

More information

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Can High-Performance Interconnects Benefit Memcached and Hadoop? Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,

More information

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012 Scientific Application Performance on HPC, Private and Public Cloud Resources: A Case Study Using Climate, Cardiac Model Codes and the NPB Benchmark Suite Peter Strazdins (Research School of Computer Science),

More information

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

Performance Evaluation of Amazon EC2 for NASA HPC Applications! National Aeronautics and Space Administration Performance Evaluation of Amazon EC2 for NASA HPC Applications! Piyush Mehrotra!! J. Djomehri, S. Heistand, R. Hood, H. Jin, A. Lazanoff,! S. Saini, R. Biswas!

More information

SR-IOV In High Performance Computing

SR-IOV In High Performance Computing SR-IOV In High Performance Computing Hoot Thompson & Dan Duffy NASA Goddard Space Flight Center Greenbelt, MD 20771 [email protected] [email protected] www.nccs.nasa.gov Focus on the research side

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

RDMA over Ethernet - A Preliminary Study

RDMA over Ethernet - A Preliminary Study RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Outline Introduction Problem Statement

More information

benchmarking Amazon EC2 for high-performance scientific computing

benchmarking Amazon EC2 for high-performance scientific computing Edward Walker benchmarking Amazon EC2 for high-performance scientific computing Edward Walker is a Research Scientist with the Texas Advanced Computing Center at the University of Texas at Austin. He received

More information

1 Bull, 2011 Bull Extreme Computing

1 Bull, 2011 Bull Extreme Computing 1 Bull, 2011 Bull Extreme Computing Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing HPC Overview Ares, Gerardo, HPC Team HPC concepts HPC: High Performance

More information

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Building a Top500-class Supercomputing Cluster at LNS-BUAP Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad

More information

Interoperability Testing and iwarp Performance. Whitepaper

Interoperability Testing and iwarp Performance. Whitepaper Interoperability Testing and iwarp Performance Whitepaper Interoperability Testing and iwarp Performance Introduction In tests conducted at the Chelsio facility, results demonstrate successful interoperability

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

Kashif Iqbal - PhD [email protected]

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie HPC/HTC vs. Cloud Benchmarking An empirical evalua.on of the performance and cost implica.ons Kashif Iqbal - PhD [email protected] ICHEC, NUI Galway, Ireland With acknowledgment to Michele MicheloDo

More information

Masters Project Proposal

Masters Project Proposal Masters Project Proposal Virtual Machine Storage Performance Using SR-IOV by Michael J. Kopps Committee Members and Signatures Approved By Date Advisor: Dr. Jia Rao Committee Member: Dr. Xiabo Zhou Committee

More information

NAVAL POSTGRADUATE SCHOOL THESIS

NAVAL POSTGRADUATE SCHOOL THESIS NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS FEASIBILITY OF VIRTUAL MACHINE AND CLOUD COMPUTING TECHNOLOGIES FOR HIGH PERFORMANCE COMPUTING by Richard Chad Hutchins December 2013 Thesis Co-Advisors:

More information

Intel True Scale Fabric Architecture. Enhanced HPC Architecture and Performance

Intel True Scale Fabric Architecture. Enhanced HPC Architecture and Performance Intel True Scale Fabric Architecture Enhanced HPC Architecture and Performance 1. Revision: Version 1 Date: November 2012 Table of Contents Introduction... 3 Key Findings... 3 Intel True Scale Fabric Infiniband

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

Open Cirrus: Towards an Open Source Cloud Stack

Open Cirrus: Towards an Open Source Cloud Stack Open Cirrus: Towards an Open Source Cloud Stack Karlsruhe Institute of Technology (KIT) HPC2010, Cetraro, June 2010 Marcel Kunze KIT University of the State of Baden-Württemberg and National Laboratory

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud StACC: St Andrews Cloud Computing Co laboratory A Performance Comparison of Clouds Amazon EC2 and Ubuntu Enterprise Cloud Jonathan S Ward StACC (pronounced like 'stack') is a research collaboration launched

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology Reduce I/O cost and power by 40 50% Reduce I/O real estate needs in blade servers through consolidation Maintain

More information

State of the Art Cloud Infrastructure

State of the Art Cloud Infrastructure State of the Art Cloud Infrastructure Motti Beck, Director Enterprise Market Development WHD Global I April 2014 Next Generation Data Centers Require Fast, Smart Interconnect Software Defined Networks

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

Cloud Computing. Alex Crawford Ben Johnstone

Cloud Computing. Alex Crawford Ben Johnstone Cloud Computing Alex Crawford Ben Johnstone Overview What is cloud computing? Amazon EC2 Performance Conclusions What is the Cloud? A large cluster of machines o Economies of scale [1] Customers use a

More information

Building a Private Cloud with Eucalyptus

Building a Private Cloud with Eucalyptus Building a Private Cloud with Eucalyptus 5th IEEE International Conference on e-science Oxford December 9th 2009 Christian Baun, Marcel Kunze KIT The cooperation of Forschungszentrum Karlsruhe GmbH und

More information

DPDK Summit 2014 DPDK in a Virtual World

DPDK Summit 2014 DPDK in a Virtual World DPDK Summit 2014 DPDK in a Virtual World Bhavesh Davda (Sr. Staff Engineer, CTO Office, ware) Rashmin Patel (DPDK Virtualization Engineer, Intel) Agenda Data Plane Virtualization Trends DPDK Virtualization

More information

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009 ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory

More information

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft

More information

FLOW-3D Performance Benchmark and Profiling. September 2012

FLOW-3D Performance Benchmark and Profiling. September 2012 FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute

More information

Full and Para Virtualization

Full and Para Virtualization Full and Para Virtualization Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF x86 Hardware Virtualization The x86 architecture offers four levels

More information

Interconnect Analysis: 10GigE and InfiniBand in High Performance Computing

Interconnect Analysis: 10GigE and InfiniBand in High Performance Computing Interconnect Analysis: 10GigE and InfiniBand in High Performance Computing WHITE PAPER Highlights: There is a large number of HPC applications that need the lowest possible latency for best performance

More information

The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation

The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation Mark Baker, Garry Smith and Ahmad Hasaan SSE, University of Reading Paravirtualization A full assessment of paravirtualization

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E RDMA Performance in Virtual Machines using QDR InfiniBand on vsphere 5 Table of Contents Introduction...

More information

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland [email protected]. 2015-12-10 Hanhirova CS/Aalto

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland jussi.hanhirova@aalto.fi. 2015-12-10 Hanhirova CS/Aalto I/O virtualization Jussi Hanhirova Aalto University, Helsinki, Finland [email protected] Outline Introduction IIoT Data streams on the fly processing Network packet processing in the virtualized

More information

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment Technical Paper Moving SAS Applications from a Physical to a Virtual VMware Environment Release Information Content Version: April 2015. Trademarks and Patents SAS Institute Inc., SAS Campus Drive, Cary,

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

Performance Evaluation of InfiniBand with PCI Express

Performance Evaluation of InfiniBand with PCI Express Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Server Technology Group IBM T. J. Watson Research Center Yorktown Heights, NY 1598 [email protected] Amith Mamidala, Abhinav Vishnu, and Dhabaleswar

More information

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014 Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,

More information

Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters

Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation report prepared under contract with Emulex Executive Summary As

More information

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department

More information

Achieving Performance Isolation with Lightweight Co-Kernels

Achieving Performance Isolation with Lightweight Co-Kernels Achieving Performance Isolation with Lightweight Co-Kernels Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015

More information

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

Accelerating Spark with RDMA for Big Data Processing: Early Experiences Accelerating Spark with RDMA for Big Data Processing: Early Experiences Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, Dip7 Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department

More information

PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS

PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS Philip J. Sokolowski Department of Electrical and Computer Engineering Wayne State University 55 Anthony Wayne Dr. Detroit, MI 822

More information

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading

More information

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro Whitepaper What s wrong with today s clouds? Compute and storage virtualization has enabled

More information

Michael Kagan. [email protected]

Michael Kagan. michael@mellanox.com Virtualization in Data Center The Network Perspective Michael Kagan CTO, Mellanox Technologies [email protected] Outline Data Center Transition Servers S as a Service Network as a Service IO as a Service

More information

HPC performance applications on Virtual Clusters

HPC performance applications on Virtual Clusters Panagiotis Kritikakos EPCC, School of Physics & Astronomy, University of Edinburgh, Scotland - UK [email protected] 4 th IC-SCCE, Athens 7 th July 2010 This work investigates the performance of (Java)

More information

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development [email protected]

More information

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Cluster Implementation and Management; Scheduling

Cluster Implementation and Management; Scheduling Cluster Implementation and Management; Scheduling CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Cluster Implementation and Management; Scheduling Spring 2013 1 /

More information

Performance of the NAS Parallel Benchmarks on Grid Enabled Clusters

Performance of the NAS Parallel Benchmarks on Grid Enabled Clusters Performance of the NAS Parallel Benchmarks on Grid Enabled Clusters Philip J. Sokolowski Dept. of Electrical and Computer Engineering Wayne State University 55 Anthony Wayne Dr., Detroit, MI 4822 [email protected]

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Understanding the Performance and Potential of Cloud Computing for Scientific Applications

Understanding the Performance and Potential of Cloud Computing for Scientific Applications IEEE TRANSACTIONS ON CLOUD COMPUTING, MANUSCRIPT ID 1 Understanding the Performance and Potential of Cloud Computing for Scientific Applications Iman Sadooghi, Jesús Hernández Martin, Tonglin Li, Kevin

More information

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Oracle Database Scalability in VMware ESX VMware ESX 3.5 Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises

More information

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School Introduction to Infiniband Hussein N. Harake, Performance U! Winter School Agenda Definition of Infiniband Features Hardware Facts Layers OFED Stack OpenSM Tools and Utilities Topologies Infiniband Roadmap

More information

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist ACANO SOLUTION VIRTUALIZED DEPLOYMENTS White Paper Simon Evans, Acano Chief Scientist Updated April 2015 CONTENTS Introduction... 3 Host Requirements... 5 Sizing a VM... 6 Call Bridge VM... 7 Acano Edge

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

PRIMERGY server-based High Performance Computing solutions

PRIMERGY server-based High Performance Computing solutions PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating

More information

Virtualization. P. A. Wilsey. The text highlighted in green in these slides contain external hyperlinks. 1 / 16

Virtualization. P. A. Wilsey. The text highlighted in green in these slides contain external hyperlinks. 1 / 16 Virtualization P. A. Wilsey The text highlighted in green in these slides contain external hyperlinks. 1 / 16 Conventional System Viewed as Layers This illustration is a common presentation of the application/operating

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

Cluster Computing at HRI

Cluster Computing at HRI Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: [email protected] 1 Introduction and some local history High performance computing

More information

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Dirk Schmidl, Tim Cramer, Sandra Wienke, Christian Terboven, and Matthias S. Müller [email protected] Rechen- und Kommunikationszentrum

More information

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman High Performance OpenStack Cloud Eli Karpilovski Cloud Advisory Council Chairman Cloud Advisory Council Our Mission Development of next generation cloud architecture Providing open specification for cloud

More information

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality White Paper Broadcom Ethernet Network Controller Enhanced Virtualization Functionality Advancements in VMware virtualization technology coupled with the increasing processing capability of hardware platforms

More information

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand Hari Subramoni *, Ping Lai *, Raj Kettimuthu **, Dhabaleswar. K. (DK) Panda * * Computer Science and Engineering Department

More information

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed Computing Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies

More information

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers. White Paper Virtualized SAP: Optimize Performance with Cisco Data Center Virtual Machine Fabric Extender and Red Hat Enterprise Linux and Kernel-Based Virtual Machine What You Will Learn The virtualization

More information

Emerging Technology for the Next Decade

Emerging Technology for the Next Decade Emerging Technology for the Next Decade Cloud Computing Keynote Presented by Charles Liang, President & CEO Super Micro Computer, Inc. What is Cloud Computing? Cloud computing is Internet-based computing,

More information

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Accelerating High-Speed Networking with Intel I/O Acceleration Technology White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing

More information

Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud Edition Lustre* and Amazon Web Services

Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud Edition Lustre* and Amazon Web Services Reference Architecture Developing Storage Solutions with Intel Cloud Edition for Lustre* and Amazon Web Services Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud

More information

Automated Testing of Installed Software

Automated Testing of Installed Software Automated Testing of Installed Software or so far, How to validate MPI stacks of an HPC cluster? Xavier Besseron HPC and Computational Science @ FOSDEM 2014 February 1, 2014 Automated Testing of Installed

More information

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA

More information

Comparing Cloud Computing Resources for Model Calibration with PEST

Comparing Cloud Computing Resources for Model Calibration with PEST Comparing Cloud Computing Resources for Model Calibration with PEST CWEMF Annual Meeting March 10, 2015 Charles Brush Modeling Support Branch, Bay-Delta Office California Department of Water Resources,

More information

A Performance and Cost Analysis of the Amazon Elastic Compute Cloud (EC2) Cluster Compute Instance

A Performance and Cost Analysis of the Amazon Elastic Compute Cloud (EC2) Cluster Compute Instance A Performance and Cost Analysis of the Amazon Elastic Compute Cloud (EC2) Cluster Compute Instance Michael Fenn ([email protected]), Jason Holmes ([email protected]), Jeffrey Nucciarone ([email protected]) Research

More information

Memory Aggregation For KVM Hecatonchire Project

Memory Aggregation For KVM Hecatonchire Project Memory Aggregation For KVM Hecatonchire Project Benoit Hudzia; Sr. Researcher; SAP Research Belfast With the contribution of Aidan Shribman, Roei Tell, Steve Walsh, Peter Izsak November 2012 Agenda Memory

More information

Cluster performance, how to get the most out of Abel. Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013

Cluster performance, how to get the most out of Abel. Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013 Cluster performance, how to get the most out of Abel Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013 Introduction Architecture x86-64 and NVIDIA Compilers MPI Interconnect Storage Batch queue

More information

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd. [email protected]

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd. sivaram@redhat.com Cloud Computing with Red Hat Solutions Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd [email protected] Linux Automation Details Red Hat's Linux Automation strategy for next-generation IT infrastructure

More information

Virtual Compute Appliance Frequently Asked Questions

Virtual Compute Appliance Frequently Asked Questions General Overview What is Oracle s Virtual Compute Appliance? Oracle s Virtual Compute Appliance is an integrated, wire once, software-defined infrastructure system designed for rapid deployment of both

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX White Paper BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH Abstract This white paper explains how to configure a Mellanox SwitchX Series switch to bridge the external network of an EMC Isilon

More information

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Journal of Information & Computational Science 9: 5 (2012) 1273 1280 Available at http://www.joics.com VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Yuan

More information

Overview of HPC systems and software available within

Overview of HPC systems and software available within Overview of HPC systems and software available within Overview Available HPC Systems Ba Cy-Tera Available Visualization Facilities Software Environments HPC System at Bibliotheca Alexandrina SUN cluster

More information