Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster

Size: px
Start display at page:

Download "Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster"

Transcription

1 Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster Pak Lui pak@hpcadvisorycouncil.com May 7, 2015

2 Agenda Introduction to HPC Advisory Council Altair RADIOSS Benchmark Configuration Performance Benchmark Testing and Results MPI Profiling Summary Altair OptiStruct Benchmark Configuration Performance Benchmark Testing and Results MPI Profiling Summary Q&A / For More Information

3 The HPC Advisory Council World-wide HPC organization (400+ members) Bridges the gap between HPC usage and its full potential Provides best practices and a support/development center Explores future technologies and future developments Working Groups HPC Cloud, HPC Scale, HPC GPU, HPC Storage Leading edge solutions and technology demonstrations

4 Copyright 2015 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved. HPC Advisory Council Members

5 HPC Council Board HPC Advisory Council Chairman Gilad Shainer - gilad@hpcadvisorycouncil.com HPC Advisory Council HPC Scale SIG Chair Richard Graham richard@hpcadvisorycouncil.com HPC Advisory Council Media Relations and Events Director HPC Advisory Council HPC Cloud SIG Chair Brian Sparks - brian@hpcadvisorycouncil.com William Lu william@hpcadvisorycouncil.com HPC Advisory Council China Events Manager Blade Meng - blade@hpcadvisorycouncil.com HPC Advisory Council HPC GPU SIG Chair Sadaf Alam sadaf@hpcadvisorycouncil.com Director of the HPC Advisory Council, Asia Tong Liu - tong@hpcadvisorycouncil.com HPC Advisory Council India Outreach Goldi Misra goldi@hpcadvisorycouncil.com HPC Advisory Council HPC Works SIG Chair and Cluster Center Manager Pak Lui - pak@hpcadvisorycouncil.com Director of the HPC Advisory Council Switzerland Center of Excellence and HPC Storage SIG Chair Hussein Harake hussein@hpcadvisorycouncil.com HPC Advisory Council Director of Educational Outreach Scot Schultz scot@hpcadvisorycouncil.com HPC Advisory Council Workshop Program Director Eric Lantz eric@hpcadvisorycouncil.com HPC Advisory Council Programming Advisor Tarick Bedeir - Tarick@hpcadvisorycouncil.com HPC Advisory Council Research Steering Committee Director Cydney Stevens - cydney@hpcadvisorycouncil.com

6 HPC Advisory Council HPC Center InfiniBand-based Storage (Lustre) Juniper Heimdall Lustre FS 640 cores 280 cores Plutus Janus Athena 192 cores 456 cores 80 cores Vesta Thor Mala 704 cores 896 cores 16 GPUs

7 Special Interest Subgroups Missions HPC Scale To explore usage of commodity HPC as a replacement for multi-million dollar mainframes and proprietary based supercomputers with networks and clusters of microcomputers acting in unison to deliver high-end computing services. HPC Cloud To explore usage of HPC components as part of the creation of external/public/internal/private cloud computing environments. HPC Works To provide best practices for building balanced and scalable HPC systems, performance tuning and application guidelines. HPC Storage To demonstrate how to build high-performance storage solutions and their affect on application performance and productivity. One of the main interests of the HPC Storage subgroup is to explore Lustre based solutions, and to expose more users to the potential of Lustre over high-speed networks. HPC GPU To explore usage models of GPU components as part of next generation compute environments and potential optimizations for GPU based computing. HPC FSI To explore the usage of high-performance computing solutions for low latency trading, more productive simulations (such as Monte Carlo) and overall more efficient financial services.

8 HPC Advisory Council HPC Advisory Council (HPCAC) 400+ members Application best practices, case studies (Over 150) Benchmarking center with remote access for users World-wide workshops Value add for your customers to stay up to date and in tune to HPC market 2015 Workshops USA (Stanford University) February 2015 Switzerland March 2015 Brazil August 2015 Spain September 2015 China (HPC China) Oct 2015 For more information

9 2015 ISC High Performance Conference Student Cluster Competition University-based teams to compete and demonstrate the incredible capabilities of state-of- the-art HPC systems and applications on the 2015 ISC High Performance Conference show-floor The Student Cluster Competition is designed to introduce the next generation of students to the high performance computing world and community

10 RADIOSS Performance Study Research performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource - HPC Advisory Council Cluster Center Objectives Give overview of RADIOSS performance Compare different MPI libraries, network interconnects and others Understand RADIOSS communication patterns Provide best practices to increase RADIOSS productivity

11 About RADIOSS Compute-intensive simulation software for Manufacturing For 20+ years an established standard for automotive crash and impact Differentiated by its high scalability, quality and robustness Supports multiphysics simulation and advanced materials Used across all industries to improve safety and manufacturability Companies use RADIOSS to simulate real-world scenarios (crash tests, climate effects, etc.) to test the performance of a product

12 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-core Intel 2.60 GHz CPUs (Turbo on, max perf set in BIOS) OS: RHEL 6.5, OFED MLNX_OFED_LINUX InfiniBand SW stack Memory: 64GB memory, DDR MHz Hard Drives: 1TB 7.2 RPM SATA 2.5 Mellanox Switch-IB SB Gb/s InfiniBand VPI switch Mellanox ConnectX-4 EDR 100Gb/s InfiniBand VPI adapters Mellanox ConnectX-3 40/56Gb/s QDR/FDR InfiniBand VPI adapters Mellanox SwitchX SX Gb/s FDR InfiniBand VPI switch MPI: Intel MPI 5.0.2, Open MPI 1.8.4, Mellanox HPC-X v1.2.0 Application: Altair RADIOSS 13.0 Benchmark datasets: Neon benchmarks: 1 million elements (8ms, Double Precision), unless otherwise stated

13 About Intel Cluster Ready Intel Cluster Ready systems make it practical to use a cluster to increase your simulation and modeling productivity Simplifies selection, deployment, and operation of a cluster A single architecture platform supported by many OEMs, ISVs, cluster provisioning vendors, and interconnect providers Focus on your work productivity, spend less management time on the cluster Select Intel Cluster Ready Where the cluster is delivered ready to run Hardware and software are integrated and configured together Applications are registered, validating execution on the Intel Cluster Ready architecture Includes Intel Cluster Checker tool, to verify functionality and periodically check cluster health RADIOSS is Intel Cluster Ready

14 PowerEdge R730 Massive flexibility for data intensive operations Performance and efficiency Intelligent hardware-driven systems management with extensive power management features Innovative tools including automation for parts replacement and lifecycle manageability Broad choice of networking technologies from GbE to IB Built in redundancy with hot plug and swappable PSU, HDDs and fans Benefits Designed for performance workloads from big data analytics, distributed storage or distributed computing where local storage is key to classic HPC and large scale hosting environments High performance scale-out compute and low cost dense storage in one package Hardware Capabilities Flexible compute platform with dense storage capacity 2S/2U server, 6 PCIe slots Large memory footprint (Up to 768GB / 24 DIMMs) High I/O performance and optional storage configurations HDD options: 12 x or - 24 x x 2.5 HDDs in rear of server Up to 26 HDDs with 2 hot plug drives in rear of server for boot or scratch

15 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides better scalability performance than Ethernet 70 times better performance than 1GbE at 16 nodes / 448 cores 4.8x better performance than 10GbE at 16 nodes / cores Ethernet solutions does not scale beyond 4 nodes with pure MPI 70x 4.8x Higher is better Intel MPI 28 Processes/Node

16 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides better scalability performance EDR InfiniBand improves over QDR IB by 28% at 16 nodes / 448 cores Similarly, EDR InfiniBand outperforms FDR InfiniBand by 25% at 16 nodes 28% 25% Higher is better Intel MPI 28 Processes/Node

17 RADIOSS Performance CPU Cores Running more cores per node generally improves overall performance Seen improvement of 18% from 20 to 28 cores per node at 8 nodes Guideline: Most optimal workload distribution is 4000 elements/process For test case of 1 million elements, most optimal core sizes is ~256 cores 4000 elements per process should provides sufficient workload for each process Hybrid MPP (HMPP) provides way to achieve additional scalability on more CPUs 18% 6% Higher is better Intel MPI

18 RADIOSS Performance Simulation Time Increasing simulation time increase the run time at a faster rate Increasing a 8ms simulation to 80ms can result in much longer runtime 10x longer simulation run can result in a 14x in the runtime Contacts usually become more severe at the middle of the run, so it costs more complexity and CPU utilization and therefore cost/cycle increases 13x 14x 14x Higher is better Intel MPI

19 RADIOSS Profiling % Time Spent on MPI RADIOSS utilizes point-to-point communications in most data transfers The most time MPI consuming calls is MPI_Waitany() and MPI_Wait() MPI_Recv(55%), MPI_Waitany(23%), MPI_Allreduce(13%) MPP Mode 28 Processes/Node

20 RADIOSS Performance Intel MPI Tuning (MPP) Tuning Intel MPI collective algorithm can improve performance MPI profile shows about 20% of runtime spent on MPI_Allreduce communications Default algorithm in Intel MPI is Recursive Doubling The default algorithm is the best among all tested for MPP Higher is better Intel MPI 28 Processes/Node

21 RADIOSS Performance MPI Libraries (MPP) Both Intel MPI and Open MPI performs similarly MPI profile shows ~20% of MPI time spent on MPI_Allreduce communications MPI collective operations (such as MPI_Allreduce) can potentially be optimized Support for Open MPI is new for RADIOSS HPC-X is a tuned MPI distribution based on the latest Open MPI Higher is better 28 Processes/Node

22 RADIOSS Hybrid MPP Parallelization Highly parallel code Multi-level parallelization Domain decomposition MPI parallelization Multithreading OpenMP Enhanced performance Best scalability in the marketplace High efficiency on large HPC clusters Unique, proven method for rich scalability over thousands of cores for FEA Flexibility -- easy tuning of MPI & OpenMP Robustness -- parallel arithmetic allows perfect repeatability in parallel

23 RADIOSS Performance Hybrid MPP version Enabling Hybrid MPP mode unlocks the RADIOSS scalability At larger scale, productivity improves as more threads involves As more threads involved, amount of communications by processes are reduced At 32 nodes/896 cores, best configuration is 1 process per socket to spawn 14 threads each 28 threads/1 PPN is not advised due to breach of data locality across different CPU socket The following environment setting and tuned flags are used for Intel MPI: I_MPI_PIN_DOMAIN auto I_MPI_ADJUST_ALLREDUCE 5 I_MPI_ADJUST_BCAST 1 KMP_AFFINITY compact 32% 70% KMP_STACKSIZE 400m 3.7x ulimit -s unlimited Intel MPI EDR InfiniBand Higher is better

24 RADIOSS Profiling MPI Comm. Time For MPP utilizes most non-blocking calls for communications MPI_Recv, MPI_Waitany, MPI_Allreduce are used most of the time For HMPP, communication behavior has changed Higher time percentage in MPI_Waitany, MPI_Allreduce, and MPI_Recv MPP, 28PPN HMPP, 2PPN / 14 Threads Intel MPI At 32 Nodes

25 RADIOSS Profiling MPI Message Sizes Most time consuming MPI communications for MPP are: MPI_Recv: Messages concentrated at 640B, 1KB, 320B, 1280B MPI_Waitany: Messages are: 48B, 8B, 384B MPI_Allreduce: Most message sizes appears at 80B MPP, 28PPN HMPP, 2PPN / 14 Threads Pure MPP 28 Processes/Node

26 RADIOSS Performance Intel MPI Tuning (DP) For Hybrid MPP DP, tuning MPI_Allreduce shows more gain than MPP For DAPL provider, Binomial gather+scatter #5 improved perf by 27% over default For OFA provider, tuned MPI_Allreduce algorithm improves by 44% over default Both OFA and DAPL improved by tuning I_MPI_ADJUST_ALLREDUCE=5 Flags for OFA: I_MPI_OFA_USE_XRC 1. For DAPL: ofa-v2-mlx5_0-1u provider 27% 44% Higher is better Intel MPI 2 PPN / 14 OpenMP

27 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than Ethernet 214% better performance than 1GbE at 16 nodes 104% better performance than 10GbE at 16 nodes InfiniBand typically outperforms other interconnect in collective operations 214% 104% Higher is better Intel MPI 2 PPN / 14 OpenMP

28 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than FDR IB EDR IB outperforms FDR IB by 27% at 32 nodes Improvement for EDR InfiniBand occurs at high node count 27% Higher is better Intel MPI 2 PPN / 14 OpenMP

29 RADIOSS Performance Floating Point Precision Single precision job runs faster double precision SP provides 47% speedup than DP Similar scalability is seen for double precision tests 47% Higher is better Intel MPI 2 PPN / 14 OpenMP

30 RADIOSS Performance CPU Frequency Increasing CPU core frequency enables higher job efficiency 18% of performance jump from 2.3GHz to 2.6GHz (13% increase in clock speed) 29% of performance jump from 2.0GHz to 2.6GHz (30% increase in clock speed) Increase in performance gain exceeds the increase CPU frequencies CPU bound application see higher benefit of using CPU with higher frequencies 29% 18% Higher is better Intel MPI 2 PPN / 14 OpenMP

31 RADIOSS Performance System Generations Intel E5-2680v3 (Haswell) cluster outperforms prior generations Performs faster by 100% vs Jupiter, by 238% vs Janus at 16 nodes System components used: Thor: 2-socket Intel 2133MHz DIMMs, EDR IB, v13.0 Jupiter: 2-socket Intel 1600MHz DIMMs, FDR IB, v12.0 Janus: 2-socket Intel 1333MHz DIMMs, QDR IB, v % 100% Single Precision

32 RADIOSS Profiling Memory Required Differences in memory consumption between MPP and HMPP MPP: Memory required to run the workload is ~5GB per node HMPP: Approximately 400MB per node is needed (as there are 2 PPN) Considered as a small workload but good enough to observe application behavior MPP, 28PPN HMPP, 2PPN / 14 Threads At 32 Nodes 28 CPU Cores/Node

33 RADIOSS Summary RADIOSS is designed to perform at large scale HPC environment Shows excellent scalability over 896 cores/32 nodes and beyond with Hybrid MPP Hybrid MPP version enhanced RADIOSS scalability 2 MPI processes per node (or 1 MPI process per socket), 14 threads each Additional CPU cores generally accelerating time to solution performance Intel E5-2680v3 (Haswell) cluster outperforms prior generations Performs faster by 100% vs Sandy Bridge, by 238% vs Westmere at 16 nodes Network and MPI Tuning EDR InfiniBand outperforms other Ethernet-based interconnects in scalability EDR InfiniBand delivers higher scalability performance than FDR and QDR IB Tuning environment parameters is important to maximize performance Tuning MPI collective ops helps RADIOSS to achieve even better scalability

34 OptiStruct by Altair Altair OptiStruct OptiStruct is an industry proven, modern structural analysis solver Solve for linear and non-linear structural problems under static and dynamic loadings Market-leading solution for structural design and optimization Helps designers and engineers to analyze and optimize structures Optimize for strength, durability and NVH (Noise, Vibration, Harshness) characteristics Help to rapidly develop innovative, lightweight and structurally efficient designs Based on finite-element and multi-body dynamics technology

35 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-core Intel 2.60 GHz CPUs (Turbo on, max perf set in BIOS) OS: RHEL 6.5, OFED MLNX_OFED_LINUX InfiniBand SW stack Memory: 64GB memory, DDR MHz Hard Drives: 1TB 7.2 RPM SATA 2.5 Mellanox Switch-IB SB Gb/s InfiniBand VPI switch Mellanox SwitchX SX Gb/s FDR InfiniBand VPI switch Mellanox ConnectX-4 EDR 100Gb/s InfiniBand VPI adapters Mellanox ConnectX-3 40/56Gb/s QDR/FDR InfiniBand VPI adapters MPI: Intel MPI Application: Altair OptiStruct 13.0 Benchmark datasets: Engine Assembly

36 Model Engine Block Grids (Structural): Elements: Local Coordinate Systems: 32 Degrees of Freedom: 4,721,581 Non-zero Stiffness Terms: ,216 Pretension sections: 10 Nonlinear run small deformation: 3 subcases (reduced to 1 subcase) 27 iterations totally Memory requirement: 96 GB per node for 1 MPI process job (for in-core) 16GB per node for 8 MPI processes job 8GB per node for 24 MPI processes job

37 PowerEdge R730 Massive flexibility for data intensive operations Performance and efficiency Intelligent hardware-driven systems management with extensive power management features Innovative tools including automation for parts replacement and lifecycle manageability Broad choice of networking technologies from GbE to IB Built in redundancy with hot plug and swappable PSU, HDDs and fans Benefits Designed for performance workloads from big data analytics, distributed storage or distributed computing where local storage is key to classic HPC and large scale hosting environments High performance scale-out compute and low cost dense storage in one package Hardware Capabilities Flexible compute platform with dense storage capacity 2S/2U server, 6 PCIe slots Large memory footprint (Up to 768GB / 24 DIMMs) High I/O performance and optional storage configurations HDD options: 12 x or - 24 x x 2.5 HDDs in rear of server Up to 26 HDDs with 2 hot plug drives in rear of server for boot or scratch

38 OptiStruct Performance CPU Cores Running more cores per node generally improves overall performance The -nproc parameter specified the number of threads spawned per MPI process Guideline: 6 threads per MPI process yields the best performance Ideal threads to be spawned appears to be 6 threads per MPI process (either 2/4 PPN) Having 6 threads spawned by each MPI process performs best among all other tested Higher is better

39 OptiStruct Performance Interconnect EDR InfiniBand provides superior scalability performance over Ethernet 11 times better performance than 1GbE at 24 nodes 90% better performance than 10GbE at 24 nodes Ethernet solutions does not scale beyond 4 nodes 11x 90% Higher is better 2 PPN / 6 Threads

40 OptiStruct Profiling Number of MPI Calls For 1GbE, communication time is mostly spent on point-to-point transfer MPI_Iprobe and MPI_Test are the tests for non-blocking transfers Overall runtime is significantly longer compared to faster interconnects For 10GbE, communication time is consumed by data transfer Amount of time for non-blocking transfers still significant Overall runtime reduces compared to 1GbE While time for data transfer reduces, collective operations has higher ratio as in overall For InfiniBand, overall runtime reduces Time consumed by MPI_Allreduce is more significant compared to data transfer Overall runtime reduces significantly compared to Ethernet 1GbE 10GbE EDR IB

41 OptiStruct Profiling Number of MPI Calls For 1GbE, communication time is mostly spent on point-to-point transfer MPI_Iprobe and MPI_Test are the tests for non-blocking transfers Overall runtime is significantly longer compared to faster interconnects For 10GbE, communication time is consumed by data transfer Amount of time for non-blocking transfers still significant Overall runtime reduces compared to 1GbE While time for data transfer reduces, collective operations has higher ratio as in overall For InfiniBand, overall runtime reduces Time consumed by MPI_Allreduce is more significant compared to data transfer Overall runtime reduces significantly compared to Ethernet 1GbE 10GbE EDR IB

42 OptiStruct Performance Interconnect EDR IB delivers superior scalability performance over previous InfiniBand EDR InfiniBand improves over FDR IB by 40% at 24 nodes EDR InfiniBand outperforms FDR InfiniBand by 9% at 16 nodes New EDR IB architecture supersedes previous FDR IB generation of in scalability 9% 40% Higher is better 4 PPN / 6 Threads

43 OptiStruct Performance Processes Per Node OptiStruct reduces communication by deploying hybrid MPI mode Hybrid MPI process can spawn threads; helps reducing communications on network By enabling more MPI processes per node, it helps to unlock additional performance The following environment setting and tuned flags are used : I_MPI_PIN_DOMAIN auto, I_MPI_ADJUST_ALLREDUCE 2, I_MPI_ADJUST_BCAST 1, I_MPI_ADJUST_REDUCE 2, ulimit -s unlimited 10% 4% Higher is better

44 OptiStruct Performance IMPI Tuning Tuning Intel MPI collective algorithm can improve performance MPI profile shows ~30% of runtime spent on MPI_Allreduce IB communications Default algorithm in Intel MPI is Recursive Doubling (I_MPI_ADJUST_ALLREDUCE=1) Rabenseifner's algorithm for Allreduce appears to the be the best on 24 nodes Higher is better Intel MPI 4 PPN / 6 Threads

45 OptiStruct Profiling MPI Message Sizes The most time consuming MPI communications are: MPI_Allreduce: Messages concentrated at 8B MPI_Iprobe and MPI_Test have volume of calls that test for completion of messages 2 PPN / 6 Threads

46 OptiStruct Performance CPU Frequency Increase in CPU clock speed allows higher job efficiency Up to 11% of high productivity by increasing clock speed from 2300MHz to 2600MHz Turbo Mode boosts job efficiency higher than increase in clock speed Up to 31% of performance jump by enabling Turbo Mode at 2600MHz Performance gain by turbo mode depends on environment factors, e.g. temperature 11% 4% 8% 10% 17% 31% Higher is better 4 PPN / 6 Threads

47 OptiStruct Profiling Disk I/O OptiStruct makes use of distributed I/O of local scratch of compute nodes Heavy disk IO appears to take place throughout the run on each compute node The high I/O usage causes system memory to also to be utilized for I/O caching Disk I/O is distributed on all compute nodes; thus provides higher I/O performance Workload would complete faster as more nodes take part on the distributed I/O Higher is better 4 PPN / 6 Threads

48 OptiStruct Profiling MPI Message Sizes Majority of data transfer takes place from rank 0 to the rest It appears that most data transfer takes place between rank 0 to the rest Those non-blocking communication appears data transfers to hide latency in network The collective operations appear to be much less in size 16 Nodes 32 Nodes 2 PPN / 6 Threads

49 OptiStruct Summary OptiStruct is designed to perform structural analysis at large scale OptiStruct designed hybrid MPI mode to perform at scale EDR InfiniBand shows to outperform Ethernet in scalability performance ~70 times better performance than 1GbE at 24 nodes 4.8x better performance than 10GbE at 24 nodes EDR InfiniBand improves over FDR IB by 40% at 24 nodes Hybrid MPI process can spawn threads; helps reducing communications on network By enabling more MPI processes per node, it helps to unlock additional performance Hybrid MPP version enhanced OptiStruct scalability Profiling and Tuning: CPU, I/O, Network MPI_Allreduce accounts for ~30% of runtime at scale Tuning for MPI_Allreduce should allow better performance at high core counts Guideline: 6 threads per MPI process yields the best performance Turbo Mode boosts job efficiency higher than increase in clock speed OptiStruct makes use of distributed I/O of local scratch of compute nodes Heavy disk IO appears to take place throughout the run on each compute node

50 Thank You Questions? Pak Lui All trademarks are property of their respective owners. All information is provided As-Is without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information contained herein. HPC Advisory Council undertakes no duty and assumes no obligation to update or correct any information presented herein 50

FLOW-3D Performance Benchmark and Profiling. September 2012

FLOW-3D Performance Benchmark and Profiling. September 2012 FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute

More information

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009 ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Scaling from Workstation to Cluster for Compute-Intensive Applications

Scaling from Workstation to Cluster for Compute-Intensive Applications Cluster Transition Guide: Scaling from Workstation to Cluster for Compute-Intensive Applications IN THIS GUIDE: The Why: Proven Performance Gains On Cluster Vs. Workstation The What: Recommended Reference

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster

More information

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing

Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing Microsoft Windows Compute Cluster Server Runs

More information

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering Enquiry No: Enq/IITK/ME/JB/02 Enquiry Date: 14/12/15 Last Date of Submission: 21/12/15 Formal quotations are invited for HPC cluster.

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure Scale-Out Smart for HPC, Cloud and Hyper-Converged Computing

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building

More information

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development Motti@mellanox.com

More information

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright 2013, Oracle and/or its affiliates. All rights reserved. 1 Oracle SPARC Server for Enterprise Computing Dr. Heiner Bauch Senior Account Architect 19. April 2013 2 The following is intended to outline our general product direction. It is intended for information

More information

Recommended hardware system configurations for ANSYS users

Recommended hardware system configurations for ANSYS users Recommended hardware system configurations for ANSYS users The purpose of this document is to recommend system configurations that will deliver high performance for ANSYS users across the entire range

More information

Mellanox Academy Online Training (E-learning)

Mellanox Academy Online Training (E-learning) Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),

More information

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive

More information

SGI HPC Systems Help Fuel Manufacturing Rebirth

SGI HPC Systems Help Fuel Manufacturing Rebirth SGI HPC Systems Help Fuel Manufacturing Rebirth Created by T A B L E O F C O N T E N T S 1.0 Introduction 1 2.0 Ongoing Challenges 1 3.0 Meeting the Challenge 2 4.0 SGI Solution Environment and CAE Applications

More information

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014 Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,

More information

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server High availability for lower expertise

More information

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools

More information

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Can High-Performance Interconnects Benefit Memcached and Hadoop? Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

Pedraforca: ARM + GPU prototype

Pedraforca: ARM + GPU prototype www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of

More information

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) ( TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being

More information

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction

More information

SAS Business Analytics. Base SAS for SAS 9.2

SAS Business Analytics. Base SAS for SAS 9.2 Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 SAS Business Analytics Base SAS for SAS 9.2 Red

More information

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This

More information

White Paper Solarflare High-Performance Computing (HPC) Applications

White Paper Solarflare High-Performance Computing (HPC) Applications Solarflare High-Performance Computing (HPC) Applications 10G Ethernet: Now Ready for Low-Latency HPC Applications Solarflare extends the benefits of its low-latency, high-bandwidth 10GbE server adapters

More information

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Building a Top500-class Supercomputing Cluster at LNS-BUAP Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad

More information

HP reference configuration for entry-level SAS Grid Manager solutions

HP reference configuration for entry-level SAS Grid Manager solutions HP reference configuration for entry-level SAS Grid Manager solutions Up to 864 simultaneous SAS jobs and more than 3 GB/s I/O throughput Technical white paper Table of contents Executive summary... 2

More information

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures 11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the

More information

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers Information Technology Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers Effective for FY2016 Purpose This document summarizes High Performance Computing

More information

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance

More information

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Strong Performance and Cluster

More information

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report

More information

SUN ORACLE EXADATA STORAGE SERVER

SUN ORACLE EXADATA STORAGE SERVER SUN ORACLE EXADATA STORAGE SERVER KEY FEATURES AND BENEFITS FEATURES 12 x 3.5 inch SAS or SATA disks 384 GB of Exadata Smart Flash Cache 2 Intel 2.53 Ghz quad-core processors 24 GB memory Dual InfiniBand

More information

Microsoft Private Cloud Fast Track Reference Architecture

Microsoft Private Cloud Fast Track Reference Architecture Microsoft Private Cloud Fast Track Reference Architecture Microsoft Private Cloud Fast Track is a reference architecture designed to help build private clouds by combining Microsoft software with NEC s

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

Crossing the Performance Chasm with OpenPOWER

Crossing the Performance Chasm with OpenPOWER Crossing the Performance Chasm with OpenPOWER Dr. Srini Chari Cabot Partners/IBM chari@cabotpartners.com #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Disclosure Copyright 215. Cabot Partners

More information

Hyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011

Hyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011 Hyperscale The new frontier for HPC Philippe Trautmann HPC/POD Sales Manager EMEA March 13th, 2011 Hyperscale the new frontier for HPC New HPC customer requirements demand a shift in technology and market

More information

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.

More information

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp

More information

Data Center Storage Solutions

Data Center Storage Solutions Data Center Storage Solutions Enterprise software, appliance and hardware solutions you can trust When it comes to storage, most enterprises seek the same things: predictable performance, trusted reliability

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers White Paper rev. 2015-11-27 2015 FlashGrid Inc. 1 www.flashgrid.io Abstract Oracle Real Application Clusters (RAC)

More information

HPC Update: Engagement Model

HPC Update: Engagement Model HPC Update: Engagement Model MIKE VILDIBILL Director, Strategic Engagements Sun Microsystems mikev@sun.com Our Strategy Building a Comprehensive HPC Portfolio that Delivers Differentiated Customer Value

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with This Dell Technical White Paper discusses the OLTP performance benefit achieved on a SQL Server database using a

More information

FUJITSU Enterprise Product & Solution Facts

FUJITSU Enterprise Product & Solution Facts FUJITSU Enterprise Product & Solution Facts shaping tomorrow with you Business-Centric Data Center The way ICT delivers value is fundamentally changing. Mobile, Big Data, cloud and social media are driving

More information

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended

More information

PRIMERGY server-based High Performance Computing solutions

PRIMERGY server-based High Performance Computing solutions PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating

More information

Intel Solid-State Drives Increase Productivity of Product Design and Simulation

Intel Solid-State Drives Increase Productivity of Product Design and Simulation WHITE PAPER Intel Solid-State Drives Increase Productivity of Product Design and Simulation Intel Solid-State Drives Increase Productivity of Product Design and Simulation A study of how Intel Solid-State

More information

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters High Throughput File Servers with SMB Direct, Using the 3 Flavors of network adapters Jose Barreto Principal Program Manager Microsoft Corporation Abstract In Windows Server 2012, we introduce the SMB

More information

HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture

HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture WHITE PAPER HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture Based on Microsoft SQL Server 2014 Data Warehouse

More information

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN Table of contents Executive summary... 2 Introduction... 2 Solution criteria... 3 Hyper-V guest machine configurations...

More information

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays Executive Summary Microsoft SQL has evolved beyond serving simple workgroups to a platform delivering sophisticated

More information

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This document

More information

4 th Workshop on Big Data Benchmarking

4 th Workshop on Big Data Benchmarking 4 th Workshop on Big Data Benchmarking MPP SQL Engines: architectural choices and their implications on benchmarking 09 Oct 2013 Agenda: Big Data Landscape Market Requirements Benchmark Parameters Benchmark

More information

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX White Paper BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH Abstract This white paper explains how to configure a Mellanox SwitchX Series switch to bridge the external network of an EMC Isilon

More information

Advancing Applications Performance With InfiniBand

Advancing Applications Performance With InfiniBand Advancing Applications Performance With InfiniBand Pak Lui, Application Performance Manager September 12, 2013 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server and

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

præsentation oktober 2011

præsentation oktober 2011 Johnny Olesen System X presale præsentation oktober 2011 2010 IBM Corporation 2 Hvem er jeg Dagens agenda Server overview System Director 3 4 Portfolio-wide Innovation with IBM System x and BladeCenter

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

HPC Growing Pains. Lessons learned from building a Top500 supercomputer

HPC Growing Pains. Lessons learned from building a Top500 supercomputer HPC Growing Pains Lessons learned from building a Top500 supercomputer John L. Wofford Center for Computational Biology & Bioinformatics Columbia University I. What is C2B2? Outline Lessons learned from

More information

Self service for software development tools

Self service for software development tools Self service for software development tools Michal Husejko, behalf of colleagues in CERN IT/PES CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Self service for software development tools

More information

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering slyness@appro.com Company Overview

More information

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA

More information

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*

More information

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Database Solutions Engineering By Murali Krishnan.K Dell Product Group October 2009

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Clusters: Mainstream Technology for CAE

Clusters: Mainstream Technology for CAE Clusters: Mainstream Technology for CAE Alanna Dwyer HPC Division, HP Linux and Clusters Sparked a Revolution in High Performance Computing! Supercomputing performance now affordable and accessible Linux

More information

RoCE vs. iwarp Competitive Analysis

RoCE vs. iwarp Competitive Analysis WHITE PAPER August 21 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...4 Summary...

More information

Choosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520

Choosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520 COMPETITIVE BRIEF August 2014 Choosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520 Introduction: How to Choose a Network Interface Card...1 Comparison: Mellanox ConnectX

More information

Deep Learning GPU-Based Hardware Platform

Deep Learning GPU-Based Hardware Platform Deep Learning GPU-Based Hardware Platform Hardware and Software Criteria and Selection Mourad Bouache Yahoo! Performance Engineering Group Sunnyvale, CA +1.408.784.1446 bouache@yahoo-inc.com John Glover

More information

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre University of Cambridge, UIS, HPC Service Authors: Wojciech Turek, Paul Calleja, John Taylor

More information

APACHE HADOOP PLATFORM HARDWARE INFRASTRUCTURE SOLUTIONS

APACHE HADOOP PLATFORM HARDWARE INFRASTRUCTURE SOLUTIONS APACHE HADOOP PLATFORM BIG DATA HARDWARE INFRASTRUCTURE SOLUTIONS 1 BIG DATA. BIG CHALLENGES. BIG OPPORTUNITY. How do you manage the VOLUME, VELOCITY & VARIABILITY of complex data streams in order to find

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation 1. Overview of NEC PCIe SSD Appliance for Microsoft SQL Server Page 2 NEC Corporation

More information

Dell Reference Configuration for Hortonworks Data Platform

Dell Reference Configuration for Hortonworks Data Platform Dell Reference Configuration for Hortonworks Data Platform A Quick Reference Configuration Guide Armando Acosta Hadoop Product Manager Dell Revolutionary Cloud and Big Data Group Kris Applegate Solution

More information

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun (dr.jhyun@sk.com) Storage Tech. Lab SK Telecom

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun (dr.jhyun@sk.com) Storage Tech. Lab SK Telecom Building All-Flash Software Defined Storages for Datacenters Ji Hyuck Yun (dr.jhyun@sk.com) Storage Tech. Lab SK Telecom Introduction R&D Motivation Synergy between SK Telecom and SK Hynix Service & Solution

More information

PowerEdge Servers and Solutions for Business Applications

PowerEdge Servers and Solutions for Business Applications PowerEdge s and Solutions for Business s PowerEdge servers and solutions for business At Dell, we listen to you every day, and what you have been telling us is that your infrastructures are being pushed

More information

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5. Performance benefit of MAX5 for databases The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5 Vinay Kulkarni Kent Swalin IBM

More information

Current Status of FEFS for the K computer

Current Status of FEFS for the K computer Current Status of FEFS for the K computer Shinji Sumimoto Fujitsu Limited Apr.24 2012 LUG2012@Austin Outline RIKEN and Fujitsu are jointly developing the K computer * Development continues with system

More information

Dell s SAP HANA Appliance

Dell s SAP HANA Appliance Dell s SAP HANA Appliance SAP HANA is the next generation of SAP in-memory computing technology. Dell and SAP have partnered to deliver an SAP HANA appliance that provides multipurpose, data source-agnostic,

More information

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011 Oracle Database Reliability, Performance and scalability on Intel platforms Mitch Shults, Intel Corporation October 2011 1 Intel Processor E7-8800/4800/2800 Product Families Up to 10 s and 20 Threads 30MB

More information