Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster
|
|
- Bathsheba McCormick
- 7 years ago
- Views:
Transcription
1 Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster Pak Lui pak@hpcadvisorycouncil.com May 7, 2015
2 Agenda Introduction to HPC Advisory Council Altair RADIOSS Benchmark Configuration Performance Benchmark Testing and Results MPI Profiling Summary Altair OptiStruct Benchmark Configuration Performance Benchmark Testing and Results MPI Profiling Summary Q&A / For More Information
3 The HPC Advisory Council World-wide HPC organization (400+ members) Bridges the gap between HPC usage and its full potential Provides best practices and a support/development center Explores future technologies and future developments Working Groups HPC Cloud, HPC Scale, HPC GPU, HPC Storage Leading edge solutions and technology demonstrations
4 Copyright 2015 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved. HPC Advisory Council Members
5 HPC Council Board HPC Advisory Council Chairman Gilad Shainer - gilad@hpcadvisorycouncil.com HPC Advisory Council HPC Scale SIG Chair Richard Graham richard@hpcadvisorycouncil.com HPC Advisory Council Media Relations and Events Director HPC Advisory Council HPC Cloud SIG Chair Brian Sparks - brian@hpcadvisorycouncil.com William Lu william@hpcadvisorycouncil.com HPC Advisory Council China Events Manager Blade Meng - blade@hpcadvisorycouncil.com HPC Advisory Council HPC GPU SIG Chair Sadaf Alam sadaf@hpcadvisorycouncil.com Director of the HPC Advisory Council, Asia Tong Liu - tong@hpcadvisorycouncil.com HPC Advisory Council India Outreach Goldi Misra goldi@hpcadvisorycouncil.com HPC Advisory Council HPC Works SIG Chair and Cluster Center Manager Pak Lui - pak@hpcadvisorycouncil.com Director of the HPC Advisory Council Switzerland Center of Excellence and HPC Storage SIG Chair Hussein Harake hussein@hpcadvisorycouncil.com HPC Advisory Council Director of Educational Outreach Scot Schultz scot@hpcadvisorycouncil.com HPC Advisory Council Workshop Program Director Eric Lantz eric@hpcadvisorycouncil.com HPC Advisory Council Programming Advisor Tarick Bedeir - Tarick@hpcadvisorycouncil.com HPC Advisory Council Research Steering Committee Director Cydney Stevens - cydney@hpcadvisorycouncil.com
6 HPC Advisory Council HPC Center InfiniBand-based Storage (Lustre) Juniper Heimdall Lustre FS 640 cores 280 cores Plutus Janus Athena 192 cores 456 cores 80 cores Vesta Thor Mala 704 cores 896 cores 16 GPUs
7 Special Interest Subgroups Missions HPC Scale To explore usage of commodity HPC as a replacement for multi-million dollar mainframes and proprietary based supercomputers with networks and clusters of microcomputers acting in unison to deliver high-end computing services. HPC Cloud To explore usage of HPC components as part of the creation of external/public/internal/private cloud computing environments. HPC Works To provide best practices for building balanced and scalable HPC systems, performance tuning and application guidelines. HPC Storage To demonstrate how to build high-performance storage solutions and their affect on application performance and productivity. One of the main interests of the HPC Storage subgroup is to explore Lustre based solutions, and to expose more users to the potential of Lustre over high-speed networks. HPC GPU To explore usage models of GPU components as part of next generation compute environments and potential optimizations for GPU based computing. HPC FSI To explore the usage of high-performance computing solutions for low latency trading, more productive simulations (such as Monte Carlo) and overall more efficient financial services.
8 HPC Advisory Council HPC Advisory Council (HPCAC) 400+ members Application best practices, case studies (Over 150) Benchmarking center with remote access for users World-wide workshops Value add for your customers to stay up to date and in tune to HPC market 2015 Workshops USA (Stanford University) February 2015 Switzerland March 2015 Brazil August 2015 Spain September 2015 China (HPC China) Oct 2015 For more information
9 2015 ISC High Performance Conference Student Cluster Competition University-based teams to compete and demonstrate the incredible capabilities of state-of- the-art HPC systems and applications on the 2015 ISC High Performance Conference show-floor The Student Cluster Competition is designed to introduce the next generation of students to the high performance computing world and community
10 RADIOSS Performance Study Research performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource - HPC Advisory Council Cluster Center Objectives Give overview of RADIOSS performance Compare different MPI libraries, network interconnects and others Understand RADIOSS communication patterns Provide best practices to increase RADIOSS productivity
11 About RADIOSS Compute-intensive simulation software for Manufacturing For 20+ years an established standard for automotive crash and impact Differentiated by its high scalability, quality and robustness Supports multiphysics simulation and advanced materials Used across all industries to improve safety and manufacturability Companies use RADIOSS to simulate real-world scenarios (crash tests, climate effects, etc.) to test the performance of a product
12 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-core Intel 2.60 GHz CPUs (Turbo on, max perf set in BIOS) OS: RHEL 6.5, OFED MLNX_OFED_LINUX InfiniBand SW stack Memory: 64GB memory, DDR MHz Hard Drives: 1TB 7.2 RPM SATA 2.5 Mellanox Switch-IB SB Gb/s InfiniBand VPI switch Mellanox ConnectX-4 EDR 100Gb/s InfiniBand VPI adapters Mellanox ConnectX-3 40/56Gb/s QDR/FDR InfiniBand VPI adapters Mellanox SwitchX SX Gb/s FDR InfiniBand VPI switch MPI: Intel MPI 5.0.2, Open MPI 1.8.4, Mellanox HPC-X v1.2.0 Application: Altair RADIOSS 13.0 Benchmark datasets: Neon benchmarks: 1 million elements (8ms, Double Precision), unless otherwise stated
13 About Intel Cluster Ready Intel Cluster Ready systems make it practical to use a cluster to increase your simulation and modeling productivity Simplifies selection, deployment, and operation of a cluster A single architecture platform supported by many OEMs, ISVs, cluster provisioning vendors, and interconnect providers Focus on your work productivity, spend less management time on the cluster Select Intel Cluster Ready Where the cluster is delivered ready to run Hardware and software are integrated and configured together Applications are registered, validating execution on the Intel Cluster Ready architecture Includes Intel Cluster Checker tool, to verify functionality and periodically check cluster health RADIOSS is Intel Cluster Ready
14 PowerEdge R730 Massive flexibility for data intensive operations Performance and efficiency Intelligent hardware-driven systems management with extensive power management features Innovative tools including automation for parts replacement and lifecycle manageability Broad choice of networking technologies from GbE to IB Built in redundancy with hot plug and swappable PSU, HDDs and fans Benefits Designed for performance workloads from big data analytics, distributed storage or distributed computing where local storage is key to classic HPC and large scale hosting environments High performance scale-out compute and low cost dense storage in one package Hardware Capabilities Flexible compute platform with dense storage capacity 2S/2U server, 6 PCIe slots Large memory footprint (Up to 768GB / 24 DIMMs) High I/O performance and optional storage configurations HDD options: 12 x or - 24 x x 2.5 HDDs in rear of server Up to 26 HDDs with 2 hot plug drives in rear of server for boot or scratch
15 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides better scalability performance than Ethernet 70 times better performance than 1GbE at 16 nodes / 448 cores 4.8x better performance than 10GbE at 16 nodes / cores Ethernet solutions does not scale beyond 4 nodes with pure MPI 70x 4.8x Higher is better Intel MPI 28 Processes/Node
16 RADIOSS Performance Interconnect (MPP) EDR InfiniBand provides better scalability performance EDR InfiniBand improves over QDR IB by 28% at 16 nodes / 448 cores Similarly, EDR InfiniBand outperforms FDR InfiniBand by 25% at 16 nodes 28% 25% Higher is better Intel MPI 28 Processes/Node
17 RADIOSS Performance CPU Cores Running more cores per node generally improves overall performance Seen improvement of 18% from 20 to 28 cores per node at 8 nodes Guideline: Most optimal workload distribution is 4000 elements/process For test case of 1 million elements, most optimal core sizes is ~256 cores 4000 elements per process should provides sufficient workload for each process Hybrid MPP (HMPP) provides way to achieve additional scalability on more CPUs 18% 6% Higher is better Intel MPI
18 RADIOSS Performance Simulation Time Increasing simulation time increase the run time at a faster rate Increasing a 8ms simulation to 80ms can result in much longer runtime 10x longer simulation run can result in a 14x in the runtime Contacts usually become more severe at the middle of the run, so it costs more complexity and CPU utilization and therefore cost/cycle increases 13x 14x 14x Higher is better Intel MPI
19 RADIOSS Profiling % Time Spent on MPI RADIOSS utilizes point-to-point communications in most data transfers The most time MPI consuming calls is MPI_Waitany() and MPI_Wait() MPI_Recv(55%), MPI_Waitany(23%), MPI_Allreduce(13%) MPP Mode 28 Processes/Node
20 RADIOSS Performance Intel MPI Tuning (MPP) Tuning Intel MPI collective algorithm can improve performance MPI profile shows about 20% of runtime spent on MPI_Allreduce communications Default algorithm in Intel MPI is Recursive Doubling The default algorithm is the best among all tested for MPP Higher is better Intel MPI 28 Processes/Node
21 RADIOSS Performance MPI Libraries (MPP) Both Intel MPI and Open MPI performs similarly MPI profile shows ~20% of MPI time spent on MPI_Allreduce communications MPI collective operations (such as MPI_Allreduce) can potentially be optimized Support for Open MPI is new for RADIOSS HPC-X is a tuned MPI distribution based on the latest Open MPI Higher is better 28 Processes/Node
22 RADIOSS Hybrid MPP Parallelization Highly parallel code Multi-level parallelization Domain decomposition MPI parallelization Multithreading OpenMP Enhanced performance Best scalability in the marketplace High efficiency on large HPC clusters Unique, proven method for rich scalability over thousands of cores for FEA Flexibility -- easy tuning of MPI & OpenMP Robustness -- parallel arithmetic allows perfect repeatability in parallel
23 RADIOSS Performance Hybrid MPP version Enabling Hybrid MPP mode unlocks the RADIOSS scalability At larger scale, productivity improves as more threads involves As more threads involved, amount of communications by processes are reduced At 32 nodes/896 cores, best configuration is 1 process per socket to spawn 14 threads each 28 threads/1 PPN is not advised due to breach of data locality across different CPU socket The following environment setting and tuned flags are used for Intel MPI: I_MPI_PIN_DOMAIN auto I_MPI_ADJUST_ALLREDUCE 5 I_MPI_ADJUST_BCAST 1 KMP_AFFINITY compact 32% 70% KMP_STACKSIZE 400m 3.7x ulimit -s unlimited Intel MPI EDR InfiniBand Higher is better
24 RADIOSS Profiling MPI Comm. Time For MPP utilizes most non-blocking calls for communications MPI_Recv, MPI_Waitany, MPI_Allreduce are used most of the time For HMPP, communication behavior has changed Higher time percentage in MPI_Waitany, MPI_Allreduce, and MPI_Recv MPP, 28PPN HMPP, 2PPN / 14 Threads Intel MPI At 32 Nodes
25 RADIOSS Profiling MPI Message Sizes Most time consuming MPI communications for MPP are: MPI_Recv: Messages concentrated at 640B, 1KB, 320B, 1280B MPI_Waitany: Messages are: 48B, 8B, 384B MPI_Allreduce: Most message sizes appears at 80B MPP, 28PPN HMPP, 2PPN / 14 Threads Pure MPP 28 Processes/Node
26 RADIOSS Performance Intel MPI Tuning (DP) For Hybrid MPP DP, tuning MPI_Allreduce shows more gain than MPP For DAPL provider, Binomial gather+scatter #5 improved perf by 27% over default For OFA provider, tuned MPI_Allreduce algorithm improves by 44% over default Both OFA and DAPL improved by tuning I_MPI_ADJUST_ALLREDUCE=5 Flags for OFA: I_MPI_OFA_USE_XRC 1. For DAPL: ofa-v2-mlx5_0-1u provider 27% 44% Higher is better Intel MPI 2 PPN / 14 OpenMP
27 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than Ethernet 214% better performance than 1GbE at 16 nodes 104% better performance than 10GbE at 16 nodes InfiniBand typically outperforms other interconnect in collective operations 214% 104% Higher is better Intel MPI 2 PPN / 14 OpenMP
28 RADIOSS Performance Interconnect (HMPP) EDR InfiniBand provides better scalability performance than FDR IB EDR IB outperforms FDR IB by 27% at 32 nodes Improvement for EDR InfiniBand occurs at high node count 27% Higher is better Intel MPI 2 PPN / 14 OpenMP
29 RADIOSS Performance Floating Point Precision Single precision job runs faster double precision SP provides 47% speedup than DP Similar scalability is seen for double precision tests 47% Higher is better Intel MPI 2 PPN / 14 OpenMP
30 RADIOSS Performance CPU Frequency Increasing CPU core frequency enables higher job efficiency 18% of performance jump from 2.3GHz to 2.6GHz (13% increase in clock speed) 29% of performance jump from 2.0GHz to 2.6GHz (30% increase in clock speed) Increase in performance gain exceeds the increase CPU frequencies CPU bound application see higher benefit of using CPU with higher frequencies 29% 18% Higher is better Intel MPI 2 PPN / 14 OpenMP
31 RADIOSS Performance System Generations Intel E5-2680v3 (Haswell) cluster outperforms prior generations Performs faster by 100% vs Jupiter, by 238% vs Janus at 16 nodes System components used: Thor: 2-socket Intel 2133MHz DIMMs, EDR IB, v13.0 Jupiter: 2-socket Intel 1600MHz DIMMs, FDR IB, v12.0 Janus: 2-socket Intel 1333MHz DIMMs, QDR IB, v % 100% Single Precision
32 RADIOSS Profiling Memory Required Differences in memory consumption between MPP and HMPP MPP: Memory required to run the workload is ~5GB per node HMPP: Approximately 400MB per node is needed (as there are 2 PPN) Considered as a small workload but good enough to observe application behavior MPP, 28PPN HMPP, 2PPN / 14 Threads At 32 Nodes 28 CPU Cores/Node
33 RADIOSS Summary RADIOSS is designed to perform at large scale HPC environment Shows excellent scalability over 896 cores/32 nodes and beyond with Hybrid MPP Hybrid MPP version enhanced RADIOSS scalability 2 MPI processes per node (or 1 MPI process per socket), 14 threads each Additional CPU cores generally accelerating time to solution performance Intel E5-2680v3 (Haswell) cluster outperforms prior generations Performs faster by 100% vs Sandy Bridge, by 238% vs Westmere at 16 nodes Network and MPI Tuning EDR InfiniBand outperforms other Ethernet-based interconnects in scalability EDR InfiniBand delivers higher scalability performance than FDR and QDR IB Tuning environment parameters is important to maximize performance Tuning MPI collective ops helps RADIOSS to achieve even better scalability
34 OptiStruct by Altair Altair OptiStruct OptiStruct is an industry proven, modern structural analysis solver Solve for linear and non-linear structural problems under static and dynamic loadings Market-leading solution for structural design and optimization Helps designers and engineers to analyze and optimize structures Optimize for strength, durability and NVH (Noise, Vibration, Harshness) characteristics Help to rapidly develop innovative, lightweight and structurally efficient designs Based on finite-element and multi-body dynamics technology
35 Test Cluster Configuration Dell PowerEdge R node (896-core) Thor cluster Dual-Socket 14-core Intel 2.60 GHz CPUs (Turbo on, max perf set in BIOS) OS: RHEL 6.5, OFED MLNX_OFED_LINUX InfiniBand SW stack Memory: 64GB memory, DDR MHz Hard Drives: 1TB 7.2 RPM SATA 2.5 Mellanox Switch-IB SB Gb/s InfiniBand VPI switch Mellanox SwitchX SX Gb/s FDR InfiniBand VPI switch Mellanox ConnectX-4 EDR 100Gb/s InfiniBand VPI adapters Mellanox ConnectX-3 40/56Gb/s QDR/FDR InfiniBand VPI adapters MPI: Intel MPI Application: Altair OptiStruct 13.0 Benchmark datasets: Engine Assembly
36 Model Engine Block Grids (Structural): Elements: Local Coordinate Systems: 32 Degrees of Freedom: 4,721,581 Non-zero Stiffness Terms: ,216 Pretension sections: 10 Nonlinear run small deformation: 3 subcases (reduced to 1 subcase) 27 iterations totally Memory requirement: 96 GB per node for 1 MPI process job (for in-core) 16GB per node for 8 MPI processes job 8GB per node for 24 MPI processes job
37 PowerEdge R730 Massive flexibility for data intensive operations Performance and efficiency Intelligent hardware-driven systems management with extensive power management features Innovative tools including automation for parts replacement and lifecycle manageability Broad choice of networking technologies from GbE to IB Built in redundancy with hot plug and swappable PSU, HDDs and fans Benefits Designed for performance workloads from big data analytics, distributed storage or distributed computing where local storage is key to classic HPC and large scale hosting environments High performance scale-out compute and low cost dense storage in one package Hardware Capabilities Flexible compute platform with dense storage capacity 2S/2U server, 6 PCIe slots Large memory footprint (Up to 768GB / 24 DIMMs) High I/O performance and optional storage configurations HDD options: 12 x or - 24 x x 2.5 HDDs in rear of server Up to 26 HDDs with 2 hot plug drives in rear of server for boot or scratch
38 OptiStruct Performance CPU Cores Running more cores per node generally improves overall performance The -nproc parameter specified the number of threads spawned per MPI process Guideline: 6 threads per MPI process yields the best performance Ideal threads to be spawned appears to be 6 threads per MPI process (either 2/4 PPN) Having 6 threads spawned by each MPI process performs best among all other tested Higher is better
39 OptiStruct Performance Interconnect EDR InfiniBand provides superior scalability performance over Ethernet 11 times better performance than 1GbE at 24 nodes 90% better performance than 10GbE at 24 nodes Ethernet solutions does not scale beyond 4 nodes 11x 90% Higher is better 2 PPN / 6 Threads
40 OptiStruct Profiling Number of MPI Calls For 1GbE, communication time is mostly spent on point-to-point transfer MPI_Iprobe and MPI_Test are the tests for non-blocking transfers Overall runtime is significantly longer compared to faster interconnects For 10GbE, communication time is consumed by data transfer Amount of time for non-blocking transfers still significant Overall runtime reduces compared to 1GbE While time for data transfer reduces, collective operations has higher ratio as in overall For InfiniBand, overall runtime reduces Time consumed by MPI_Allreduce is more significant compared to data transfer Overall runtime reduces significantly compared to Ethernet 1GbE 10GbE EDR IB
41 OptiStruct Profiling Number of MPI Calls For 1GbE, communication time is mostly spent on point-to-point transfer MPI_Iprobe and MPI_Test are the tests for non-blocking transfers Overall runtime is significantly longer compared to faster interconnects For 10GbE, communication time is consumed by data transfer Amount of time for non-blocking transfers still significant Overall runtime reduces compared to 1GbE While time for data transfer reduces, collective operations has higher ratio as in overall For InfiniBand, overall runtime reduces Time consumed by MPI_Allreduce is more significant compared to data transfer Overall runtime reduces significantly compared to Ethernet 1GbE 10GbE EDR IB
42 OptiStruct Performance Interconnect EDR IB delivers superior scalability performance over previous InfiniBand EDR InfiniBand improves over FDR IB by 40% at 24 nodes EDR InfiniBand outperforms FDR InfiniBand by 9% at 16 nodes New EDR IB architecture supersedes previous FDR IB generation of in scalability 9% 40% Higher is better 4 PPN / 6 Threads
43 OptiStruct Performance Processes Per Node OptiStruct reduces communication by deploying hybrid MPI mode Hybrid MPI process can spawn threads; helps reducing communications on network By enabling more MPI processes per node, it helps to unlock additional performance The following environment setting and tuned flags are used : I_MPI_PIN_DOMAIN auto, I_MPI_ADJUST_ALLREDUCE 2, I_MPI_ADJUST_BCAST 1, I_MPI_ADJUST_REDUCE 2, ulimit -s unlimited 10% 4% Higher is better
44 OptiStruct Performance IMPI Tuning Tuning Intel MPI collective algorithm can improve performance MPI profile shows ~30% of runtime spent on MPI_Allreduce IB communications Default algorithm in Intel MPI is Recursive Doubling (I_MPI_ADJUST_ALLREDUCE=1) Rabenseifner's algorithm for Allreduce appears to the be the best on 24 nodes Higher is better Intel MPI 4 PPN / 6 Threads
45 OptiStruct Profiling MPI Message Sizes The most time consuming MPI communications are: MPI_Allreduce: Messages concentrated at 8B MPI_Iprobe and MPI_Test have volume of calls that test for completion of messages 2 PPN / 6 Threads
46 OptiStruct Performance CPU Frequency Increase in CPU clock speed allows higher job efficiency Up to 11% of high productivity by increasing clock speed from 2300MHz to 2600MHz Turbo Mode boosts job efficiency higher than increase in clock speed Up to 31% of performance jump by enabling Turbo Mode at 2600MHz Performance gain by turbo mode depends on environment factors, e.g. temperature 11% 4% 8% 10% 17% 31% Higher is better 4 PPN / 6 Threads
47 OptiStruct Profiling Disk I/O OptiStruct makes use of distributed I/O of local scratch of compute nodes Heavy disk IO appears to take place throughout the run on each compute node The high I/O usage causes system memory to also to be utilized for I/O caching Disk I/O is distributed on all compute nodes; thus provides higher I/O performance Workload would complete faster as more nodes take part on the distributed I/O Higher is better 4 PPN / 6 Threads
48 OptiStruct Profiling MPI Message Sizes Majority of data transfer takes place from rank 0 to the rest It appears that most data transfer takes place between rank 0 to the rest Those non-blocking communication appears data transfers to hide latency in network The collective operations appear to be much less in size 16 Nodes 32 Nodes 2 PPN / 6 Threads
49 OptiStruct Summary OptiStruct is designed to perform structural analysis at large scale OptiStruct designed hybrid MPI mode to perform at scale EDR InfiniBand shows to outperform Ethernet in scalability performance ~70 times better performance than 1GbE at 24 nodes 4.8x better performance than 10GbE at 24 nodes EDR InfiniBand improves over FDR IB by 40% at 24 nodes Hybrid MPI process can spawn threads; helps reducing communications on network By enabling more MPI processes per node, it helps to unlock additional performance Hybrid MPP version enhanced OptiStruct scalability Profiling and Tuning: CPU, I/O, Network MPI_Allreduce accounts for ~30% of runtime at scale Tuning for MPI_Allreduce should allow better performance at high core counts Guideline: 6 threads per MPI process yields the best performance Turbo Mode boosts job efficiency higher than increase in clock speed OptiStruct makes use of distributed I/O of local scratch of compute nodes Heavy disk IO appears to take place throughout the run on each compute node
50 Thank You Questions? Pak Lui All trademarks are property of their respective owners. All information is provided As-Is without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information contained herein. HPC Advisory Council undertakes no duty and assumes no obligation to update or correct any information presented herein 50
FLOW-3D Performance Benchmark and Profiling. September 2012
FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute
More informationMaximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,
More informationLS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
More informationECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009
ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu
More informationECLIPSE Performance Benchmarks and Profiling. January 2009
ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster
More informationScaling from Workstation to Cluster for Compute-Intensive Applications
Cluster Transition Guide: Scaling from Workstation to Cluster for Compute-Intensive Applications IN THIS GUIDE: The Why: Proven Performance Gains On Cluster Vs. Workstation The What: Recommended Reference
More informationHigh Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
More informationAchieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
More informationStovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.
Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster
More informationLS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.
LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability
More informationEnabling High performance Big Data platform with RDMA
Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery
More informationFinite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing
Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing Microsoft Windows Compute Cluster Server Runs
More informationINDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering
INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering Enquiry No: Enq/IITK/ME/JB/02 Enquiry Date: 14/12/15 Last Date of Submission: 21/12/15 Formal quotations are invited for HPC cluster.
More informationAccelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
More informationData Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure
Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure Scale-Out Smart for HPC, Cloud and Hyper-Converged Computing
More informationHigh Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
More informationAchieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building
More informationMellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct
Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development Motti@mellanox.com
More informationCopyright 2013, Oracle and/or its affiliates. All rights reserved.
1 Oracle SPARC Server for Enterprise Computing Dr. Heiner Bauch Senior Account Architect 19. April 2013 2 The following is intended to outline our general product direction. It is intended for information
More informationRecommended hardware system configurations for ANSYS users
Recommended hardware system configurations for ANSYS users The purpose of this document is to recommend system configurations that will deliver high performance for ANSYS users across the entire range
More informationMellanox Academy Online Training (E-learning)
Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),
More informationHP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief
Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...
More informationDIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
More informationThree Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture
White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive
More informationSGI HPC Systems Help Fuel Manufacturing Rebirth
SGI HPC Systems Help Fuel Manufacturing Rebirth Created by T A B L E O F C O N T E N T S 1.0 Introduction 1 2.0 Ongoing Challenges 1 3.0 Meeting the Challenge 2 4.0 SGI Solution Environment and CAE Applications
More informationComparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,
More informationData Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server
Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server High availability for lower expertise
More informationAccelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing
Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools
More informationCan High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
More informationDriving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA
WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5
More informationAgenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC
HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical
More informationPedraforca: ARM + GPU prototype
www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of
More informationHow To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationSMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
More informationALPS Supercomputing System A Scalable Supercomputer with Flexible Services
ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being
More informationIntroduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution
Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction
More informationSAS Business Analytics. Base SAS for SAS 9.2
Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 SAS Business Analytics Base SAS for SAS 9.2 Red
More informationPower Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This
More informationWhite Paper Solarflare High-Performance Computing (HPC) Applications
Solarflare High-Performance Computing (HPC) Applications 10G Ethernet: Now Ready for Low-Latency HPC Applications Solarflare extends the benefits of its low-latency, high-bandwidth 10GbE server adapters
More informationBuilding a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
More informationHP reference configuration for entry-level SAS Grid Manager solutions
HP reference configuration for entry-level SAS Grid Manager solutions Up to 864 simultaneous SAS jobs and more than 3 GB/s I/O throughput Technical white paper Table of contents Executive summary... 2
More informationA Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures
11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the
More informationPurchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers
Information Technology Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers Effective for FY2016 Purpose This document summarizes High Performance Computing
More informationHETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK
HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance
More informationData Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server
Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Strong Performance and Cluster
More informationIBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
More informationSUN ORACLE EXADATA STORAGE SERVER
SUN ORACLE EXADATA STORAGE SERVER KEY FEATURES AND BENEFITS FEATURES 12 x 3.5 inch SAS or SATA disks 384 GB of Exadata Smart Flash Cache 2 Intel 2.53 Ghz quad-core processors 24 GB memory Dual InfiniBand
More informationMicrosoft Private Cloud Fast Track Reference Architecture
Microsoft Private Cloud Fast Track Reference Architecture Microsoft Private Cloud Fast Track is a reference architecture designed to help build private clouds by combining Microsoft software with NEC s
More informationAppro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
More informationCrossing the Performance Chasm with OpenPOWER
Crossing the Performance Chasm with OpenPOWER Dr. Srini Chari Cabot Partners/IBM chari@cabotpartners.com #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Disclosure Copyright 215. Cabot Partners
More informationHyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011
Hyperscale The new frontier for HPC Philippe Trautmann HPC/POD Sales Manager EMEA March 13th, 2011 Hyperscale the new frontier for HPC New HPC customer requirements demand a shift in technology and market
More informationDell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820
Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.
More informationCisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage
Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp
More informationData Center Storage Solutions
Data Center Storage Solutions Enterprise software, appliance and hardware solutions you can trust When it comes to storage, most enterprises seek the same things: predictable performance, trusted reliability
More informationHadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
More informationConverged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers
Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers White Paper rev. 2015-11-27 2015 FlashGrid Inc. 1 www.flashgrid.io Abstract Oracle Real Application Clusters (RAC)
More informationHPC Update: Engagement Model
HPC Update: Engagement Model MIKE VILDIBILL Director, Strategic Engagements Sun Microsystems mikev@sun.com Our Strategy Building a Comprehensive HPC Portfolio that Delivers Differentiated Customer Value
More informationDavid Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM
More informationAchieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage
Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with This Dell Technical White Paper discusses the OLTP performance benefit achieved on a SQL Server database using a
More informationFUJITSU Enterprise Product & Solution Facts
FUJITSU Enterprise Product & Solution Facts shaping tomorrow with you Business-Centric Data Center The way ICT delivers value is fundamentally changing. Mobile, Big Data, cloud and social media are driving
More informationCORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER
CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended
More informationPRIMERGY server-based High Performance Computing solutions
PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating
More informationIntel Solid-State Drives Increase Productivity of Product Design and Simulation
WHITE PAPER Intel Solid-State Drives Increase Productivity of Product Design and Simulation Intel Solid-State Drives Increase Productivity of Product Design and Simulation A study of how Intel Solid-State
More informationHigh Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters
High Throughput File Servers with SMB Direct, Using the 3 Flavors of network adapters Jose Barreto Principal Program Manager Microsoft Corporation Abstract In Windows Server 2012, we introduce the SMB
More informationHP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture
WHITE PAPER HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture Based on Microsoft SQL Server 2014 Data Warehouse
More informationHP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN
HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN Table of contents Executive summary... 2 Introduction... 2 Solution criteria... 3 Hyper-V guest machine configurations...
More informationThe Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays
The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays Executive Summary Microsoft SQL has evolved beyond serving simple workgroups to a platform delivering sophisticated
More informationPower Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This document
More information4 th Workshop on Big Data Benchmarking
4 th Workshop on Big Data Benchmarking MPP SQL Engines: architectural choices and their implications on benchmarking 09 Oct 2013 Agenda: Big Data Landscape Market Requirements Benchmark Parameters Benchmark
More informationBRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX
White Paper BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH Abstract This white paper explains how to configure a Mellanox SwitchX Series switch to bridge the external network of an EMC Isilon
More informationAdvancing Applications Performance With InfiniBand
Advancing Applications Performance With InfiniBand Pak Lui, Application Performance Manager September 12, 2013 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server and
More informationCloud Computing through Virtualization and HPC technologies
Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC
More informationpræsentation oktober 2011
Johnny Olesen System X presale præsentation oktober 2011 2010 IBM Corporation 2 Hvem er jeg Dagens agenda Server overview System Director 3 4 Portfolio-wide Innovation with IBM System x and BladeCenter
More informationPerformance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationHPC Growing Pains. Lessons learned from building a Top500 supercomputer
HPC Growing Pains Lessons learned from building a Top500 supercomputer John L. Wofford Center for Computational Biology & Bioinformatics Columbia University I. What is C2B2? Outline Lessons learned from
More informationSelf service for software development tools
Self service for software development tools Michal Husejko, behalf of colleagues in CERN IT/PES CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Self service for software development tools
More informationIntel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband
Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering slyness@appro.com Company Overview
More informationInfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair
InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA
More informationACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING
ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*
More informationBest Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays
Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Database Solutions Engineering By Murali Krishnan.K Dell Product Group October 2009
More informationIBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud
IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain
More informationSun Constellation System: The Open Petascale Computing Architecture
CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical
More informationBenchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
More informationClusters: Mainstream Technology for CAE
Clusters: Mainstream Technology for CAE Alanna Dwyer HPC Division, HP Linux and Clusters Sparked a Revolution in High Performance Computing! Supercomputing performance now affordable and accessible Linux
More informationRoCE vs. iwarp Competitive Analysis
WHITE PAPER August 21 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...4 Summary...
More informationChoosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520
COMPETITIVE BRIEF August 2014 Choosing the Best Network Interface Card Mellanox ConnectX -3 Pro EN vs. Intel X520 Introduction: How to Choose a Network Interface Card...1 Comparison: Mellanox ConnectX
More informationDeep Learning GPU-Based Hardware Platform
Deep Learning GPU-Based Hardware Platform Hardware and Software Criteria and Selection Mourad Bouache Yahoo! Performance Engineering Group Sunnyvale, CA +1.408.784.1446 bouache@yahoo-inc.com John Glover
More informationCommoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre
Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre University of Cambridge, UIS, HPC Service Authors: Wojciech Turek, Paul Calleja, John Taylor
More informationAPACHE HADOOP PLATFORM HARDWARE INFRASTRUCTURE SOLUTIONS
APACHE HADOOP PLATFORM BIG DATA HARDWARE INFRASTRUCTURE SOLUTIONS 1 BIG DATA. BIG CHALLENGES. BIG OPPORTUNITY. How do you manage the VOLUME, VELOCITY & VARIABILITY of complex data streams in order to find
More informationKriterien für ein PetaFlop System
Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working
More informationPSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation
PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation 1. Overview of NEC PCIe SSD Appliance for Microsoft SQL Server Page 2 NEC Corporation
More informationDell Reference Configuration for Hortonworks Data Platform
Dell Reference Configuration for Hortonworks Data Platform A Quick Reference Configuration Guide Armando Acosta Hadoop Product Manager Dell Revolutionary Cloud and Big Data Group Kris Applegate Solution
More informationBuilding All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun (dr.jhyun@sk.com) Storage Tech. Lab SK Telecom
Building All-Flash Software Defined Storages for Datacenters Ji Hyuck Yun (dr.jhyun@sk.com) Storage Tech. Lab SK Telecom Introduction R&D Motivation Synergy between SK Telecom and SK Hynix Service & Solution
More informationPowerEdge Servers and Solutions for Business Applications
PowerEdge s and Solutions for Business s PowerEdge servers and solutions for business At Dell, we listen to you every day, and what you have been telling us is that your infrastructures are being pushed
More informationThe MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.
Performance benefit of MAX5 for databases The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5 Vinay Kulkarni Kent Swalin IBM
More informationCurrent Status of FEFS for the K computer
Current Status of FEFS for the K computer Shinji Sumimoto Fujitsu Limited Apr.24 2012 LUG2012@Austin Outline RIKEN and Fujitsu are jointly developing the K computer * Development continues with system
More informationDell s SAP HANA Appliance
Dell s SAP HANA Appliance SAP HANA is the next generation of SAP in-memory computing technology. Dell and SAP have partnered to deliver an SAP HANA appliance that provides multipurpose, data source-agnostic,
More informationOracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011
Oracle Database Reliability, Performance and scalability on Intel platforms Mitch Shults, Intel Corporation October 2011 1 Intel Processor E7-8800/4800/2800 Product Families Up to 10 s and 20 Threads 30MB
More information