MSC Nastran Performance Benchmark and Profiling. August 2012

Similar documents
FLOW-3D Performance Benchmark and Profiling. September 2012

ECLIPSE Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

New to servers. Are you new to servers? Consider these HP ProLiant Essentials servers. Family guide HP ProLiant rack and tower servers

High Performance Computing in CST STUDIO SUITE

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

Clusters: Mainstream Technology for CAE

IBM System x family brochure

Cisco SmartPlay Select. Cisco Global Data Center Promotional Program

HP PCIe IO Accelerator For Proliant Rackmount Servers And BladeSystems

APACHE HADOOP PLATFORM HARDWARE INFRASTRUCTURE SOLUTIONS

HP Cloudline Overview

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

HP reference configuration for entry-level SAS Grid Manager solutions

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

præsentation oktober 2011

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

HP Mellanox Low Latency Benchmark Report 2012 Benchmark Report

Fujitsu PRIMERGY Servers Portfolio

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

HP recommended configurations for Microsoft Exchange Server 2013 and HP ProLiant Gen8 with direct attached storage (DAS)

C460 M4 Flexible Compute for SAP HANA Landscapes. Judy Lee Released: April, 2015

White Paper. Better Performance, Lower Costs. The Advantages of IBM PowerLinux 7R2 with PowerVM versus HP DL380p G8 with vsphere 5.

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture

HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads

HP recommended configuration for Microsoft Exchange Server 2010: ProLiant DL370 G6 supporting GB mailboxes

How System Settings Impact PCIe SSD Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

PCI Express and Storage. Ron Emerick, Sun Microsystems

Hadoop on the Gordon Data Intensive Cluster

IBM System x family brochure

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Up to 4 PCI-E SSDs Four or Two Hot-Pluggable Nodes in 2U

HP Moonshot System. Table of contents. A new style of IT accelerating innovation at scale. Technical white paper

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

HP ProLiant Server Booklet. Virtualization Reliability Efficiency Agility

FatTwin. 16% Lower Power Consumption 30% More Storage Capacity. Evolutionary 4U Twin Architecture. 4 Nodes in 4U 8 Hot-swap 3.

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Cisco Unified Computing System Hardware

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

HP high availability solutions for Microsoft SQL Server Fast Track Data Warehouse using SQL Server 2012 failover clustering

Power Redundancy. I/O Connectivity Blade Compatibility KVM Support

Virtual Compute Appliance Frequently Asked Questions

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Michael Kagan.

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

Performance Evaluation, Scalability Analysis, and Optimization Tuning of HyperWorks Solvers on a Modern HPC Compute Cluster

Fujitsu PRIMERGY BX920 S2 Dual-Socket Server

Configuring the HP DL380 Gen9 24-SFF CTO Server as an HP Vertica Node. HP Vertica Analytic Database

TREND MICRO SOFTWARE APPLIANCE SUPPORT

Cost Efficient VDI. XenDesktop 7 on Commodity Hardware

HP Proliant BL460c G7

How To Write An Article On An Hp Appsystem For Spera Hana

Fujitsu PRIMERGY BX924 S2 Dual-Socket server

Configuration Maximums VMware vsphere 4.0

The Hardware Dilemma. Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering

NEC Express5800/E120d-1 Configuration Guide

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Windows Server 2008 R2 for Itanium-Based Systems offers the following high-end features and capabilities:

FUJITSU Enterprise Product & Solution Facts

Business white paper. HP Process Automation. Version 7.0. Server performance

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

PCI Express Impact on Storage Architectures and Future Data Centers

The following InfiniBand adaptor products based on Mellanox technologies are available from HP

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

HP 85 TB reference architectures for Microsoft SQL Server 2012 Fast Track Data Warehouse: HP ProLiant DL980 G7 and P2000 G3 MSA Storage

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

RDMA over Ethernet - A Preliminary Study

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

HPC Update: Engagement Model

PCI Express Impact on Storage Architectures. Ron Emerick, Sun Microsystems

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

HP ProLiant Servers ML310e Gen8 Fantastic 5 Cash Back Offer

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Data Sheet Fujitsu Server PRIMERGY BX924 S4 Dual Socket Server Blade

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

Data Sheet FUJITSU Server PRIMERGY TX1310 M1 Tower Server

Building Clusters for Gromacs and other HPC applications

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.

IBM System Cluster 1350 ANSYS Microsoft Windows Compute Cluster Server

HP RA for SAS Visual Analytics on HP ProLiant BL460c Gen8 Servers running Linux

Solution Brief July All-Flash Server-Side Storage for Oracle Real Application Clusters (RAC) on Oracle Linux

How To Configure Your Computer With A Microsoft X86 V3.2 (X86) And X86 (Xo) (Xos) (Powerbook) (For Microsoft) (Microsoft) And Zilog (X

How To Test Nvm Express On A Microsoft I7-3770S (I7) And I7 (I5) Ios 2 (I3) (I2) (Sas) (X86) (Amd)

Transcription:

MSC Nastran Performance Benchmark and Profiling August 2012

Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting vendors solutions please refer to: www.mellanox.com, http://www.hp.com/go/hpc For more information on the application: http://www.mscsoftware.com/ 2

MSC Nastran Application MSC Nastran is a widely used Finite Element Analysis (FEA) solver Used for simulating stress, dynamics, or vibration of real-world, complex systems Nearly every spacecraft, aircraft, and vehicle designed in the last 40 years has been analyzed using MSC Nastran 3

Objectives The presented research was done to provide best practices MSC Nastran performance benchmarking Interconnect performance comparisons MPI performance comparison Understanding MSC Nastran communication patterns The presented results will demonstrate The scalability of the compute environment to provide nearly linear application scalability 4

Test Cluster Configuration HP ProLiant SL230s Gen8 4-node Athena cluster Processors: Dual Eight-Core Intel Xeon E5-2680 @ 2.7 GHz Memory: 32GB per node, 1600MHz DDR3 DIMMs OS: RHEL 6 Update 2, OFED 1.5.3 InfiniBand SW stack Mellanox ConnectX-3 VPI InfiniBand adapters Mellanox SwitchX SX6036 56Gb/s InfiniBand and 40Gb/s Ethernet Switch MPI: Platform MPI 8.1, Intel MPI 4.0.3 Application: MSC Nastran 2012.1 Benchmark Workload: Input dataset: xl0tdf1: Power Train (Ndof: 529,257, SOL108, Direct Freq) md0mdf1: Half Sphere (Ndof: 42,066 SOL 108, Direct Freq w/ Umfpack) 5

About HP ProLiant SL230s Gen8 Item Processor Chipset Memory Max Memory Internal Storage Max Internal Storage Networking I/O Slots Ports Power Supplies Integrated Management Additional Features Form Factor SL230 Gen8 Two Intel Xeon E5-2600 Series, 4/6/8 Cores, Intel Sandy Bridge EP Socket-R (512 GB), 16 sockets, DDR3 up to 1600MHz, ECC 512 GB Two LFF non-hot plug SAS, SATA bays or Four SFF non-hot plug SAS, SATA, SSD bays Two Hot Plug SFF Drives (Option) 8TB Dual port 1GbE NIC/ Single 10G Nic One PCIe Gen3 x16 LP slot 1Gb and 10Gb Ethernet, IB, and FlexF abric options Front: (1) Management, (2) 1GbE, (1) Serial, (1) S.U.V port, (2) PCIe, and Internal Micro SD card & Active Health 750, 1200W (92% or 94%), high power chassis ilo4 hardware-based power capping via SL Advanced Power Manager Shared Power & Cooling and up to 8 nodes per 4U chassis, single GPU support, Fusion I/O support 16P/8GPUs/4U chassis 6

MSC Nastran Result (xl0tdf1) Input dataset: xl0tdf1 Car Body (Ndof 529,257, SOL111, Direct Frequency Response) Memory: 520MB, SCR Disk: 5GB, Total I/O 190GB Time reduced as more nodes are being utilized for computation Up to 74% in time saved by running on a 4 DMPs cluster versus 1 DMP 74% Lower is better 7

MSC Nastran Result (md0mdf1) Input dataset: md0mdf1 Half Sphere (Ndof 42,066, SOL108, Direct Freq w/ Umfpack) Memory: 1GB, Total I/O 0.1GB Time reduced as more nodes are being utilized for computation Up to 91% in time saved by running on a 16 DMPs versus 1 DMP 91% Lower is better SMP=1 8

MSC Nastran Result Comparison (md0dmf1) Intel E5-2600 Series cluster delivers higher performance Up to 46% in higher performance than best published result* at 8 DMPs Reference hardware configuration: Dual Intel Westmere X5682 @ 3.47GHz, Linux RHEL 6.1, 96GB 1333MHz DDR3 DIMMs 46% Higher is better *http://www.mscsoftware.com/support/prod_support/nastran/performance/msc20121_par.cfm SMP=1 9

MSC Nastran Benchmark Comparision (xl0tdf1) Intel E5-2600 Series cluster delivers higher performance Up to 39% in higher performance than best published result* at 4 DMPs Reference hardware configuration: Dual Intel Westmere X5682 @ 3.47GHz, Linux RHEL 6.1, 96GB 1333MHz DDR3 DIMMs 39% Higher is better SMP=1 10

MSC Nastran Benchmark MPI Both Platform MPI and Intel MPI performs about the same Intel MPI delivers slightly better performance Higher is better InfiniBand FDR 11

MSC Nastran Profiling MPI Functions Mostly used MPI functions MPI_Recv (50%) and MPI_Ssend (50%) Mostly time consuming MPI functions MPI_Recv (61%) follows by MPI_Ssend (15%) 12

MSC Nastran Profiling Message Size Majority of MPI messages are small messages In the range of 0-64 bytes 13

MSC Nastran Profiling Disk IO Heavy disk write access is seen throughout the test run Not much access for disk IO reads Tests shows that Nastran could benefit from better disk IO 14

MSC Nastran Summary MSC Nastran Performance ProLiant G7 servers provide 46% higher performance than best published results Test illustrates Using more DMPs allow Nastran to scale Time reduced as more nodes are being utilized for computation Both Platform MPI and Intel MPI performs about the same MSC Nastran Profiling Frequent used message sizes Small message are used the most frequently Frequent used MPI functions MPI_Recv and MPI_Ssend Heavy disk write access is seen throughout the test run Tests shows that Nastran could benefit from better disk IO 15

Thank You HPC Advisory Council All trademarks are property of their respective owners. All information is provided As-Is without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information contained herein. HPC Advisory Council Mellanox undertakes no duty and assumes no obligation to update or correct any information presented herein 16 16