Ansys & optislang on a HPC-Cluster



Similar documents
Recommended hardware system configurations for ANSYS users

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

Recent Advances in HPC for Structural Mechanics Simulations

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Best practices for efficient HPC performance with large models

Simultaneous Calculation with ANSYS

Comparing Free Virtualization Products

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

System requirements for A+

Comparison of computational services at LRZ

Clusters: Mainstream Technology for CAE

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

IOS110. Virtualization 5/27/2014 1

Windows HPC 2008 Cluster Launch

Cloud Computing through Virtualization and HPC technologies

LabStats 5 System Requirements

Introduction to the NI Real-Time Hypervisor

ANSYS Computing Platform Support. July 2013

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

Vertical Scaling of Oracle 10g Performance on Red Hat Enterprise Linux 5 on Intel Xeon Based Servers. Version 1.0

Hardware/Software Guidelines

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

High Performance Computing in CST STUDIO SUITE

Dynamode External USB3.0 Dual RAID Encloure. User Manual.

Ports utilisés. Ports utilisés par le XT1000/5000 :

Performance Guide. 275 Technology Drive ANSYS, Inc. is Canonsburg, PA (T) (F)

PVTC Technical Requirements

SUN ORACLE EXADATA STORAGE SERVER

Remote Desktop Services

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Microsoft Compute Clusters in High Performance Technical Computing. Björn Tromsdorf, HPC Product Manager, Microsoft Corporation

Virtualization of a Cluster Batch System

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

inforouter V8.0 Server & Client Requirements

CPU Benchmarks Over 600,000 CPUs Benchmarked

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Hardware and Software Guidelines for Customer Hosted Systems

Ensure that the server where you install the Primary Server software meets the following requirements: Item Requirements Additional Details

Upgrading Small Business Client and Server Infrastructure E-LEET Solutions. E-LEET Solutions is an information technology consulting firm

Configuring and Launching ANSYS FLUENT Distributed using IBM Platform MPI or Intel MPI

Wind-Tunnel Simulation using TAU on a PC-Cluster: Resources and Performance Stefan Melber-Wilkending / DLR Braunschweig

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

LS DYNA Performance Benchmarks and Profiling. January 2009

Hardware and Software Requirements for Installing California.pro

Scholastic Education Technology Programs

LiveDDM hardware requirements and recommendations

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

APPLICATION OF SERVER VIRTUALIZATION IN PLATFORM TESTING

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS UPDATE

Hardware and Software Requirements. Release 7.5.x PowerSchool Student Information System

Sage 100 Premium ERP Version 2015 Supported Platform Matrix Created as of April 6, 2015

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers

Pedraforca: ARM + GPU prototype

Rapattoni Magic 9 Hardware and Software Guidelines

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

HP Z Turbo Drive PCIe SSD

Self service for software development tools

Tandberg Data AccuVault RDX

System Requirements. SuccessMaker 5

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

A Comparison of VMware and {Virtual Server}

RED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK

IT Business Management System Requirements Guide

CSE 501 Monday, September 09, 2013 Kevin Cleary

Performance Comparison of ISV Simulation Codes on Microsoft Windows HPC Server 2008 and SUSE Linux Enterprise Server 10.2

PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms

Parallels Plesk Automation

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

The Best RDP One-to-many Computing Solution. Start

Terminal Server Software and Hardware Requirements. Terminal Server. Software and Hardware Requirements. Datacolor Match Pigment Datacolor Tools

The Value of High-Performance Computing for Simulation

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

ISPS & WEBHOSTS SETUP REQUIREMENTS & SIGNUP FORM LOCAL CLOUD

Building Clusters for Gromacs and other HPC applications

Balancing CPU, Storage

SUN DUAL PORT 10GBase-T ETHERNET NETWORKING CARDS

Kriterien für ein PetaFlop System

Deduplication on SNC NAS: UI Configurations and Impact on Capacity Utilization

Adapt Support Managed Service Programs

HPC performance applications on Virtual Clusters

Intel Solid-State Drives Increase Productivity of Product Design and Simulation

Using the Windows Cluster

High Performance SQL Server with Storage Center 6.4 All Flash Array

The Bus (PCI and PCI-Express)

HOB Remote Desktop VPN Secure access for remote workers and business partners to your enterprise network

Transcription:

Ansys & optislang on a HPC-Cluster Optimierungsergebnis nach 50 Iterationen Evolutionary Algorithm 1,00E+05 1,00E+04 Messkurve Best-Fit 1,00E+03 Dipl.-Ing. (FH) Holger Mai Engineering GmbH Holunderweg 8 89182 Bernstadt www.microconsult-engineering.de PSD-Beschleunigung [(m/s²)²/hz] 1,00E+02 1,00E+01 1,00E+00 1,00E-01 1,00E-02 1,00E-03 1,00E-04 1,00E-05 0 100 200 300 400 500 600 700 800 900 1000 Frequenz [Hz] Engineering

Overview & optislang The simple way optimization on a workstation One step further optimization on a HPC-Cluster using RSM Pushing the limits optimization on a HPC-Cluster in Linux-environment Engineering Holger Mai, Seite 2

About Founded in 2000 Business focussed on FE-simulation and metrology engineering 5 employees, 3 working on simulation Customers: Automotive, Automotive Electronics Typical problems: fatigue, thermo-mechanics, fluid dynamics Since 2009: ANSYS Enhanced Solutions Partner Engineering Holger Mai, Seite 3

& optislang successfully built up a HPC-Cluster to deal with extreme customer s problems Looking for new applications to make use of the enormous computing power Being able to solve one huge problem on 100+ cores we could also solve several problems simultaneously optimization is the way to go Introduction of optislang in June 2009, since then trying to push the limits HPC Cluster 8 Intel Harpertown Systems, total of 64 cores, 488 GB RAM 16 Intel Nehalem Systems, total of 128 cores, 1140 GB RAM OS SUSE Linux Enterprise Server Max. power consumption 18 kw Engineering Holger Mai, Seite 4

Typical test cases 1 Adaption of material parameters to fit a PSD-Analysis to experimental data 2.700.000 DOFs 150 designs calculated with ARSM (5 optimization parameters) 2 Variation of CTEs of a fibre-reinforced plastic structure to fit a thermomechanical simulation to measured values 2.600.000 DOFs 720 EA-Designs calculated (15 optimization parameters) 4 different ambient temperatures to optimize => 2880 designs! Huge number of designs to calculate for each optimization task Efficient optimization needs optimization of computing performance Engineering Holger Mai, Seite 5

Optimization on a workstation Workstation: HP Z800, Win XP Pro x64, 2x Intel Nehalem Quadcore, 48 GB RAM Maximum of 1 Job parallel, 8 cores/job RSM WB-Problem Ansys Classic is.db-file ported & to APDL-script optislang is via generated optiplug; WB optislang WB-Problem is queued changes via is RSM APDL-script ported to RSMsolve to optislang multiple to generate via problems optiplug; new designs at the WB gets input from optislang same time and works in background Maximum of 4 Jobs parallel, 2 cores/job Maximum of 4 Jobs parallel, 2 cores/job Engineering Holger Mai, Seite 6

Optimization on a workstation Testcase 1 PSD-analysis Engineering Holger Mai, Seite 7

Optimization on a Multicore Opteron Multicore Opteron: Tyan S8812 Quad Socket Board, 4x AMD Opteron Magny- Cours 12-Core-Processor, 192 GB RAM Main advantage: Configuration as simple as a Workstation Ansys Classic Compute.db-file & power APDL-script almost is generated like a cluster optislang changes APDL-script to generate new designs Maximum of 24 Jobs parallel, 2 cores/job Engineering Holger Mai, Seite 8

Optimization on a Multicore Opteron Workstation (reference) Testcase 1 size of ARSM- Generations (in this case = 9) dominates speedup-effects Engineering Holger Mai, Seite 9

Optimization on a Multicore Opteron Workstation (reference) Testcase 2 24 Designs/EA- Generation, doesn t affect performance Engineering Holger Mai, Seite 10

Optimization on a Multicore Opteron Simple setup, just like a workstation Power consumption like standard 8-core workstation Hardware costs about twice of a workstation Number of cores like in a cluster 6 times faster than up-to-date 8-core Workstation Main expense: licensing (as you will see later) Extremely efficient way to speedup your optimizations Engineering Holger Mai, Seite 11

Optimization on a HPC-Cluster using RSM RSM generation of optimization designs, pre-/post-processing I/O solution solution solution solution Engineering Holger Mai, Seite 12

Optimization on a HPC-Cluster using RSM Workstation (reference) Testcase 1 size of ARSM- Generations (in this case = 9) dominates speedup-effects Engineering Holger Mai, Seite 13

Optimization on a HPC-Cluster using RSM Problems and Disadvantages Performance is killed by huge amount of time spent for I/O (in case of a PSD-Analysis the results of the modal-analysis are transferred to the headnode and from there transferred back to host to perform the PSD-Analysis) Each process needs a single license, to calculate 5 jobs parallel you need 5 prep/post-licenses, 5 batch-licenses and 5 HPC-Packs Engineering Holger Mai, Seite 14

Optimization on a HPC-Cluster in Linux-environment optislang running in Linux-environment Headnode only generating optimization designs calculation calculation calculation Small amount of I/O e.g. only text files or pictures of evaluated results must be transferred calculation Remote machines doing entire calculation (Solution and pre/post-processing) Engineering Holger Mai, Seite 15

Optimization on a HPC-Cluster in Linux-environment optislang running in Linux-environment Headnode only generating optimization designs Additional optislang-variable allows remote calculation on more than one remote machine calculation calculation calculation calculation Engineering Holger Mai, Seite 16

Optimization on a HPC-Cluster in Linux-environment Workstation (reference) Testcase 1 size of ARSM- Generations (in this case = 9) dominates speedup-effects Engineering Holger Mai, Seite 17

Optimization on a HPC-Cluster in Linux-environment Testcase 2 24 Designs/EA- Generation, doesn t affect performance 42 times faster Engineering Holger Mai, Seite 18

Optimization on a HPC-Cluster in Linux-environment current hardware parallel computing previous generation hardware current hardware Engineering Holger Mai, Seite 19

Optimization on a HPC-Cluster in Linux-environment standard licensing RDO-Pack licensing [RDO-Pack multiplies number of availiable licenses by 8, for optimization purposes only] 12*4 cores Opteron 8*8 cores 8*16 cores 12*8 cores 42 times faster @ 8 times the cost 1*8 cores workstation Engineering Holger Mai, Seite 20

Conclusions Ansys Classic always faster than Workbench Multicore machine most convenient way to go for parallel optimization RSM generates huge amount of I/O => makes it inefficient to accelerate optimization Important factor for speedup possibilities: generation sizes Using up-to-date hardware doubles performance while causing very little extra costs compared to licensing Due to new licensing model (RDO-Pack) optimization on HPC- Cluster can be very cost-efficient (42x faster, 8x cost) Engineering Holger Mai, Seite 21

Questions? Engineering Holger Mai, Seite 22

Comparison of Windows RSM & Linux RSM Data transfer works three times faster with Linux RSM! Engineering Holger Mai, Seite 23