Wind-Tunnel Simulation using TAU on a PC-Cluster: Resources and Performance Stefan Melber-Wilkending / DLR Braunschweig Folie 1 > Vortrag > Stefan Melber-Wilkending
Wind-Tunnel Simulation using TAU on a PC-Cluster: Resources and Performance Outline New Linux PC-Cluster at Braunschweig (DLR-AS) Performance Measurements of TAU on PCClusters: Platforms Results Example of an application on a PCCluster: Wind-Tunnel Simulation Wind-Tunnel Boundary Condition Example: Simulation of DLR ALVAST High-Lift Configuration in Low-Speed Wind-Tunnel DNW-NWB
New Linux PC-Cluster at DLR-AS Technical Data - General New Linux PC-Cluster at DLR-AS / Braunschweig: For middle-sized CFD-problems Production-usage for research and contract-work Size: 276 Opteron 2.6 GHz CPUs Hardware installation and testing: 09/2005 Open for user-access: 10/2005
New Linux PC-Cluster at DLR-AS Technical Data - Nodes 138 Dual-Opteron (AMD) Nodes (V20z, SUN) CPU-clockspeed: 2.6 GHz 4 GByte DDR1/400 memory 2 x 73 GB Ultra320 SCSI HDs Management processor ( remote power reset, monitoring, error-analysis...) Infiniband HPC interconnect 100 MBit Ethernet interconnect 1 HU - size SuSE Linux 9.3 professional
New Linux PC-Cluster at DLR-AS Technical Data - Frontends 2 Frontends (V40z, SUN) 4x Opteron 2.2 GHz (AMD) 8 GByte DDR1/333 memory 2 x 73 GB Ultra320 SCSI HDs 100MBit Ethernet interconnect 3 HU - size SuSE Linux 9.3 professional RAID system 10 TByte Infiniband switch 144 ports (Voltaire) PBS Pro queuing-system / MAUI sheduler
New Linux PC-Cluster at DLR-AS Technical Data - Setup
New Linux PC-Cluster at DLR-AS Performance Compared Systems 32 Nodes / 64 CPUs Intel Xeon 3.06 GHz NEC-Cluster (DLR-AS): 2 GByte RAM / Node, Myrinet 2000 Interconnect 128 Nodes / 256 CPU AMD Opteron 2.0 GHz Cray-Cluster (HWW) 4 GByte RAM / Node, Myrinet 2000 Interconnect 192 Nodes / 384 CPUs AMD Opteron 2.4 GHz SUN-Cluster (DLR-AT) 4 GByte RAM / Node, Infiniband (Voltaire) Interconnect 36 Nodes / 72 CPUs AMD Opteron 2.2 GHz Cray XD1-Cluster (Cray) 4 GByte RAM / Node, RapidArray Interconnect (direct connection between network and Hybertransport-channel on the CPU) 72 Nodes / 144 CPUs AMD Opteron 2.4 GHz Cray XD1-Cluster (Cray) 8 GByte RAM / Node, RapidArray Interconnect
New Linux PC-Cluster at DLR-AS Performance Setup All Clusters running under Linux Operating-System Compiler: GnuCC 3.2.3 TAU-Code, Version 2004.1.2 with typical settings for complex configurations: Central discretization Implicit time integration (LU-SGS) CFL-number: 5 Multigrid: 3v Turbulence model: Menter k-ω SST Low-Mach-number preconditioning Cache-optimization Case: glider with laminar-turbulent transition Free-stream conditions: Ma = 0.078, Re = 1.1e6 Grid: 10 million points, 30 layers
New Linux PC-Cluster at DLR-AS Performance Test Results CPU-Time for 50 cycles [s] for different CPU-numbers CPUs NEC Xeon Cray Opteron Cray Opteron Cray Opteron SUN Opteron SUN Opteron 3.06 Ghz (AS) 2.0 Ghz 2.2 Ghz 2.4 Ghz 2.4 Ghz (AT) 2.6 Ghz (AS) 6 2303 1947 1702 8 1667 1307 1222 1266 1126 12 1564 1108 881 811 987 743 16 1203 760 661 621 669 572 32 643 436 347 326 339 306 48 241 236 60 176 183 165
New Linux PC-Cluster at DLR-AS Performance Test Results Relative Speedup compared to Cray Opteron-Cluster at HWW CPUs NEC Xeon Cray Opteron Cray Opteron Cray Opteron SUN Opteron SUN Opteron 3.06 Ghz (AS) 2.0 Ghz 2.2 Ghz 2.4 Ghz 2.4 Ghz (AT) 2.6 Ghz (AS) 6 100 118 135 8 100 128 136 132 148 12 71 100 126 137 121 149 16 63 100 115 114 114 133 32 68 100 126 134 129 143
New Linux PC-Cluster at DLR-AS Performance Test Results Speed of TAU on Opteron CPUs is a linear function of CPU clockspeed Compared to CrayOpteron 2.0 GHz new cluster is about 1.5 times faster Compared to NEC Xeon 3.06 GHz (standard cluster at AS-BS) new cluster is about 2.1 times faster Folie 11 > Vortrag > Stefan Melber-Wilkending
New Linux PC-Cluster at DLR-AS Performance Test Results Speedup compared to 8 CPUs (memory restrictions of the test-case) Nearly linear scalability of the TAU-Code up to 60 CPUs Tested Interconnects (Myrintet, Infiniband, RapidArray) have enough reserve for TAU parallelisation
Wind-Tunnel simulation using TAU-Code General Simulation of a wind-tunnel including test-section and nozzle Background: Avoid uncertainties of wind-tunnel corrections Uncorrected measurements directly comparable to CFD Validation of wind-tunnel corrections Extrapolation of wind-tunnel results at free-flight using CFD DLR project ForMEx (Fortschrittliche Methoden zur Extrapolation von Windkanalergebnissen auf den Freiflug) Problem : Numerical simulation of wind-tunnel including model big grids (about 20 million points) HPC-resources needed new PCcluster / AS-BS
Wind-Tunnel simulation using TAU-Code Wind-Tunnel Boundary Condition Idea: Usage and extension of engine boundary-condition Wind-tunnel inlet: Total-pressure and -temperature are given Regulation of flow-speed in windtunnel: Imaginary probe in numerical test-section (same position as in experiment) Comparison with given Machnumber Input for static pressure regulation on tunnel-outlet Applyable for 0 < Ma < 1 Numerical Wind-Tunnel Pressure on Outlet Imaginary Probe Bound. Cond. TAU-Code
Wind-Tunnel simulation using TAU-Code Validation Measurements in empty low-speed windtunnel DNW-NWB Database for validation of numerical results Measurements: Boundary layer profiles Static pressure on tunnel-outlet
Wind-Tunnel simulation using TAU-Code Preliminary Results DNW-NWB / ALVAST DLR-ALVAST half-model in high-lift configuration in DNW-NWB DLR-ALVAST: analoge to AIRBUS A320 Half model mounted on peniche Grids: Hybrid unstructured Centaur grid generator 20 million points Full Navier-Stokes Chimera-Technique: rotation of model without grid-generation
Wind-Tunnel simulation using TAU-Code Preliminary Results DNW-NWB / ALVAST Simulation of complete lift-polars including maximum lift Geometry variations: Wing-root geometry (e.g. slathorn, 16 configurations) Comparison of wind-tunnelsimulation against free-flight wind-tunnel-corrections Influence of peniche height
Wind-Tunnel simulation using TAU-Code Preliminary Results DNW-NWB / ALVAST ALVAST TAU F11 Wind-Tunnel Horse-shoe vortex around peniche
Wind-Tunnel simulation using TAU-Code Preliminary Results DNW-NWB / ALVAST
Conclusions TAU tested on PC-Linux Clusters: Good scalability and performance New Cluster at AS/BS available for production: 10/2005 Implementation of an wind-tunnel boundary condition in TAU: Validation with empty wind-tunnel measurements First results of simulation of ALVAST high-lift configuration at DNW-NWB compared to the experiment Further work: Investigation of half-model influence, variation of geometry,...
Special thanks for testing-support and debugging of TAU-parallelisation W. Hafemann, C. Simmendinger (T-Systems) N. Gal, Y. Shahar (Voltaire) J. Redmer, T. Warschko (Linux NetWorx) Axel Köhler (SUN) Institute of Propulsion Technology (DLR-AT) R. Dwight, T. Alrutz (DLR-AS) M. Wierse (Cray)