High-performance computing in mechanical engineering SIMPRO VTT subproject, task 1 Janne Keränen, Juha Kortelainen, Marko Antila, Kai Katajamäki, Aino Manninen, Vesa Nieminen, Aku Karvinen
Task 1.1 Computational resource management systems 2
Motivation Why to use computational resource management Computational resource management means Managing concrete computational resources, such as processing and storage resources and possible peripheral systems, i.e. additional hardware resources Managing computational queues and computing load balancing, i.e. execution of computational jobs Distributed Resource Management Systems 1 (DRMS) are meant to Provide fair access and increase the utilisation rate of the computational resources Help the users to find the best available resources for their computations Simplify the submission, executions, monitoring, management, and results retrieval of large computational cases The user need not to know the computation hardware, but only give some requirements for the hardware and DRMS handles the rest When the management of the computations becomes too complicated or is inefficient to be done manually, it is time to consider using a computational resource management system 1 Or resource management system or resource manager or job scheduler or workload management system 13/11/2015 3
Distributed resource management systems utilised in Case studies Grid Engine (Univa Grid Engine, Son of Grid Engine, Open Grid Scheduler) Original developed by Sun Micro Systems, presently the proprietary version is owned by Univa Open source versions based on the original Sun Grid Engine (SGE): Son of Grid Engine and Open Grid Scheduler SLURM (Simple Linux Utility for Resource Management) Development and maintenance coordinated by SchedMD LLC, giving also commercial support Open source software (GNU GPL v3) HTCondor Developed and maintained by the University of Wisconsin-Madison, USA Open source software (Apache License v2.0) Commercial support available by third party companies (e.g. Red Hat, Inc.) Techila Developed and maintained by Techila Technologies Oy, Finland Proprietary software Some others: Portable Batch System PBS (PBS Professional, TORQUE) Grid resource management systems, so-called meta-schedulers (e.g. open source GridWay and Globus Toolkit) Connect local resource management systems into systems of systems 13/11/2015 4
Different needs, different solutions There are different environments where computational resource management systems can be used: Distributed, heterogeneous computational resources: e.g. office networks Server systems or workstations: one system with one or more processors and several computational cores Cluster systems: several computational nodes (computers) with each one or more processors and several computation cores Grid systems: network of distributed computational subsystems, such as clusters and servers (a system of systems) Different resource management systems are focused on different needs and concepts E.g. Grid Engine, SLURM, and TORQUE are focused cluster (or server) systems, and operation is based on computational queues E.g. Techila and HTCondor are focused on distributed resources utilising the unused office computer resources Operation is based on selecting resources for given computation attributes None of DRMS can handle all the use scenarios well Multitude of different workable systems 13/11/2015 5
Example case 1: Grid Engine in VTT computational cluster Grid Engine (SGE) used to be the most common DRMS, presently several competing versions SGE is the DRMS of VTT computation cluster Close to 2000 cores, Rocks Cluster Distribution, InfiniBand network, NetApp storage system Both last Sun version (SGE 6.2u5) and Open Grid Scheduler 2011.11p1 were tested SGE cluster is composed of execution machines and a master machine (and a possible backup master) Experiences: The queuing system is not fair and practical for HPC With default configuration SGE overloads the nodes problems with HPC The job scheduling is based on free CPUs, not e.g. free memory If memory the limiting resource, challenges with SGE 13/11/2015 6
Example case 2: HTCondor with DAKOTA optimisation software, two scenarios DAKOTA feeds separate jobs into the HTCondor computational pool Each case is executed separately in the computational pool Can utilise heterogeneous independent resources in the pool Collecting the data for the whole study requires additional tricks A DAKOTA job is submitted into the HTCondor computational pool The whole DAKOTA study is treated as one job Suits best for larger resources in the computational pool, such as a server Execution of the DAKOTA run is straightforward, but retrieving the results files needs additional tricks 13/11/2015 7
Task 1.2 Scripting in high performance computing environment 8
Why scripting in HPC? Automate repeating tasks Own routines to speed up modelling, analysis, and post-processing Interface between different software Calculation engine in optimisation Python is presently the most common and best scripting language Easy to learn and use, includes even classes Special libraries to many practical needs NumPy for efficient vector calculation Matplotlib for plotting Much more: Web development etc. Development environments Python IDLE, Komodo, NetBeans, Pystudio, 13/11/2015 9
Scripting with Python/NumPy: powerful in vector calculation Case study: Correlation between two sets of vectors, e.g. mode shape vectors in dynamics, so-called modal assurance criteria (MAC) Sizes of vectors can be huge in FEM Vector size tens-of-millions, number of vectors can be several hundred Direct matrix multiplication (MATLAB or similar) not possible NumPy library extremely efficient in HPC Matrix visualisation with Matplotlib. Easy to couple with optimisation for mode shape identification To take in use: import numpy import matplotlib Example output: Reference mode 10 35.436Hz corresponds to 10 31.838Hz mac value is 0.922056806543 Reference mode 12 39.148Hz corresponds to 11 37.946Hz mac value is 0.924408685671 Reference mode 14 48.975Hz corresponds to 12 42.478Hz mac value is 0.919284572632 13/11/2015 10
Python scripting in model updating In model updating, open source optimisation software DAKOTA was used Abaqus was used to solve the natural modes and frequencies Workflow, run by a Python script: 1. From parameters given by DAKOTA, Python script creates Abaqus input data applying parameters to template files 2. The script runs the Abaqus analysis 3. MAC calculation with Python/NumPy 4. The script returns results for DAKOTA DAKOTA starts a new iteration step Mathplotlib can be used for visualisation Write frequencies to file MAC calculation with DAKOTA results file Write modes to file DAKOTA parameter file Apply parameters into template file Abaqus input file 13/11/2015 11
Task 1.3 Multi-physics simulations for electrical machine development 12
Elmer open-source FEM: Electrical machine end-winding model Electromagnetic fields and forces for end-windings Parallel computation, partitioning, linear algebra, etc. Utilises effectively 500 cores, commercial FEM < 20 cores Elmer, Sisu: Elmer, VTT cluster (Doctor): Commercial FEM, VTT cluster: 13/11/2015 13
Elmer open-source FEM: Induction machine model Multiphysics: Electromagnetics, rotating model, electrical circuits Comparison of the parallel performance in VTT cluster and CSC supercomputer Sisu VTT cluster Sisu Partitioning with rotation: Elmer can utilise 50-500 cores, depending on the model complexity Computational time reduces from weeks-days to days-hours Enables new type of multiphysical modelling: accurate electromagnetic-thermal and vibro-acoustic analysis 13/11/2015 14
Analytical design tool for permanent magnet electrical machines, for optimisation An analytical tool for design of permanent magnet machine was implemented with MATLAB About 30-40 design variables have to be determined during the design process Initial parameters: pole pair number, air-gap flux density (T), stator current density (A/m 2 ), ratio between diameter and height, air-gap (m) Output: efficiency, mass of magnets, mass of copper, mass of iron Magnets Stator 1. Choose initial parameters and define rotor dimensions 2. Design armature (stator) winding 3. Define magnet dimensions 4. Define stator tooth and slot dimensions based on target values of magnetic flux density in stator Rotor Back-EMF is large enough? Yes No Windings 5. Calculate machine properties, e.g. shaft power, efficiency, power factor, losses, temperatures in different parts 13/11/2015 15
Multi-objective optimisation problem for the electrical machine case 6 objectives 13/11/2015 Output power, target 3 MW Torque density, maximised Mass, minimised Efficiency, maximised Power factor, maximised Cost, minimised Constraints: Slot pitch > 7mm P out P target should be 1 and 1.05 Temperature of permanent magnets < 100 C 14 input parameters Pole pair number, desired current densities, air-gap length, magnet width, tangential stress, flux density, number of slots, stator outer diameter, slot shape, Results with DAKOTA: Optimal torque density (Nm/m 3 ) with different maximum number of function evaluations, selected values and measures from 50 runs. 500 1000 5000 10000 15000 20000 25000 Best 26,18 47,88 40,36 40,32 49,38 40,32 40,32 Worst 16,48 19,06 21,82 21,82 21,67 21,82 21,82 Average 21,04 25,76 29,59 29,43 29,83 29,37 29,43 Median 20,68 24,88 28,30 28,30 28,14 28,30 28,30 Standard deviation 2,18 4,77 5,52 5,32 6,13 5,30 5,32 16
Task 1.4 Co-simulation and parallelisation in technical simulation 17
Case Study for Fluid-Structure interaction (FSI) Rotating propeller Non-rotating cylinder at the wake represents for example the body of the azimuthing thruster Structural parts surrounded by a tube, which forms a cavity for the fluid (water) Dimensions: Propeller diameter: 600 mm Cylinder diameter: 150 mm Fluid tube diameter: 960 mm 13/11/2015 18
Fluid-Structure interaction Co-Simulation process with MpCCI FSI Co-Simulation coupling interface: MpCCI (developed at Fraunhofer Institute SCAI) So called weak coupling : each problem is solved separately and variables are exchanged before each time step in both directions During iteration step, data is not changed between the codes Coupled variables in this FSI are displacements and pressure Non-conforming meshes; A shape function mapping is used for data exchange between two non-matching grids Explicit-transient coupling Sequential serial coupling algorithm Time step used: 100 µs Code A: solid mechanics code (Abaqus) Code B: CFD code (Fluent) 13/11/2015 19
Fluid-Structure interaction Results (1/3) Totally 8000 time steps was simulated corresponding 4 full rotations of the propeller Total co-simulation wall-clock time with workstation was about 80 hours 13/11/2015 20
Fluid-Structure interaction Results (2/3) 13/11/2015 21
Fluid-Structure interaction Results (3/3) Axial displacements at tip of the propeller blade and propeller hub: the blade frequency dominates Caused by cylinder structure located at the wake of the propeller Transversal and vertical displacements of the hub: rotating frequency dominates 13/11/2015 22
Task 1.5 Large-scale visualisation and open source tools in technical computations 23
Open source tools in technical photorealistic large-scale visualisation Benefits of open source Availability of the source code: Possibility to see what the code does Possibility to modify the code Free of license fees (important in HPC) Continuous software development process, bugs are corrected (usually) faster security Tools used: Salome-platform for pre-processing snappyhexmesh for grid generation OpenFOAM for solution ParaView for post-processing Blender for rendering 13/11/2015 24
Technical photorealistic visualisation: Example case Radio-controlled (RC) car Complex domain 16 million control volumes Salome OpenFOAM ParaView Blender 13/11/2015 25
Results from RC-car case Visualisation of the velocity at the symmetry plane using Paraview 13/11/2015 26
Results Time-averaged streamlines over the RC car rendered using Blender 13/11/2015 27
Surrogate-based optimisation of airfoil using open source software Case study: drag minimisation of an airfoil when the minimum lift is given as an inequality constraint Optimisation in DAKOTA, calculation of the objective function using a computational fluid dynamics software OpenFOAM CFD results produce a non-smooth objective function due to the numerical errors from the coarseness of the grid Surrogate-based optimisation methods where the objective function is replaced by simpler surrogate function 13/11/2015 28
Airfoil optimisation results Design variables Cd Cl 7 6,5 6 5,5 5 4,5 4 3,5 3 0 1 2 3 4 5 6 7 8 0,014 0,0138 0,0136 0,0134 0,0132 0,013 0,0128 0,0126 0,0124 0,0122 0,012 0 1 2 3 4 5 6 7 8 0,7 0,68 0,66 0,64 0,62 0,6 0,58 0,56 0,54 0,52 0,5 0 1 2 3 4 5 6 7 8 Optimisation Cycle Optimisation Cycle Optimisation Cycle m p Cd Cl Design variables define the shape of the airfoil Cd = drag coefficient Cl = lift coefficient (limited to be >0.55) Optimum (black, initial=red) 13/11/2015 29
TECHNOLOGY FOR BUSINESS