The UberCloud HPC Experiment: Compendium of Case Studies

Transcription

1 The UberCloud HPC Experiment: Compendium of Case Studies

2 The UberCloud HPC Experiment: Compendium of Case Studies Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies. Progress can only be achieved by educating our engineers, especially those in the missing middle, and making HPC easier to access and use for everyone who can benefit from this advanced technology. Compendium Sponsor The UberCloud HPC Experiment actively promotes the wider adoption of digital manufacturing technology. It is an example of a grass roots effort to foster collaboration among engineers, HPC experts, and service providers to address challenges at scale. The UberCloud HPC Experiment started in mid with the aim of exploring the end-to-end process employed by digital manufacturing engineers to access and use remote computing resources in HPC centers and in the cloud. In the meantime, the UberCloud HPC Experiment has achieved the participation of 500 organizations and individuals from 48 countries. Over 80 teams have been involved so far. Each team consists of an industry end-user and a software provider; the organizers match them with a well-suited resource provider and an HPC expert. Together, the team members work on the enduser s application defining the requirements, implementing the application on the remote HPC system, running and monitoring the job, getting the results back to the end-user, and writing a case study. Media Partners COMMUNICATIONS Intel decided to sponsor this Compendium of 25 case studies selected from the first 60 teams to raise awareness in the digital manufacturing community about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource for engineers, managers and executives who believe in the strategic importance of this technology for their organizations. Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible. This HPC UberCloud Compendium of Case Studies has been sponsored by Intel and produced in conjunction with Tabor Communications Custom Publishing, which includes HPCwire, HPC in the Cloud, and Digital Manufacturing Report. If you are interested in participating in this experiment, either actively as a team member or passively as an observer, please register at A Tabor Communications, Inc. (TCI) publication This report cannot be duplicated without prior permission from TCI. While every effort is made to assure accuracy of the contents we do not assume liability for that accuracy, its presentation, or for any opinions presented.

3 The UberCloud HPC Experiment: Compendium of Case Studies Table of Contents 4 Welcome Note 5 Finding The Missing Middle 6 Executive Summary Case Studies: Team 1: Heavy Duty Abaqus Structural 7 Analysis Using HPC in the Cloud Team 2: Simulation of a Multi-resonant Antenna System Using CST MICROWAVE STUDIO Team 4: Simulation of Jet Mixing in the Supersonic Flow with Shock Team 5: Two-phase Flow Simulation of a Separation Column Team 8: Flash Dryer Simulation with Hot Gas Used to Evaporate Water from a Solid Team 9: Simulation of Flow in Irrigation Systems to Improve Product Reliability Team 14: Electromagnetic Radiation and Dosimetry for High Resolution Human Body Phantoms and a Mobile Phone Antenna Inside a Car as Radiation Source Team 15: Weather Research and Forecasting on Remote Computing Resources Team 19: Parallel Solver of Incompressible, 2D and 3D Navier-Stokes Equations, Using the Finite Volume Method Team 20: NPB2.4 Benchmarks and Turbo-machinery Application on Amazon EC Team 22: Optimization Study of Side Door Intrusion Bars Team 25: Simulation of Spatial Hearing Team 26: Development of stents for a narrowed artery Team 30: Heat Transfer Use Case Team 34: Analysis of Vertical and Horizontal Wind Turbine Team 36: Advanced Combustion Modeling for Diesel Engines Team 40: Simulation of Spatial Hearing (Round 2) Team 44: CFD Simulation of Drifting Snow Team 46: CAE Simulation of Water Flow Around a Ship Hull Team 47: Heavy Duty Abaqus Structural Analysis Using HPC in the Cloud (Round 2) Team 52: High-Resolution Computer Simulations of Blow-off in Combustion Systems Team 53: Understanding Fluid Flow in Microchannels Team 54: Analysis of a Pool in a Desalinization Plant Team 56: Simulating Radial and Axial Fan Performance Team 58: Simulating Wind Tunnel Flow Around Bicycle and Rider

4 The UberCloud HPC Experiment: Compendium of Case Studies Welcome! The UberCloud HPC Experiment started one year ago when Burak sent an to Wolfgang with a seemingly simple question: Hi Wolfgang, I am Burak. Why is cloud adoption in high performance computing so slow, compared to the rapid adoption of cloud computing in our enterprise community? After several discussions and Skype conferences that elaborated on the fundamental differences between enterprise and high performance computing, Wolfgang got on a plane to San Francisco for a face-to-face conference with Burak. Four days of long discussions and many cups of tea later, a long list of challenges, hurdles, and solutions for HPC in the cloud covered the whiteboard in Burak s office. The idea of the experiment was born. Later, after more than 30 HPC cloud providers had joined, we called it the UberCloud HPC Experiment. We found that, in particular, small- and medium-sized enterprises in digital manufacturing would strongly benefit from HPC in the Cloud (or HPC as a Service). The major benefits they would realize by having access to additional remote compute resources are: the agility gained by speeding up product design cycles through shorter simulation run times; the superior quality achieved by simulating more sophisticated geometries or physics; and the discovery of the best product design by running many more iterations. These are benefits that increase a company s competitiveness. Tangible benefits like these make HPC, and more specifically HPC as a Service, quite attractive. But how far away are we from an ideal HPC cloud model? At this point, we don t know. However, in the course of this experiment as we followed each team closely and monitored its challenges and progress, we gained an excellent insight into these roadblocks and how our teams have tackled them. We are proud to present this Compendium of 25 selected use cases in digital manufacturing, with a focus on computational fluid dynamics and material analysis. It documents the results of over six months of hard work by the participating teams their findings, challenges, lessons learned, and recommendations. We were amazed by how engaged all participants were, despite the fact that this was not their day job. But their inquiring minds and the chance of collaborating with the brightest people and companies in the world, as well as tackling some of today s greatest challenges associated with accessing remote computing resources, were certainly their strongest motivator. We want to thank all participants for their continuous commitment and for their voluntary contribution to their individual teams, to the Experiment, and thus to our whole HPC and digital manufacturing community. We want to thank John Kirkley from Kirkley Communications for his support with editing these use cases and our media sponsor Tabor Communications for this publication. Last but not least, we are deeply grateful to our sponsor, Intel, who made this Compendium possible. Enjoy reading! Wolfgang Gentzsch and Burak Yenier Neutraubling and Los Altos, June 1, 2013 A Tabor Communications, Inc. (TCI) publication This report cannot be duplicated without prior permission from TCI. While every effort is made to assure accuracy of the contents we do not assume liability for that accuracy, its presentation, or for any opinions presented.

5 The UberCloud HPC Experiment: Compendium of Case Studies Finding the Missing Middle So far the application of high performance computing (HPC) to the manufacturing sector hasn t lived up to expectations. Despite an attractive potential payoff, companies have been slow to take full advantage of today s advanced HPC-based technologies such as modeling, simulation and analysis. Fortunately, the situation is changing. Recently a number of important initiatives have gotten underway designed to bring the benefits of HPC to small- to medium- sized manufacturers (SMMs) the so-called missing middle. For example, in the United States the National Center for Manufacturing Sciences is launching a network of Predictive Information Centers to bring the technology to these smaller manufacturers. The NCMS initiative is designed to help SMMs apply HPC-based modeling and simulation (M&S) to help solve their manufacturing problems and be more competitive in the global marketplace. A Unique Initiative The UberCloud HPC Experiment is one of those initiatives but with a difference. It s a grass roots effort, the result of the vision of two working HPC professionals Wolfgang Gentzsch and Burak Yenier. It s also international in scope, involving more than 500 organizations and individuals from around the globe. So far more than 80 teams each consisting of an industry enduser (typically an SMM), a resource provider, a software provider, and an HPC expert have explored the challenges and benefits associated with accessing and running engineering applications on cloud-provisioned HPC resources. This team approach is unique. Its success is a tribute to the organizers who not only conceived the idea, but also play matchmaker and mentor, bringing together winning combinations of team members often from widely separated geographic locations. In Rounds 1 and 2, reported in this document, the teams have enthusiastically addressed these challenges at scale. In the process they have identified and solved major problems that have limited the adoption of HPC solutions by the missing middle those hundreds of thousands of small- to medium-sized companies worldwide that have yet to realize the full benefits of HPC. As you read through the case studies in this HPC UberCloud Compendium, as Yogi Berra once famously said, It s déjà vu all over again. Among the 25 reports you will unquestionably find scenarios that resonate with your own situation. You will benefit from the candid descriptions of problems encountered, problems solved, and lessons learned. These situations, many involving computational fluid dynamics, finite element analysis and multiphysics, are no longer the exception in the digital manufacturing universe they have become the rule. These reports are down-to-earth examples that speak directly to what you are trying to accomplish within your own organizations. Intel Involvement Why is Intel supporting this Compendium and showing such an interest in the UberCloud HPC Experiment? For one thing, it is clear that the potential market for HPC worldwide is much larger than what we see today. And it has its problems. Recently the number of participants in the HPC community has been somewhat stagnant. Without an influx of new talent across the board, basic skill sets are being lost. We need to include more participants to ensure the sector s vibrancy over time. Initiatives like the UberCloud HPC Experiment do just that. By addressing the barriers confronting the missing middle, we are finding that we can indeed broaden the adoption of its capabilities within this underserved market segment. It s a win-win situation all around: The SMMs gain new advanced capabilities and competitiveness; the HPC ecosystem expands; and companies like Intel and others that support HPC are part of a robust and growing business environment. The UberCloud HPC Experiment fuels innovation not just by endusers who are using HPC tools to create new solutions to their manufacturing problems, but also on the part of the hardware and software vendors and the resource providers. The initiative creates a virtuous cycle leading to the democratization of HPC it s making M&S available to the masses. The initiative satisfies the four strategies set forth by NCMS on what s needed to revitalize manufacturing through the application of digital manufacturing. First is to educate: providing a low risk environment that allows end users to learn about HPC and M&S. Next is entice: clarifying the value of advanced M&S through the use of HPC through entrylevel evaluative solutions. Engage and Elevate take end users to the next levels of digital manufacturing as they become proficient in the use of HPC either through cloud services or by developing in-house capabilities. Lessons Learned One lesson that s become very clear as the UberCloud HPC Experiment continues one size does not fit all. Manufacturing has many facets and virtually every solution reported in this Compendium had to be tailored to the individual situation. As one team commented in their report, From an end user perspective, we observed that each cluster provider has a unique way of bringing the cloud HPC option to the end user. Other teams ran into issues such as scalability, licensing, and unexpected fees for running their applications in the cloud. Despite this diversity, there are a number of common threads running through all 25 reports that provide invaluable information about what to anticipate when running HPC-based applications and how to avoid or solve the speed bumps that inevitably arise. The applications themselves are not the problem; it s a question of understanding how the capabilities inherent in, say a CFD or FEA solver, can meet your needs. This is where the team approach shines by bringing to bear a wide range of experience from all four categories of team members, the chances of finding a solution are greatly enhanced. As the saying goes, To compete, you must compute. The sooner you become familiar with and start using this technology, the sooner you can compete more vigorously and broaden your marketplace. You can not only make your existing products more effective and desirable, but also create new products that are only possible with the application of HPC technology. The competitive landscape is shifting. You need to ask, Do I want to remain in the old world of manufacturing or embrace the new? Reading these 25 case studies will not only show you what s possible, but also how to kick-off the activities that will allow you to take a quantum leap in competitiveness. The UberCloud HPC Experiment this energetic grass roots movement to bring HPC to the missing middle continues. You just might want to become a part of it. Dr. Stephen R. Wheat General Manager, High Performance Computing Intel Corp.

6 Executive Summary This is an extraordinary document. It is a collection of selected case studies written by the participants in Rounds 1 and 2 of the ambitious UberCloud HPC Experiment. The goal of the HPC Experiment is to explore the end-toend processes of accessing remote computing resources in HPC centers and HPC clouds. The project is the brainchild of Wolfgang Gentzsch and Burak Yenier, and had its inception in July What makes this collection so unusual is that, without exception, the 25 teams reporting their experiences are totally frank and open. They share information about their failures as well as successes, and are more than willing to discuss in detail what they learned about the ins and outs of working with HPC in the cloud. When Round 1 wrapped up in October 2012, 160 participating organizations and individuals from 25 countries working together in 25 widely dispersed but tightly knit teams had been involved. With Round 2, completed in March 2013, another 35 teams and 360 individuals some of them veterans of Round 1 took up the challenge. (As of this writing, Round 3 is underway, with almost 500 participating organizations and another 25 enthusiastic teams.) The Participants Each HPC Experiment team is made up of four types of individuals: Industry end-users, many of them small- to mediumsized manufacturers, stand to realize substantial benefits from applying HPC to their manufacturing processes Computing and storage resource providers with particular emphasis on those offering HPC in the cloud Software providers ranging from ISVs to open source and government software in the public domain HPC and cloud computing experts helping the teams an essential ingredient In addition to organizing the teams, Gentzsch and Yenier, and four dedicated team mentors from the HPC Experiment core team (Dennis Nagy, Gregory Shirin, Margarette Kesselman, and Sharan Kalwani) also provided guidance and mentoring whenever and wherever it was needed to help the teams navigate the sometimes rocky road to running applications on remote HPC services. CFD a Hit By far, computational fluid dynamics (CFD) was the main application run in the cloud by the Round 1 and Round 2 teams 11 of the 25 teams presented here concentrated their efforts in this area. The other areas of interest included finite element analysis (FEA), multiphysics, and a variety of miscellaneous applications including biotech. As you ll read in the reports, quite a few teams encountered major speed bumps during the three months spent on their project. Many of these problems were solved sometimes with simplex fixes, in other cases with ingenious solutions. Some proved difficult, others intractable. For example, the pay-per-use billing feature of cloud computing solves a major end-user dilemma whether or not to make the considerable investment needed to build in-house computational resources, which includes not just the HPC hardware, but also the infrastructure and human resources necessary to support the company s foray into high performance computing. It seems like a no-brainer: pay only for what you need and leave all the rest to your cloud resource provider. But as several of the Experiment s teams discovered, unless you pay close attention to the costs you re incurring in the cloud, the price tag associated with remote computing can quickly mount up. Other Speed Bumps In addition to unpredictable costs associated with payper-use billing, incompatible software licensing models are a major headache. Fortunately many of the software vendors, especially those participating in the Experiment, are working on creating more flexible, compatible licensing models, including on-demand licensing in the cloud. Other teams ran into problems of scalability when attempting to run jobs on multiple cores. Yet another group found that the main difficulty they encountered was the development of interactive visualization tools to work with simulation data stored in the cloud. Overall, the challenges were many and varied and, in most cases, they were solved. However, in a few instances, despite a team s valiant efforts, the experiment had to be abandoned or postponed for a future round. On balance though, most of the teams, with the help of their incumbent HPC/cloud expert, worked their way to a solution and describe in helpful detail the lessons learned in the process. Benefits of HPC in the Cloud In addition to recounting the challenges the team confronted, each report contains a benefits section. As you read these results, it quickly becomes clear why the HPC Experiment has proven so popular and why many of the Round 1 and Round 2 teams have continued their explorations into Round 3. The teams were not the only ones moving up the learning curve. In the course of the experiment the organizers Gentzsch and Yenier have learned and are continuing to learn their own set of lessons. As a result, they are continually modifying how the HPC Experiment is conducted to make the process run even more smoothly and the rewards even greater for the participants. This compendium is a treasure trove of information. We recommend you take your time reading through the individual reports there is much of value to be gained. Each team seems to have run into and solved many problems that are sometimes ubiquitous and other times unique to their company s situation and industry. Either way, the information is invaluable. This report underscores the fact that HPC in the cloud is a viable and growing solution; especially for small- to medium-sized manufacturers looking to leverage the technology to speed up time to market, cut costs, improve quality, and be more competitive in the global marketplace. The HPC Experiment is helping to make this a reality for companies both large and small that wish to make the most of what high performance computing has to offer. John Kirkley, Co-Editor, Kirkley Communications, June 5, 2013

7 Team 1: Heavy Duty Abaqus Structural Analysis Using HPC in the Cloud MEET THE TEAM Clearly one of the first things established was that the HPC cloud model can indeed be made to work. USE CASE Abaqus/Explicit and Abaqus/Standard are the major applications for this project they provide the driving force behind using HPC cloud to address the need for sudden demand in compute. The applications in this experiment range from solving anchorage tensile capacity and steel and wood connector load capacity, to special moment frame cyclic pushover analysis. The existing HPC cluster at Simpson Strong-Tie is modest, consisting of about 32 cores of Intel x86-based gear. Therefore, when emergencies arise, the need for cloud bursting is critical. Also challenging is the ability to handle sudden large data transfers, as well as the need to perform visualization for ensuring that the design simulation is proceeding along correct lines. The end-to-end process began with widely dispersed demand in the Pacific Time Zone, expertise at the other end of the US, and resources in the middle. Network bandwidth and latency were expected to play a major role since they impact the workflow and user perception of the ability to access cloud HPC capabilities. Here is an example of the workflow: 1. Pre-processing on the end user s local workstation to prepare the CAE model. 2. Abaqus input file is transferred to the HPC cloud data staging area using a secured FTP process. 3. The end user submits the job through the HPC cloud provider s (Nimbix.net) web portal. 4. Once the job finishes, the end user receives a notification . The result files can be transferred back to the end user s workstation for postprocessing, or the post-processing can be done using a remote desktop tool like HP RGS on the HPC provider s visualization node. Typical data transfer sizes (upstream) were modest, ranging in the few hundred megabytes. The large number of output files (anywhere from 5 to 20) and output sizes of a few gigabytes described the data domain in this use case. CHALLENGES Keeping everyone s time demands in mind, we set up a weekly schedule and we kept it very simple. We first

8 identified the HPC simulation jobs and ensured that they were representative of a typical workload. The cloud based infrastructure at Nimbix was the first challenge. In this case, for MPI parallel jobs, the Abaqus application needed a fast interconnect such as Infiniband, which was not available. However this was solved with fat nodes (with available scale as it true in the cloud) the large number of cores and large memory allowed the workload job to be run close to the local cluster performance and avoid the need for a very fast interconnect. As this cluster is just a sandbox for testing out the cloud HPC workflow, the actual interconnect performance of this 12 core cluster was not a concern. The second challenge was to address the need for simple and secure file storage and transfer. Surprisingly, this was accomplished very quickly using GLOBUS technology. This speaks volumes to the fact that these days cloud-based storage is mature and ready for prime time HPC, especially in the CAE arena. The third challenge was how to push the limits and stream several tens of jobs simultaneously to the remote HPC cloud resource. This would provide solid evidence that bursting was indeed feasible. To the whole team s surprise, it worked admirably and made no impact whatsoever overall. The fourth and final challenge was perhaps the most critical end user perception and acceptance of the cloud as a smooth part of the workflow. Here remote visualization was necessary to see if the simulation results (left remotely in the cloud) could be viewed and manipulated as if they were local on the end user desktop. After several iterations and brainstorming sessions, HP s RGS was chosen to help deliver this capability. RGS was selected because it is: Application neutral Has a clean and separate client (free) and server component Provides some tuning parameters that can help overcome the bandwidth issues Several tests were conducted and carefully iterated, such as image update rate, bandwidth selection, codecs, etc. A screen shot is shown below of the final successful user acceptance of remote visualization settings: BENEFITS Clearly one of the first things established was that the HPC cloud model can indeed be made to work. What is required is a well-defined use case, which will vary by industry verticals. It is also important to have very capable and experienced participants ISV, end user, and providers of the entire solution. This is distinct from the requirement to spin everything as a first time instance, since practically everyone s infrastructure differs ever so slightly mandating the need for good service delivery setups. CONCLUSIONS AND RECOMMENDATIONS At the conclusion of the experiment, a few key necessary factors emerged. Anyone who wishes to wander down this road needs to heed these four lessons: 1. Result file transfers are a major source of concern since most CAE result files can easily run over several gigabytes. It depends on the individual use case emergency, sizes, etc. For this CAE use case, a minimum of 2-4 MB/ sec sustained and delivered bandwidth is necessary to be considered acceptable as an alternative to local cluster performance 2. The same applies to remote visualizations. In this case, 4 MB/sec is the threshold where a CAE analyst can perform work and not get annoyed by bandwidth limitations. Latency is also a key concern, but, in this case, it was not an issue when connecting US East and West coasts to the Texas-based cloud facility. 3. In addition to the cloud service provider, a network savvy ISP is perhaps a necessary part of the team infrastructure in order to deliver robust and production quality HPC cloud services. Everyone s mileage will vary; an ROI analysis is recommended to help uncover the necessary SLA requirements and costs associated with connectivity to and from the cloud. 4. Remote visualization provides a convenient collaboration platform for a CAE analyst to access the analysis results anywhere he has the need, but it requires a secure behind the firewall remote workspace. Fig. 1 - Cloud Infrastructure: Nimbix Accelerated Compute Cloud Case Study Authors: Frank Ding, Matt Dunbar, Steve Hebert, Rob Sherrard, and Sharan Kalwani

9 Team 2: Simulation of a Multi-resonant Antenna System Using CST MICROWAVE STUDIO The cloud is normally advertised as enabling agility and enabling elasticity but in several cases it was our own project team that was required to be agile/nimble simply to react to the rapid rate of change within the AWS environment. USE CASE The end user uses CAE for virtual prototyping and design optimization on sensors and antenna systems used in NMR spectrometers. Advances in hardware and software have enabled the end-user to simulate the complete RF-portion of the involved antenna system. Simulation of the full system is still computationally intensive although there are parallelization and scaleout techniques that can be applied depending on the particular solver method being used in the simulation. The end-user has a highly-tuned and over-clocked local HPC cluster. Benchmarks suggest that for certain solvers the local HPC cluster nodes are roughly 2x faster than the largest of the cloud-based Amazon Web Services resources used for this experiment. However, the local HPC cluster averages 70% utilization at all times and the larger research-oriented simulations the end-user was interested in could not be run during normal business hours without impacting production engineering efforts. Remote cloud-based HPC resources offered the end-user the ability to burst out of the local HPC system and onto the cloud. This was facilitated both by the architecture of the commercial CAE software as well as the parallelizable nature of many of the solver methods. The CST software offers multiple methods to accelerate simulation runs. On the node level (single machine) multithreading and GPGPU computing (for a subset of all available solvers) can be used to accelerate simulations still small enough to be handled by a single machine. If a simulation project needs multiple independent simulation runs (e.g. in a parameter sweep or for the calculation of different frequency points) that are independent of each other, these simulations can be sent to different machines to execute in parallel. This is done by the CST Distributed Computing System, which takes care of all data transfer operations necessary to perform this parallel execution. In addition, very large models can be handled by MPI parallelization using a domain decomposition approach. End-user effort: >25h for setup, problems and benchmarking. >100h for software related issues due to large simulation projects, bugs, and postprocessing issues that would also have occurred for purely local work. MEET THE TEAM

10 ISV effort: ~2-3 working days for creating license files, assembling documentation, following discussions, debugging problems with models in the setup, debugging problems with hardware resources. PROCESS 1. Define the ideal end-user experiment 2. Initial contacts with software provider (CST) and resource provider (AWS) 3. Solicit feedback from software provider on recommended cloud bursting methods; secure licenses 4. Propose Hybrid Windows/Linux Cloud Architecture #1 (EU based) 5. Abandon Cloud Architecture #1; User prefers to keep simulation input data within EU-protected regions. However, AWS has resources we require that did not yet exist in EU AWS regions. End-user modifies experiment to use synthetic simulation data, which enables the use of US, based cloud systems. 6. Propose Hybrid Windows/Linux Cloud Architecture #2 (US based) & implement at small scale for testing 7. Abandon Cloud Architecture #2. Heavily secured virtual private cloud (VPC) resource segregation front-ended by an internet-accessible VPN gateway looked good on paper however AWS did not have GPU nodes (or the large cc2.* instance types) within VPC at the time and the commercial CAE software had functionality issues when forced to deal with NAT translation via a VPN gateway server. 8. Propose Hybrid Windows/Linux Cloud Architecture #3 & implement at small scale for testing. 9. The third design pattern works well; user begins to scale up simulation size 10. Amazon announces support for GPU nodes in EU region and GPU nodes within VPC environments; end-user is also becoming more familiar with AWS and begins experimenting with Amazon Spot Market to reduce hourly operating costs by very significant amount. 11. Hybrid Windows/Linux Cloud Architecture #3 is slightly modified. The License Server remains in the U.S. because moving the server would have required a new license file from the software provider. However all solver and simulation systems are relocated to Amazon EU region in Ireland for performance reasons. End-user switches all simulation work to inexpensively sourced nodes from the Amazon Spot Market. 12. The Modified Design #3 in which solver/simulation systems are running on AWS Spot Market Instances in Ireland, while a small license server remaining in the U.S. reflects the final design. As far as we understood, the VPN-Solution that did not work in the beginning of the project would actually have worked at the end of the project period because of changes within the AWS. In addition the preferred heavily secured solution would have provided fixed MAC addresses, thus avoiding having to run a license instance all the time. Front-end and two GPU solvers in action CHALLENGES Geographic constraints on data End-user had real simulation and design data that should not leave the EU. Unequal availability of AWS resources between Regions At the start of the experiment, some of the preferred EC2 instance types (including GPU nodes) were not yet available in the EU region (Ireland). This disparity was fixed by Amazon during the course of the experiment. At the end of the experiment we had migrated the majority of our simulation systems back to Ireland. Performance of Remote Desktop Protocol The CAE software used in this experiment makes use of Microsoft Windows for experiment design, submission and visualization. Using RDP to access remote Windows systems was very difficult for the end-user, especially when the Windows systems were operating in the U.S. CAE Software and Network Address Translation (NAT) The simulation software assumes direct connections between participating client, solver and front-end systems. The cloud architecture was redesigned so that essential systems were no longer isolated within secured VPC network zones. Bandwidth between Linux solvers & Windows Front-End The technical requirements of the CAE software allow for the Windows components to be run on relatively small AWS instance types. However, when large simulations are underway a tremendous volume of data flows between the Windows system and the Linux solver nodes. This was a significant performance bottleneck throughout the experiment. The project team ended up running Windows on much larger AWS instance types to gain access to 10GbE network connectivity options. Node-locked software licenses The CAE software license breaks if the license server node changes its network hardware (MAC address). The project team ended up leveraging multiple AWS services (VPC, ENI, ElasticIP) in order to operate a persistent, reliable license serving framework. We had to leave the license server in the US and let it run 24/7 because it would have lost the MAC-address upon reboot. Only in the first setup did it have a fixed MAC and IP. Spanning Amazon Regions It is easy in theory to talk

11 about cloud architectures that span multiple geographic regions. It is much harder to implement this for real. Our HPC resources switched between US and EU-based Amazon facilities several times during the lifespan of the project. Our project required the creation, management and maintenance of multiple EU and US specific SSH keys, server images (AMIs) and EBS disk volumes. Managing and maintaining capability to operate in the EU or US (or both) required significant effort and investment. BENEFITS End-User Confirmation that a full system simulation is indeed possible even though there are heavy constraints, mostly due to the CAE software. Model setup, meshing and post-processing are not optimal and require huge efforts in terms of manpower and CPU-time. Confirmation that a full system simulation can reproduce certain problems occurring in real devices and can help to solve those issues. Realize the reasonable financial investment for additional computation resources needed for cloud bursting approaches. Realize that the internet connection speed was the major bottleneck for a cloud bursting approach but also very limiting for RDP work. Software Provider Confirmation that the software is able to be setup and run within a cloud environment and also, in principle, using a cloud bursting approach (see comments regarding the network speed). Some very valuable knowledge was gained on how to setup an elastic cluster in the cloud using best practices regarding security, stability and price in the Amazon EC 2 environment. Experience the limitations and pitfalls specific to the Amazon EC2 configuration (e.g. availability of resources in different areas, VPC needed to preserve MAC addresses for licensing setup, network speed, etc.). Experiencing the restrictions of the IT department of a company when it comes to the integration of cloud resources (specific to the cloud bursting approach). HPC Expert Chance to use Windows-based HPC systems on the cloud in a significant way was very helpful New appreciation for the difficulties in spanning US/EU regions within Amazon Web Services CONCLUSIONS AND RECOMMENDATIONS End-User Internet transfer speed is the major bottleneck for serious integration of cloud computing resources to the end users design flow and local HPC systems. Internet transfer speed is also a limiting factor to allow for remote visualization. Security and data protection issues as well as fears of the end users IT department create a huge administrative limitation for the integration of cloud based resources. Confirmation that a 10 GbE network can considerably speed up certain simulation tasks compared to the local clusters GbE network. The local cluster has been upgraded in the meantime to an IB network. HPC Expert Rapid evolvement of our provider s capability constantly forced the project team to re-architect the HPC system design. The cloud is normally advertised as enabling agility and enabling elasticity but in several cases it was our own project team that was required to be agile/ nimble simply to react to the rapid rate of change within the AWS environment. The AWS Spot Market has huge potential for HPC on the cloud. The price difference is extremely compelling and the relative stability of spot prices over time makes HPC usage worth pursuing. Our design pattern for the commercial license server is potentially a useful best-practice. By leveraging custom/persistent MAC addresses via the use of Elastic Network Interfaces (ENI) within Amazon VPC we were able to build a license server that would not break should the underlying hardware characteristics change (common on the cloud). In a real world effort we would not have made as much use of the hourly on-demand server instance types. Outside of this experiment it is clear that a mixture of AWS Reserved Instances (license server, Windows front-end, etc.) and AWS Spot Market instances (solvers and compute nodes) would deliver the most power at the lowest cost. In a real world effort we would not have done all of our software installation, configuration management and patching by hand. These tasks would have been automated and orchestrated by a proper cloud-aware configuration management system such as Opscode Chef. Software Provider: The setup of a working setup in the cloud is quite complex and needs quite some IT/Amazon EC2 expertise. Supporting such a setup can be quite challenging for an ISV as well as for an end user. Tools to provide simplified access to EC2 would be helpful. Case Study Authors Felix Wolfheimer and Chris Dagdigian

12 Team 4: Simulation of Jet Mixing in the Supersonic Flow with Shock MEET THE TEAM It (remote visualization) provides exciting possibilities for remote, very large data management, limiting the unpleasant (and unavoidable) remote-rendering delay effect. USE CASE The hardware platform consisted of a single desktop node, Ubuntu bit, 8 GB RAM, 5.7 TB RAID storage. Currently available expertise: Two PhD research scientists with industrial-level CFD expertise, and a professor in fluid dynamics. Benchmarking the OpenFOAM solver against an in-house FORTRAN code for supersonic CFD applications on remote HPC resources included the following steps: 1. Test OpenFOAM solver (sonicfoam) on a 2D case at the same conditions as in an in-house simulation 2. Test OpenFOAM solver with dynamic mesh refinement (sonicdymfoam) on the same 2D case a series of simulation to be performed to find suitable refinement parameters for acceptable accuracy and mesh size 3. Production simulations with dynamic mesh refinement, which could be a 3D case or a series of 2D simulations with different parameters. The total estimate for resources was 1,120 CPU hours and 320 GB of disk space. CHALLENGES Generally, the web interface provided by CILEA was pretty convenient to run the jobs, although some extra efforts were required to download the results. Both the traditional approach (secure shell access) and web interface were used to handle simulations. The complete workflow included: Create test case on the end-user desktop Upload it to CILEA computing resource through ssh Run the case using the web interface Receive notification when the case is finished Download the results through ssh Post-process the results on the end-user desktop Direct access is beneficial for transferring large amount of data, providing a noticeable advantage when using the rsync utility. In fact, it might be desirable to run the jobs from the command line as well, although this may just be a matter of habit. An alternative means of accessing remote data offered by CILEA is remote visualization. This approach receives maximum benefit from the relevant HPC facilities where the remote visualization system is sitting on top of a 512 GB RAM node plus videoimage compression tools. It provides exciting possibilities for remote, very CILEA (Consorzio Interuniversitario Lombardo per l Elaborazione Automatica) Dacol large data management, limiting the unpleasant (and unavoidable) remoterendering delay effect. CONCLUSIONS The simulations were not completed beyond step 1 due to unexpected numerical problems and the time spent investigating these problems. Approximately 4 CPU hours were used in total. Because the initial test program runs were not completed during this round of the experiment, both the end-user and the resource provider indicated they would like to participate in the second round. Also, the end-user was interested in evaluating the GPU capabilities of OpenFOAM. Case Study Authors Ferry Tap and Claudio Arlandini

13 Team 5: Two-phase Flow Simulation of a Separation Column MEET THE TEAM Such detailed flow simulations of separation columns provides an insight into the flow phenomena which was previously not possible. USE CASE This use case investigates the dynamics of a vapor-liquid compound in a distillation column with trays. Chemical reactions and phase transitions are not considered, instead the fluid dynamics is resolved with a high level of detail in order to predict properties of the column like the pressure drop, or the residence time of the liquid on a tray. CHALLENGES The main challenge addressed in this use case is the need for computational power, as a consequence of the large mesh resolution. The need for a large mesh (close to 109 mesh nodes an ambitious value in this field) stems from the complex physics of turbulent two-phase flow, and from the complex structure of fine droplet dispersions in the vapor. These difficulties are addressed, in the present case, with help of a highly efficient and scalable computational approach, mixing a so-called lattice Boltzmann method with a volume-of-fluid representation of the two-phase physics. Technology Involved The hardware used was a 128-core cluster with Infiniband interconnection network and an NFS file system. The nodes used 8-core Intel E processors at 2.7GHz. The use case was implemented and executed with help of the open-source Palabos library ( which is based on the lattice Boltzmann method. Project Execution End-to-end process Finding a match between resource provider, software provider and enduser While the needs for a software solution had been preliminarily analyzed by the end-user, and the match to the software provider was therefore straightforward, some efforts were invested by the organizers of the HPC Experiment to find an adequate resource provider to match the substantial needs for resources brought on the part of the end-user. Team setup and initial organization The project partners agreed on the need for hardware resources and the general organization. Exchange of concepts and needs The resource provider explained his approach to cloud computing. It was agreed that the resource provider would not provide direct ssh access to the computing server. Instead, the application was compiled and installed by the resource provider. The interface provided to the end-user consisted of an online XML editor to set up the parameters of the program, and means of interactivity to launch the program and post-process the results. Benchmarks and feasibility tests The compatibility of the software with the provided hardware resources was carefully tested as described above. Setup of the execution environment (ongoing at the time of this writing) Palabos was recompiled on XF by the resource providers team using gcc and openmpi (including IB support). The team also published a simple Palabos application in extreme Factory Studio by configuring a new job submission web form, including relevant job parameters for this Palabos experiment. Final test run (ongoing at the time of this writing) Encountered and solved challenges that consisted of mostly human interactions no technical challenges. We were concerned about inadequate harddisk properties.

14 Palabos job submission web form BENEFITS For the software provider The software provider had the opportunity to learn about a new approach to cloud computing, implying the use of efficient dedicated hardware, and the implementation of a lightweight Softwareas-a-Service model. At the end of this HPC experiment, the software provider is considering including this new approach into his business model. For the hardware provider This was a great opportunity to host a new, exciting and very scalable CFD software, get in touch with the Flowkit and Palabos user community, and therefore envision a great partnership targeting real customers and real production. For the end-user Such detailed flow simulations of separation columns provides an insight into the flow phenomena which was previously not possible. In order to resolve these results, large computational meshes are absolutely necessary. This HPC experiment allows for an assessment of whether such big simulations are computationally feasible for industry. Furthermore we gained valuable experience in handling of these kinds of bit simulation setups. CONCLUSIONS AND RECOMMENDATIONS Definition of cloud computing Different partners have a different understanding of the term cloud computing. While some consider it a simple remote execution of a given software on pay-as-you-go hardware, others find it useful when it provides a software solution that is fully integrated in the web browser. In this project, we learned about an intermediate viewpoint defended by the resource provider, where an adequate web interface can be custom-tailored to every software/end-user pair. Availability of specialized resources Before entering this experiment, the software provider was familiar with generic cloud computing resources like those provided by Amazon Web Services. The project revealed the existence of custom cloud solutions for high performance computing, with substantially more powerful and cost-effective hardware. Communication with three involved partners The interaction between the partners was dominated by technical aspects that had to be worked out between the software and the resource providers. It appears that such a process leaves little room for the application end-user to impact the choice of the user interface and cloud computing model deployed between the software and the resource provider. Instead, it is likely that in such a framework the application end-user accepts the provided solution as it is, and decides if it is suitable for his needs or not. Notes from the resource provider In our model, the end user can ask for basic improvements to job submission forms for free. More complex ones are charged. Most HPC applications provided neither cloud API nor a web user interface. extreme Factory SaaS-enabling model partnered with the open source science community and software vendors to expose their applications in the cloud Enabling HPC applications in a cloud is not something everyone can do on their own. It requires a lot of experience and R&D plus a team to deploy and tune applications and support software users. This is one reason we think HPC as a Service is not ready yet for total cloud automation and on-line billing. SSH connections are surprisingly less secure than web portal isolation (we optionally add network isolation) because people who know how to use a job scheduler can discover a lot about the architecture and potentially look for security holes. This is because HPC job schedulers are, if not impossible, pretty difficult to secure. Web portal ergonomics coupled to remote visualization make it possible to execute a complete pre-processing/simulation/post-processing workflow on line, which is highly appreciated Improving the HPC Experiment The partners of this team appreciated the structure and management of the HPC experiment and find that its execution was highly appropriate. The following are minor suggestions that might be helpful for future repetition of the Experiment: It appears that the time frame of three month is very short, especially if a large amount of time must be spent finding appropriate partners and connecting them to each other. Participating partners are likely to have both industrial and academic interests in the HPC experiment. In the latter case, a certain amount of conceptual and theoretical guidance could be inspiring. As an example, it might have been more realistic for the participants to contribute and exchange their opinions on topics related to the actual definition of cloud computing if the framework for such a topic had been sketched more precisely by the organizers. It would be highly interesting for all participants, as the project advances, to know more about the content and progress of the project of other teams. Why not conceive a mid-term webinar that is more practically oriented, based on concrete examples of the results achieved so far in the various projects? Case Study Authors Jonas Latt, Marc Levrier, and Felix Muggli

15 Team 8: Flash Dryer Simulation with Hot Gas Used to Evaporate Water from a Solid The company was interested in reducing the solution time and, if possible, increasing mesh size to improve the accuracy of their simulation results without investing in a computing cluster that would be utilized only occasionally. USE CASE CFD multiphase flow models are used to simulate a flash dryer. Increasing plant sizes in the cement and mineral industries mean that current designs need to be expanded to fulfill customers requests. The process is described by the Process Department and the structural geometry by the Mechanical Department both departments come together using CFD tools that are part of end-user s extensive CAE portfolio. Currently, the multiphase flow model takes about five days for a realistic particle loading scenario on our local infrastructure (Intel Xeon X5667, 12M Cache, 3.06 GHz, 6.40 GT/s, 24 GB RAM). The differential equation solver of the Lagrangian particle tracking model requires several GBs of memory. ANSYS CFX 14 is used as the solver. Simulations for this problem are made using 1.4 million cells, five species and a time step of one millisecond for a total time of two seconds. A cloud solution should allow the end-user to run the models faster to increase the turnover of sensitivity analyses and reducing time to customer implementation. It also would allow the end-user to focus on engineering aspects instead of using valuable time on IT and infrastructure problems. Fig. 1 - Flash dryer model viewed with ANSYS CFD-Post The Project The most recent addition to the company s offerings is a flash dryer designed for a phosphate processing plant in Morocco. The dryer takes a wet filter cake and produces a dry product suitable for transport to markets around the world. The company was interested in reducing the solution time and, if possible, increasing mesh size to improve the accuracy of their simulation results without investing in a computing cluster that would be utilized only occasionally. The project goal was defined based on current experiences with the inhouse compute power. For the chosen model a challenge for reaching MEET THE TEAM

16 this goal was the scalability of the problem with the number of cores. Next, the end-user needed to register for XF. After organizational steps were completed, the XF team integrated ANSYS CFX for the end-user into their web user interface. This made it easy for the end-user to transfer data and run the application in the pre-configured batch system on the dedicated XF resources. The model was then run on up to 128 Intel E cores. The work was accomplished in three phases: Setup phase During the project period XF was very busy with production customers and was also migrating their Bull B500 blades (Intel Xeon X5670 sockets, 2.93 GHz, 6 cores, 6.40 GT/s, 12 MB) to B510 blades (Intel E sockets, 2.70 GHz, 8 cores, 8.0 GT/s, 20 MB). The nodes are equipped with 64 GB Ram, 500 GB hard disks and connected with Infiniband QDR. Execution phase After an initial hardware problem with the new blades, a solver run crashed after 35 hours due to a CFX stack memory overflow. This was handled by adding a new parameter to the job submission web form. A run using 64 cores still crashed after 12 hours despite 20% additional stack memory. This issue is not related to overall memory usage as the model never used more than 10% of the available memory as observed for one of the 64 cores runs. Finally, a run on 128 cores and 30% additional stack memory successfully ran up to the 2s point. An integer stack memory error occurred at a later point this still needs to be looked into. Post-processing phase The XF team installed ANSYS CFD-Post, visualization software for ANSYS CFX, and made it available from the portal in a 3D remote visualization session. It was also possible to monitor the runs from the Solver Manager GUI and hence avoid downloading large output log files. Because the ANSYS CFX solver was designed from the ground up for parallel efficiency, all numerically intensive tasks are performed in parallel and all physical models work in parallel. So administrative tasks, such as simulation control and user interaction, as well as the input/output phases of a parallel run were performed in sequential mode by the master process. BENEFITS The extreme factory team was quickly able to provide ANSYS CFX as SaaS and configure any kind of HPC workflow in extreme factory Studio (XF s web front end). The XF team spent around three man days to setup, configure, execute and help debug the ANSYS CFX experiment. FLSmidth spent around two man days in order to understand, setup and utilize the XF Portal methodology. XF also provides 3D remote visualization with good performance, which helps solve the problem of downloading large result files for local post-processing and checking the progress of the simulation. Fig. 2 - ANSYS CFX job submission web form Enabling HPC applications in a cloud requires a lot of experience and R&D, plus a team to deploy and tune applications and support software users. For the end-user the primary goal of running the job in one to two days was met. The runtime of the successful job was about 46.5 hours. There was not enough time in the end to perform some scalability tests these would have been helpful to optimize the size of the resources required with the runtime of the job. The ANSYS CFX technology incorporates optimization for the latest multi-core processors and benefits greatly from recent improvements in processor architecture, algorithms for model partitioning combined with optimized communications, and dynamic load balancing between processors. CONCLUSIONS AND RECOMMENDATIONS No special problems occurred during the project, only hardware provisioning delays. Pressure from production made it difficult to find free resources and tuning phases to get good results. Providing the HPC application in form of SaaS made it easy for the end-user to get started with the cloud and concentrate on his core business. It would be helpful to have some more information about cluster metrics beyond what is currently readily available e.g. memory, I/O-usage, etc. Time needed for downloading the results files and minimizing risks to proprietary data need to be considered for each use case. Due to the size of output data and transfer speed limitations, we determined that a remote visualization solution is required. Case Study Authors Ingo Seipp, Marc Levrier, Sam Zakrzewski, and Wim Slagter Note: Some parts of this report are excerpted from a story on the project featured in the Digital Manufacturing Report. You can read the full story at

17 Team 9: Simulation of Flow in Irrigation Systems to Improve Product Reliability MEET THE TEAM HPC and cloud computing will certainly be a valuable tool as our company seeks to increase its reliance on CFD simulation to reduce costs and time associated with the build-and-test iteration model of prototyping and design. USE CASE In the industry of residential and commercial irrigation products, product reliability is paramount customers want their equipment to work every time with low maintenance over a long product lifetime. For engineers, this means designing affordable products that are rigorously tested before the device begins production. Irrigation equipment companies employ a large force of designers, researchers and engineers who use CAD packages to develop and manufacture the products, and CAE analysis programs to determine the products reliability, specifications and features. CHALLENGES As the industry continues to demand more efficiency along with greater environmental stewardship, the usage rate of recycled and untreated water for irrigation grows. Fine silt and other debris often exist in untreated water sources (e.g. lakes, rivers and wells), and cause malfunction of internal components over the life of the product. In order to prevent against product failure, engineers are turning to increasingly fine meshes for CFD analysis, outpacing the resources of in-house workstations. To continue expanding the fidelity of these analyses within reasonable product design cycles, manufacturers are looking to cloud-based and remote computing for the heavy computation loads. The single largest challenge we

18 faced as end-users was the coordination with and application of the various resources presented to us. For example, one roadblock was that when presented with a high-powered cluster, we discovered that the interface was Linux, which is prevalent throughout HPC. As industry engineers with a focus on manufacturing, we have little or no experience with Linux and its navigation. In the end, we were assigned another cluster with a Windows virtualization to allow for quicker adoption. We consistently found that while the resources had great potential, we didn t have the knowledge to take full advantage of all of the possibilities because of the Linux interface and complications of HPC cluster configurations. Additionally, we found that HPC required becoming familiar with software programs that we were not accustomed to. Engineers typically use multiple software packages on a daily basis, and the addition of new operating environment, GUI, and user controls added another roadblock to the process. The increased use of scripting and software automation increased the time of the learning curve. Knowledge of HPC-oriented simulation was lacking for the end-user. As the end-user engineer s knowledge was limited to in-house and small-scale simulation, optimizing the model and mesh(es) for more powerful clusters proved to be cumbersome and time-intensive. As we began to experiment with extremely fine mesh conditions, we ran into a major issue. While the CFD-solver itself scaled well across the computing cluster, every increase in mesh size took significantly more time for meshgeneration, in addition to dramatically slowing the set-up times. Therefore with larger/finer meshes, the bottleneck moved from the solve time to the preparation time. BENEFITS At the conclusion of the experiment, the end-user was able to determine the potential of HPC for the future of simulation within the company. Another crucial benefit was the comparison of mesh refinements to accurately compromise both fidelity and practicality. A sweet-spot was suggested by the results one that would balance user set-up time with computing costs and would deliver timely, consistent, precise results. As suggested by the experiment, performing a fine mesh with 32 compute cores proved to be a balance of affordable hardware and timely, accurate results. CONCLUSIONS AND RECOMMENDATIONS The original cluster configuration offered by SDSC was Linux, but the standard Linux interface provided was not user-friendly for the end user s purposes. In order to accommodate the end user s needs, the SDSC team decided to try running Windows in a large, virtual shared memory machine using the vsmp software on SDSC s Gordon supercomputer. Using vsmp with Windows on the Gordon supercomputer offers the opportunity to provision a one terabyte Windows virtual machine which can provide a significant capability for large modeling and simulation problems that do not scale well on a conventional cluster. Although the team was successful in getting ANSYS CFX to run in this configuration on up to 16 cores (we discovered the 16-core limitation was due to the version of Windows we installed on the virtual machine), various technical topics with remote access and licensing could not be completely addressed within the timeframe of this project and precluded running actual simulations for this phase. Following the Windows test, the SDSC team recommended moving back to the proven Linux environment, which as noted previously was not ideal for this particular end user. Due to time constraints and the aforementioned Linux vs. Windows issues, end user simulations were not run on the SDSC resources for this phase of the project. However, SDSC has made the resource available for an additional time period should the end user desire to try simulations on the SDSC system. The end user states that they learned a lot, and are still intending to benchmark the results for the team members data, but do not have any performance or scalability data to show at this time. The results given above in terms of HPC performance were gathered using the Simutech Group s Simucloud cloud computing HPC offerings. From the SDSC perspective, this was a valuable exercise in interacting with and discovering the use cases and requirements of a typical SME end user. The experiment in running CFX for Windows on a large shared memory (1 TB) cluster was valuable and provided SDSC with an opportunity to explore how this significant capability might be configured for scientific and industrial users computing on Big Data. Another finding is that offering workshops for SMEs in running simulation software at HPC centers may be a service that SDSC can offer in the future, in conjunction with its Industrial Affiliates (IA) program. The end user noted, Having short-term licenses which scale with the need of a simulation greatly reduces our costs by preventing the purchase of under-utilized HPC packs for our company s in-house simulation. Summarizing his overall reaction to the project, the end user had this to say: HPC and cloud computing will certainly be a valuable tool as our company seeks to increase its reliance on CFD simulation to reduce costs and time associated with the build-and-test iteration model of prototyping and design. Case Study Authors Rick James, Wim Slagter, and Ron Hawkins

19 Team 14: Electromagnetic Radiation and Dosimetry for High Resolution Human Body Phantoms and a Mobile Phone Antenna Inside a Car as Radiation Source The goals were to reduce the runtime of the current job and to increase model resolution to more than 750 million cells. MEET THE TEAM USE CASE The use case is a simulation of electromagnetic radiation by mobile phone technology and dosimetry in human body phantoms inside a car model. The scenario is a car interior with seat and a highly detailed human body phantom with a hands-free mobile phone. Simulation software is CST Studio Suite. The transient solver of CST Microwave Studio was used during the experiment. CHALLENGES The goals were to reduce the runtime of the current job and to increase model resolution to more than 750 million cells. A challenge to achieving the goals was the scalability of the problems on many nodes with or without GPUs and high-speed network connections. Based on experiences with performance of the problem, the preferred infrastructure included Windows nodes with GPUs and fast network connectivity, i.e. Infiniband. If no GPUs were available, a multiple of the number of cores would be required to run the selected problem and achieve the same performance as with GPUs. THE PROJECT In the beginning, the project was identified by the end-user. The goals were set based on current experiences with existing compute power. The runtime of the problem on the existing environment was from several days up to one or two weeks. Output data sizes were in the range of GB depending on the size of the problem. The project was planned in three steps. At first a chip model simulation would be performed as a benchmark problem. The aim was to set up the simulation environment, check the speed of the system itself and its visualization, and analyze first problems. The second step would then be a simulation with a car seat, hands-free equipment and a human body phantom. The last step featured a full car model.

20 Execution Phase With the commitment from people at CST and HSR, the installation of CST on the HPC cluster has been completed and first jobs have been run. Testing the installation and debugging requires rdp access to the compute nodes, something that only cluster administrators are commonly allowed to do. BENEFITS Benefits from a cloud model to end-user are: the availability of additional resources on project demand; no taxable hardware costs remaining after the project; and no hardware aging. Fig. 1 - Applications for high fidelity simulations: seat with human body & hands-free equipment. Setup phase Access to the resource provider was established via a VPN connection. HSR provides 33 compute nodes with 12 cores each, InfiniBand interconnect, and workstations with GPU. Some VPN client versions did not succeed in connecting from the end user location to the resource provider although they were working from outside. With the latest VPN client version from one provider it was possible to connect. To let the Job Manager connect with the proper credentials from a local machine to the resource provider it was necessary to connect with the appropriate credentials and save them. Access to the cluster was then available through the Windows HPC Client. Batch jobs could be submitted and managed through Windows HPC Job Manager. CONCLUSIONS AND RECOMMENDATIONS Establishing the access to the cloud resources through the VPN- and HPC-Client is more complicated to setup. Once established, it works reliably. But an automated process is needed for integration into a workflow. Because of the size of the result files for big problems, the time required for transferring the results can be very long. For post-processing an rdp-connection is required to reduce the amount of data that needs to be transferred. Remote visualization for big problems would require a high performance graphics card. The CST software on Windows uses the graphical frontend in batch mode. For debugging and monitoring a job. An rdp-connection to the front-end node is required. This is a problem for many HPC cluster policies, where direct access of users to the compute nodes is prohibited. Data availability and accountability for data security and loss must be defined. Case Study Authors Ingo Seipp, Carsten Cimala, Felix Wolfheimer, and Henrik Nordborg.