The UberCloud HPC Experiment: Compendium of Case Studies
|
|
|
- Justin Allison
- 10 years ago
- Views:
Transcription
1 The UberCloud HPC Experiment: Compendium of Case Studies
2 The UberCloud HPC Experiment: Compendium of Case Studies Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies. Progress can only be achieved by educating our engineers, especially those in the missing middle, and making HPC easier to access and use for everyone who can benefit from this advanced technology. Compendium Sponsor The UberCloud HPC Experiment actively promotes the wider adoption of digital manufacturing technology. It is an example of a grass roots effort to foster collaboration among engineers, HPC experts, and service providers to address challenges at scale. The UberCloud HPC Experiment started in mid with the aim of exploring the end-to-end process employed by digital manufacturing engineers to access and use remote computing resources in HPC centers and in the cloud. In the meantime, the UberCloud HPC Experiment has achieved the participation of 500 organizations and individuals from 48 countries. Over 80 teams have been involved so far. Each team consists of an industry end-user and a software provider; the organizers match them with a well-suited resource provider and an HPC expert. Together, the team members work on the enduser s application defining the requirements, implementing the application on the remote HPC system, running and monitoring the job, getting the results back to the end-user, and writing a case study. Media Partners COMMUNICATIONS Intel decided to sponsor this Compendium of 25 case studies selected from the first 60 teams to raise awareness in the digital manufacturing community about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource for engineers, managers and executives who believe in the strategic importance of this technology for their organizations. Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible. This HPC UberCloud Compendium of Case Studies has been sponsored by Intel and produced in conjunction with Tabor Communications Custom Publishing, which includes HPCwire, HPC in the Cloud, and Digital Manufacturing Report. If you are interested in participating in this experiment, either actively as a team member or passively as an observer, please register at A Tabor Communications, Inc. (TCI) publication This report cannot be duplicated without prior permission from TCI. While every effort is made to assure accuracy of the contents we do not assume liability for that accuracy, its presentation, or for any opinions presented.
3 The UberCloud HPC Experiment: Compendium of Case Studies Table of Contents 4 Welcome Note 5 Finding The Missing Middle 6 Executive Summary Case Studies: Team 1: Heavy Duty Abaqus Structural 7 Analysis Using HPC in the Cloud Team 2: Simulation of a Multi-resonant Antenna System Using CST MICROWAVE STUDIO Team 4: Simulation of Jet Mixing in the Supersonic Flow with Shock Team 5: Two-phase Flow Simulation of a Separation Column Team 8: Flash Dryer Simulation with Hot Gas Used to Evaporate Water from a Solid Team 9: Simulation of Flow in Irrigation Systems to Improve Product Reliability Team 14: Electromagnetic Radiation and Dosimetry for High Resolution Human Body Phantoms and a Mobile Phone Antenna Inside a Car as Radiation Source Team 15: Weather Research and Forecasting on Remote Computing Resources Team 19: Parallel Solver of Incompressible, 2D and 3D Navier-Stokes Equations, Using the Finite Volume Method Team 20: NPB2.4 Benchmarks and Turbo-machinery Application on Amazon EC Team 22: Optimization Study of Side Door Intrusion Bars Team 25: Simulation of Spatial Hearing Team 26: Development of stents for a narrowed artery Team 30: Heat Transfer Use Case Team 34: Analysis of Vertical and Horizontal Wind Turbine Team 36: Advanced Combustion Modeling for Diesel Engines Team 40: Simulation of Spatial Hearing (Round 2) Team 44: CFD Simulation of Drifting Snow Team 46: CAE Simulation of Water Flow Around a Ship Hull Team 47: Heavy Duty Abaqus Structural Analysis Using HPC in the Cloud (Round 2) Team 52: High-Resolution Computer Simulations of Blow-off in Combustion Systems Team 53: Understanding Fluid Flow in Microchannels Team 54: Analysis of a Pool in a Desalinization Plant Team 56: Simulating Radial and Axial Fan Performance Team 58: Simulating Wind Tunnel Flow Around Bicycle and Rider
4 The UberCloud HPC Experiment: Compendium of Case Studies Welcome! The UberCloud HPC Experiment started one year ago when Burak sent an to Wolfgang with a seemingly simple question: Hi Wolfgang, I am Burak. Why is cloud adoption in high performance computing so slow, compared to the rapid adoption of cloud computing in our enterprise community? After several discussions and Skype conferences that elaborated on the fundamental differences between enterprise and high performance computing, Wolfgang got on a plane to San Francisco for a face-to-face conference with Burak. Four days of long discussions and many cups of tea later, a long list of challenges, hurdles, and solutions for HPC in the cloud covered the whiteboard in Burak s office. The idea of the experiment was born. Later, after more than 30 HPC cloud providers had joined, we called it the UberCloud HPC Experiment. We found that, in particular, small- and medium-sized enterprises in digital manufacturing would strongly benefit from HPC in the Cloud (or HPC as a Service). The major benefits they would realize by having access to additional remote compute resources are: the agility gained by speeding up product design cycles through shorter simulation run times; the superior quality achieved by simulating more sophisticated geometries or physics; and the discovery of the best product design by running many more iterations. These are benefits that increase a company s competitiveness. Tangible benefits like these make HPC, and more specifically HPC as a Service, quite attractive. But how far away are we from an ideal HPC cloud model? At this point, we don t know. However, in the course of this experiment as we followed each team closely and monitored its challenges and progress, we gained an excellent insight into these roadblocks and how our teams have tackled them. We are proud to present this Compendium of 25 selected use cases in digital manufacturing, with a focus on computational fluid dynamics and material analysis. It documents the results of over six months of hard work by the participating teams their findings, challenges, lessons learned, and recommendations. We were amazed by how engaged all participants were, despite the fact that this was not their day job. But their inquiring minds and the chance of collaborating with the brightest people and companies in the world, as well as tackling some of today s greatest challenges associated with accessing remote computing resources, were certainly their strongest motivator. We want to thank all participants for their continuous commitment and for their voluntary contribution to their individual teams, to the Experiment, and thus to our whole HPC and digital manufacturing community. We want to thank John Kirkley from Kirkley Communications for his support with editing these use cases and our media sponsor Tabor Communications for this publication. Last but not least, we are deeply grateful to our sponsor, Intel, who made this Compendium possible. Enjoy reading! Wolfgang Gentzsch and Burak Yenier Neutraubling and Los Altos, June 1, 2013 A Tabor Communications, Inc. (TCI) publication This report cannot be duplicated without prior permission from TCI. While every effort is made to assure accuracy of the contents we do not assume liability for that accuracy, its presentation, or for any opinions presented.
5 The UberCloud HPC Experiment: Compendium of Case Studies Finding the Missing Middle So far the application of high performance computing (HPC) to the manufacturing sector hasn t lived up to expectations. Despite an attractive potential payoff, companies have been slow to take full advantage of today s advanced HPC-based technologies such as modeling, simulation and analysis. Fortunately, the situation is changing. Recently a number of important initiatives have gotten underway designed to bring the benefits of HPC to small- to medium- sized manufacturers (SMMs) the so-called missing middle. For example, in the United States the National Center for Manufacturing Sciences is launching a network of Predictive Information Centers to bring the technology to these smaller manufacturers. The NCMS initiative is designed to help SMMs apply HPC-based modeling and simulation (M&S) to help solve their manufacturing problems and be more competitive in the global marketplace. A Unique Initiative The UberCloud HPC Experiment is one of those initiatives but with a difference. It s a grass roots effort, the result of the vision of two working HPC professionals Wolfgang Gentzsch and Burak Yenier. It s also international in scope, involving more than 500 organizations and individuals from around the globe. So far more than 80 teams each consisting of an industry enduser (typically an SMM), a resource provider, a software provider, and an HPC expert have explored the challenges and benefits associated with accessing and running engineering applications on cloud-provisioned HPC resources. This team approach is unique. Its success is a tribute to the organizers who not only conceived the idea, but also play matchmaker and mentor, bringing together winning combinations of team members often from widely separated geographic locations. In Rounds 1 and 2, reported in this document, the teams have enthusiastically addressed these challenges at scale. In the process they have identified and solved major problems that have limited the adoption of HPC solutions by the missing middle those hundreds of thousands of small- to medium-sized companies worldwide that have yet to realize the full benefits of HPC. As you read through the case studies in this HPC UberCloud Compendium, as Yogi Berra once famously said, It s déjà vu all over again. Among the 25 reports you will unquestionably find scenarios that resonate with your own situation. You will benefit from the candid descriptions of problems encountered, problems solved, and lessons learned. These situations, many involving computational fluid dynamics, finite element analysis and multiphysics, are no longer the exception in the digital manufacturing universe they have become the rule. These reports are down-to-earth examples that speak directly to what you are trying to accomplish within your own organizations. Intel Involvement Why is Intel supporting this Compendium and showing such an interest in the UberCloud HPC Experiment? For one thing, it is clear that the potential market for HPC worldwide is much larger than what we see today. And it has its problems. Recently the number of participants in the HPC community has been somewhat stagnant. Without an influx of new talent across the board, basic skill sets are being lost. We need to include more participants to ensure the sector s vibrancy over time. Initiatives like the UberCloud HPC Experiment do just that. By addressing the barriers confronting the missing middle, we are finding that we can indeed broaden the adoption of its capabilities within this underserved market segment. It s a win-win situation all around: The SMMs gain new advanced capabilities and competitiveness; the HPC ecosystem expands; and companies like Intel and others that support HPC are part of a robust and growing business environment. The UberCloud HPC Experiment fuels innovation not just by endusers who are using HPC tools to create new solutions to their manufacturing problems, but also on the part of the hardware and software vendors and the resource providers. The initiative creates a virtuous cycle leading to the democratization of HPC it s making M&S available to the masses. The initiative satisfies the four strategies set forth by NCMS on what s needed to revitalize manufacturing through the application of digital manufacturing. First is to educate: providing a low risk environment that allows end users to learn about HPC and M&S. Next is entice: clarifying the value of advanced M&S through the use of HPC through entrylevel evaluative solutions. Engage and Elevate take end users to the next levels of digital manufacturing as they become proficient in the use of HPC either through cloud services or by developing in-house capabilities. Lessons Learned One lesson that s become very clear as the UberCloud HPC Experiment continues one size does not fit all. Manufacturing has many facets and virtually every solution reported in this Compendium had to be tailored to the individual situation. As one team commented in their report, From an end user perspective, we observed that each cluster provider has a unique way of bringing the cloud HPC option to the end user. Other teams ran into issues such as scalability, licensing, and unexpected fees for running their applications in the cloud. Despite this diversity, there are a number of common threads running through all 25 reports that provide invaluable information about what to anticipate when running HPC-based applications and how to avoid or solve the speed bumps that inevitably arise. The applications themselves are not the problem; it s a question of understanding how the capabilities inherent in, say a CFD or FEA solver, can meet your needs. This is where the team approach shines by bringing to bear a wide range of experience from all four categories of team members, the chances of finding a solution are greatly enhanced. As the saying goes, To compete, you must compute. The sooner you become familiar with and start using this technology, the sooner you can compete more vigorously and broaden your marketplace. You can not only make your existing products more effective and desirable, but also create new products that are only possible with the application of HPC technology. The competitive landscape is shifting. You need to ask, Do I want to remain in the old world of manufacturing or embrace the new? Reading these 25 case studies will not only show you what s possible, but also how to kick-off the activities that will allow you to take a quantum leap in competitiveness. The UberCloud HPC Experiment this energetic grass roots movement to bring HPC to the missing middle continues. You just might want to become a part of it. Dr. Stephen R. Wheat General Manager, High Performance Computing Intel Corp.
6 Executive Summary This is an extraordinary document. It is a collection of selected case studies written by the participants in Rounds 1 and 2 of the ambitious UberCloud HPC Experiment. The goal of the HPC Experiment is to explore the end-toend processes of accessing remote computing resources in HPC centers and HPC clouds. The project is the brainchild of Wolfgang Gentzsch and Burak Yenier, and had its inception in July What makes this collection so unusual is that, without exception, the 25 teams reporting their experiences are totally frank and open. They share information about their failures as well as successes, and are more than willing to discuss in detail what they learned about the ins and outs of working with HPC in the cloud. When Round 1 wrapped up in October 2012, 160 participating organizations and individuals from 25 countries working together in 25 widely dispersed but tightly knit teams had been involved. With Round 2, completed in March 2013, another 35 teams and 360 individuals some of them veterans of Round 1 took up the challenge. (As of this writing, Round 3 is underway, with almost 500 participating organizations and another 25 enthusiastic teams.) The Participants Each HPC Experiment team is made up of four types of individuals: Industry end-users, many of them small- to mediumsized manufacturers, stand to realize substantial benefits from applying HPC to their manufacturing processes Computing and storage resource providers with particular emphasis on those offering HPC in the cloud Software providers ranging from ISVs to open source and government software in the public domain HPC and cloud computing experts helping the teams an essential ingredient In addition to organizing the teams, Gentzsch and Yenier, and four dedicated team mentors from the HPC Experiment core team (Dennis Nagy, Gregory Shirin, Margarette Kesselman, and Sharan Kalwani) also provided guidance and mentoring whenever and wherever it was needed to help the teams navigate the sometimes rocky road to running applications on remote HPC services. CFD a Hit By far, computational fluid dynamics (CFD) was the main application run in the cloud by the Round 1 and Round 2 teams 11 of the 25 teams presented here concentrated their efforts in this area. The other areas of interest included finite element analysis (FEA), multiphysics, and a variety of miscellaneous applications including biotech. As you ll read in the reports, quite a few teams encountered major speed bumps during the three months spent on their project. Many of these problems were solved sometimes with simplex fixes, in other cases with ingenious solutions. Some proved difficult, others intractable. For example, the pay-per-use billing feature of cloud computing solves a major end-user dilemma whether or not to make the considerable investment needed to build in-house computational resources, which includes not just the HPC hardware, but also the infrastructure and human resources necessary to support the company s foray into high performance computing. It seems like a no-brainer: pay only for what you need and leave all the rest to your cloud resource provider. But as several of the Experiment s teams discovered, unless you pay close attention to the costs you re incurring in the cloud, the price tag associated with remote computing can quickly mount up. Other Speed Bumps In addition to unpredictable costs associated with payper-use billing, incompatible software licensing models are a major headache. Fortunately many of the software vendors, especially those participating in the Experiment, are working on creating more flexible, compatible licensing models, including on-demand licensing in the cloud. Other teams ran into problems of scalability when attempting to run jobs on multiple cores. Yet another group found that the main difficulty they encountered was the development of interactive visualization tools to work with simulation data stored in the cloud. Overall, the challenges were many and varied and, in most cases, they were solved. However, in a few instances, despite a team s valiant efforts, the experiment had to be abandoned or postponed for a future round. On balance though, most of the teams, with the help of their incumbent HPC/cloud expert, worked their way to a solution and describe in helpful detail the lessons learned in the process. Benefits of HPC in the Cloud In addition to recounting the challenges the team confronted, each report contains a benefits section. As you read these results, it quickly becomes clear why the HPC Experiment has proven so popular and why many of the Round 1 and Round 2 teams have continued their explorations into Round 3. The teams were not the only ones moving up the learning curve. In the course of the experiment the organizers Gentzsch and Yenier have learned and are continuing to learn their own set of lessons. As a result, they are continually modifying how the HPC Experiment is conducted to make the process run even more smoothly and the rewards even greater for the participants. This compendium is a treasure trove of information. We recommend you take your time reading through the individual reports there is much of value to be gained. Each team seems to have run into and solved many problems that are sometimes ubiquitous and other times unique to their company s situation and industry. Either way, the information is invaluable. This report underscores the fact that HPC in the cloud is a viable and growing solution; especially for small- to medium-sized manufacturers looking to leverage the technology to speed up time to market, cut costs, improve quality, and be more competitive in the global marketplace. The HPC Experiment is helping to make this a reality for companies both large and small that wish to make the most of what high performance computing has to offer. John Kirkley, Co-Editor, Kirkley Communications, June 5, 2013
7 Team 1: Heavy Duty Abaqus Structural Analysis Using HPC in the Cloud MEET THE TEAM Clearly one of the first things established was that the HPC cloud model can indeed be made to work. USE CASE Abaqus/Explicit and Abaqus/Standard are the major applications for this project they provide the driving force behind using HPC cloud to address the need for sudden demand in compute. The applications in this experiment range from solving anchorage tensile capacity and steel and wood connector load capacity, to special moment frame cyclic pushover analysis. The existing HPC cluster at Simpson Strong-Tie is modest, consisting of about 32 cores of Intel x86-based gear. Therefore, when emergencies arise, the need for cloud bursting is critical. Also challenging is the ability to handle sudden large data transfers, as well as the need to perform visualization for ensuring that the design simulation is proceeding along correct lines. The end-to-end process began with widely dispersed demand in the Pacific Time Zone, expertise at the other end of the US, and resources in the middle. Network bandwidth and latency were expected to play a major role since they impact the workflow and user perception of the ability to access cloud HPC capabilities. Here is an example of the workflow: 1. Pre-processing on the end user s local workstation to prepare the CAE model. 2. Abaqus input file is transferred to the HPC cloud data staging area using a secured FTP process. 3. The end user submits the job through the HPC cloud provider s (Nimbix.net) web portal. 4. Once the job finishes, the end user receives a notification . The result files can be transferred back to the end user s workstation for postprocessing, or the post-processing can be done using a remote desktop tool like HP RGS on the HPC provider s visualization node. Typical data transfer sizes (upstream) were modest, ranging in the few hundred megabytes. The large number of output files (anywhere from 5 to 20) and output sizes of a few gigabytes described the data domain in this use case. CHALLENGES Keeping everyone s time demands in mind, we set up a weekly schedule and we kept it very simple. We first
8 identified the HPC simulation jobs and ensured that they were representative of a typical workload. The cloud based infrastructure at Nimbix was the first challenge. In this case, for MPI parallel jobs, the Abaqus application needed a fast interconnect such as Infiniband, which was not available. However this was solved with fat nodes (with available scale as it true in the cloud) the large number of cores and large memory allowed the workload job to be run close to the local cluster performance and avoid the need for a very fast interconnect. As this cluster is just a sandbox for testing out the cloud HPC workflow, the actual interconnect performance of this 12 core cluster was not a concern. The second challenge was to address the need for simple and secure file storage and transfer. Surprisingly, this was accomplished very quickly using GLOBUS technology. This speaks volumes to the fact that these days cloud-based storage is mature and ready for prime time HPC, especially in the CAE arena. The third challenge was how to push the limits and stream several tens of jobs simultaneously to the remote HPC cloud resource. This would provide solid evidence that bursting was indeed feasible. To the whole team s surprise, it worked admirably and made no impact whatsoever overall. The fourth and final challenge was perhaps the most critical end user perception and acceptance of the cloud as a smooth part of the workflow. Here remote visualization was necessary to see if the simulation results (left remotely in the cloud) could be viewed and manipulated as if they were local on the end user desktop. After several iterations and brainstorming sessions, HP s RGS was chosen to help deliver this capability. RGS was selected because it is: Application neutral Has a clean and separate client (free) and server component Provides some tuning parameters that can help overcome the bandwidth issues Several tests were conducted and carefully iterated, such as image update rate, bandwidth selection, codecs, etc. A screen shot is shown below of the final successful user acceptance of remote visualization settings: BENEFITS Clearly one of the first things established was that the HPC cloud model can indeed be made to work. What is required is a well-defined use case, which will vary by industry verticals. It is also important to have very capable and experienced participants ISV, end user, and providers of the entire solution. This is distinct from the requirement to spin everything as a first time instance, since practically everyone s infrastructure differs ever so slightly mandating the need for good service delivery setups. CONCLUSIONS AND RECOMMENDATIONS At the conclusion of the experiment, a few key necessary factors emerged. Anyone who wishes to wander down this road needs to heed these four lessons: 1. Result file transfers are a major source of concern since most CAE result files can easily run over several gigabytes. It depends on the individual use case emergency, sizes, etc. For this CAE use case, a minimum of 2-4 MB/ sec sustained and delivered bandwidth is necessary to be considered acceptable as an alternative to local cluster performance 2. The same applies to remote visualizations. In this case, 4 MB/sec is the threshold where a CAE analyst can perform work and not get annoyed by bandwidth limitations. Latency is also a key concern, but, in this case, it was not an issue when connecting US East and West coasts to the Texas-based cloud facility. 3. In addition to the cloud service provider, a network savvy ISP is perhaps a necessary part of the team infrastructure in order to deliver robust and production quality HPC cloud services. Everyone s mileage will vary; an ROI analysis is recommended to help uncover the necessary SLA requirements and costs associated with connectivity to and from the cloud. 4. Remote visualization provides a convenient collaboration platform for a CAE analyst to access the analysis results anywhere he has the need, but it requires a secure behind the firewall remote workspace. Fig. 1 - Cloud Infrastructure: Nimbix Accelerated Compute Cloud Case Study Authors: Frank Ding, Matt Dunbar, Steve Hebert, Rob Sherrard, and Sharan Kalwani
9 Team 2: Simulation of a Multi-resonant Antenna System Using CST MICROWAVE STUDIO The cloud is normally advertised as enabling agility and enabling elasticity but in several cases it was our own project team that was required to be agile/nimble simply to react to the rapid rate of change within the AWS environment. USE CASE The end user uses CAE for virtual prototyping and design optimization on sensors and antenna systems used in NMR spectrometers. Advances in hardware and software have enabled the end-user to simulate the complete RF-portion of the involved antenna system. Simulation of the full system is still computationally intensive although there are parallelization and scaleout techniques that can be applied depending on the particular solver method being used in the simulation. The end-user has a highly-tuned and over-clocked local HPC cluster. Benchmarks suggest that for certain solvers the local HPC cluster nodes are roughly 2x faster than the largest of the cloud-based Amazon Web Services resources used for this experiment. However, the local HPC cluster averages 70% utilization at all times and the larger research-oriented simulations the end-user was interested in could not be run during normal business hours without impacting production engineering efforts. Remote cloud-based HPC resources offered the end-user the ability to burst out of the local HPC system and onto the cloud. This was facilitated both by the architecture of the commercial CAE software as well as the parallelizable nature of many of the solver methods. The CST software offers multiple methods to accelerate simulation runs. On the node level (single machine) multithreading and GPGPU computing (for a subset of all available solvers) can be used to accelerate simulations still small enough to be handled by a single machine. If a simulation project needs multiple independent simulation runs (e.g. in a parameter sweep or for the calculation of different frequency points) that are independent of each other, these simulations can be sent to different machines to execute in parallel. This is done by the CST Distributed Computing System, which takes care of all data transfer operations necessary to perform this parallel execution. In addition, very large models can be handled by MPI parallelization using a domain decomposition approach. End-user effort: >25h for setup, problems and benchmarking. >100h for software related issues due to large simulation projects, bugs, and postprocessing issues that would also have occurred for purely local work. MEET THE TEAM
10 ISV effort: ~2-3 working days for creating license files, assembling documentation, following discussions, debugging problems with models in the setup, debugging problems with hardware resources. PROCESS 1. Define the ideal end-user experiment 2. Initial contacts with software provider (CST) and resource provider (AWS) 3. Solicit feedback from software provider on recommended cloud bursting methods; secure licenses 4. Propose Hybrid Windows/Linux Cloud Architecture #1 (EU based) 5. Abandon Cloud Architecture #1; User prefers to keep simulation input data within EU-protected regions. However, AWS has resources we require that did not yet exist in EU AWS regions. End-user modifies experiment to use synthetic simulation data, which enables the use of US, based cloud systems. 6. Propose Hybrid Windows/Linux Cloud Architecture #2 (US based) & implement at small scale for testing 7. Abandon Cloud Architecture #2. Heavily secured virtual private cloud (VPC) resource segregation front-ended by an internet-accessible VPN gateway looked good on paper however AWS did not have GPU nodes (or the large cc2.* instance types) within VPC at the time and the commercial CAE software had functionality issues when forced to deal with NAT translation via a VPN gateway server. 8. Propose Hybrid Windows/Linux Cloud Architecture #3 & implement at small scale for testing. 9. The third design pattern works well; user begins to scale up simulation size 10. Amazon announces support for GPU nodes in EU region and GPU nodes within VPC environments; end-user is also becoming more familiar with AWS and begins experimenting with Amazon Spot Market to reduce hourly operating costs by very significant amount. 11. Hybrid Windows/Linux Cloud Architecture #3 is slightly modified. The License Server remains in the U.S. because moving the server would have required a new license file from the software provider. However all solver and simulation systems are relocated to Amazon EU region in Ireland for performance reasons. End-user switches all simulation work to inexpensively sourced nodes from the Amazon Spot Market. 12. The Modified Design #3 in which solver/simulation systems are running on AWS Spot Market Instances in Ireland, while a small license server remaining in the U.S. reflects the final design. As far as we understood, the VPN-Solution that did not work in the beginning of the project would actually have worked at the end of the project period because of changes within the AWS. In addition the preferred heavily secured solution would have provided fixed MAC addresses, thus avoiding having to run a license instance all the time. Front-end and two GPU solvers in action CHALLENGES Geographic constraints on data End-user had real simulation and design data that should not leave the EU. Unequal availability of AWS resources between Regions At the start of the experiment, some of the preferred EC2 instance types (including GPU nodes) were not yet available in the EU region (Ireland). This disparity was fixed by Amazon during the course of the experiment. At the end of the experiment we had migrated the majority of our simulation systems back to Ireland. Performance of Remote Desktop Protocol The CAE software used in this experiment makes use of Microsoft Windows for experiment design, submission and visualization. Using RDP to access remote Windows systems was very difficult for the end-user, especially when the Windows systems were operating in the U.S. CAE Software and Network Address Translation (NAT) The simulation software assumes direct connections between participating client, solver and front-end systems. The cloud architecture was redesigned so that essential systems were no longer isolated within secured VPC network zones. Bandwidth between Linux solvers & Windows Front-End The technical requirements of the CAE software allow for the Windows components to be run on relatively small AWS instance types. However, when large simulations are underway a tremendous volume of data flows between the Windows system and the Linux solver nodes. This was a significant performance bottleneck throughout the experiment. The project team ended up running Windows on much larger AWS instance types to gain access to 10GbE network connectivity options. Node-locked software licenses The CAE software license breaks if the license server node changes its network hardware (MAC address). The project team ended up leveraging multiple AWS services (VPC, ENI, ElasticIP) in order to operate a persistent, reliable license serving framework. We had to leave the license server in the US and let it run 24/7 because it would have lost the MAC-address upon reboot. Only in the first setup did it have a fixed MAC and IP. Spanning Amazon Regions It is easy in theory to talk
11 about cloud architectures that span multiple geographic regions. It is much harder to implement this for real. Our HPC resources switched between US and EU-based Amazon facilities several times during the lifespan of the project. Our project required the creation, management and maintenance of multiple EU and US specific SSH keys, server images (AMIs) and EBS disk volumes. Managing and maintaining capability to operate in the EU or US (or both) required significant effort and investment. BENEFITS End-User Confirmation that a full system simulation is indeed possible even though there are heavy constraints, mostly due to the CAE software. Model setup, meshing and post-processing are not optimal and require huge efforts in terms of manpower and CPU-time. Confirmation that a full system simulation can reproduce certain problems occurring in real devices and can help to solve those issues. Realize the reasonable financial investment for additional computation resources needed for cloud bursting approaches. Realize that the internet connection speed was the major bottleneck for a cloud bursting approach but also very limiting for RDP work. Software Provider Confirmation that the software is able to be setup and run within a cloud environment and also, in principle, using a cloud bursting approach (see comments regarding the network speed). Some very valuable knowledge was gained on how to setup an elastic cluster in the cloud using best practices regarding security, stability and price in the Amazon EC 2 environment. Experience the limitations and pitfalls specific to the Amazon EC2 configuration (e.g. availability of resources in different areas, VPC needed to preserve MAC addresses for licensing setup, network speed, etc.). Experiencing the restrictions of the IT department of a company when it comes to the integration of cloud resources (specific to the cloud bursting approach). HPC Expert Chance to use Windows-based HPC systems on the cloud in a significant way was very helpful New appreciation for the difficulties in spanning US/EU regions within Amazon Web Services CONCLUSIONS AND RECOMMENDATIONS End-User Internet transfer speed is the major bottleneck for serious integration of cloud computing resources to the end users design flow and local HPC systems. Internet transfer speed is also a limiting factor to allow for remote visualization. Security and data protection issues as well as fears of the end users IT department create a huge administrative limitation for the integration of cloud based resources. Confirmation that a 10 GbE network can considerably speed up certain simulation tasks compared to the local clusters GbE network. The local cluster has been upgraded in the meantime to an IB network. HPC Expert Rapid evolvement of our provider s capability constantly forced the project team to re-architect the HPC system design. The cloud is normally advertised as enabling agility and enabling elasticity but in several cases it was our own project team that was required to be agile/ nimble simply to react to the rapid rate of change within the AWS environment. The AWS Spot Market has huge potential for HPC on the cloud. The price difference is extremely compelling and the relative stability of spot prices over time makes HPC usage worth pursuing. Our design pattern for the commercial license server is potentially a useful best-practice. By leveraging custom/persistent MAC addresses via the use of Elastic Network Interfaces (ENI) within Amazon VPC we were able to build a license server that would not break should the underlying hardware characteristics change (common on the cloud). In a real world effort we would not have made as much use of the hourly on-demand server instance types. Outside of this experiment it is clear that a mixture of AWS Reserved Instances (license server, Windows front-end, etc.) and AWS Spot Market instances (solvers and compute nodes) would deliver the most power at the lowest cost. In a real world effort we would not have done all of our software installation, configuration management and patching by hand. These tasks would have been automated and orchestrated by a proper cloud-aware configuration management system such as Opscode Chef. Software Provider: The setup of a working setup in the cloud is quite complex and needs quite some IT/Amazon EC2 expertise. Supporting such a setup can be quite challenging for an ISV as well as for an end user. Tools to provide simplified access to EC2 would be helpful. Case Study Authors Felix Wolfheimer and Chris Dagdigian
12 Team 4: Simulation of Jet Mixing in the Supersonic Flow with Shock MEET THE TEAM It (remote visualization) provides exciting possibilities for remote, very large data management, limiting the unpleasant (and unavoidable) remote-rendering delay effect. USE CASE The hardware platform consisted of a single desktop node, Ubuntu bit, 8 GB RAM, 5.7 TB RAID storage. Currently available expertise: Two PhD research scientists with industrial-level CFD expertise, and a professor in fluid dynamics. Benchmarking the OpenFOAM solver against an in-house FORTRAN code for supersonic CFD applications on remote HPC resources included the following steps: 1. Test OpenFOAM solver (sonicfoam) on a 2D case at the same conditions as in an in-house simulation 2. Test OpenFOAM solver with dynamic mesh refinement (sonicdymfoam) on the same 2D case a series of simulation to be performed to find suitable refinement parameters for acceptable accuracy and mesh size 3. Production simulations with dynamic mesh refinement, which could be a 3D case or a series of 2D simulations with different parameters. The total estimate for resources was 1,120 CPU hours and 320 GB of disk space. CHALLENGES Generally, the web interface provided by CILEA was pretty convenient to run the jobs, although some extra efforts were required to download the results. Both the traditional approach (secure shell access) and web interface were used to handle simulations. The complete workflow included: Create test case on the end-user desktop Upload it to CILEA computing resource through ssh Run the case using the web interface Receive notification when the case is finished Download the results through ssh Post-process the results on the end-user desktop Direct access is beneficial for transferring large amount of data, providing a noticeable advantage when using the rsync utility. In fact, it might be desirable to run the jobs from the command line as well, although this may just be a matter of habit. An alternative means of accessing remote data offered by CILEA is remote visualization. This approach receives maximum benefit from the relevant HPC facilities where the remote visualization system is sitting on top of a 512 GB RAM node plus videoimage compression tools. It provides exciting possibilities for remote, very CILEA (Consorzio Interuniversitario Lombardo per l Elaborazione Automatica) Dacol large data management, limiting the unpleasant (and unavoidable) remoterendering delay effect. CONCLUSIONS The simulations were not completed beyond step 1 due to unexpected numerical problems and the time spent investigating these problems. Approximately 4 CPU hours were used in total. Because the initial test program runs were not completed during this round of the experiment, both the end-user and the resource provider indicated they would like to participate in the second round. Also, the end-user was interested in evaluating the GPU capabilities of OpenFOAM. Case Study Authors Ferry Tap and Claudio Arlandini
13 Team 5: Two-phase Flow Simulation of a Separation Column MEET THE TEAM Such detailed flow simulations of separation columns provides an insight into the flow phenomena which was previously not possible. USE CASE This use case investigates the dynamics of a vapor-liquid compound in a distillation column with trays. Chemical reactions and phase transitions are not considered, instead the fluid dynamics is resolved with a high level of detail in order to predict properties of the column like the pressure drop, or the residence time of the liquid on a tray. CHALLENGES The main challenge addressed in this use case is the need for computational power, as a consequence of the large mesh resolution. The need for a large mesh (close to 109 mesh nodes an ambitious value in this field) stems from the complex physics of turbulent two-phase flow, and from the complex structure of fine droplet dispersions in the vapor. These difficulties are addressed, in the present case, with help of a highly efficient and scalable computational approach, mixing a so-called lattice Boltzmann method with a volume-of-fluid representation of the two-phase physics. Technology Involved The hardware used was a 128-core cluster with Infiniband interconnection network and an NFS file system. The nodes used 8-core Intel E processors at 2.7GHz. The use case was implemented and executed with help of the open-source Palabos library ( which is based on the lattice Boltzmann method. Project Execution End-to-end process Finding a match between resource provider, software provider and enduser While the needs for a software solution had been preliminarily analyzed by the end-user, and the match to the software provider was therefore straightforward, some efforts were invested by the organizers of the HPC Experiment to find an adequate resource provider to match the substantial needs for resources brought on the part of the end-user. Team setup and initial organization The project partners agreed on the need for hardware resources and the general organization. Exchange of concepts and needs The resource provider explained his approach to cloud computing. It was agreed that the resource provider would not provide direct ssh access to the computing server. Instead, the application was compiled and installed by the resource provider. The interface provided to the end-user consisted of an online XML editor to set up the parameters of the program, and means of interactivity to launch the program and post-process the results. Benchmarks and feasibility tests The compatibility of the software with the provided hardware resources was carefully tested as described above. Setup of the execution environment (ongoing at the time of this writing) Palabos was recompiled on XF by the resource providers team using gcc and openmpi (including IB support). The team also published a simple Palabos application in extreme Factory Studio by configuring a new job submission web form, including relevant job parameters for this Palabos experiment. Final test run (ongoing at the time of this writing) Encountered and solved challenges that consisted of mostly human interactions no technical challenges. We were concerned about inadequate harddisk properties.
14 Palabos job submission web form BENEFITS For the software provider The software provider had the opportunity to learn about a new approach to cloud computing, implying the use of efficient dedicated hardware, and the implementation of a lightweight Softwareas-a-Service model. At the end of this HPC experiment, the software provider is considering including this new approach into his business model. For the hardware provider This was a great opportunity to host a new, exciting and very scalable CFD software, get in touch with the Flowkit and Palabos user community, and therefore envision a great partnership targeting real customers and real production. For the end-user Such detailed flow simulations of separation columns provides an insight into the flow phenomena which was previously not possible. In order to resolve these results, large computational meshes are absolutely necessary. This HPC experiment allows for an assessment of whether such big simulations are computationally feasible for industry. Furthermore we gained valuable experience in handling of these kinds of bit simulation setups. CONCLUSIONS AND RECOMMENDATIONS Definition of cloud computing Different partners have a different understanding of the term cloud computing. While some consider it a simple remote execution of a given software on pay-as-you-go hardware, others find it useful when it provides a software solution that is fully integrated in the web browser. In this project, we learned about an intermediate viewpoint defended by the resource provider, where an adequate web interface can be custom-tailored to every software/end-user pair. Availability of specialized resources Before entering this experiment, the software provider was familiar with generic cloud computing resources like those provided by Amazon Web Services. The project revealed the existence of custom cloud solutions for high performance computing, with substantially more powerful and cost-effective hardware. Communication with three involved partners The interaction between the partners was dominated by technical aspects that had to be worked out between the software and the resource providers. It appears that such a process leaves little room for the application end-user to impact the choice of the user interface and cloud computing model deployed between the software and the resource provider. Instead, it is likely that in such a framework the application end-user accepts the provided solution as it is, and decides if it is suitable for his needs or not. Notes from the resource provider In our model, the end user can ask for basic improvements to job submission forms for free. More complex ones are charged. Most HPC applications provided neither cloud API nor a web user interface. extreme Factory SaaS-enabling model partnered with the open source science community and software vendors to expose their applications in the cloud Enabling HPC applications in a cloud is not something everyone can do on their own. It requires a lot of experience and R&D plus a team to deploy and tune applications and support software users. This is one reason we think HPC as a Service is not ready yet for total cloud automation and on-line billing. SSH connections are surprisingly less secure than web portal isolation (we optionally add network isolation) because people who know how to use a job scheduler can discover a lot about the architecture and potentially look for security holes. This is because HPC job schedulers are, if not impossible, pretty difficult to secure. Web portal ergonomics coupled to remote visualization make it possible to execute a complete pre-processing/simulation/post-processing workflow on line, which is highly appreciated Improving the HPC Experiment The partners of this team appreciated the structure and management of the HPC experiment and find that its execution was highly appropriate. The following are minor suggestions that might be helpful for future repetition of the Experiment: It appears that the time frame of three month is very short, especially if a large amount of time must be spent finding appropriate partners and connecting them to each other. Participating partners are likely to have both industrial and academic interests in the HPC experiment. In the latter case, a certain amount of conceptual and theoretical guidance could be inspiring. As an example, it might have been more realistic for the participants to contribute and exchange their opinions on topics related to the actual definition of cloud computing if the framework for such a topic had been sketched more precisely by the organizers. It would be highly interesting for all participants, as the project advances, to know more about the content and progress of the project of other teams. Why not conceive a mid-term webinar that is more practically oriented, based on concrete examples of the results achieved so far in the various projects? Case Study Authors Jonas Latt, Marc Levrier, and Felix Muggli
15 Team 8: Flash Dryer Simulation with Hot Gas Used to Evaporate Water from a Solid The company was interested in reducing the solution time and, if possible, increasing mesh size to improve the accuracy of their simulation results without investing in a computing cluster that would be utilized only occasionally. USE CASE CFD multiphase flow models are used to simulate a flash dryer. Increasing plant sizes in the cement and mineral industries mean that current designs need to be expanded to fulfill customers requests. The process is described by the Process Department and the structural geometry by the Mechanical Department both departments come together using CFD tools that are part of end-user s extensive CAE portfolio. Currently, the multiphase flow model takes about five days for a realistic particle loading scenario on our local infrastructure (Intel Xeon X5667, 12M Cache, 3.06 GHz, 6.40 GT/s, 24 GB RAM). The differential equation solver of the Lagrangian particle tracking model requires several GBs of memory. ANSYS CFX 14 is used as the solver. Simulations for this problem are made using 1.4 million cells, five species and a time step of one millisecond for a total time of two seconds. A cloud solution should allow the end-user to run the models faster to increase the turnover of sensitivity analyses and reducing time to customer implementation. It also would allow the end-user to focus on engineering aspects instead of using valuable time on IT and infrastructure problems. Fig. 1 - Flash dryer model viewed with ANSYS CFD-Post The Project The most recent addition to the company s offerings is a flash dryer designed for a phosphate processing plant in Morocco. The dryer takes a wet filter cake and produces a dry product suitable for transport to markets around the world. The company was interested in reducing the solution time and, if possible, increasing mesh size to improve the accuracy of their simulation results without investing in a computing cluster that would be utilized only occasionally. The project goal was defined based on current experiences with the inhouse compute power. For the chosen model a challenge for reaching MEET THE TEAM
16 this goal was the scalability of the problem with the number of cores. Next, the end-user needed to register for XF. After organizational steps were completed, the XF team integrated ANSYS CFX for the end-user into their web user interface. This made it easy for the end-user to transfer data and run the application in the pre-configured batch system on the dedicated XF resources. The model was then run on up to 128 Intel E cores. The work was accomplished in three phases: Setup phase During the project period XF was very busy with production customers and was also migrating their Bull B500 blades (Intel Xeon X5670 sockets, 2.93 GHz, 6 cores, 6.40 GT/s, 12 MB) to B510 blades (Intel E sockets, 2.70 GHz, 8 cores, 8.0 GT/s, 20 MB). The nodes are equipped with 64 GB Ram, 500 GB hard disks and connected with Infiniband QDR. Execution phase After an initial hardware problem with the new blades, a solver run crashed after 35 hours due to a CFX stack memory overflow. This was handled by adding a new parameter to the job submission web form. A run using 64 cores still crashed after 12 hours despite 20% additional stack memory. This issue is not related to overall memory usage as the model never used more than 10% of the available memory as observed for one of the 64 cores runs. Finally, a run on 128 cores and 30% additional stack memory successfully ran up to the 2s point. An integer stack memory error occurred at a later point this still needs to be looked into. Post-processing phase The XF team installed ANSYS CFD-Post, visualization software for ANSYS CFX, and made it available from the portal in a 3D remote visualization session. It was also possible to monitor the runs from the Solver Manager GUI and hence avoid downloading large output log files. Because the ANSYS CFX solver was designed from the ground up for parallel efficiency, all numerically intensive tasks are performed in parallel and all physical models work in parallel. So administrative tasks, such as simulation control and user interaction, as well as the input/output phases of a parallel run were performed in sequential mode by the master process. BENEFITS The extreme factory team was quickly able to provide ANSYS CFX as SaaS and configure any kind of HPC workflow in extreme factory Studio (XF s web front end). The XF team spent around three man days to setup, configure, execute and help debug the ANSYS CFX experiment. FLSmidth spent around two man days in order to understand, setup and utilize the XF Portal methodology. XF also provides 3D remote visualization with good performance, which helps solve the problem of downloading large result files for local post-processing and checking the progress of the simulation. Fig. 2 - ANSYS CFX job submission web form Enabling HPC applications in a cloud requires a lot of experience and R&D, plus a team to deploy and tune applications and support software users. For the end-user the primary goal of running the job in one to two days was met. The runtime of the successful job was about 46.5 hours. There was not enough time in the end to perform some scalability tests these would have been helpful to optimize the size of the resources required with the runtime of the job. The ANSYS CFX technology incorporates optimization for the latest multi-core processors and benefits greatly from recent improvements in processor architecture, algorithms for model partitioning combined with optimized communications, and dynamic load balancing between processors. CONCLUSIONS AND RECOMMENDATIONS No special problems occurred during the project, only hardware provisioning delays. Pressure from production made it difficult to find free resources and tuning phases to get good results. Providing the HPC application in form of SaaS made it easy for the end-user to get started with the cloud and concentrate on his core business. It would be helpful to have some more information about cluster metrics beyond what is currently readily available e.g. memory, I/O-usage, etc. Time needed for downloading the results files and minimizing risks to proprietary data need to be considered for each use case. Due to the size of output data and transfer speed limitations, we determined that a remote visualization solution is required. Case Study Authors Ingo Seipp, Marc Levrier, Sam Zakrzewski, and Wim Slagter Note: Some parts of this report are excerpted from a story on the project featured in the Digital Manufacturing Report. You can read the full story at
17 Team 9: Simulation of Flow in Irrigation Systems to Improve Product Reliability MEET THE TEAM HPC and cloud computing will certainly be a valuable tool as our company seeks to increase its reliance on CFD simulation to reduce costs and time associated with the build-and-test iteration model of prototyping and design. USE CASE In the industry of residential and commercial irrigation products, product reliability is paramount customers want their equipment to work every time with low maintenance over a long product lifetime. For engineers, this means designing affordable products that are rigorously tested before the device begins production. Irrigation equipment companies employ a large force of designers, researchers and engineers who use CAD packages to develop and manufacture the products, and CAE analysis programs to determine the products reliability, specifications and features. CHALLENGES As the industry continues to demand more efficiency along with greater environmental stewardship, the usage rate of recycled and untreated water for irrigation grows. Fine silt and other debris often exist in untreated water sources (e.g. lakes, rivers and wells), and cause malfunction of internal components over the life of the product. In order to prevent against product failure, engineers are turning to increasingly fine meshes for CFD analysis, outpacing the resources of in-house workstations. To continue expanding the fidelity of these analyses within reasonable product design cycles, manufacturers are looking to cloud-based and remote computing for the heavy computation loads. The single largest challenge we
18 faced as end-users was the coordination with and application of the various resources presented to us. For example, one roadblock was that when presented with a high-powered cluster, we discovered that the interface was Linux, which is prevalent throughout HPC. As industry engineers with a focus on manufacturing, we have little or no experience with Linux and its navigation. In the end, we were assigned another cluster with a Windows virtualization to allow for quicker adoption. We consistently found that while the resources had great potential, we didn t have the knowledge to take full advantage of all of the possibilities because of the Linux interface and complications of HPC cluster configurations. Additionally, we found that HPC required becoming familiar with software programs that we were not accustomed to. Engineers typically use multiple software packages on a daily basis, and the addition of new operating environment, GUI, and user controls added another roadblock to the process. The increased use of scripting and software automation increased the time of the learning curve. Knowledge of HPC-oriented simulation was lacking for the end-user. As the end-user engineer s knowledge was limited to in-house and small-scale simulation, optimizing the model and mesh(es) for more powerful clusters proved to be cumbersome and time-intensive. As we began to experiment with extremely fine mesh conditions, we ran into a major issue. While the CFD-solver itself scaled well across the computing cluster, every increase in mesh size took significantly more time for meshgeneration, in addition to dramatically slowing the set-up times. Therefore with larger/finer meshes, the bottleneck moved from the solve time to the preparation time. BENEFITS At the conclusion of the experiment, the end-user was able to determine the potential of HPC for the future of simulation within the company. Another crucial benefit was the comparison of mesh refinements to accurately compromise both fidelity and practicality. A sweet-spot was suggested by the results one that would balance user set-up time with computing costs and would deliver timely, consistent, precise results. As suggested by the experiment, performing a fine mesh with 32 compute cores proved to be a balance of affordable hardware and timely, accurate results. CONCLUSIONS AND RECOMMENDATIONS The original cluster configuration offered by SDSC was Linux, but the standard Linux interface provided was not user-friendly for the end user s purposes. In order to accommodate the end user s needs, the SDSC team decided to try running Windows in a large, virtual shared memory machine using the vsmp software on SDSC s Gordon supercomputer. Using vsmp with Windows on the Gordon supercomputer offers the opportunity to provision a one terabyte Windows virtual machine which can provide a significant capability for large modeling and simulation problems that do not scale well on a conventional cluster. Although the team was successful in getting ANSYS CFX to run in this configuration on up to 16 cores (we discovered the 16-core limitation was due to the version of Windows we installed on the virtual machine), various technical topics with remote access and licensing could not be completely addressed within the timeframe of this project and precluded running actual simulations for this phase. Following the Windows test, the SDSC team recommended moving back to the proven Linux environment, which as noted previously was not ideal for this particular end user. Due to time constraints and the aforementioned Linux vs. Windows issues, end user simulations were not run on the SDSC resources for this phase of the project. However, SDSC has made the resource available for an additional time period should the end user desire to try simulations on the SDSC system. The end user states that they learned a lot, and are still intending to benchmark the results for the team members data, but do not have any performance or scalability data to show at this time. The results given above in terms of HPC performance were gathered using the Simutech Group s Simucloud cloud computing HPC offerings. From the SDSC perspective, this was a valuable exercise in interacting with and discovering the use cases and requirements of a typical SME end user. The experiment in running CFX for Windows on a large shared memory (1 TB) cluster was valuable and provided SDSC with an opportunity to explore how this significant capability might be configured for scientific and industrial users computing on Big Data. Another finding is that offering workshops for SMEs in running simulation software at HPC centers may be a service that SDSC can offer in the future, in conjunction with its Industrial Affiliates (IA) program. The end user noted, Having short-term licenses which scale with the need of a simulation greatly reduces our costs by preventing the purchase of under-utilized HPC packs for our company s in-house simulation. Summarizing his overall reaction to the project, the end user had this to say: HPC and cloud computing will certainly be a valuable tool as our company seeks to increase its reliance on CFD simulation to reduce costs and time associated with the build-and-test iteration model of prototyping and design. Case Study Authors Rick James, Wim Slagter, and Ron Hawkins
19 Team 14: Electromagnetic Radiation and Dosimetry for High Resolution Human Body Phantoms and a Mobile Phone Antenna Inside a Car as Radiation Source The goals were to reduce the runtime of the current job and to increase model resolution to more than 750 million cells. MEET THE TEAM USE CASE The use case is a simulation of electromagnetic radiation by mobile phone technology and dosimetry in human body phantoms inside a car model. The scenario is a car interior with seat and a highly detailed human body phantom with a hands-free mobile phone. Simulation software is CST Studio Suite. The transient solver of CST Microwave Studio was used during the experiment. CHALLENGES The goals were to reduce the runtime of the current job and to increase model resolution to more than 750 million cells. A challenge to achieving the goals was the scalability of the problems on many nodes with or without GPUs and high-speed network connections. Based on experiences with performance of the problem, the preferred infrastructure included Windows nodes with GPUs and fast network connectivity, i.e. Infiniband. If no GPUs were available, a multiple of the number of cores would be required to run the selected problem and achieve the same performance as with GPUs. THE PROJECT In the beginning, the project was identified by the end-user. The goals were set based on current experiences with existing compute power. The runtime of the problem on the existing environment was from several days up to one or two weeks. Output data sizes were in the range of GB depending on the size of the problem. The project was planned in three steps. At first a chip model simulation would be performed as a benchmark problem. The aim was to set up the simulation environment, check the speed of the system itself and its visualization, and analyze first problems. The second step would then be a simulation with a car seat, hands-free equipment and a human body phantom. The last step featured a full car model.
20 Execution Phase With the commitment from people at CST and HSR, the installation of CST on the HPC cluster has been completed and first jobs have been run. Testing the installation and debugging requires rdp access to the compute nodes, something that only cluster administrators are commonly allowed to do. BENEFITS Benefits from a cloud model to end-user are: the availability of additional resources on project demand; no taxable hardware costs remaining after the project; and no hardware aging. Fig. 1 - Applications for high fidelity simulations: seat with human body & hands-free equipment. Setup phase Access to the resource provider was established via a VPN connection. HSR provides 33 compute nodes with 12 cores each, InfiniBand interconnect, and workstations with GPU. Some VPN client versions did not succeed in connecting from the end user location to the resource provider although they were working from outside. With the latest VPN client version from one provider it was possible to connect. To let the Job Manager connect with the proper credentials from a local machine to the resource provider it was necessary to connect with the appropriate credentials and save them. Access to the cluster was then available through the Windows HPC Client. Batch jobs could be submitted and managed through Windows HPC Job Manager. CONCLUSIONS AND RECOMMENDATIONS Establishing the access to the cloud resources through the VPN- and HPC-Client is more complicated to setup. Once established, it works reliably. But an automated process is needed for integration into a workflow. Because of the size of the result files for big problems, the time required for transferring the results can be very long. For post-processing an rdp-connection is required to reduce the amount of data that needs to be transferred. Remote visualization for big problems would require a high performance graphics card. The CST software on Windows uses the graphical frontend in batch mode. For debugging and monitoring a job. An rdp-connection to the front-end node is required. This is a problem for many HPC cluster policies, where direct access of users to the compute nodes is prohibited. Data availability and accountability for data security and loss must be defined. Case Study Authors Ingo Seipp, Carsten Cimala, Felix Wolfheimer, and Henrik Nordborg.
21 Team 15: Weather Research and Forecasting on Remote Computing Resources In our view, the end-user is the key beneficiary from this short three month experiment. MEET THE TEAM USE CASE In this HPC Experiment, we attempted to evaluate the performance of WRF (Weather Research & Forecasting) open software on a computer cluster, which is larger than our existing computing cluster. The Application Domain is Weather Research and Forecasting. The WRF software is currently implemented on a computer cluster of Beowulf class consisting of 12 nodes, each node being an 8-core CPU. This cluster is used 24x7 and the execution time is 24 hours for a 12 hour weather prediction cycle. Currently, it has been empirically determined that the improvement in the system performance becomes negligible with an increase in the number of parallel computing cores beyond the current 96 computing cores. With the usage of applicable High Performance Computing methods, we would like to investigate if it is feasible to reduce the overall processing time of the WRF software and thus provide faster weather predictions and/ or higher-resolution predictions as well. As our computer cluster is being used non-stop and since the team needs to send out regular weather reports, it is not possible to stop the same to perform any experiments, leave alone instrument and measure the various system parameters as these will slow down the process as well. In addition, as we are yet to determine if there will be a time reduction in the first place and, if so, what is the ideal computer cluster size before we can recommend for building the same internally. This was the reason we looked to 3rd party resource providers. End-to-end process Our experiment was fairly straight forward as we were using open-source software and we already have it running on a computer cluster. Again, as the resource provider s setup was similar (Beowulf class) and since the software (WRF) was already installed, there were no challenges as well. We have executed a couple of runs and are currently reviewing the results. CHALLENGES We would like to highlight a few (minor) challenges: 1. The time difference for the team members was a challenge that delayed responses from both ends. 2. The Spanish resource provider had to make additional efforts to create documentation in English for us so we could learn how to use the system 3. The job was not accepted initially when the ORTE parallel environment was set. Later when the parallel environment was changed to pmpi, the job was accepted, but then was shut down immediately citing that the queue was full. This hinted that the resource was not being dedicated. 4. Finally, we were not able to use all 256 cores. We had to settle for 192 cores as the other cores were assigned to another HPC Experiment team. BENEFITS In our view, the end-user is the key beneficiary from this short three month experiment. The end-user was able to use additional resources to try out different software runtime configurations compared to those that were tried before. We are looking forward to the next three months of the Round 2 experiment to iron out some of the issues we faced in this phase and produce effective results. CONCLUSIONS AND RECOMMENDATIONS Resource providers should consider having staff respond to user queries on a 24x7 basis. This will reduce turnaround time and also ensure that their resources are being used effectively. Resource providers should state the language in which the technical documentation is available. In addition, a YouTube video on how to access and submit jobs would be useful. The HPC experiments might consider listing the public holidays in the various countries that are participating in the experiments, especially when the team members are in different countries. We had at least two instances where it was a PH in Singapore and a PH in Spain and one party tried to reach the other party on this day. Case Study Authors S. P. T. Krishnan, Bharadwaj Veeravali, and Ranjani M. R
22 Team 19: Parallel Solver of Incompressible, 2D and 3D Navier-Stokes Equations, Using the Finite Volume Method USE CASE This team s application used MPI to parallelize a solver of incompressible Navier-Stokes equations (both 2D and 3D) in rectangular domains, using Finite Volume Method. In this application, pressure and velocity were linked by the Semi-Implicit Pressure Linked Equation the SIM- PLE algorithm. The resulting discretized equations were solved by a line-by-line Gauss-Seidel solver. End-user Pratanu Roy has developed this application on the IBM idataplex HPC cluster, as a graduate student at Texas A&M University. The team attempted to port what is essentially a traditional HPC application to an Indiana University s FutureGrid Nimbus IaaS cloud, with the intention of analyzing performance within the virtualized runtime environment of the cloud. CHALLENGES AND CONCLUSIONS Team 19 members were introduced to one another in late August 2012, and registered for access to Future- Grid resources over the Labor Day Weekend. Rapid support from Fu- Team efforts made substantial progress toward building a customized VM image to launch on Nimbus resources. turegrid experts Fox and von Laszewski facilitated prompt establishment of the team s project under the FutureGrid access regime. Team members worked to become familiar with FutureGrid computing resources during early-mid September. FG tutorial documentation led to early successes including accessing the India cluster s resources using batch job submission methods. Subsequent attempts to establish Nimbus credentials and thereby access true IaaS-style cloud platforms (hotel and sierra virtualized cloud resources) required multiple support tickets to be filed. von Laszewski and Wang worked with the team members to address issues with hung VMs, failed Nimbus credential disposition, and tutorial documentation problems. One problem with the FG ticket routing system resulted in a twoweek delay in response to a technical issue, which was not resolved until HPC Experiment organizers went out-of-channel to request support for the issue. FG experts routed the issue, and it was resolved immediately MEET THE TEAM by John Bresnahan, once the routing issue was recognized. (He updated the tutorial to the point to the correct, current VM/tarball, with additional coordination by Pierre Riteau to assure availability to all FG clouds.) During early-mid October, the team worked through a learning curve regarding FG authorization and authentication methods. The anatomy and physiology of ssh authentication was a significant challenge for an inexperienced user (relying on a busy, remote collaborator for support), which complicated working through the process of obtaining all the necessary credentials, and getting them to the right locations for Nimbus cloud access. Team efforts in late October made substantial progress toward building a customized VM image to launch on Nimbus resources. The application runtime environment requires loading of a number of modules, as well as MPI, openssl, and many other dependencies that are not available in any known image that also includes Torque resource manager (for job queuing of multiple runs with varying data sizes and inputs). Installing all the necessary packages on top of the base hello cloud image which does include the Torque was a work in progress at the end of Round 1. Significant computational results should be possible during Round 2, given this progress. Case Study Authors Lyn Gerner and Pratanu Roy
23 Team 20: NPB2.4 Benchmarks and Turbo-machinery Application on Amazon EC2 MEET THE TEAM We obtained first-hand experience in running engineering applications in the commercial cloud computing environment. USE CASE We used an application from machinery manufacturing in this experiment. The engineers who use CAE software on physical workstations or compute clusters can do the same operations on cloud resources based on their skills and knowledge of creating, configuring and connecting to instances on Amazon EC2. Advantages and Challenges of Complex Engineering Applications in Clouds The rapid development and popularity of cloud computing are profoundly affecting and changing the resource supply, resource management, and computing modes of future applications. As one of the main supporting technologies of cloud computing, virtualization technology enables cloud computing to have the features of dynamical scalability, flexible customization and isolation of the environment, and transparent fault tolerance of applications, which are missing in the traditional platform of engineering applications. Therefore, cloud computing provides a good choice to solve engineering applications demand, but also brings new opportunities for the development of engineering applications and software. Compared to traditional computing environments, two main advantages of cloud computing for engineering applications are a customizable environment, and flexible usage and management of resources. However, the majority of current cloud systems and the corresponding techniques primarily aim at Internet-based applications. Engineering applications, especially complex engineering applications, bring grand challenges to cloud computing since they are significantly different from those serviceoriented Internet-based applications due to their inherent features, such as workload variations, process control, resource requirements, environment configurations, lifecycle management, and reliability maintenance. The End-to-End Process 1. Define end-user project with help from end-user During the definition of our experiment, we consulted with researchers from the School of Mechanical Science and Engineering, Huazhong University of Science and Technology, and then decided to choose the NAS Parallel Benchmarks (NPB) and a real engineering application as the two experimental subjects. 2. Contact resource providers, set up project environment Amazon EC2 was our resource provider. After obtaining an Amazon EC2 redeem code voucher from the project organizers, we redeemed the resource immediately, then launch and login an instance, which was made into an AMI file later. This set the experiment environment with openmpi-1.4.3, which is the commonly used MPI Communication Library, NAS Parallel Benchmarks (NPB2.4), and industry-renowned CFD software. We use scripts to build virtual clusters with the AMI file created above as the experiment environment. Three types of EC2 instances were used in our experiment, namely EC2 Cluster Compute Instances (cc1.4xlarge), EC2 High CPU Extra Large Instances (c1.xlarge) and EC2 Extra Large Instances (m1.large). Each CCI has 8 cores (the computing capability is equal to 33.5 EC2 Compute Units), with 23GB of memory. Each High CPU Extra Large Instances has 8 cores (the computing capability is equal to 20 EC2 Compute Units), with 7GB of memory. Each Extra Large Instances has 4 cores (the computing capability is equal to 8 EC2 Compute Units), with 23GB of memory. We built virtual clusters consisting of 9 or 17 nodes to run the NPB programs. One node of the virtual cluster is the NFS Server, which is an EC2 Extra Large Instance; the other nodes are
24 compute nodes, which run EC2 Cluster Compute Instances or EC2 High CPU Extra Large Instances. The detailed setup of NPB experiment is depicted in Table 1. Number of Compute Node Number of Number of Process Instance Type Compute Node NFS node 64 EC2 Extra Large 16 1 Instance (m1.large) 64 EC2 Cluster Compute 8 1 Instance (cc1.4xlarge) 128 EC2 High CPU 16 1 Extra Large Instance (c1.xlarge) Table 1 - Detailed setup of the NPB experiment. 3. Initiate execution of the end-user project After setting the experiment environment, we first carried out both class C and Class D of the NPB benchmarks on three virtual clusters consisting of 16 m1.large instances (64 cores), 16 c1.xlarge instances (128 cores) and 8 cc1.4xlarge (64 cores) instances separately. We ran each benchmark 10 times, and the experimental results are shown in the appendix. Since we used Amazon EC2 for the first time, and had the expectation of cheap prices, we did the test in the same way that we would have done on the HPC cluster in our laboratory. This approach incurred significant costs at this stage of the experiment so, it is not cheap, or not as cheap as we originally thought. 4. Monitoring We did not use specialized monitoring tools, but instead used the Xschell4, which is a terminal emulator for Windows platforms, to connect to virtual cluster consisting of EC2 instances, and to monitor the progress of our experiment. We used the runtime of the benchmark as the metric for evaluating the performance of the cloud resource. 5. Review results (where needed) Experimental results show that there is substantially no performance fluctuation of tightly coupled benchmarks on the virtual cluster consisting of CCI instances. By contrast, the performance fluctuation of tightly coupled benchmarks on the virtual cluster consisting of non-cci instances is obvious, and we find that the application performance on the virtual cluster consisting of non-cci instances shows an increasingly positive trend from the first round to the tenth round. CHALLENGES An objective fact is that China s Internet speed is still lagging behind comparable network speeds worldwide. Currently, Chinese users cannot connect to the Amazon EC2 instances. In order to do the scientific experiments, we access EC2 instances temporarily through some technical means. During uploading large files to EC2 instances by Xschell4, we often encountered the problem of instance crash. Our solution was to first upload files to DropBox, which is a free service that lets you bring your photos, docs, and videos anywhere and share them easily, and then download them in the EC2 instances. We also had budget issues. Running the NPB test excessively on virtual clusters consisting of non-cci instances at the first step of the experiment led to a budget overrun. The project organizers have solved this issue. BENEFITS We are very grateful to the organizer for the opportunity of participating in the project we obtained first-hand experience in running engineering applications in the commercial cloud computing environment. We got a more profound understanding of the EC2 billing methods through the experiment. We verified two main advantages of cloud computing for engineering applications, which are a customizable environment, and the flexible usage and management of resources. We found that the Standard Deviation of the benchmarks performance on the virtual cluster consisting of CCI instances is small. By contrast, the performance fluctuation of benchmarks on the virtual cluster consisting of non-cci instances is obvious. We observed that the performance of the real application on a virtual cluster consisting of non-cci instances showed an increasingly positive trend from the first to the last round. CONCLUSIONS AND RECOMMENDATIONS Do not use commercial cloud resources in the same way that you use them in your own physical cluster, because each operation in commercial cloud resources may incur costs. Although you don t need upfront investment on infrastructure and commercial software in the cloud environment, you still need to pay for the hardware and software you have consumed. That is to say, you need comprehensive schedules regarding using the cloud resource in order to adapt to the billing method before running engineering applications in the cloud. Although the price of non-cci instance is cheaper, manufacturers need a precise model to choose the suitable type of instance according to their budget and the deadline of tasks. Manufacturers need solutions, which are the basis of the comprehensive schedules mentioned above, to predict the running times of engineering application when the physical environment turns into a cloud environment. Relative to the general engineering applications, we are more concerned about performance issues of complex engineering applications in the cloud computing environment, because it is more challenging! The key issues are: How to build a new cloud resource organization model for the characteristics of complex engineering applications How to design virtualized resource management techniques for complex engineering applications in the cloud environment How to schedule the cloud resources in order to enhance complex engineering applications performance and cloud system capacity Case Study Authors Haibao Chen, Zhenjiang Xie, Song Wu, and Wenguang Chen
25 Team 22: Optimization Study of Side Door Intrusion Bars MEET THE TEAM The main benefits from the current experiment were exposure to cloud computing and the discovery of the limitations of the cloud computing environment. USE CASE A research into optimization techniques was conducted and a novel approach has been developed. To evaluate optimization scores, each model is subjected to a finite element analysis. A case study was designed and for its execution, high power computing resources were required. A cluster was utilized to analyze a large number of moderately sized jobs. Preprocessing was done locally, but the post-processing was done remotely, although no visualization was required. The FEA for this optimization study of side door intrusion bars job was planned to be solved using the ABAQUS Standard solver. There are a limited number of institutions capable of providing required resources (for example there are only five high power computer centers with ABAQUS licenses in all of AU). In situations where an HPC cluster access is secured, there is a problem with a limited availability of academic licenses. Having a reliable cloud service would definitely provide alternative options, versatility, and help in meeting the demand. End-to-End Process 1. End-user project The user was performing an optimization study of side door intrusion bars. The profiles of the bars are unconstrained and so have non-uniform thickness along and across the bars. The bars are meshed with solid elements and there were anywhere between 15,000-20,000 nodes and 30,000-50,000 elements per model. All models were computer generated and auto-meshed. The user expected 2,000-3,000 design variations in this study. The user defined their needs to the Resource Provider (PBS scripts, resources needed). 2. Experiment setup environment The Linux HPC cluster was setup on Microsoft Azure with PBS scheduling system and CentOS operating system. The FEA software for the end-user, ABAQUS, was deployed and the license server configured as per the requirements. 3. Experiment Initiation The enduser submitted his test jobs to the cluster. The HPC expert tested the cluster environment and observed seamless access to cloud resources. The resource provider provided all the necessary accesses to the end-user and provided technical support when needed.
26 4. Experiment Monitoring The end-user constantly monitored the execution of the jobs and raised a support request when encountering a problem. The nodes that failed during simulation were corrected and reset as needed. The HPC expert monitored the progress of the experiment with his inputs. CHALLENGES Limiting the experiment to utilizing only nodes that were online solved limited and unreliable access to the computer nodes. Non-matching number of license tokens and computer nodes was solved by devising a strategy that maximized the available infrastructure. Due to lack of scalability, adding more cores did not improve the speed, thus voiding the devised strategy and resulting in utilizing only 5 (on average) out of 10 nodes at a time. Running license servers in the cloud is problematic as the licenses are tied to static hardware resources. Access to application licensing was resolved by using VPN back into an on-premise license server. BENEFITS The main benefits from the current experiment were exposure to cloud computing and the discovery of the limitations of the cloud computing environment. Benefits for the resource provider Proved doing CAE simulation with Linux on Azure is feasible. Benefits for the HPC Expert Access to the cloud resources proved seamless. The expert was able to identify on-demand resources availability with technology enhancements as and when needed with shorter turn-around-times. Shortfalls from the current experiment were the very limited time (seven days) to assess suggested problem resolutions. Another shortfall was that cloud resources and the general Internet are not as stable as on-premise resources. In the next experiment, we expect more time and access to a reliable cloud infrastructure. CONCLUSIONS AND RECOMMENDATIONS For the End-User Clusters built on public clouds are more cost effective than dedicated hosted providers, but are generally not as reliable as hosted clusters. For the Resource Provider Doing pre/post on the cloud avoids data transfer issues. Licensing with dynamic resources requires putting license servers in the data-center. VPN is the simplest mechanism to reach back into the data center to reach needed license servers. Cloud HPC clusters (true cloud, not co-located or hosted clusters) and Internet WAN are not as stable as on-premise clusters, so system should be resilient to failure conditions. For the HPC Expert Cloud HPC seems feasible. However, a few things need to be addressed: e.g., company owned ISV licenses; security of company s intellectual property; company compliance norms like ITAR; and confidence in the reliability of the resources and the resource provider s capabilities. Case Study Authors Mldenko Kajtaz, Rod Mach, Matt Dunbar, and Satyanarayanaraju P.V.
27 Team 25: Simulation of Spatial Hearing Our main challenge was to develop interactive visualization tools for simulation data that was stored in the cloud. MEET THE TEAM USE CASE A sound emitted by an audio device is perceived by the user of the device. The human perception of sound is, however, a personal experience. For example, the spatial hearing (the capability to distinguish the direction of sound) depends on the individual shape of the torso, head and pinna (i.e. so-called head-related transfer function, HRTF). To produce directional sounds via headphones, one needs to use HRTF filters that model sound propagation in the vicinity of the ear. These filters can be generated using computer simulations, but, to date, the computational challenges of simulating the HRTFs have been enormous due to: the need of a detailed geometry of head and torso; the large number of frequency steps needed to cover the audible frequency range; and the need of a dense set of observation points to cover the full 3D space surrounding the listener. In this project, we investigated the fast generation of HRTFs using simulations in the cloud. The simulation method relied on an extremely fast boundary element solver, which is scalable to a large number of CPUs. The process for developing filters for 3D audio is long, but the simulation work of this study constitutes a crucial part of the development chain. In the first phase, a sufficient number of the 3D head-and-torso geometries needed to be generated. A laser-scanned geometry of a commercially available test dummy was used in these simulations. Next, acoustic simulations to characterize acoustic field surrounding the head-and-torso were performed. This was our task in the HPC experiment. Finally, the filters were generated from the simulated data and they were evaluated by a listening test. The final part will be done by Aalto University and the end-user after the data from the HPC experiment is available. The Environment Simulations were run via Kuava s Waveller Cloud simulation tool using the system described below. The number of concurrent instances ranged between 6 and 20. Service: Amazon Elastic Compute Cloud. Total CPU hour usage: 341h. Type: High-CPU Extra Large Instance. High-CPU Extra Large Instance: 7 GiB of memory. 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each) GB of instance storage. One EC2 Compute Unit provides the equivalent CPU capacity of a GHz 2007 Opteron or 2007 Xeon processor. This is also the equivalent to an early GHz Xeon processor. CHALLENGE Our main challenge was to develop interactive visualization tools for simulation data that was stored in the cloud. BENEFITS The main benefit was realized from the flexible resource allocation that is necessary for efficient acoustic simulations that is, a large number of instances can be obtained for a short period of time. There was no need to invest in our own computing capacity. Especially in audio simulations, the capacity is needed in short bursts for fast simulation turnaround times and the time between the simulation bursts while the next simulation is planned. The fact that no computational capacity is needed is significant. CONCLUSIONS AND RECOMMENDATIONS The main lessons learned were related to the optimal use of the cloud capacity. In particular, we obtained important experience on running large simulations in the cloud. For example, the optimal number of instances was dependent on the size of the simulation task and the amount of data needed to transfer to and from the cloud. The man hours logged during the experiment were: Kuava (60h), Aalto (1h) and end-user (4h). Total CPU hour usage during the experiment was 341h using High-CPU Extra Large Instance. Case Study Authors Antti Vanne, Kimmo Tuppurainen, Tomi Huttunen, Ville Pulkki, and Marko Hiipakka
28 Team 26: Development of Stents for a Narrowed Artery For an Abaqus user using SGI Cyclone this is a viable solution for both compute and visualization. USE CASE This project focused on simulating stent deployment using SIMULIA s Abaqus/Standard and Remote Visualization Software from NICE to run Abaqus/CAE on SGI Cyclone. The intent was to determine the viability of shifting similar work to the cloud during periods of full-utilization of inhouse compute resources. Information on Software and Resource Providers Abaqus from SIMULIA, the Dassault Systems brand for realistic simulation, is an industry leading product family that provides a comprehensive and scalable set of Finite Element Analysis (FEA) and multiphysics solvers and modeling tools for simulating a wide range of linear and nonlinear model types. It is used for stress, heat transfer crack initiation, failure and other types of analysis in mechanical, structural, aerospace, automotive, bio-medical, civil, energy, and related engineering and research applications. Abaqus includes four core products: Abaqus/CAE, Abaqus/Standard, Abaqus/Explicit, and Abaqus/ CFD. Abaqus/CAE provides users with a modeling and visualization environment for Abaqus analysis. NICE Desktop Cloud Visualization (DCV) is an advanced technology that enables technical computing users to remote access 2D/3D interactive applications over a standard network. Engineers and scientists are immediately empowered by taking full advantage of high-end graphics cards, fast I/O performance and large memory nodes hosted in Public or Private 3D Cloud, rather than waiting for the next upgrade of the workstations. SGI Cyclone is the world s first large scale on-demand cloud computing service specifically dedicated to technical applications. Cyclone capitalizes on over twenty years of SGI HPC expertise to address the growing science and engineering technical markets that rely on extremely high-end computational hardware, software and networking equipment to achieve rapid results. Current State The end user currently has two 8 core PC workstations for pre- and post-processing with Abaqus/CAE, and a Linux based compute server with 40 cores and 128GB of available memory. They do not use any batch job scheduling software. The typical size of model of the stent design that they run has 2-6 million degrees of freedom (DOF). Typical job uses 20 cores and takes six hours. After the job is run, the data is transferred to the workstation for post-processing. For the experiment it was agreed the Simulia and SGI would provide the end user with Abaqus licenses for up to 128 cores in order to see if running a job on more cores could reduce the time to finish the job, as well as provide access to NICE DCV remote graphics software to view the results in Northern MEET THE TEAM California before downloading them to the end user office in New Hampshire. End-To-End Process 1. Set up Cyclone account for End User. 2. SGI License Server info sent to Software Provider. 3. Issuance of a 128 core temporary license of Abaqus by Software Provider.
29 4. End user uploads model to his home directory on Cyclone login node and sends to CAE Expert. 5. Benchmark scaling exercise to find core count sweet spot is done by CAE Expert. 6. Results of benchmark scaling exercise sent to End User by CAE Expert. 7. Remote Viz session to view data using Abaqus CAE is set up by CAE Expert. 8. Remote Viz demo via WebEx with End User. 9. PBS submission script written by CAE Expert and shared with End User. 10. End user uploads, runs, views and downloads test case days of free access is given to End User. CHALLENGES The team met via a con call and agreed upon the list of steps that made up the end-to-end process. The setting up of the end user account and having the software licenses issued was quickly done. In order for the End User to upload their model via SSH they needed to get permission from their internal IT group, which took some time. Once the model was uploaded, the CAE Expert ran the model at various core counts and produced a routine benchmark report for the End User to review (see results in table below). The remote viz demo went smoothly but when the End User tried to run the software themselves it took both the Resource and End User IT network teams to open the necessary ports, which took much longer than anticipated. Once the ports were open, the remote viz post-processing experience was better than expected. Analysis output files still needed to be shipped back to the End User for future reuse, additional post-processing, etc. Data transfer via the network was found to be slow. Final results might be better transferred through an external USB hard drive via FedEx. BENEFITS Here are the top 3 benefits of participating in the experiment for each of the team members: Software Provider 1. I was able to hear from an experienced Abaqus user that doing remote postprocessing using a client machine in New Hampshire to an SGI Cyclone server in California provided a good user experience. 2. I was able to hear from an end user that managing the networking requirements (opening ports in firewalls) took some work but was manageable. 3. I have a reference point for an Abaqus user who views executing his Abaqus workflow on SGI Cyclone to be a viable solution. CAE Expert 1. Expanded my knowledge of analytical methods used in medical stent engineering with Abaqus/Standard. 2. Increased awareness of user interactions with cloud based solution and networking requirements. 3. The geographic distance of ~3100 miles between customer and SGI Cyclone Cloud resources confirms distance is no longer a barrier in high performance computing and remote visualization. Based on the Abaqus Engineer, he comments the SGI Remote Visualization for cloud computing was faster and smoother than I expected. Resource Provider: 1. The ability to walk a new customer through our HPC cloud process for usage. 2. Testing our remote visualization solution, which is in beta. 3. Working with a long time CAE ISV partner to offer a joint cloud base solution to run and view Abaqus jobs. CONCLUSION For an Abaqus user using SGI Cyclone this is a viable solution for both compute and visualization. The Viz side was impressive. End User 1. Gained an increased understanding of what is involved in turning on and using a cloud-based solution for computational work with the Abaqus suite of finite element software. 2. Determined that shifting computational work to the cloud during periods of full-utilization of in-house compute resources is a viable approach to ensuring analysis throughput. 3. Participation in the experiment allowed direct assessment of the speed and integrity of remote visualization of computational models (both pre- and post-processing) for a variety of model and output database sizes. SGI/Nice DCV provided a robust solution, which permitted fast, and accurate manipulation of the computational models used in the study.
30 Team 30: Heat Transfer Use Case Concerning the ease of using cloud computing resources, we concluded that this working methodology is very friendly and easy to use through the CloudBroker Platform. USE CASE Background In many engineering problems fluid dynamics is coupled with heat transfer and many other multiphysics scenarios. The simulation of such problems in real cases produces large numerical models to be solved, so that big computational power is required in order for simulation cycles to be affordable. For SME industrial companies in particular it is hard to implement this kind of technology in-house, because of its investment cost and the IT specialization needed. There is great interest in making these technologies available to SME companies, in terms of easy-to-use HPC platforms that can be used on demand. Biscarri Consultoria SL, is committed to disseminate parallel open source simulation tools and HPC resources in the cloud. CloudBroker is offering its platform for various multiphysics, fluid dynamics, and other engineering applications, as well as life science for small, medium and large corporations along with related services. The CloudBroker Platform is also offered as a licensed in-house solution. Current State Biscarri Consultoria SL is exploring MEET THE TEAM ORGANIZATIONS INVOLVED
31 the capabilities of cloud computing resources for performing highly coupled computational mechanics simulations, as an alternative to the acquisition of new computing servers to increase the computing power available. For a small company such as BCSL, the strategy of using cloud computing resources to cover HPC needs has the benefit of not needing an IT expert to maintain in-house parallel servers thus concentrating on our efforts in our main field of competence. To solve the needs of the end user, the following hardware and software resources existing on the provider side were employed by the team: Elmer ( an open source multi-physical simulation software mainly developed by the CSC IT Center for Science CAELinux ( a CAE Linux distribution including the Elmer software as well as a CAE- Linux virtual machine image at the AWS Cloud CloudBroker Platform (public version under platform.cloudbroker.com), CloudBroker s web-based application store offering scientific and technical Software as a Service (SaaS) on top of Infrastructure as a Service (IaaS) cloud resources, already interfaced to AWS and other clouds Amazon Web Services (AWS, in particular Amazon s IaaS cloud offerings EC2 (Elastic Compute Cloud) for compute and S3 (Simple Storage Service) for storage resources Experiment Procedure Technical Setup The technical setup for the HPC Experiment was performed in several steps. These followed the principle to start with the simplest possible solution and then to grow it to fulfil more complex requirements in an agile fashion. If possible, each step was first tested and iteratively improved before the next step was taken. The main steps were: 1. All team members were given access to the public CloudBroker Platform via their own account under a shared organization created specifically for the HPC Experiment. A new AWS account was opened by CloudBroker, the AWS credit loaded onto it, and the account registered in the CloudBroker Platform exclusively for the experiment team. 2. Elmer software on the existing CAELinux AWS machine image was made available in the CloudBroker Platform for serial runs and tested with minimal test cases by Cloud- Broker and Joël Cugnoni. The setup was then extended to allow parallel runs using NFS and MPI. 3. Via Skype calls, screen sharing, chatting, and contributions on Basecamp, the team members exchanged knowledge on how to work with Elmer on the CloudBroker Platform. The CloudBroker Team gave further support for its platform throughout HPC Experiment Round 2. Cloud- Broker and BCSL performed corresponding validation case runs to test the functionality. 4. The original CAELinux image was only available for normal, non-hpc AWS virtual machine instance types. Therefore, Joël Cugnoni provided Elmer 6.2 as optimized and non-optimized binaries for Cluster Compute instances. Also, the CloudBroker Team deployed these on the Cloud- Broker Platform for the AWS HPC instance types with 10GBit Ethernet network backbone, called Cluster Compute instances. 5. BCSL created a medium benchmark case, and performed scalability and performance runs with different numbers of cores and nodes of the Amazon Cluster Compute Quadruple and Eight Extra Large instance types and different I/O settings. The results were logged, analyzed and discussed within the team. 6. The CloudBroker Platform setup was improved as needed. This included, for example, a better display of the number of cores in the web UI, and the addition of artificial AWS instance types with fewer cores, as well as the ability to change the shared disk space. 7. BCSL tried to run a bigger benchmark case on the AWS instance type configuration that turned out to be preferable from the scalability runs that is, single AWS Cluster Compute Eight Extra Large instances. Fig. 1 - This figure shows the model employed in the scalability benchmark. The image on the right shows the temperature field, while the left image shows the velocity field at a certain time of the transient simulation. Validation Case First a validation case was defined to test the whole simulation procedure. This case was intentionally simple, but had the same characteristics as the more complex problems that were used for the rest of the experiment. It was an idealized 2D room with a cold air inlet on the roof (T = 23ºC, V = 1m/s), a warm section on the floor (T = 30ºC, V = 0.01m/s) and an outlet on a lateral wall near the floor (P = 0.0Pa). The initial air temperature was 25ºC. The mesh was created with Salome V6. It consists of 32,000 nodes and 62,000 linear triangular elements. The solution is transient. Navier-Stokes and Heat equations were solved in a strong coupled way. No turbulence model was used. Free convection effects were included. The
32 mesh of the benchmark analysis was a much finer one of the same geometry domain, consisting of about 500,000 linear triangular elements. The warm section on the floor was removed and lateral boundaries had open condition (P = 0.0Pa). Job Execution The submission of jobs to be run at AWS was done through the web interface of the CloudBroker Platform. The procedure was as follows: A job was created on the CloudBroker Platform, specifying Job Name, Software, Instance Type and AWS Region Case and mesh partition files were compressed and uploaded to the CloudBroker Platform attached to the created job The job was submitted to the selected AWS resource Result files were downloaded from the CloudBroker Platform and postprocessed in a local workstation Scalability parameters were calculated from job output log file data Fig. 2 Streamline on the inlet section. CHALLENGES End User The first challenge for BCSL in this project was to learn if the procedure to run Elmer jobs in a cloud computing resource such as AWS is easy enough to be a practical alternative to in-house calculation servers. The second challenge was to determine the level of scalability of the Elmer solver running at AWS. Here we encountered good scalability when the instance employed is the only computational node. When running a job on an instances using two or more computational nodes the scalability is reduced dramatically, showing that communications between cores of different computational nodes slows down the process. AWS uses 10Gbit Ethernet as backbone network, which seems to be a limitation for this kind of simulations. After the scalability study with the mesh of 500 Kelems was performed, a second scalability test was tried with a new mesh of about 2000 Kelems. However, jobs submitted for this study to Cluster Compute Quadruple Extra Large and Cluster Compute Eight Extra Large instances have not been successfully run yet. Further investigations are in progress to better characterize the network bottleneck issue as a function of problem size (number of elements per core) and to establish if it is related to MPI communication latency or NFS throughput of the results. Resource Provider and Team Expert On the technical side, most challenges were mastered by already existing features of the CloudBroker Platform or by small improvements. For this it was essential to follow the stepwise agile procedure as outlined above, partly ignoring the stiffer framework suggested by the default HPC Experiment tasks on Basecamp. Unfortunately AWS HPC Cloud resources are limited to a 10 GBit Ethernet network. 10 Gbit Ethernet was not sufficient in terms of latency and throughput to run the experiment efficiently on more than one node in parallel. The following options are possible: 1. Run the experiment on one large node only, that is the AWS Cluster Compute Eight Extra Large instances with 16 cores 2. Run several experiment jobs independently in parallel with different parameters on the AWS Cluster Compute Eight Extra Large instances 3. Run the experiment on another cloud infrastructure which provides low latency and high throughput using technology such as Infiniband The CloudBroker Platform allows for all the variants as described above. Variants 2 and 3 were not part of this experiment, but would be the next reasonable step to explore in a further experiment round. In the given time, it was also not possible to try out all the different I/O optimization possibilities, which could provide another route to improve scalability. A further challenge of the HPC Experiment was to bring together the expertise from all the different involved partners. Each of them has experience on a separate set of the technical layers that were needed to be combined here (actual engineering use case, Elmer CAE algorithms, Elmer software package, CloudBroker Platform, AWS Cloud). For example, often it is difficult to say from the onset which layer causes a certain issue, or if the issue results from the combination of layers. Here it was essential for the success of the project to stimulate and coordinate the contributions of the team members. For the future, we envision making this procedure more efficient through decoupling for example, by the software provider directly offering an already optimized Elmer setup in the CloudBroker Platform to the end users.
33 Finally, a general challenge of the HPC Experiment concept is that it is a non-funded effort (apart from the AWS credit). This means that the involved partners can only provide manpower on a best effort basis, and paid projects during the same time usually have precedence. It is thus important that future HPC Experiment rounds take realistic business and commercialization aspects into account. BENEFITS Concerning the ease of using cloud computing resources, we concluded that this working methodology is very friendly and easy to use through the CloudBroker Platform. The main benefits for BCSL regarding the use of cloud computing resources were: To have external HPC capabilities available to run medium sized CAE simulations To have the ability to perform parametric studies, in which a big number of small/medium size simulations have to be submitted To externalize all IT stuff necessary to have in-house calculation servers For CloudBroker, it was a pleasure to extend its platform and services to a new set of users and to Elmer as a new software. Through the responses and results we were able to further improve our platform and to gain additional experience on the performance and scalability of AWS cloud resources, particularly for the Elmer software. CONCLUSIONS AND RECOMMENDATIONS The main lesson learned at Biscarri Consultoria SL arising from our participation in HPC Experiment Round 2 is that collaborative work through the Internet, using on-line resources like cloud computing hardware, Open Source software such as Elmer and CAElinux, and middleware platforms like CloudBroker, is a very interesting alternative to in-house calculation servers. A backbone network such as 10Gbit Ethernet connecting computational nodes of a cloud computing platform seems not to be suitable for computational mechanics calculations that need to be run on more than one large AWS Cluster Compute node in parallel. The need for network bandwidth for the solution of strongly coupled equations involved in such simulations makes the use of faster network protocols such as Infiniband necessary to achieve time savings when running it in parallel on more than a single AWS Cluster Compute instance with 16 cores. For CloudBroker, HPC Experiment Round 2 has provided another proof of its methodology, which combines its automated web application platform with remote consulting and support in an agile fashion. The CloudBroker Platform could easily work with CAELinux and the Elmer software at AWS. User requirements and test outcomes even resulted in additional improvements, which are now available to all platform users. On the other hand, this round has shown again that there are still needs for example, a reduction of latency and improvement of throughput (i.e., by using Infiniband instead of 10 GBit Ethernet) to be fulfilled by dynamic cloud providers such as AWS regarding highly scalable parallel HPC resources. Their cloud infrastructure is currently best suited for loosely or embarrassingly parallel jobs such as parameter sweeps, or highly coupled parallel jobs limited to single big machines. Finally, despite online tools, the effort necessary for a project involving several partners like this one should not be underestimated. CloudBroker expects though that in the future more software like Elmer can be directly offered through its platform in an already optimized way, making usage more efficient. Case Study Authors - Lluís M. Biscarri, Pierre Lafortune, Wibke Sudholt, Nicola Fantini, Joël Cugnoni, and Peter Råback.
34 Team 34: Analysis of Vertical and Horizontal Wind Turbines MEET THE TEAM Cloud computing would be an excellent option for these kind of simulations if the HPC provider offered remote visualization and access to the required software licenses. USE CASE The goal was to optimize the design of wind turbines using numerical simulations. The case of vertical axis turbines is particularly interesting, since the upwind turbine blades create vortices that interact with the blades downstream. The full influence of this can only be understood using transient flow simulations, requiring large models to run for a long time. CHALLENGES In order test the performance of a particular wind turbine design, a transient simulation had to be performed for each wind speed and each rotational velocity. This lead to a large number of very long simulations, even though each model might not be very large. Since the different wind speeds and rotational velocities were independent, the computations could be trivially distributed on a cluster or in the cloud. Figure 1: 2D simulation of a rotating vertical wind turbine.
35 Another important use of HPC and cloud computing for wind power is parametric optimization. Again, if the efficiency of the turbine is used as target function, very long transient simulations will have to be performed to evaluate every configuration. BENEFITS The massive computing power required to optimize a wind turbine is typically not available locally. Since only some steps of the design require HPC and an on-site cluster would never be fully utilized, cloud computing offers an obvious solution. again demonstrates that simulations on an unstructured grid are bandwidth limited. To conclude, cloud computing would be an excellent option for these kinds of simulations if the HPC provider offered remote visualization and access to the required software licenses. CONCLUSIONS AND RECOMMENDATIONS The problem with cloud computing for simulations using commercial tools is that the number of licenses is typically the bottleneck. Obviously, having a large number of cores does not help if there are not enough parallel licenses. In our case, a number of test-licenses were provided by AN- SYS, which was very helpful. It is not possible to transfer data back and forth between the cluster and a local workstation. Therefore, any HPC facility needs to provide remote access for interactive use. Unfortunately, this was not available in our case. A test performed on the Penguin cluster showed an 8% increase in speed (per core) as compared with our local Windows cluster. This speedup was surprisingly small, given that Penguin uses a newer generation of CPUs with a much better theoretical floating-point performance. This Figure 2: CFD Simulation of a vertical wind turbine with 3 helical rotors. Case Study Author Juan Enriquez Paraled
36 Team 36: Advanced Combustion Modeling for Diesel Engines remote clusters allow small companies to conduct simulations that previously were only possible by large companies and government labs. USE CASE Modeling combustion in Diesel engines with CFD is a challenging task. The physical phenomena occurring in the short combustion cycle are not fully understood. This especially applies to the liquid spray injection, the auto-ignition and flame development and formation of undesired emissions like NOx, CO and soot. Dacolt has developed an advanced combustion model named Dacolt PSR+PDF, specifically meant to address these types of challenging cases where combustion initiating chemistry plays a large role. This Dacolt PSR+PDF model has been implemented in ANSYS Fluent and was validated on an academic test case (SAE paper pdf). An IC engine case validation case is the next step, tackled in the context of the HPC Experiment in the Penguin Computing HPC cloud. Simulation result showing the flame (red) located on top of the evaporating fuel spray (light blue in the center) CHALLENGES Current challenges for the end-user operating with just his in-house resources include the fact that the computational resources needed for these simulations are significant (i.e. more than 16 cpus and one to three days of continuous running. BENEFIT The benefit for the end-user using remote resources was that the remote clusters allow small companies to conduct simulations that previously were only possible by large companies and government labs. End-user findings on the provided cloud access include: Startup: o POD environment setup went smoothly o ANSYS software installation and licensing as well System: o POD system OS comparable to OS used at Dacolt o ANSYS Fluent version same as used at Dacolt Running: o Getting used to POD job scheduling o No portability issues of the CFD model in general o Some MPI issues related to Dacolt s User Defined Functions (UDFs) MEET THE TEAM
37 o Solver crash during injection + combustion phase, to be investigated Overall, we experienced easy-to use ssh access to the POD cluster. The environment and software set-up went smoothly with collaboration between POD and ANSYS. The remote environment, which nearly equaled the Dacolt environment, provided a head start. Main issue encountered: the uploaded Dacolt UDF library for Fluent did not work in parallel out of the box. It is likely the Dacolt User Defined Functions would have to be recompiled on the remote system. Project results An IC-engine was successfully run until solver divergence, to be reviewed by Dacolt with ANSYS support. Dacolt model validation seems promising. Anticipated challenges included: Account set-up and end-user access Configuring end-user s CFD environment with ANSYS Fluent v14.5 Educating end-user in using the batch queuing system Get data in and out of the POD cloud Actual barriers encountered: Running end-user UDFs with Fluent in parallel gave MPI problems CONCLUSIONS AND RECOMMENDATIONS Use of POD remote HPC resources worked well with ANSYS Fluent Although the local and remote systems were quite comparable in terms of OS, etc, systems like MPI may not work out of the box Local and remote network bandwidth was good enough for data transfer, but not for tunneling CAE graphics using X Future use of remote HPC resources depends on availability of pay-as-you-go commercial CFD licensing schemes Case Study Author Ferry Tap
38 Team 40: Simulation of Spatial Hearing (Round 2) MEET THE TEAM The main lessons learned during Round 2 were related to using CPU-optimization when compiling the code for cloud simulations. USE CASE A sound emitted by an audio device is perceived by the user of the device. The human perception of sound is, however, a personal experience. For example, the spatial hearing (the capability to distinguish the direction of sound) depends on the individual shape of the torso, head and pinna (i.e. so-called head-related transfer function, HRTF). To produce directional sounds via headphones, one needs to use HRTF filters that model sound propagation in the vicinity of the ear. These filters can be generated using computer simulations, but, to date, the computational challenges of simulating the HRTFs have been enormous due to: the need of a detailed geometry of head and torso; the large number of frequency steps needed to cover the audible frequency range; and the need of a dense set of observation points to cover the full 3D space surrounding the listener. In this project, we investigated the fast generation of HRTFs using simulations in the cloud. The simulation method relied on an extremely fast boundary element solver, which is scalable to a large number of CPUs. The process for developing filters for 3D audio is long but the simulation work of this study constitutes a crucial part of the development chain. In the first phase, a sufficient number of the 3D head-and-torso geometries needed to be generated. A laser-scanned geometry of a commercially available test dummy was used in these simulations. Next, acoustic simulations to characterize acoustic field surrounding the head-and-torso were performed. This was our task in the HPC Experiment. The Round 2 simulations focused on the effect of the acoustic impedance of the test dummy on the HRTFs. Finally, the filters were generated from the simulated data and they will be evaluated by a listening test. The final part was done by the end-user. Simulations were run via Kuava s Waveller Cloud simulation tool using the system described below. The number of concurrent instances ranged between 6 and 20. Service: Amazon Elastic Compute Cloud Total CPU hour usage: 371h Type: High-CPU Extra Large Instance High-CPU Extra Large Instance: 7 GiB of memory 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each) 1690 GB of instance storage
39 64-bit platform I/O Performance: High One EC2 Compute Unit provides the equivalent CPU capacity of a GHz 2007 Opteron or 2007 Xeon processor. This is also the equivalent to an early GHz Xeon processors support) by disabling some of the optimizations when compiling the code. The man hours accumulated during the experiment included Kuava (50h), and end-user (5h). Total CPU hour usage during the experiment was 371h using High-CPU Extra Large Instance. CHALLENGES Our main challenge was to develop interactive visualization tools for simulation data stored in the cloud. BENEFITS The main benefit resulted from the flexible resource allocation, which is necessary for efficient acoustic simulations. That is, a large number of instances can be obtained for a short period of time. Other benefits included not having to invest in our own computing capacity. Especially in audio simulations, the capacity is needed in short bursts for fast simulation turnaround times and the time between the simulation bursts while the next simulation is planned i.e., when no computational capacity is needed is significant. CONCLUSIONS AND RECOMMENDATIONS The main lessons learned during Round 2 were related to using CPU-optimization when compiling the code for cloud simulations. We observed that Amazon did not support all optimization features even though the optimization should be available in the instances used for simulations. The problems were solved (with the kind help of Amazon Fig. 1 - Simulation model (an acoustic test dummy). The dots indicate all locations of monopole sound sources that were used in the simulations. The red dots are the sound sources used in this image. The figure in the middle shows the sound pressure level (SPL) in the left ear as a function of the sound direction and the frequency. On the right, the SPL relative to sound sources in the far-field is shown. Case Study Author Tomi Huttunen
40 Team 44: CFD Simulation of Drifting Snow MEET THE TEAM The main challenge during the setup of the configuration was getting a successful build of OpenFOAM on the hardware resource. USE CASE Binkz Inc. is a Canadian-based CFD consultancy firm with less than five employees, active in the areas of aerospace, automotive, environmental and wind engineering, as well as naval hydrodynamics and process technologies. For Binkz s consultancy activities, simulation of drifting snow is necessary in order to predict the redistribution of accumulated snow by the wind around arbitrary structures. Such computations can be used to determine the snow load design parameters of rooftops, which are not properly addressed by building codes at the present. Other applications can be found in hydrology and avalanche effects mitigation. Realistic simulation of snow drift requires a 3D two-phase fully-coupled CFD model that easily takes several months of computing time on a powerful workstation (~16 cores) and memory requirements that can exceed 100GB in some cases; hence the need for computing clusters to reduce the computing time. The pay-per-use model of the cloud paradigm could be ideal for a small consultancy firm to reduce the fixed costs of acquiring and maintaining a computing cluster, and allow the direct billing of the computing resources in each project. The snowdrift simulations were performed with a customized Open- FOAM two-phase solver. OpenFOAM is a free, open source CFD software package developed by OpenCFD Ltd at ESI Group and distributed by the OpenFOAM Foundation. It has a large user base across most areas of engineering and science, from both commercial and academic organizations. The input data consisted of a computational mesh of several million cells and a number of ASCII input files to provide the physical and numerical parameters of the simulation. Output data consisted of several files containing the values of the velocity, pressure, volume fraction and turbulence variables for each of the air and snow phases, in every computational cell and for each required flow time. These were used to generate snapshots of the flow field and drifting snow as well as values of snow loads where required on and around the structure being analyzed. End-to-end process: Project definition was agreed upon in an online meeting between team expert and the end user and the SDSC s Compute Cluster Triton was selected as hardware resource to fulfill the large memory demands (~100GB RAM) and fast interconnect required for good scalability. An initial budget of 1,000 core hours was assigned to the project. OpenFOAM was downloaded into
41 the home directories. An initial attempt to build the solver with the PGI compilers was unsuccessful. Building with the Intel compilers was successful but subsequent computational tests ended in segmentation faults never observed on other platforms. As a last ditch effort a final build was done with the gcc compiler, the Open- FOAM native compiler, albeit with several non-optimal fixes to make sure the build is available on time to get some tests done before the project deadline. At that point about 40% of the allocated CPU time had been spent. Limited speedup tests were done with the gcc build due to scarcity of time and resources left. The speedup tests showed the expected scalability behavior with one anomalous occurrence never before observed on other platforms. Thorough investigation of the anomaly was considered outside the context of the Experiment considering the non-optimal nature of the gcc build. Efforts invested: Triton support: <10 hours, build attempts, system configuration, tracking of the build. End user: more than 100 hours in build attempts, solver and testcase setup, testing the builds and analyzing the test results. Team expert: hours of basic support, reporting and overall experiment management. Resources: ~900 core hours for building the software, testing the builds and performing initial tests for running large jobs. CHALLENGES The main challenge during the setup of the configuration was getting a successful build of OpenFOAM on the hardware resource. The main challenge during test execution was scheduling a test MPI simulation job requiring several parallel compute nodes on queues occupied by a high number of serial runs by other users prioritized in the queuing system. This resulted in deployment waiting times that were not acceptable in the workflow of the end user. There exist however other queues on Triton that could provide better prioritization and response times, but they were not tested due to the limited time frame of the experiment. BENEFITS The first benefit of the experiment was the learning experience with building OpenFOAM with different compilers on different platforms. Past experience in compiling OpenFOAM on other CentOS systems led us to believe this would not be a problem on Triton. Unfortunately, it was and in the future one should make sure in advance that an optimized OpenFOAM build exists on the target resource; or the project plan should anticipate the time and labor required to obtain a good build. In this experiment, SDSC had agreed to provide computing time only, but even so support staff committed a significant amount of their own time to assist with the OpenFOAM build. Given enough time it is certain that the Triton support staff would have managed to provide optimal builds of OpenFOAM with all tested compilers. Another benefit from the experiment is that, apart from a well-fitting hardware platform (as Triton would be), it is also important for production jobs to be launched on appropriate MPI queues that would not allow high numbers of smaller serial jobs to delay large parallel MPI jobs. CONCLUSIONS AND RECOMMENDATIONS OpenFOAM (or more generally, a large open-source software package such as OpenFOAM) is best built on the platform it will run on. OpenFOAM is most easily built with the third-party software as provided within the distribution. For the application of snow drift simulations, running on a public/academic resource using a standard (i.e. nonprioritized) account yields unpredictable waiting times and important computing delays when running concurrently with a high number of serial runs by other users. In another experiment round, we would recommend testing an alternative platform/queue that has a different capacity, user base, or job queuing system that is a better fit to the end user s work flow. Fig. 1 - Closeup of the building model with simplified roof structure. The structured mesh is 1.25 million hexahedral cells. Case Study Authors -- Ziad Boutanios and Koos Huijssen
42 Team 46: CAE Simulation of Water Flow Around a Ship Hull The results of the simulation, performed in a wide range of towing speeds in the grid with about 1 mln computational cells, showed good agreement with the experimental data. USE CASE The goal of this project was to run CAE simulations of water flow around the hull of a ship much more quickly than was possible using available resources. Current simulations took a long time to compute, limiting the usefulness and usability of CAE for this problem. For instance, on the resources currently available to the end user, a simulation of seconds of real time water flow took two to three weeks of computational time. We decided to run the existing software on a HPC resource to realize whatever runtime gains might be achieved by using larger amounts of computing resources. Application software requirements This project required the TESIS FlowVision 3.08.xx software. FlowVision is already parallelized using MPI so we expected it to be able to utilize the HPC resources. However, it does require the ability to connect to the software from a remote location while the software is running in order to access the software licenses and steer the computation. For the license keys, see description in FlowVision installation and preferences on Calendula Cluster.docx, MEET THE TEAM Digital Marine Technolog
43 Fig. 1 - Wave pattern around the ship hull Custom code or configuration of end-user This project necessitated the upgrading of the operating system on the HPC system to support some libraries required by the FlowVision software. The Linux version on the system was not recent enough and one of the system main parts, glibc libraries, were not the correct version. We also had to open a number of ports to enable software to connect to and from specific external machines (specified by their IP address). Computing Resource: Resource requirement from the end-user: about 8-16 nodes for the HPC machine, used for 5 runs of 24 hours each. Resource details: There are two processors on each node, Intel Xeon 3.00GHz, with 4 real cores per processor, so each compute node has 8 real cores. Each node also has 16GB of memory and two 1Gb Ethernet cards and one Mellanox Infiniband card. This experiment had been assigned 32 nodes (so 256 cores) to use for simulations. How to request resources: To get access to the resources you the resource provider. They provide an account quickly (in around a day). How to access resources: The front end of the resource is accessed using ssh; you will need an account on the system to do this using a command such as this: ssh -X [email protected] -p 2222 Once you have logged into the system. you can run jobs using the Open Grid Scheduler/Grid Engine batch system. To use that system you need to submit a job script using the qsub command. CHALLENGES Current simulations take a long time to compute, limiting the usefulness and usability of the CAE approach for this problem. For instance, on the resources currently available to the end user, a simulation of seconds of real time water flow takes two to three weeks of computational time. To improve this time to solution we need to access to larger computational resources than we currently have available. Scientific Challenge Simulation of the viscous flow around the hull of a ship with the free surface was provided. The object of research was the hull of the river-sea dry-cargo vessel with extremely high block coefficient (Cb = 0.9). The hull flow included complex phenomena, e.g. wave pattern on the free surface, and fully developed turbulence flow in the boundary layer. The main purpose of the simulation was towing resistance determination. In general, dependence of towing resistance on the speed of the ship was used for the prime mover s power prediction at the design stage. The present case considered a test example for which there is reliable experimental data. In contrast to the conventional method of model tests, the methods of CFD simulation have not been fully studied regarding the reliability of the results, as well as the computational resources and time costs, etc. For these reasons, the computational grid formation and the scalability of the solution were the focus of this research. Resources FCSCL, the Foundation of Supercomputing Center of Castile and León, Spain, provided HPC resources, in the form of a 288 HP blade nodes system with 8 cores and 16GB RAM per node. Software FlowVision is a new generation multi-purpose simulation system for solving practical CFD (computational fluid dynamics) problems. The modern C++ implementation offers modularity and flexibility that allows addressing the most complex CFD areas. A unique approach to grid generation (geometry fitted sub-grid resolution) provides a natural link with CAD geometry and FE mesh. The ABAQUS integration through Multi-Physics (MP) Manager supports the most complex fluid-structure interaction (FSI) simulations (e.g., hydroplaning of automotive tires). FlowVision integrates 3D partial differential equations (PDE) describing different flows, viz., the mass, momentum (Navier- Stokes), and energy conservation equations. The system of the governing equations is completed by state equations. If the flow is coupled with physical-chemical processes like turbulence, free surface evolution, combustion, etc., the corresponding PDEs are added to the basic equations. All together the PDEs, state equations, and closure correlations (e. g., wall functions) constitute the mathematical model of the flow. FlowVision is based on the finite-volume approach to discretization of the governing equations. Implicit velocity-pressure split algorithm is used for integration of the Navier-Stokes equations. FlowVision is integrated CFD software: its pre-processor, solver, and post-processor are combined into one system. A user sets the flow model(s), physical and method parameters, initial and boundary conditions, etc. (pre-processor), performs and controls calculations (solver), and visualizes the results (post-processor) in the same window. He can stop the calculations at any time to change the required parameters, and continue or recommence the calculations. Additional Challenges This project continued from the first round of the cloud experiment. In the first round we faced the challenge that
44 the end user for this project had a particular piece of commercial simulation software they needed to use for this work. The software required a number of ports to be open from the front end of the HPC system to the end users machines, both for accessing the licenses for the software and to enable visualization, computational steering, and job preparation for the simulations. There were a number of issues to be resolved to enable these ports to be opened, including security issues for the resource provider (requiring the open ports to be restricted to a single IP address or small range of IP addresses), and educating the end user about the configuration of the HPC system (with front-end and back-end resources and a batch system to access the main back-end resources). These issues were successfully tackled. However, another issues was encountered the Linux version of the operating system on the HPC resources was not recent enough and one of the system main parts, glibc libraries, were not the required version for the commercial software to be run. The resource provider was willing to upgrade the glibc libraries to the required version; however this impacted another team during the first round. At the start of this second round of the experiment this problem was resolved so simulations could be undertaken. Outcome The dependence of the towing resistance on the resolution of computational grid (grid convergence) was investigated. The results show that grid convergence becomes good when grids with a number of computational cells of more than 1 mln are used. The results of the simulation, performed in a wide range of towing speeds (Froude numbers) in the grid with about 1 mln computational cells, showed good agreement with the experimental data. CFD calculations were performed in full scale. The experimental results were obtained in the deepwater towing tank of Krylov State Research Centre (model scale is 1:18.7). The full-scale CFD results were compared to the recalculated results of the model test. The maximum error in the towing resistance of the hull reached only 2.5%. Fig. 3 - Comparison of the CFD and experimental data in dimensionless form (residual resistance coefficient versus Froude number) Visualization of the free surface demonstrated the wave pattern, which is in a good correspondence with the photos of the model tests. High-quality visualization of other flow characteristics was also available. Fig. 4 - Free surface CFD, speed 13 knots (Fn = 0.182) Fig. 2 - Grid convergence, speed 12.5 knots Fig. 5 - Pressure distribution on the hull surface (scale in Pa)
45 Fig. 6 - Shear stress distribution on the hull surface (scale in Pa) Fig. 7 - Scalability test results CONCLUSIONS AND RECOMMENDATIONS Using HPC clouds offers users incredible access to supercomputer resources. CFD users with the help of commercial software can greatly speed up their simulation of hard industrial problems. Nevertheless, existing access to these resources has the following drawbacks: 1. Commercial software must be first installed on remote supercomputer 2. It is necessary to provide the license for the software, or to connect to a remote license server 3. User can be faced with a lot of problems during installation process: e.g., incompatibility of the software with the operation system, and incompatibility of additional 3rdparty software like MPI, TBB libraries, etc. 4. All these steps require that the user be in contact with the software vendor or cluster administrator for technical support From our point of view, it is necessary to overcome all these problems in order to use commercial software on HPC clouds. Commercial software packages used for simulation often have requirements for licensing and operation that either means the resources they are being run on need to access external machines or software needs to be installed locally to handle licenses, etc. New users to HPC resources often require education in the general setup and use of such systems (e.g., the fact you generally access the computational resources through a batch system rather than logging on directly). Basecamp has been useful to enable communication between the project partners, sharing information, and ensuring that one person is does not hold up the whole project. Communication between the client side and the solver side of modern CAE systems ordinarily uses network protocol. Thus the organization of work over SSH protocol requires additional operations, including port forwarding and data translation. On the other hand, when properly configured, the client interface is able to manage the solving in the same manner as in local network. Case Study Authors Adrian Jackson, Jesus Lorenzana, Andrew Pechenyuk, and Andrey Aksenov.
46 Team 47: Heavy Duty Abaqus Structural Analysis using HPC in the Cloud Round 2 The major challenge and now widely accepted to be the most critical was the end user perception and acceptance of the cloud as a smooth part of the workflow. USE CASE In Round 1 of the HPC Cloud experiment, the team established that indeed computational use cases could be submitted successfully using the cloud API and infrastructure. The objective of this Round 2 was to explore the following: How can the end user experience be improved? For example, how could the post processing of HPC CAE results kept in the cloud be viewed at the remote desktop? Was there any impact of the security layer on the end user experience? The end to end process remains widely dispersed end user demand was tested in two different geographic areas: Continental USA and Europe. The network bandwidth and latency were expected to play a major role since it impacts the workflow and user perception of the ability to deliver cloud HPC capability not in the compute, but in the pixel manipulation domain. Here is an example of the workflow: 1. Once the job finishes, the end user receives a notification , the results files remain at the cloud facility i.e. they are NOT transferred back to MEET THE TEAM
47 the end user s workstation for post-processing 2. The post-processing is done using remote desktop tool in this case NICE-Software DCV infrastructure layer on the HPC provider s visualization node(s). Typical network transfer sizes (upstream and downstream) were expected to be modest, and it is this impact that we hoped to measure thus making them tunables. This represented the major component of the end user experience. The team also expanded by almost 100%, to help bring in more expertise and support to tackle the last stage of the whole process and make the end user experience adjustable depending on several network layer related factors. CHALLENGE The major challenge and now widely accepted to be the most critical was the end user perception and acceptance of the cloud as a smooth part of the workflow. Here remote visualization was necessary to see if the simulation results (left remotely in the cloud) could be viewed and manipulated as if it were local on the end user desktop. To contrast with Round 1, and to get real network expertise to bear on this aspect, NICE s DCV was chosen to help deliver this, as it is: Application neutral Has a clean and separate client (free) and server component Provided some tuning parameters which can help overcome the bandwidth issues Several tests were conducted and carefully iterated, such as image update rate, bandwidth selection, codecs, etc. A screen shot is shown below for the final successful user acceptance of remote visualization settings: TABLE 1. CAST-IN-PLACE MECHANICAL ANCHOR CONCRETE ANCHORAGE PULLOUT CAPACITY ANALYSIS (FEA STATS) Materials Procedure Steel & Concrete 3D Nonlinear Contact, Fracture & Damage Analysis Number of Elements 1,626,338 Number of DOF 1,937,301 Solver Solving Time ODB Result File Size The post processing underlying Infrastructure (cloud end): Fig 2. - DCV layer setup ABAQUS/Explicit in Parallel 11.5 hours on a 32-core Linux Cluster 2.9 GB The post processing underlying Infrastructure (end user space): Fig 3. - DCV enabled post processing (end user view) Fig 1. - Typical end user screen manipulation(s) Setup We made a number of end user trails. First the DCV was installed with both a Windows and Linux client. Next a portal window was opened, usually at the same time as the end user trial to observe the demand on the serving infra-
48 structure (see diagram). This ensured that there was sufficient bandwidth and capacity at the cloud end. The end node was hosting an NVIDIA graphics accelerator card. An initial concern was if the version was supported or had an impact. DCV has the ability to do a sliding scale of the pixel compression and this involves skipping certain frames in order to keep the flow smooth. Fig 4. - Ingress/egress test results/profile Figure 4 basically shows us that the cloud Internet measurements peaked out at 12 Mbits/sec, but generally hover at or below 8 Mbits for this particular session. This profile graph is a good representation of what has been seen in the past on DCV sessions. The red line (2 Mbits/sec) is where consistent end user experience for this particular graphic size was observed. CONCLUSIONS AND RECOMMENDATIONS Here is a summary of the key results found during our Round 2 experiment: End point Internet bandwidth variability: Depending on when it is conducted, a vendor neutral test applet result ranges from 1Mbps ~ 10Mbps. The pipe Bandwidth was expected to be 20 Mbits/sec, but when it was shared by the office site using normal enterprise applications such as server Exchange, Citrix, etc., such variation was not conducive to a qualitative end user experience. Switching to another pipe (with burst mode of 50 to 100 Mbits/sec): More testing showed that the connection was not stable, and ABAQUS/Viewer graphics windows freeze was experienced after being idle for a while. This required local IT to troubleshooting the issue. There were no significant differences between Windows or Linux hosted platforms. The NICE DCV/EnginFrame is a good platform for remote visualization if a stable Internet BW is available. Some of the parameters for the connection performance: o VNC connection line-speed estimate: 1 ~ 6 Mbps, RTT ~ 62 ms o DCV bandwidth usage: AVG 100 KiB ~ 1 MiB o DCV Frame Rate: 0~10 FPS, >5 FPS acceptable, >10 FPS smooth We tried both Linux and Windows desktop. Because of the BW randomness & variability, it was not possible create a good baseline to compare the performance of the two desktops. The graphics cards did not have any impact on the end user experience. However the model size and graphic image pixel size perhaps play a major role, and the current experiment did not have enough time to study and characterize this issue. The ABAQUS model used in this test case does not put much demand on the graphics card. We ve seen only 2% usage on the card. There was usually sufficient network capacity and bandwidth at the cloud serving end. The last mile delivery or capability at the end user site was the most important and perhaps only determining factor influencing the end user experience and perception. Beyond the cloud service provider, a local or end user IT support person with network savvy is perhaps a necessary part of the infrastructure team in order to deliver robust and repeatable post processing visual delivery. This incurs a cost. The security aspect could not be tested, as the time and effort required were not sufficient in the time allotted. Part of the end user experience learned from Round 1 was to better document the setup which can be found in the Appendix and clearly shows a smooth and easy to follow up flow. Major single conclusion and recommendations Any site that wishes to benefit from this experience needs to prioritize the last mile issue. End User Experience Observations & Data Tables: Bandwidth Usage Note: Image Quality: Specify the quality level of dynamic images when using TCP connections. Higher values correspond to higher image quality and more data transfer. Lower values reduce quality and reduce bandwidth usage. Network Latency for round trip from the DCV remote visualization server Ping statistics for : Packets: Sent = 4, Received = 3, Lost = 1 (25% loss), Approximate round trip times in milliseconds: Min = 56ms, Max = 58ms, Average = 56ms Case Study Authors Frank Ding, Matt Dunbar, Steve Hebert, Rob Sherrard and Sharan Kalwani.
49 Team 52: High-Resolution Computer Simulations of Blow-off in Combustion Systems Remote clusters allow small companies to conduct simulations that were previously only possible for large companies and government labs. USE CASE The undesired blow-off of turbulent flames in combustion devices can be a very serious safety hazard. Hence, it is of interest to study how flames blow off. Simulations offer an attractive way to do this. However, due to the multi-scale nature of turbulent flames, and the fact that the simulations are unsteady, these simulations required significant computer resources. This makes the use of large, remote computational resources extremely useful. In this project, a canonical test problem of a turbulent premixed flame is simulated with OpenFOAM and run in extremefactory.com. MEET THE TEAM Computing resource requirements At least 40 cores. CHALLENGES The current challenges for the end-user (with just his inhouse resources is that the computational resources needed for these simulations are significant (i.e. more than 100 cpus and 1-3 days of continuous running). BENEFITS Remote clusters allow small companies to conduct simulations that were previously only possible for large companies and government labs. Fig. 1 - Schematic of the bluff-body flame holder experiment. Sketch of the Volvo case: A premixed mixture of air and propane enters the left of a plane rectangular channel. A triangular cylinder is located at the center of the channel and serves as a flame holder. Fig. 2 - Predicted temperature contours field for the Volvo case using OpenFOAM. Application software requirements OpenFOAM can handle this problem very well. It can be downloaded from: Custom code or configuration of end-user OpenFOAM input files are available at com/u/ /3dcoarse_125.tar.gz. These files were used in a 3D simulation that ran OpenFOAM (reactingfoam to be precise) in 40 cores. To get an idea of how to run these files, have a look at the section Run in parallel in: sites.google.com/site/estebandgj/openfoam-training CONCLUSIONS Running reactingfoam for a simulation of a bluff-bodystabilized premixed flame requires a mesh of less than 1/4 million cells. This is not much, but the simulations need to run for a long time, and are part of a parametric study that needs more than 100 combinations of parameters. Running one or two huge simulations is not the goal here. The web interface was easy to use so much easier than running in Amazon s EC2, that I did not even read the instructions and was able to properly run OpenFOAM. Nonetheless, it was not very clear how to download all the data once the simulation ended. In addition the simulation ran satisfactorily. There were some errors at the end, but these were expected. The team has one suggestion: A key advantage of using OpenFOAM is that it allows us to tailor OpenFOAM applications to different problems. This requires making some changes in the code and compiling with wmake, something that can be done in Amazon s EC2. It is not clear how this can this be done with the present interface. A future test might be to run myreactingfoam instead of reactingfoam. Case Study Author - Ferry Tap
50 Team 53: Understanding Fluid Flow in Microchannels the experiment serves as a proof of concept of applying a user-oriented computational federation to solve large-scale computational problems in engineering. USE CASE Problem Description The end-user developed a parallel MPI solver for Navier-Stokes equation. With this solver, the end-user can simulate the flow in a microchannel with an obstacle for a single configuration of the fluid speed, the micro-channel size and the obstacle geometry (see Figure 1). A single simulation typically requires hundreds of CPU-hours. MEET THE TEAM TEAM MEMBERS Fig. 1 - Example flow in a microchannel with a pillar. Four variables characterize the simulation: channel height, pillar location, pillar diameter, and Reynolds number. The end-user sought to construct a phase diagram of possible fluid flow behaviors to understand how input parameters affect the flow. Additionally, the end-user wanted to create a library of fluid flow patterns to enable analysis of their combinations. The problem has many significant applications in the context of medical diagnostics, bio-medical engineering, constructing structured materials, etc. CHALLENGES The problem was challenging for the end-user as it required thousands of MPI-based simulations, which collectively exceeded computational throughput offered by any individual HPC machine. Although the end-user had access to several high-end HPC resources, executing thousands of simulations requires complex coordination and fault-tolerance, which were not readily available. Fi-
51 nally, simulations are highly heterogeneous, and their computational requirements were hard to estimate a priori, adding another layer of complexity. The Solution To tackle the problem, the team decided to use multiple federated heterogeneous HPC resources. The team proceeded in four stages: 1. Preparatory phase in which HPC experts gained understanding of the domain problem, and formulated a detailed plan to solve it this phase included face-to-face meeting between the end-user and HPC experts 2. Software-hardware deployment stage in which HPC experts deployed the end-user s software, and implemented required integration components. Here, minimal or no interaction with systems administrators was required, thanks to the flexibility of the CometCloud platform used in the experiment 3. Computational phase in which the actual simulations were executed 4. Data analysis in which the output of simulations was summarized and postprocessed by the end-user. The developed approach is based on the federation of distributed heterogeneous HPC resources aggregated completely in a user-space. Each aggregated resource acts as a worker executing simulations. However, each resource can join or leave the federation at any point of time without interrupting the overall progress of the computations. Each aggregated resource acts as a temporal storage for the output data as well. The data is compressed on-the-fly, and transferred using the RSYNC protocol to the central repository for simple, sequential postprocessing. In general, the computational platform used in this experiment takes the concept of volunteer computing to the next level, in which desktops are replaced with HPC resources. As a result the end-user s application gains cloud-like capabilities. In addition to solving an important and urgent problem for the end-user, the experiment serves as a proof of concept of applying a user-oriented computational federation to solve large-scale computational problems in engineering. CONCLUSIONS AND RECOMMENDATIONS Several observations emerged from the experiment: Good understanding of the domain specific details by the HPC experts was important to the fluent progress of the experiment. Close collaboration with the end-user, including faceto-face meetings, was critical for the entire process. Although it may at first seem counterintuitive, working within limits set by different HPC centers i.e. using only SSH access without special privileges greatly simplified the development process. At the same time, maintaining friendly relationship with respective systems administrators helped to shorten response time to address common operational issues. General Challenges and Benefits of Using UberCloud The main difficulty was to obtain a sufficient number of HPC resources that collectively would provide throughput needed to solve the end-users problem. This challenge was solved by interacting with several HPC centers, and then exploiting elasticity offered by CometCloud to add extra resources during the experiment. For example, several machines were federated after the experiment was already running for five days. UberCloud greatly simplified the process of obtaining computational resources. The ability to quickly contact various HPC providers was central to the success of the experiment. UberCloud provided a well-structured and organized environment to test new approaches for solving large-scale scientific and engineering problems. Following well planned steps with clearly defined deadlines, as well as having a central message-board and documents repository, greatly simplified and accelerated the development process. Experiment Highlights The main highlights of the experiment are summarized below: 10 different HPC resources from 3 countries federated using CometCloud 16 days, 12 hours, 59 minutes and 28 seconds of continuous execution 12,845 MPI-based flow simulations executed 2,897,390 core-hours consumed 400 GB of output data generated The most comprehensive data to date on the effect of pillars on microfluid channel flow gathered. Case Study Authors - Javier Diaz-Montes, Baskar Ganapathysubramanian, Manish Parashar, Ivan Rodero, Yu Xie, and Jaroslaw Zola. Acknowledgments This work is supported in part by the National Science Foundation (NSF) via grants number IIP and DMS (RDI2 group), and CA- REER and PHY (Iowa State group). This project used resources provided by: the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the NSF grant number OCI , FutureGrid, which is supported in part by the NSF grant number OCI , and the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy (DOE) under the contract number DE-AC02-05CH The authors would like to thank the SciCom research group at the Universidad de Castillala Mancha, Spain (UCLM) for providing access to Hermes, and Distributed Computing research group at the Institute of High Performance Computing, Singapore (IHPC) for providing access to Libra. The authors would like to acknowledge the Consorzio Interuniversitario del Nord est Italiano Per il Calcolo Automatico, Italy (CINECA), Leibniz-Rechenzentrum, Germany (LRZ), Centro de Supercomputacion de Galicia, Spain (CESGA), and the National Institute for Computational Sciences (NICS) for willing to share their computational resources. The authors would like to thank Dr. Olga Wodo for discussion and help with development of the simulation software, and Dr. Dino DiCarlo for discussions about the problem definition. The authors express gratitude to all administrators of systems used in this experiment, especially to Prentice Bisbal from RDI2 and Koji Tanaka from FutureGrid, for their efforts to minimize downtime of computational resources, and a general support. The authors are grateful to Wolfgang Gentzsch and Burak Yenier for their overall support.
52 Team 54: Analysis of a Pool in a Desalinization Plant MEET THE TEAM The bottleneck in all CAE simulations using commercial software is the cost of the commercial CFD licenses. USE CASE Many areas in the world have no available fresh water even though they are located in coastal areas. As a result, in recent years a completely new industry has been created to treat seawater and transform it into tap water. This water transformation requires that the water must be pumped into special equipment, which is very sensitive to cavitation. Therefore, a correct and precise water flow intake must be forecasted before building the installation. The CFD analysis of air-water applications using free surfaces modeling is a highly complex modelization. The computational mesh must correctly capture the fluid interface and the number of iterations required to obtain physically and numerically converged solution is very high. If both previous requirements are not matched, the forecasted solution will not even be close to the real world solution. CHALLENGES The end-user needed to obtain a physical solution in a short period of time as the time to analyze the current design stage was limited. The time limitation mandated the use of remote HPC resources to meet the customer s time requirements. As usual the main problem was the result data transfer between the end-user and the HPC resources. To overcome this problem, the end-user used the visualization software Ensight to look at the solution and obtain images and animations completely through the Internet. The table below provides an evaluation of the Gompute on demand solution: Remote Visualization The end user categorized the Gompute VNC-based solution as excellent. It is possible to request a graphically accelerated node when starting your programs with a GUI. This functionality substantially cuts virtual prototyping lead time, since all the data generated from a CAE simulation can be simulated directly in Gompute. Also, this omits time consuming data transfers and increases data security by removing the need to have multiple copies of the same data at different locations sometimes on insecure workstations. Gompute accelerators allows the use of the desktop over links with latency over 300 ms. This allows Gompute resources
53 to be used by locations separated by as much as 160 degrees longitude i. e., the user may be in India and the cluster in Detroit. Collaborative workflow is allowed by the Gompute remote desktop sharing option so two users at different geographical locations can work together on the same simulation. Ease of Use Gompute on demand provide a ready-to-use environment with an integrated repository of the applications requested, license connection, and queuing system based on SGE. In order to establish the connection to the cluster, you just open ports 22 and 443 on the company s firewall. Downloading Gompute explorer and opening a remote desktop allows you to have the same user experience as working with your own in house machine. Compared to other tested HPC connection modes, Gompute connections were easy to setup and use. The connection allowed connecting and disconnecting to the HPC account to check how the calculations were progressing. As to costs, the Gompute quotation clearly described the services provided. Also, the technical support from Gompute personal was good BENEFITS Compute remotely Pre/post-process remotely Gompute can be used as an extension of in-house resources Able to burst into Gompute On-Demand from an inhouse cluster Accelerated file transfers Possible to have exclusive desktops Support for multiple users on each graphics node Applications integrated and ready to use GPFS storage available Handles high latency links between the user and the Gompute cluster Facilitates collaboration with clients and support CONCLUSIONS AND RECOMMENDATIONS The bottleneck using commercial software in CAE is the cost of the commercial CFD licenses. There were two lessons learned: ANSYS has no CFD on demand license to use the maximum number of available cores in a system while competitor software, such as Star-CCM+, already has such a license. Supercomputing centers must provide analysis/postprocessing tools for customers to check results without the need to download result files otherwise, many of the advantages of using cloud computing are lost because of long data file transfer times. The future for the wider use of supercomputing centers is to find a way to have commercial CAE (CFD and FEA) licenses on demand in order to pay for the actual software usage. Commercial software must take full advantage of current and future hardware developments for the wider spread of virtual engineering tools. Case Study Authors Juan Enriquez Paraled, Manager of ANALISIS-DSC; Ramon Diaz, Gompute
54 Team 56: Simulating Radial and Axial Fan Performance MEET THE TEAM The main reason to look to HPC in the cloud is cost. USE CASE For the end user the aim of the exercise was to evaluate the HPC cloud service without the need to obtain new engineering insights. That s why a relatively basic test case was chosen a case for which they already had results from the end user s own cluster, and which had a minimum of confidential content. The test case was the simulation of the performance of an axial fan in a duct similar to those found in the AMCA standard. A single ANSYS Fluent run simulated the performance of a fan under 10 different conditions to reconstruct the fan curve. The mesh consisted of 12 million tetrahedral cells and was suited to test parallel scalability. CHALLENGES The main reason to look to HPC in the cloud is cost. The end user has a highly fluctuating load with regard to simulations. This means that their current on-site cluster rarely has the correct capacity. When it is too large, they are paying too much for hardware and licences; and when it is too small they are losing money because the design teams are waiting for the results. With a flexible HPC solution in the cloud the end user can theoretically avoid both costs. Evaluation HPC as a service will only be an alternative to the current on-site solution if it manages to meet a series of well-defined criteria as set by the end user. Criteria Local HPC Ideal cloud Actual cloud Pass/Fail hpc HPC Upload speed 11.5 MB/s 2 MB/s 0.2 MB/s Fail Download speed 11.5 MB/s 2 MB/s 4-5 MB/s Pass Graphical output possible possible inconvenient Fail Quality of the excellent excellent good Pass image Refresh rate excellent excellent good Pass Latency excellent excellent good Pass Command possible possible possible Pass line access Output file possible possible possible Pass access Run on the easy easy easy Pass reserved cluster Run on the on N/A easy easy Pass demand cluster Graphical node excellent excellent good Pass Using UDF s possible possible possible Pass on the cluster State of the reasonable good good Pass art hardware Scalability poor excellent excellent Pass Security excellent excellent good Pass Hardware cost good excellent N/A N/A License cost good excellent N/A N/A Table 1 - Evaluation results
55 Cluster Access Gridcore allows you to connect to its clusters through the GomputeXplorer. This is a Java-based program that lets you monitor your jobs and launch virtual desktops. Establishing the connection was actually not that easy. If the standard SSH and SSL ports (22 and 443) are open in your companies firewall then connecting is straightforward. This is however rarely the case. Alternatively you can make your connection with the use of a VPN. Both options require that the end user make changes to the firewall. Because the end user had to wait a long time for these changes to be implemented, valuable time was lost. Only the port changes were implemented. So the VPN option was never tested. Transfer Speed Input files, and certainly result files, for typical calculations range from a couple of hundreds of megabytes to a couple of gigabytes in size. Therefore a good transfer speed is of vital importance. The target is a minimum of 2 MB/s for both upload and download. This means that it is theoretically possible to transfer 1GB of data in 8.5 minutes. When transferring files with the GomputeXplorer, upload speeds of 0.2MB/s and download speeds of about 4-5MB/ s were measured. When transferring the files with a regular SSH client the upload speed was 1.7 MB/s and the download speed 0.9 MB/s. These speeds were measured during transferring the same files several times. The tests were performed one after the other to ensure a fair comparison. These measurements show that theoretically reasonable to good transfer speeds are possible, but so far no solution was found to get the GomputeXplorer s upload speed up to par. As noted by resource provider, most clients get their speed depending on the bandwidth, and the low numbers measured are quite abnormal. Several tests were performed in the system seeking the root cause of the issue, but none was found. The investigations would have continued until the solution was found, but not within the time frame of the experiment. It might be more practical to wait for a new file transfer tool that is planned to be rolled out shortly by Gompute and might resolve this issue. Graphical output in batch To see how the flow develops over time, it is common practice to output some images from the flow field. Fluent cannot do this with just a command line but requires an X-window to render to. The end user was not able to make this option work on the Gompute cluster within the allocated timeframe. Several suggestions (mainly different command line arguments) have been put forward to resolve this issue. Remote visualization The end user used the HP Remote Graphics Software package that gave a like local experience. If we categorise HP RGS as excellent, the VNC based solution of Gompute can surely be categorized as good. There was a noticeable difference between the dedicated cluster and the on-demand one with regard to the quality of the remote visualisation (these are both remote Gompute clusters the dedicated one was specifically reserved for the end user). The dedicated clusters render quality and latency was much better. It is entirely possible to do pre- and post-processing on the cluster. It is also possible to request a graphically accelerated node when starting your programs with a GUI. Ease of use The Gompute remote cluster uses the same queuing system (SGE) as the end user s cluster so the commands are familiar. The fact that you can request a full virtual desktop makes using the system a breeze. This virtual desktop allows for easy compilation of the UDF s (C-code to extend the capabilities of Fluent) on the architecture of the remote cluster. Submitting and monitoring jobs is just as easy as on the local cluster. The process is also identical on the dedicated and the on-demand cluster. Apart from the billing method, there is no additional overhead when you temporarily want to expand your simulation capacity by using the on-demand cluster. Hardware The hardware that was made available to the end user was less than two years old (Westmere Xeon s). This was considered to be good. Sandy Bridge-based Xeon s would have been considered excellent. The test case was used to benchmark the Gompute cluster against the end user s own aging cluster. Fig. 1 - Comparison of run times of the test case. The time it took to run the simulation on 16 cores of the local cluster is the reference where the speedup is defined relative to this time. The blue curve represents the old, local cluster and the red curve the on-demand cluster from Gompute. The green point is from a run on a workstation that has a similar hardware configuration as the cluster from Gompute but runs Windows instead of Linux. The following points can be concluded from this graph: The old cluster isn t performing all that badly consid-
56 ering its age. Either that or a larger speedup was expected from the new hardware. The simulation scales nicely on the Gompute cluster, but not as well on the local cluster. The performance of the workstation is similar to that of the Gompute cluster. Cost The resource provider only provides hardware; the customer is still responsible for acquiring necessary software licenses. The cost benefit is therefore limited to hardware and support. The most likely customer base for the On Demand Cluster service are companies that either rarely do a simulation or occasionally need extra capacity. In both cases they would have to pay for a set of licenses that are rarely used. It doesn t seem to be a very good solution and may become a showstopper for adopting the HPC in the cloud. Hopefully, ANSYS will come up with a license model that would enable a service that is more in line with HPC in the cloud. BENEFITS End User Ease of use. Post- and pre-processing can be done remotely. Excellent opportunity to test the state of the art in cloud based HPC. CONCLUSIONS AND RECOMMENDATIONS HPC in the cloud is technically feasible. Most remaining issues are implementation related that the resource provider should be able to solve. The remote visualisation solution was good and allowed the user to actually perform some real work. Of course, it remains to be seen if a stress test with multiple users from the same company yields the same results. The value of the HPC in the cloud solution is limited by the absence of appropriate license models from the software vendors that would allow Gompute to actually sell simulation time and not just hardware and support. Further rounds of this experiment can be used to analyse the abnormal uploading speed. File transfer might be tested using the VPN connection to guarantee no restrictions from the company s firewall. Also of interest is the testing of the new release of Gompute file transfer tool, which implements a transferring accelerator. Different graphical node configurations can be tested to enhance the user experience. Case Study Authors Wim Slagter, Ramon Diaz, Oleh Khoma, and Dennis Nagy. Note: The illustration on top of this report shows pressure contours in front/behind a 6-bladed axial fan.
57 Team 58: Simulating Wind Tunnel Flow Around Bicycle and Rider Being able to quickly adapt a solution to a certain environment is a key competitiveness factor in the cloudbased CAE arena. MEET THE TEAM USE CASE The CAPRI to OpenFOAM Connector and the Sabalcore HPC Computing Cloud infrastructure were used to analyze the airflow around bicycle design iterations from Trek Bicycle. The goal was to establish a great synergy between iterative CAD design, CFD analysis, and HPC cloud environments. Trek has been heavily invested in engineering R&D, and does extensive prototyping before producing a final production design. CAE has been an integral part of design process in accelerating the pace of R&D and rapidly increasing the number of design iterations. Advanced CAE capabilities have helped Trek reduce cost and keep up with the demanding product development time necessary to stay competitive. Automating iterative design changes in Computer Aided Design (CAD) models coupled with Computational Fluid Dynamics (CFD) simulations can significantly enhance the productivity of engineers and enable them to make better decisions in order to achieve optimal product designs. Using a cloud-based or On-Demand solution to meet the HPC requirements of computationally intensive applications decreases the turn-around time in iterative design scenarios and reduces the overall cost of the design. With most of the software available today, the process of importing CAD models into CAE tools, and executing a simulation workflow requires years of experience and remains, for the most part, a human-intensive task. Coupling parametric CAD systems with analysis tools to ensure reliable automation also presents significant interoperability challenges. The upfront and ongoing costs of purchasing a high performance computing system are often underestimated. As most companies HPC needs fluctuate, it s difficult to adequately size a system. Inevitably, this means resources will be idle for many hours and, at other times, will be inadequate for a project s requirements. In addition, as servers age and more advanced hardware becomes available, companies may recognize a performance gap between themselves and their competitors. Beyond the price of the hardware itself, a large computer cluster demands specialized power resources, consumes vast amounts of electrical power, and requires specialized cooling systems, valuable floor space and experienced experts to maintain and manage them. Using a HPC provider in the cloud overcomes these challenges in a cost effective, pay-per-use model. Experiment Development The experiment was defined as an iterative analysis of the performance of a bike. Mio Suzuki at Trek, the end user, supplied the CAD model. The analysis was performed on two Sabalcore provided cluster accounts. The CADNexus CFD connector, an iterative preprocessor, was used to generate OpenFOAM cases using the SolidWorks CAD model as geometry. A custom version of the CAPRI- CAE interface, in the form of an Excel spreadsheet, was delivered to the end user by the team expert Mihai Pruna, who represented the Software Provider, CADNexus. Fig. 1 - Setting up the CAD model for tessellation
58 The CAPRI-CAE interface was modified to allow for the deployment and execution of OpenFOAM cases on Sabalcore cluster machines. Mihai Pruna also ran test simulations and provided advice in setting up the CAD model for tessellation, that is, the generation of an STL file suitable for meshing (Figure 1). The cluster environment was set up by Kevin Van Workum with Sabalcore, allowing for rapid and frequent access to the cluster accounts via SSH as needed by the automation involved in copying and executing the OpenFOAM cases. The provided bicycle was tested at two speeds: 10 and 15 mph. The CADNexus CFD connector was used to generate cutting planes and wake velocity linear plots. In addition, the full simulation results were archived and provided to the end user for review using ParaView, a free tool (see the figure on top of this report). ParaView or other graphical post-processing applications can also be run directly on Sabalcore using their accelerated Remote Graphical Display capability. Thanks to the modular design of the CAPRI powered OpenFOAM Connector and the flexible environment provided by Sabalcore Computing, integration of the software and HPC provider resources was quite simple. CHALLENGES General Considering the interoperability required between several technologies, the set up went fairly smoothly. The CAPRI- CAE interface had to be modified to work with an HPC cluster. The production version was designed to work with discrete local or cloud based Ubuntu Linux machines. For the cluster environment, some programmatically generated scripts had to be changed to send jobs to a solver queue rather than execute the OpenFOAM utilities directly. The CAD model was not a native SolidWorks project but rather a series of imported bodies, and surfaces exhibited topological errors that were picked up by the CAPRI middleware. Defeaturing in SolidWorks, as well as turning off certain consistency checks in CAPRI, helped alleviate these issues and produce quality tessellations. Data Transfer Issues Sometimes, a certain OpenFOAM dictionary would fail to copy to the client, causing the OpenFOAM scripts to fail. This issue has not been resolved at this time, but it seems to occur only with large geometry files, although it is not the geometry file that fails to copy. Possible solutions include zipping up each case and sending it as a single file. Retrieving the full results can take a long time. Solutions already developed involve doing some of the post processing on the client and retrieving only simulation results data specified by the user, as implemented by CADNexus in the Excel based CAPRI-CAE interface (2), or running ParaView directly on the cluster, as implemented by Sabalcore. End User s Perspective Capri is a fantastic tool to connect the end user desktop environment directly to a remote cluster. As an end user, the first challenge I faced was thoroughly understanding the formatting of the Excel sheet. As soon as I was able to identify what was wrong with my Excel entries, the rest of the workflow went relatively smoothly and as exactly specified in the templates workflow. I also experienced slowness in building up the cases and running the cases. If there is a way to increase the speed at each step (synchronizing the CAD, generating cases on the server, and running), that would enhance the user experience. Figure 2: Z=0 Velocity Color Plot Generated with CADNexus Visualizer Lightweight Postprocessor BENEFITS The CAPRI-CAE Connector and the CAPRI-FOAM connector dramatically simplify the generation of design-analysis iterations. The user has a lot fewer inputs to fill in, and the rest are generated automatically. The end user does not need to be proficient in OpenFOAM or Linux. With respect to the HPC resource provider, the environment provided to the user by Sabalcore was already setup to run OpenFOAM, which helped speedup the process of integrating the CADNexus OpenFOAM connector with Sabalcore s services. The only required modification to the HPC environment made by Sabalcore was to allow a greater than normal number of SSH connections from the user, which was required by the software. With Sabalcore s flexible environment, these changes were easily realized. CONCLUSIONS AND RECOMMENDATIONS Among the lessons learned in the course of this project were: Being able to quickly adapt a solution to a certain environment is a key competitiveness factor in the cloudbased CAE arena. A modular approach when developing your CAE solution for HPC / cloud deployment helps speed up the process of adapting your solution to a new provider. Selecting an HPC resource provider that has a flexible environment is also vital to quickly deploying a custom CAE solution. From an end user perspective, we observed that each cluster provider has a unique way of bringing the cloud HPC option to the end user. Many of them seem to be very flexible with respect to the services and interface they provide based on the user s preference. When choosing a cloud cluster service, we suggest that a CAE engineer investigate and select the service that is most suitable for the organization s particular engineering needs. Case Study Authors Mihai Pruna, Mio Suzuki, and Kevin Van Workum.
59 Thank you for your interest in the free and voluntary UberCloud HPC Experiment. If you, or your organization would like to participate in this Experiment to explore hands-on the end-to-end process of HPC as a Service for your business then please register at: If you are interested in promoting your service/product at the UberCloud Exhibit then please register at
Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.
Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies.
Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.
Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies.
Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.
Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies.
Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.
Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies.
SGI HPC Systems Help Fuel Manufacturing Rebirth
SGI HPC Systems Help Fuel Manufacturing Rebirth Created by T A B L E O F C O N T E N T S 1.0 Introduction 1 2.0 Ongoing Challenges 1 3.0 Meeting the Challenge 2 4.0 SGI Solution Environment and CAE Applications
Making a Case for Including WAN Optimization in your Global SharePoint Deployment
Making a Case for Including WAN Optimization in your Global SharePoint Deployment Written by: Mauro Cardarelli Mauro Cardarelli is co-author of "Essential SharePoint 2007 -Delivering High Impact Collaboration"
Relocating Windows Server 2003 Workloads
Relocating Windows Server 2003 Workloads An Opportunity to Optimize From Complex Change to an Opportunity to Optimize There is much you need to know before you upgrade to a new server platform, and time
High Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
Scaling LS-DYNA on Rescale HPC Cloud Simulation Platform
Scaling LS-DYNA on Rescale HPC Cloud Simulation Platform Joris Poort, President & CEO, Rescale, Inc. Ilea Graedel, Manager, Rescale, Inc. 1 Cloud HPC on the Rise 1.1 Background Engineering and science
WINDOWS AZURE AND WINDOWS HPC SERVER
David Chappell March 2012 WINDOWS AZURE AND WINDOWS HPC SERVER HIGH-PERFORMANCE COMPUTING IN THE CLOUD Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents High-Performance
An Introduction to Cloud Computing Concepts
Software Engineering Competence Center TUTORIAL An Introduction to Cloud Computing Concepts Practical Steps for Using Amazon EC2 IaaS Technology Ahmed Mohamed Gamaleldin Senior R&D Engineer-SECC [email protected]
IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud
IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain
Smart Manufacturing. CAE as a Service in the Cloud. Objective: convincing you to consider CAE in the Cloud
Smart Manufacturing CAE as a Service in the Cloud Objective: convincing you to consider CAE in the Cloud Wolfgang Gentzsch LS-DYNA Conference Würzburg 15. 17. June 2015 Engineers & scientists major computing
Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing
Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing Microsoft Windows Compute Cluster Server Runs
Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER
Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES TABLE OF CONTENTS Introduction... 3 Overview: Delphix Virtual Data Platform... 4 Delphix for AWS... 5 Decrease the
Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.
Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat
Hadoop in the Hybrid Cloud
Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big
RemoteApp Publishing on AWS
RemoteApp Publishing on AWS WWW.CORPINFO.COM Kevin Epstein & Stephen Garden Santa Monica, California November 2014 TABLE OF CONTENTS TABLE OF CONTENTS... 2 ABSTRACT... 3 INTRODUCTION... 3 WHAT WE LL COVER...
White Paper 8 STEPS TO CLOUD 9. How a hybrid approach can maximise the business value of cloud and what you can do to make it happen
White Paper 8 STEPS TO CLOUD 9 How a hybrid approach can maximise the business value of cloud and what you can do to make it happen Introduction Today, we re seeing IT s place in the enterprise evolving
Global Financial Management Firm Implements Desktop Virtualization to Meet Needs for Centralized Management and Performance
Global Financial Management Firm Implements Desktop Virtualization to Meet Needs for Centralized Management and Performance INDUSTRY Financial Services LOCATION San Francisco, CA; Pittsburgh, PA; and Boston,
IBM PureFlex System. The infrastructure system with integrated expertise
IBM PureFlex System The infrastructure system with integrated expertise 2 IBM PureFlex System IT is moving to the strategic center of business Over the last 100 years information technology has moved from
Cloud computing and SAP
Cloud computing and SAP Next Generation SAP Technologies Volume 1 of 2010 Table of contents Document history 1 Overview 2 SAP Landscape challenges 3 Infrastructure as a Service (IaaS) 4 Public, Private,
Leveraging Windows HPC Server for Cluster Computing with Abaqus FEA
Leveraging Windows HPC Server for Cluster Computing with Abaqus FEA This white paper outlines the benefits of using Windows HPC Server as part of a cluster computing solution for performing realistic simulation.
Planning the Migration of Enterprise Applications to the Cloud
Planning the Migration of Enterprise Applications to the Cloud A Guide to Your Migration Options: Private and Public Clouds, Application Evaluation Criteria, and Application Migration Best Practices Introduction
Accelerating Time to Market:
Accelerating Time to Market: Application Development and Test in the Cloud Paul Speciale, Savvis Symphony Product Marketing June 2010 HOS-20100608-GL-Accelerating-Time-to-Market-Dev-Test-Cloud 1 Software
Establishing a Private Cloud
SPONSORED CONTENT Fireside Chat Solutions Brief Establishing a Private Cloud By Miklos Sandorfi, Vice President of Solutions & Cloud, Hitachi Data Systems Claude Lorenson, Senior Product Manager, Cloud
MEETING THE CHALLENGES OF COMPLEXITY AND SCALE FOR MANUFACTURING WORKFLOWS
MEETING THE CHALLENGES OF COMPLEXITY AND SCALE FOR MANUFACTURING WORKFLOWS Michael Feldman White paper November 2014 MARKET DYNAMICS Modern manufacturing increasingly relies on advanced computing technologies
Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad
Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer
Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2
Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 In the movie making, visual effects and 3D animation industrues meeting project and timing deadlines is critical to success. Poor quality
Top Ten Reasons to Transition Your IT Sandbox Environments to the Cloud
Top Ten Reasons to Transition Your IT Sandbox Environments to the Cloud WHITE PAPER BROUGHT TO YOU BY SKYTAP 2 Top Ten Reasons to Transition Your IT Sandbox Environments to the Cloud Contents Executive
Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts
Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware
The Next Phase of Datacenter Network Resource Management and Automation March 2011
I D C T E C H N O L O G Y S P O T L I G H T The Next Phase of Datacenter Network Resource Management and Automation March 2011 Adapted from Worldwide Datacenter Network 2010 2015 Forecast and Analysis
White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary
White Paper The Ten Features Your Web Application Monitoring Software Must Have Executive Summary It s hard to find an important business application that doesn t have a web-based version available and
Managing the Cloud as an Incremental Step Forward
WP Managing the Cloud as an Incremental Step Forward How brings cloud services into your IT infrastructure in a natural, manageable way white paper [email protected] Table of Contents Accepting the
Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using
Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using Amazon Web Services rather than setting up a physical server
GTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved
GTC Presentation March 19, 2013 Copyright 2012 Penguin Computing, Inc. All rights reserved Session S3552 Room 113 S3552 - Using Tesla GPUs, Reality Server and Penguin Computing's Cloud for Visualizing
Cloud Computing and Amazon Web Services
Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD
Recommended hardware system configurations for ANSYS users
Recommended hardware system configurations for ANSYS users The purpose of this document is to recommend system configurations that will deliver high performance for ANSYS users across the entire range
Scaling from Workstation to Cluster for Compute-Intensive Applications
Cluster Transition Guide: Scaling from Workstation to Cluster for Compute-Intensive Applications IN THIS GUIDE: The Why: Proven Performance Gains On Cluster Vs. Workstation The What: Recommended Reference
Secure Bridge to the Cloud
Secure Bridge to the Cloud Jaushin Lee, Ph.D. September 2013 1 Table of Contents The promise for enterprise hybrid cloud computing... 3 Reality facing enterprise today... 3 Connecting the dots... 6 Secure
Reinventing Virtual Learning: Delivering Hands-On Training using Cloud Computing
Reinventing Virtual Learning: Delivering Hands-On Training using Cloud Computing WHITE PAPER BROUGHT TO YOU BY SKYTAP 2 Reinventing Virtual Learning: Delivering Hands-On Training using Cloud Computing
SAP HANA - an inflection point
SAP HANA forms the future technology foundation for new, innovative applications based on in-memory technology. It enables better performing business strategies, including planning, forecasting, operational
Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study
Creating Value Delivering Solutions Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study Chris Zajac, NJDOT Bud Luo, Ph.D., Michael Baker Jr., Inc. Overview
ediscovery and Search of Enterprise Data in the Cloud
ediscovery and Search of Enterprise Data in the Cloud From Hype to Reality By John Patzakis & Eric Klotzko ediscovery and Search of Enterprise Data in the Cloud: From Hype to Reality Despite the enormous
White Paper. Java versus Ruby Frameworks in Practice STATE OF THE ART SOFTWARE DEVELOPMENT 1
White Paper Java versus Ruby Frameworks in Practice STATE OF THE ART SOFTWARE DEVELOPMENT 1 INTRODUCTION...3 FRAMEWORKS AND LANGUAGES...3 SECURITY AND UPGRADES...4 Major Upgrades...4 Minor Upgrades...5
HIGH-SPEED BRIDGE TO CLOUD STORAGE
HIGH-SPEED BRIDGE TO CLOUD STORAGE Addressing throughput bottlenecks with Signiant s SkyDrop 2 The heart of the Internet is a pulsing movement of data circulating among billions of devices worldwide between
Cloud Computing For Bioinformatics
Cloud Computing For Bioinformatics Cloud Computing: what is it? Cloud Computing is a distributed infrastructure where resources, software, and data are provided in an on-demand fashion. Cloud Computing
Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration
Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration Part I of an ebook series of cloud infrastructure and platform fundamentals not to be avoided when preparing
Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida
Amazon Web Services Primer William Strickland COP 6938 Fall 2012 University of Central Florida AWS Overview Amazon Web Services (AWS) is a collection of varying remote computing provided by Amazon.com.
WhitePaper. Private Cloud Computing Essentials
Private Cloud Computing Essentials The 2X Private Cloud Computing Essentials This white paper contains a brief guide to Private Cloud Computing. Contents Introduction.... 3 About Private Cloud Computing....
Private Cloud for the Enterprise: Platform ISF
Private Cloud for the Enterprise: Platform ISF A Neovise Vendor Perspective Report 2009 Neovise, LLC. All Rights Reserved. Background Cloud computing is a model for enabling convenient, on-demand network
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
Public, Private and Hybrid Clouds
Public, Private and Hybrid Clouds When, Why and How They are Really Used Sponsored by: Research Summary 2013 Neovise, LLC. All Rights Reserved. [i] Table of Contents Table of Contents... 1 i Executive
Simulation Platform Overview
Simulation Platform Overview Build, compute, and analyze simulations on demand www.rescale.com CASE STUDIES Companies in the aerospace and automotive industries use Rescale to run faster simulations Aerospace
Hybrid: The Next Generation Cloud Interviews Among CIOs of the Fortune 1000 and Inc. 5000
Hybrid: The Next Generation Cloud Interviews Among CIOs of the Fortune 1000 and Inc. 5000 IT Solutions Survey Wakefield Research 2 EXECUTIVE SUMMARY: Hybrid The Next Generation Cloud M ost Chief Information
Parallels Virtuozzo Containers
Parallels Virtuozzo Containers White Paper Virtual Desktop Infrastructure www.parallels.com Version 1.0 Table of Contents Table of Contents... 2 Enterprise Desktop Computing Challenges... 3 What is Virtual
Sistemi Operativi e Reti. Cloud Computing
1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi [email protected] 2 Introduction Technologies
Terms and Conditions
- 1 - Terms and Conditions LEGAL NOTICE The Publisher has strived to be as accurate and complete as possible in the creation of this report, notwithstanding the fact that he does not warrant or represent
Using WebSphere Application Server on Amazon EC2. Speaker(s): Ed McCabe, Arthur Meloy
Using WebSphere Application Server on Amazon EC2 Speaker(s): Ed McCabe, Arthur Meloy Cloud Computing for Developers Hosted by IBM and Amazon Web Services October 1, 2009 1 Agenda WebSphere Application
Amazon Relational Database Service (RDS)
Amazon Relational Database Service (RDS) G-Cloud Service 1 1.An overview of the G-Cloud Service Arcus Global are approved to sell to the UK Public Sector as official Amazon Web Services resellers. Amazon
Chapter 19 Cloud Computing for Multimedia Services
Chapter 19 Cloud Computing for Multimedia Services 19.1 Cloud Computing Overview 19.2 Multimedia Cloud Computing 19.3 Cloud-Assisted Media Sharing 19.4 Computation Offloading for Multimedia Services 19.5
Making the Business and IT Case for Dedicated Hosting
Making the Business and IT Case for Dedicated Hosting Overview Dedicated hosting is a popular way to operate servers and devices without owning the hardware and running a private data centre. Dedicated
Fujitsu Cloud IaaS Trusted Public S5. shaping tomorrow with you
Fujitsu Cloud IaaS Trusted Public S5 shaping tomorrow with you Realizing the cloud opportunity: Fujitsu Cloud iaas trusted Public s5 All the benefits of the public cloud, with enterprise-grade performance
Using a Java Platform as a Service to Speed Development and Deployment Cycles
Using a Java Platform as a Service to Speed Development and Deployment Cycles Dan Kirsch Senior Analyst Sponsored by CloudBees Using a Java Platform as a Service to Speed Development and Deployment Cycles
PRACTICAL USE CASES BPA-AS-A-SERVICE: The value of BPA
BPA-AS-A-SERVICE: PRACTICAL USE CASES How social collaboration and cloud computing are changing process improvement TABLE OF CONTENTS 1 Introduction 1 The value of BPA 2 Social collaboration 3 Moving to
Migration Scenario: Migrating Batch Processes to the AWS Cloud
Migration Scenario: Migrating Batch Processes to the AWS Cloud Produce Ingest Process Store Manage Distribute Asset Creation Data Ingestor Metadata Ingestor (Manual) Transcoder Encoder Asset Store Catalog
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should
When Does Colocation Become Competitive With The Public Cloud? WHITE PAPER SEPTEMBER 2014
When Does Colocation Become Competitive With The Public Cloud? WHITE PAPER SEPTEMBER 2014 Table of Contents Executive Summary... 2 Case Study: Amazon Ec2 Vs In-House Private Cloud... 3 Aim... 3 Participants...
Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
CLOUD PERFORMANCE TESTING - KEY CONSIDERATIONS (COMPLETE ANALYSIS USING RETAIL APPLICATION TEST DATA)
CLOUD PERFORMANCE TESTING - KEY CONSIDERATIONS (COMPLETE ANALYSIS USING RETAIL APPLICATION TEST DATA) Abhijeet Padwal Performance engineering group Persistent Systems, Pune email: [email protected]
Performance Test Process
A white Success The performance testing helped the client identify and resolve performance bottlenecks which otherwise crippled the business. The ability to support 500 concurrent users was a performance
Building Blocks of the Private Cloud
www.cloudtp.com Building Blocks of the Private Cloud Private clouds are exactly what they sound like. Your own instance of SaaS, PaaS, or IaaS that exists in your own data center, all tucked away, protected
When Does Colocation Become Competitive With The Public Cloud?
When Does Colocation Become Competitive With The Public Cloud? PLEXXI WHITE PAPER Affinity Networking for Data Centers and Clouds Table of Contents EXECUTIVE SUMMARY... 2 CASE STUDY: AMAZON EC2 vs IN-HOUSE
HPC Deployment of OpenFOAM in an Industrial Setting
HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak [email protected] Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment
RightScale mycloud with Eucalyptus
Swiftly Deploy Private and Hybrid Clouds with a Single Pane of Glass View into Cloud Infrastructure Enable Fast, Easy, and Robust Cloud Computing with RightScale and Eucalyptus Overview As organizations
Estimating the Cost of a GIS in the Amazon Cloud. An Esri White Paper August 2012
Estimating the Cost of a GIS in the Amazon Cloud An Esri White Paper August 2012 Copyright 2012 Esri All rights reserved. Printed in the United States of America. The information contained in this document
Benchmarking Large Scale Cloud Computing in Asia Pacific
2013 19th IEEE International Conference on Parallel and Distributed Systems ing Large Scale Cloud Computing in Asia Pacific Amalina Mohamad Sabri 1, Suresh Reuben Balakrishnan 1, Sun Veer Moolye 1, Chung
With Red Hat Enterprise Virtualization, you can: Take advantage of existing people skills and investments
RED HAT ENTERPRISE VIRTUALIZATION DATASHEET RED HAT ENTERPRISE VIRTUALIZATION AT A GLANCE Provides a complete end-toend enterprise virtualization solution for servers and desktop Provides an on-ramp to
Virtual Desktop Infrastructure Planning Overview
WHITEPAPER Virtual Desktop Infrastructure Planning Overview Contents What is Virtual Desktop Infrastructure?...2 Physical Corporate PCs. Where s the Beef?...3 The Benefits of VDI...4 Planning for VDI...5
CLOUD COMPUTING IN HIGHER EDUCATION
Mr Dinesh G Umale Saraswati College,Shegaon (Department of MCA) CLOUD COMPUTING IN HIGHER EDUCATION Abstract Technology has grown rapidly with scientific advancement over the world in recent decades. Therefore,
The Virtualization Practice
The Virtualization Practice White Paper: Managing Applications in Docker Containers Bernd Harzog Analyst Virtualization and Cloud Performance Management October 2014 Abstract Docker has captured the attention
Deploy Remote Desktop Gateway on the AWS Cloud
Deploy Remote Desktop Gateway on the AWS Cloud Mike Pfeiffer April 2014 Last updated: May 2015 (revisions) Table of Contents Abstract... 3 Before You Get Started... 3 Three Ways to Use this Guide... 4
ABAQUS High Performance Computing Environment at Nokia
ABAQUS High Performance Computing Environment at Nokia Juha M. Korpela Nokia Corporation Abstract: The new commodity high performance computing (HPC) hardware together with the recent ABAQUS performance
