HPC on AWS Hiroshi Kobayashi, Dev./Lab. IT System HGST Japan, Ltd. Jun 3, 2015 1
HPC on AWS HPC = High Performance Computing AWS = Amazon Web Service 2
Agenda HGST Why choose Cloud? Performance Flexibility What s Next Summary 3
HGST Company Profile Founded in 2003 through the combination of the hard drive businesses of IBM, the inventor of the hard drive, and Hitachi, Ltd ( Hitachi ) Acquired by Western Digital in 2012 Headquartered in San Jose, California Approximately 38,000 employees worldwide More than 4,700 active worldwide patents Develops innovative, advanced hard disk drives, enterpriseclass solid state drives, external storage solutions and services Delivers intelligent storage devices that tightly integrate hardware and software to maximize solution performance 4
Broadening Lineup of Storage Solutions RECENT INNOVATIONS HDD Storage Solutions with HelioSeal Technology Petabyte-scale Data Center Storage Solutions HGST 10TB SMR HDD HGST Ultrastar He8 Active Archive Platform Solid State Storage Solutions HGST Storage Software HGST Virident Solutions FlashMAX III PCIe Ultrastar SSD800MH.B, SSD1600MM & SSD1600MR SAS SSD Ultrastar SN100 Series NVMe PCIe HGST Virident Space 5
HGST Active Archive System Our first fully integrated system with 4.7PB raw capacity per rack! Complete scale-out object storage system for cloud data centers 4.7PB raw capacity per rack Optimized for active archive workloads Breakthrough TCO Highest Density Improves Data Center Efficiency Lowest Power per TB with Fast Data Access Beats White Box Economics Scales to Exabytes of Capacity 6
Market Leadership 7
Agenda HGST Why choose Cloud? Performance Flexibility What s Next Summary 8
Why choose Cloud? Background A few years ago, HPC implementation project was started. Project team investigated several cloud HPC services except for AWS. But those did not satisfy HGST s requirement. CIO Steve Phillpott recommended AWS for HPC. He had much experience of HPC on AWS at life-science industry. Through several Proof of Concept projects, began to understand Pros/Cons of On-premise and Cloud HPC. Key factors are Scalability, Data transfer, Remote Visualization Commercial Application, Cost 9
Agenda HGST Why choose Cloud? Performance Flexibility What s Next Summary 10
Scalability CD-adapco provided the benchmark data on their cluster. C3 provide significant improvement to the scalability C3 is 1.81x faster than CR1 Still behind to physical cluster with InfiniBand 1.70x slower 1.81x faster 1 EN = Enhanced Networking 2 placement group enable 3 evaluated by elapse time 4 only 200steps 11
Remote Visualization Result data is too huge to download Transferring huge data is NOT a option Require Remote Visualization for huge result data Remote Desktop Console Consume server side GPU resource and license Users Client Remote access via RDC/VNC G2 AWS graphic server Not good performance Slower response Slower rendering Server Client Mode Users Client Consume client side GPU resource Consume server side license AWS file server Great performance!!! Almost same performance as local workstation with highend graphic card 12
Data Collaboration Transferring huge data is NOT a option Even 48TB of d2.8xlarge may not be sufficient for long term / huge data repository High cost for re-computing of large scale model AWS Simple Storage Service (S3) Cluster Master Computing Nodes S3 bucket job submission Shared storage small data back to client Client Users 13
Performance Scalability C3.8xlarge improved the scalability dramatically Higher scalability is better Remote Visualization Star-CCM+ is ready Other applications are NOT ready Data Collaboration No need to struggle with the storage capacity and durability AWS can support whole process of simulation works!!! 14
Agenda HGST Why choose AWS for HPC? Performance Flexibility What s Next Summary 15
Hybrid HPC Architecture Local + Cloud = Hybrid HPC environment AWS + Cycle Computing http://www.cyclecomputing.com/ Auto Scale Out / In Cluster Master attached data I/O Computing Nodes Fixed Capability Users Client Shared Storage Virtual Private Cloud HGST Local Cluster S3 bucket AWS 16
Shape Compute To Match Work To Be Done All Jobs Run In Parallel on AWS 1.67x Throughput Improvement Time Before: Shared Cluster Computer 512 core waiting 512core 512core 512core 256 core waiting 256 core 128 core waiting 128 core Today: AWS EC2 CC2 Cluster (Max Total 512 core) 17
Shape Compute To Match Work To Be Done (Cont.) 18
Shape Storage To Match Work To Be Done No need to struggle with the storage capacity and durability!!! Cluster Master Computing Nodes S3 bucket job submission Shared storage small data back to client Client Users 19
Shape Cost To Match Work To Be Done Workload is NOT constant Server Reservation Discount = Reserved Instances (RI) Analyzing workload Utilizing RI Optimizing cost 20
Agenda HGST Why choose Cloud? Performance Flexibility What s Next Summary 21
What s next for Cloud HPC Computing Performance More scalability, like InfiniBand Remote Visualization Higher performance than RDC-TCP/IP PC over IP? NICE DCV? Star-CCM+ is ready!!! Commercial Application License End User License Agreement (EULA) Hybrid License Server Consumption Based License Power On Demand!!! Local License Server 22
Agenda HGST Why choose Cloud? Performance Flexibility What s Next Summary 23
Summary At this moment, HPC on AWS is NOT perfect Scalability, Remote Visualization except for Star-CCM+ HPC on AWS has extremely high flexibility Hybrid HPC, Shape Compute/Storage/Cost To Match Work To Be Done Flexibility will drive to responding to the changing business model Benefit of HPC on AWS should be verified with each applications based on its characteristic Required collaboration with application venders 24
Helping the World Harness the Power of Data with Smarter Storage Solutions 25