8 th Int. Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, Sep 13 16, 2009 e-infrastructures for Science and Industry -Clusters, Grids, and Clouds Wolfgang Gentzsch, The DEISA Project and OGF
HPC Centers They are service providers, for past 40 years For research, education, and industry Computing, storage, apps, data, services Very professional to end-users, they look (almost) like Clouds PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 2
Grids 1998: The Grid: Blueprint for a New Computing Infrastructure Ian Foster, Carl Kesselman 2002: The Anatomy of the Grid Ian Foster, Carl Kesselman, Steve Tuecke PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 3
Grids (Sun in 2001) Clouds Departmental Grids Enterprise Grids Global Grids PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 4
IaaS, PaaS, SaaS Access Elasticity Abstraction Public Clouds Outside corporate data center Access over the Internet Virtual (Vmware, Xen,...) Abstraction of the hardware Public, private, hybrid Capex => Opex Pay-per-use Clouds Service oriented: SaaS, PaaS, IaaS, HaaS Variable cost of services (QoS) Pay-per-use IT services Scaling up/down and we have all the components available Scaling today PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 5
Benefits of moving HPC to Grids Closer collaboration with your colleagues (VCs) More resources allow faster/more processing Different architectures serve more users Failover: move jobs to another system... and Clouds No upfront cost for additional resources CapEx => OpEx, pay-per-use Elasticity, scaling up and down Hybrid solution (private and public cloud) PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 6
The Cloud of Cloud Companies Amazon Google Sun Salesforce Microsoft IBM Oracle EMC Cloudera Cloudsoft Akamai Areti Internet Enki Fortress ITX Joyent Layered Technologies Rackspace Terremark Xcalibre Manjrasoft / Aneka GridwiseTech / Momentum NICE/EnginFrame PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 7
NICE EnginFrame Cluster/Grid/Cloud Portal Remote, interactive, transparent, secure access to apps & data on corporate Intranet or Internet, or in the Cloud. Users Administrators Users Win Mac LX UX Standard protocols Intranet Clients Cloud Portal / Gateway Virtualized Data Center Clusters Licenses Batch Applications Interactive Applications Administrators Virtualized Storage PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 8
A Scalable Data Cloud Infrastructure Example: GridwiseTech Momentum PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 9
ANEKA Cloud Platform SaaS Cloud applications Social computing, Enterprise, ISV, Scientific, CDNs,... PaaS Cloud Programming Models & SDK Task Model Thread Model Map Reduce Model Core Cloud Services SLA QoS Management Negotiation Pricing Job Execution Scheduling Management Monitoring Workflow Model Billing Admission Control Third Party Models Metering Data Storage Aneka Cloud Platform VM Management VM Deployment Virtual Machines Private Cloud Windows Mac with Mono Linux with Mono Amazon Microsoft Google Sun IaaS LAN network Data Center Courtesy: Manjrasoft PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 10
PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 11
Courtesy: Werner Vogels PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 12
4000 Animoto EC2 image usage 0 Day 1 Day 8 PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 13
My current project: DEISA: Grid or Cloud? Distributed European Infrastructure for Supercomputing Applications Ecosystem for HPC Grand-Challenge Applications
DEISA HPC Centers DEISA1: May 1st, 2004 April 30th, 2008 DEISA2: May 1st, 2008 April 30th, 2011 PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 15
DEISA UNICORE Infrastructure job Gateway BSC CINECA user Gateway CINECA NJS CSC Cray XT4/5 Gateway CSC NJS ECMWF IBM P5 IDB AIX LL UUDB Gateway ECMWF NJS FZJ IBM IDB AIX LL-MC Gateway FZJ UUDB AIX LL-MC Gateway IDRIS NJS IDRIS IBM P6 IDB UUDB GridFTP Gateway HLRS Gateway HPCX NJS HLRS NEC SX8 IDB Super-UX NQS II UUDB UNICOS/lc PBS Pro Gateway LRZ NJS HPCX Cray XT4 IDB UUDB job Gateway RZG Gateway SARA NJS LRZ SGI ALTIX LRZ user IDB UUDB UNICOS/lc PBS Pro LINUX PBS Pro IDB UUDB NJS CINECA IBM P5 IDB UUDB AIX LL-MC LINUX Maui/Slurm LINUX LL AIX LL-MC NJS RZG IBM NJS BSC IBM PPC NJS SARA IBM IDB UUDB PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 16 IDB UUDB IDB UUDB
Categories of DEISA services Operations requests configuration requests support offers product offers service Technologies requests development offers technology Applications PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 17
DEISA Service Layers Multiple ways to access Workflow managemnt Common production environmnt Presentation layer Single monitor system Job rerouting Co- reservation and co- allocation Job manag. layer and monitor. Data staging tools Data transfer tools WAN shared File system Data manag. layer Unified AAA DEISA Sites Network connectivity Network and AAA layers PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 18
DEISA Global File System IBM P6 & BlueGene/P IBM P6 & BlueGene/P NEC SX8 IBM P6 AIX LL AIX, Linux LL-MC AIX, Linux LL-MC GridFTP Super-UX NQS II UNICOS/lc PBS Pro Cray XT4 Cray XT4/5 UNICOS/lc PBS Pro SGI ALTIX LINUX PBS Pro AIX LL-MC IBM P5 LINUX Maui/Slurm IBM PPC LINUX LL IBM P5+ / P6 AIX, Linux LL-MC IBM P6 & BlueGene/P Global transparent file system based on the Multi-Cluster General Parallel File System (MC-GPFS of IBM) PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 19
User management in DEISA A dedicated LDAP-based distributed repository administers DEISA users Trusted LDAP servers are authorized to access each other (based on X.509 certificates) and encrypted communication is used to maintain confidentiality SARA BSC CINECA CSC ECMWF EPCC FZJ HLRS IDRIS LRZ RZG SARA PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 20
DEISA: Grid or Cloud? Built on top of proven, professional infrastructure of HPC centers with expertise in implementation, operation, services. Ecosystem of resources, middleware, applications is respecting administrative, cultural and political autonomy of partners. Globalizing existing HPC services - from local to global - according to user requirements: revolution by evolution. User support: user-friendly access to resources, porting user apps onto turnkey architecture. After EU funding, DEISA HPC ecosystem will operate in a sustainable way, in the interest of the global scientist, as...... almost an HPC Cloud! PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 21
There are still many Challenges with Clouds Sustainable Competitive Advantage TECHNICAL CULTURAL LEGAL & REGULATORY PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 22
Challenges, Potential Cloud Inhibitors Not all applications are cloud-ready or cloud-enabled Interoperability of clouds (standards?) Sensitive data, sensitive applications (med.patient records) Different organizations have different ROI Security: end-to-end from your resources to the cloud! Current IT culture is not predisposed to sharing resources Static licensing model doesn t embrace cloud Protection of intellectual property Legal issues (FDA, HIPAA) PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 23
A Cloud Checklist for HPC When is your HPC app ready for the Cloud?... no issues with licenses, IP, secrecy, privacy, sensitive data and big data movement, legal or regulatory issues, trust,......your app is architecture independent, not optimized for specific architecture (single process, loosely-coupled lowlevel parallel, I/O-robust)...it s just one app and zillions of parameters...latency and bandwidth are not an issue Ideally, your meta-scheduler knows your requirements and schedules automatically PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 24
Hybrid Grid/Cloud Resource Management Define policies according to priorities, budget, and time Department 4 Department 3 User 1 User 2 Project C Team B Contractor X External Cloud Resources Department 2 Department 1 Project A Department resource access Campus wide resource demand PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 25
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008. PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 26
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008. PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 27
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008. PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 28
Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008. PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 29
A Closer Look at HPC Load Single parallel job, cpu-intensive, tightly-coupled, highly scalable, peta, exa,.. Single parallel job, cpu-intensive, weakly-scalable Capacity computing, throughput, parameter jobs Managing massive data sets, possibly geographically distributed Analysis and visualization of data sets *) Similar to the analysis of T.Sterling and D.Stark, LSU, HPCwire PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 30
Clouds and supercomputers: Conventional wisdom? Clouds/ clusters Too slow Super computers Too expensive Loosely coupled Tightly coupled applications applications Courtesy Ian Foster PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 31
Loosely coupled problems Ensemble runs to quantify climate model uncertainty Identify potential drug targets by screening a database of ligand structures against target proteins Study economic model sensitivity to parameters Analyze turbulence dataset from many perspectives Perform numerical optimization to determine optimal resource assignment in energy problems Mine collection of data from advanced light sources Construct databases of computed properties of chemical compounds Analyze data from the Large Hadron Collider Analyze log data from 100,000-node parallel computations all can run in the cloud PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 32 3
Thank You Dziekuje gentzsch@rzg.mpg.de PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 33