S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Richard Gass Intel Michael Kozuch Intel Michael Ryan Intel 1
Agenda Sessions: (A) Introduction 8.30-9.00 (B) Hadoop 9.00-10.00 Break 10.00-10.15 Hadoop 10.15-11:30 Lunch 11.30-12.30 (C) Pig 12.30-1.30 Break 1.30-1.45 (D) Tashi 1.45-3.30 Break 3.30-3.45 (E) PRS 3.45-5.00 I. Speaker intros II. Motivation III. Open Cirrus IV. Open Cirrus software stack V. Getting involved 2
Session A: Introduction 3
Michael Kozuch (Intro) Michael Kozuch is a Principal Engineer with Intel Labs Pittsburgh and manager of the ILP Systems Research and Engineering group Manages the Intel Open Cirrus cluster and is the PI for the Tashi research project Michael is a 12-year veteran of Intel and contributed to the development of Intel s VT and TXT technologies He has published 25+ scientific papers and 20+ patents 4
Milind Bhandarkar (Hadoop) Lead Yahoo! Grid Solutions Team since June 2005 Contributor to Hadoop since January 2006 Trained 1000+ Hadoop users at Yahoo! & elsewhere 20+ years of experience in Parallel Programming 5
Michael Ryan (Tashi) Michael is currently a research engineer with Intel Labs Pittsburgh Lead developer for Tashi Serves as sysadmin for the Intel Open Cirrus site Coordinates the Global Monitoring service for Open Cirrus 6
Richard Gass (PRS) Richard is currently a research engineer with Intel Labs Pittsburgh Lead developer for PRS Serves as sysadmin for the Intel OpenCirrus site Richard has published 9+ scientific papers and is also an (imminent) PhD candidate with University Pierre and Marie Curie LIP6 in Paris 7
Motivation 8
Why Open and Cloud makes sense Cloud Computing is a new, critical technology Efficiency: Admin costs aggregated Scalability: From 1 to 1000 servers in 10 sec. flat Empowerment: Anyone can buy a cluster Open Communities enable rapid innovation Exchange of ideas: Knowledge grows Constructive Darwinism: Best tools survive/evolve Empowerment: Anyone can build a LAMP stack Rapidly developing and deploying innovative computing technologies 9
Research Interest: Big Data Interesting applications are data hungry The data grows over time The data is immobile 100 TB @ 1Gbps ~= 10 days Compute comes to the data Big Data clusters are the new libraries (Data-Rich Computing theme proposal. J. Campbell, et al., 2007) The value of a cluster is its data 10
Open Cirrus 11
Open Cirrus Cloud Computing Testbed Collaboration between industry and academia, sharing hardware infrastructure software infrastructure research applications and data sets UIUC* KIT* ISPRAS* ETRI* IDA* MIMOS* Sponsored by HP, Intel, and Yahoo! (with additional support from NSF) 12 9 sites currently, target of around 20 in the next two years
Open Cirrus Objectives Foster systems research around cloud computing Vendor-neutral open-source stacks and APIs for the cloud Expose research community to enterprise level requirements Provide realistic traces of cloud workloads How are we unique Support for systems research and applications research Federation of heterogeneous datacenters Collection of interesting data sets Independently-managed sites providing a cooperative research testbed 13
User Access to Open Cirrus User access is organized around Research Projects Led by Principal Investigator (PI) Project PIs apply to each site separately Identifying additional team members Contact information for applications to each site are available on the Open Cirrus Web site (http://opencirrus.org) Each Open Cirrus site decides which users and projects get access to its site. 14
Open Cirrus * Research Projects Example research areas of interest Datacenter federation Datacenter management Web services Data-intensive systems Projects typically not of interest Traditional HPC app development Production apps looking for free cycles Closed-source system development 15
Software Stack 16
Open Cirrus* Software Components Single Global Global User Sign-On Monitoring Directories Global Services Application Services (Hadoop) Virtual Machine Allocation (AWS* Compatible, e.g. Tashi or Eucalyptus) Data Resource Billing/ Location Telemetry Accounting Site Services Cluster Storage (HDFS) Physical Machine Allocation (PRS) Compute Node Services 17
Physical Machine Allocation: PRS PRS dynamically divides compute nodes into isolated subdomains Provides each project with a mini-datacenter Isolation of experiments Open service research Tashi development Production storage service Proprietary service research Apps running in a VM mgmt infrastructure (e.g., Tashi, Eucalyptus) Open workload monitoring and trace collection 18
Cluster Storage: HDFS Storage system aggregating standard devices High-performance, parallel access High data reliability through replication Exposing location information enables intelligent placement of computation Storage Service Node Node Node Node Node Node 19
Virtual Machine Allocation: Tashi An open source Apache Software Foundation incubator project Infrastructure for cloud computing on Big Data http://incubator.apache.org/projects/tashi Support for AWS* interface OS, FS, and VMM agnostic Research focus: Location-aware co-scheduling of compute, storage, and power Seamless physical/virtual migration 20
Application Service: Hadoop An open-source Apache Software Foundation project sponsored by Yahoo! http://hadoop.apache.org Provides a scalable, parallel programming model (MapReduce) and the associated runtime 21
Getting Involved 22
Summary Open Communities can shape the development of Cloud Computing Open Cirrus* is a multi-partner test bed for research in Cloud Computing The Open Cirrus software stack provides a good starting point for open-source cloud computing software development 23
Getting Involved http://opencirrus.org Contact Open Cirrus* with research proposals Contribute to the Open Cirrus software stack PRS, Tashi, Hadoop Apache Software Foundation* 24