Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices and solutions Amazon Web Services Building a cloud application - Example 2 What is cloud computing? In principle, cloud computing implementations offer seemingly infinite pooled computing resource over the network users can start, stop, and scale (up and down) its power at will Comes close to the idea of utility computing: Ideally computing is provided the same way as e.g. water or electricity; available in every home and charged based on consumption, outsourcing all the hardware and getting charged by the use. 3
What is cloud computing? Three criteria for a cloud service: (1) The service is accessible via web browser or web services API (no need to installation) (2) No capital investments is needed to get started (3) You pay only for what you use [CloudArchitectures] 4 What is cloud computing? Stefan Tai, Karlsruhe Institute of Technology: cloud computing provides scalable, network-centric, abstracted IT infrastructure, platforms, and applications as on-demand services that are billed by consumption. Three important viewpoints Business opportunities Internet scale service computing Efficient management and utilization of systems 5 What s new and what s old? Cloud computing combines features of cluster computing and grid computing...with the help of virtualization. VM 1 VM 2 VM 3 Virtualization layer Internet Host operating system Hardware 6
Levels of cloud computing Infrastructure as a Service (IaaS) Offers a computer infrastructure often a virtual hardware infrastructure - that is immediately accessible and ready to use Platform as a Service (PaaS) Offers a computing platform and/or software stack as a service Often consuming IaaS and sustaining SaaS cloud applications Software as a Service (SaaS) While IaaS and PaaS are aimed for a software developer, SaaS is often aimed directly to the end user Accessible via a browser and/or API (SOA services) 7 Infrastructure as a Service (IaaS) The service provider offers a computer infrastructure including storage, hardware, servers and networking components often as a virtual hardware infrastructure The service provider owns the equipment and is responsible for housing, running and maintaining it. 8 IaaS: Some reasons why to use Scalability Pay-as-you-go model allows you to scale up or down Error Recovery Your hardware and the data located on your IaaS provider and are housed in (hopefully) secure data centers Time Back You can focus on value-added tasks Efficient payment model No hardware investments 9
IaaS: Business model Charging based on the resources and services used time, bandwidth, transactions, storage etc. Custom units and different measuring methods make the comparison of the provider prices harder. 10 Platform as a Service (PaaS) A PaaS service provides the hosting infrastructure, and tools for development and deployment. Sandboxed, more locked-in, but also more tasks handled by the service provider (automation, load balancing, billing etc.) Payment e.g. based on Outgoing bandwidth Incoming bandwidth CPU time Data storage space used Recipients emailed 11 PaaS: Support for Application design Application development Testing Deployment Hosting Team collaboration Web service integration Database integration Security Scalability Storage Persistence State management Application versioning Application instrumentation 12
PaaS: Some reasons why to use Lower investment Jump start development No maintenance cost Lower risk factor If your project fails, just free the reserved resources and pay the usage bill Business Provides a marketplace (e.g. Google Apps Marketplace) and/or a customer pool (e.g. FaceBook) 13 SaaS: Software as a Service Software is provided and used through a web browser (or API) A one-to-many model Activities managed from central locations As many business models as there are companies 14 Deployment models Public cloud Community cloud Private cloud Hybrid cloud 15
Moving to the cloud? It's about the architecture... Internet Transactional Web application architecture Load balancer Application server Grid application architecture Separation into presentation, business logic, and data storage Database cluster Processing node Publsh results Separation of the core application from its data Anna Ruokonen, processing OHJ, TTYnodes Get job Job gueue Read results Push job Data manager 18 [CloudArchitectures]
Moving to the cloud? Options for IT infrastructure.. Internal Managed services The Cloud Capital investment Significant Moderate Neglible Ongoing costs Moderate Significant Based on usage Provisioning time Significant Moderate None Flexibility Limited Moderate Flexible Staff expertise required Significant Limited Moderate Reliability Varies High Moderate to high [CloudArchitectures] 19 Cloud best practices 1.Design for failure 2.Loose coupling 3.Implement Elasticity 4.Think Parallel 5.Build security in every layer - Design with security in mind 6.Don't fear constraints - Re-think architectural constraints 7.Leverage different storage options - One DOES NOT fit all [CloudBestPractices] 20 Cloud best practices: Design for failure No single points of failure Replication, monitoring, load balancing, backups, snapshots.. Region1 Zone1 Zone2 Internet Geographical redundancy with master-slave replication Region2 DB master DB slave Permanent storage Cloud front 21
Cloud best practices: Decouple your components Loose coupling using message queues for communication (isolating, buffering) Component design As stateless as possible Component1 Component 2 Component 3 Tight coupling Queue 1 Queue 2 Queue 3 Loose coupling Component1 Component 2 Component 3 22 Cloud best practices: Elasticity Scaling (e.g. machine configurations, storage, computing capacity) Monitor system metrics Use load balancing tools Automatize Scale based on variability in usage Manual scaling (up and down) Small instance Medium instance Large instance Medium instance Automatic scaling (out and in) One instance Two instances Four instances Two instances 23 Cloud best practices: Parallel and distributed computing Create job flows using MapReduce Designed for scalable processing of large amount of data Automatic distribution of work load Two simple programs, map(key, value) and reduce(key, values), are distributed in several machines for parallel computation Combine Input Map Output Map Reduce Output Map Reduce Output Map 24
Cloud best practices: MapReduce cont. 1. Map: is run to each key-value pair of the input and it produces a list of preliminary values 2. Sort/combine: values are sorted according to keys 3. Reduce: reduce is run to a list of values for each key and it produces a list of final values <key input, value input > map <key output, value intermediate > <key output, list(value intermediate )> reduce <key output, list(value output )> [MapReduce] 25 Amazon Web Services (AWS) Elastic Computing Cloud (EC2) Virtual machines Simple Queue Service (SQS) Message gueue Simple Storage Service (S3) Persistent storage Files are hold in buckets A file is identified by a key and URI Simple DataBase (SDB) No predefined schemas domain:item:attribute, UTF-8 string Attributes can be added dynamically and they can have multiple values Elastic MapReduce (EMR) Implements Google s MapReduce architecture Uses Hadoop implementation on top of EC2 instances and S3 [AWS] 26 Amazon Machine Image (AMI) Used to instantiate a virtual machine Bundles the operating system (VM), application software and associated configuration settings Provides API for configuration and management App Server Your code Framework Libraries OS Tomcat Your code Libraries J2EE Linux Tomcat Tomcat Tomcat Your code Your code Your code Libraries J2EE Libraries J2EE Libraries J2EE Linux Linux Linux Amazon EC2 27
Building an AWS application Your Application SDB SQS Domains Queues Auto- Elastic Scaling LB Amazon Elastic MapReduce Cloud Watch Amazon EC2 Instances Amazon S3 Object and Buckets Amazon Cloud Front SOAP and REST APIs Command line tools Admin Console Amazon WorldWide Physical Infrastructure (geographical regions, availability zones, edge locations) 28 Example: Image search service An example of building a SaaS application using AWS Implemented with Python, Boto, and Diango 1. Create SQS and fill with website URIs 2. Process pages from SQS: extract images and keywords 3. Store images in S3 4. Store keywords and S3 URIs in SDB 5. Use EMR to find search suggestions (find common keyword relations) 6. Store search suggestions in SDB 5 SQS 4 KeywordDB SDB 6 2 1 List of websites Image processing EC2 instance search 3 Image Storage S3 download images [OHJ-5202] MapReduce EMR (EC2, S3) WebApp EC2 instance 29 Click to add title Example: Find search suggestions Fig_1: finland, snow Fig_2: finland, lapland, snow Map: <finland.snow, 1>, <finland.lapland,1>, <finland.snow, 1>.. Reduce: <finland.snow, 2>, <finland.lapland,1>.. Fig_3: finland, winter Suggest: <finland, snow> 30
Other tools.. Google App Engine: http://code.google.com/appengine/ Windows Azure: http://www.windowsazure.com/en-us/ [WindowsAzure] 31 References [AWS] Amazon Web services: http://aws.amazon.com/ [CloudArchitectures] Cloud Application Architectures: Building Applications and Infrastructure in the Cloud, George Reese, O'Reilly Media, 2009. [CloudBestPractices] Architecting for the Cloud: Best Practices: http:// media.amazonwebservices.com/aws_cloud_best_practices.pdf [Hadoop] Open Source MapReduce, Apache Hadoop: http://hadoop.apache.org/ [Huhtanen2010] Karri Huhtanen: Cloud computing Business models:, 2010 http ://www.cs.tut.fi/~tsysta/cloud-computing-business-models.pdf [MapReduce] MapReduce: Simplified Data Processing on Large Clusters: http:// static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/mapreduce-osdi04.pdf [OHJ-5202] OHJ-5202 Palvelupohjaiset järjestelmät:http://www.cs.tut.fi/~palpo/ 32