Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware and software Public cloud: pay-as-you-go to the public Service being sold is Utility Computing Examples: Amazon Web Services, Google AppEngine, Microsoft Azure Private cloud: internal datacenters not available to the public CLOUD COMPUTING = SaaS + Utility Computing (typically does not include Private Clouds) Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 2 SaaS has been around for awhile Advantages for service providers Simplified software installation & maintenance Centralized control over versioning Advantages for end users Service access ANYTIME, ANYWHERE Share data and collaborate more easily Keep data stored safely in the infrastructure Cloud computing does not change these principles Allows application providers to deploy their products as SaaS without provisioning a datacenter Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 3
New hardware aspects Illusion of infinite computing resources available on demand No need for cloud computing users to plan for provisioning Elimination of an up-front commitment by Cloud users Start small and increase hardware resources as the need increases Pay for computing resources on a short term basis as needed & release them when there is no need Pay processors by the hour & storage by the day Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 4 New hardware aspects Example: Amazon Web Services (AWS) charges 1.0 GHz x86 ISA slices for $0.10 per hour; new slice can be added in 2 to 5 minutes $0.12 $0.15 per gigabyte-month (Amazon s Scalable Storage Service (S3)) $0.10 - $0.15 per gigabyte data transfer in and out of AWS over the Internet Limiting factors Data movement cost and/or latency may limit getting in and out of the cloud Example: stock trading Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 5 Cloud computing providers Building, provisioning & launching cost hundreds million dollars Who? Most Cloud Computing providers already have existing investments in very large datacenters, large-scale software infrastructure, and operational expertise to run them Why? Make a lot of money: purchase hardware, network bandwidth, and power for 1/5 to 1/7 of the regular prices; fixed cost of software development and deployment. (Datacenters are built in areas where cost of electricity, cooling, labor, property purchase cost & taxes are lower.) Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 6
Cloud computing providers Why? (continuation) Leverage existing investment: adding cloud computing on top of existing infrastructure provides new revenue Defend a franchise: provide a path for migrating existing customers to a cloud environment Attack an incumbent: get established before a single 800 pound gorilla has emerged Leverage customer relationship: preserves the investment in customer relationships Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 7 Applications suitable for the cloud Mobile interactive applications that rely on large datasets conveniently hosted in large datacenters Parallel batch processing that analyzes terabytes of data and can take hours to finish using 1000 computers for one hour costs the same as using one computer for 1000 hours Analytics aimed at understanding customers, supply chains, buying habits, ranking, etc. Compute-intensive desktop applications E.g., Matlab & Mathematica are now capable of using cloud computing Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 8 Computation models of different Cloud Computing providers Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 9
Amazon Elastic Compute Cloud (EC2) Users can control nearly the entire software stack from the kernel upwards No limit on applications that can be hosted Low level of virtualization allows developers to code whatever they want Difficult for Amazon to offer automatic scalability & failover Replication and other state management issues are highly application dependent Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 10 Google AppEngine Targeted exclusively at traditional Web applications with a clean separation between a stateless computation tier and a stateful storage tier Not suitable for general-purpose computing Automatic scaling & high-availability mechanisms Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 11 Microsoft Azure Intermediate solution: tradeoff between flexibility and programmers convenience Applications are written using.net libraries and compiled to the language independent managed environment (Common Language Runtime) Supports general-purpose computing Users can choose the language, but cannot control the underlying OS To some degree libraries provide automatic network configuration, scalability & failover, but require developers to specify some application properties Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 12
Cloud computing business model Shifting the risk due to elasticity Converting capital expenses to Operating expenses (e.g., pay-as-you-go ) Non-uniform distribution of purchased hours may lead to waste of resources or turning away the excess users For many servers the peak workload exceeds the average by factors of 2 to 10 Seasonal and other periodic demand variations AWS hosts Target.com, which experienced only gradual degradation of service on Black Friday The risk of misestimating the workload is shifted from the service operator to the cloud vendor Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 13 Cloud computing business model UserHours UserHours cloud datacenter ( revenue Cost cloud ) Costdatacenter ( revenue ) Utilization If Utilization=1 the two sides look the same Queueing theory teaches us that when Utilization is close to 1 response time approaches infinity. In practice usable Utilization is 0.6 to 0.8. Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 14 Cloud computing business model Other benefits from provisioning resources on the scale of hours rather than years Technology trends suggest that over the useful lifetime of purchased equipment Hardware costs will fall down New hardware and software technologies will become available Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 15
10 obstacles (and opportunities) for cloud computing Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 16 #1: Availability of service Would Utility Computing service be available when needed? Google Search sets high standards in availability Can similar availability be achieved by Utility Computing? Amazon Simple Storage Service (S3): 2 hours outage due to authentication service overload Amazon S3: 6-8 hours outage; gossip protocol blowup due to single bit error Google AppEngine: 5 hours partial outage due to programming error Gmail: 1.5 hours due to outage in contacts system Is using a multiple Cloud Computing providers a solution? Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 17 #2: Data lock-in Can customers easily extract their data and programs from one site to run on another? Proprietary APIs Online storage service Linkup shut down after loosing 45% of its customers data. Linkup relied on the online storage service Nirvanix. Linkup 20,000 users were told the service was no longer available. Standardization of the APIs is the obvious solution In addition to mitigating data lock-in concerns standardized APIs would allow for avoiding a single point of failure by using multiple Cloud Computing providers for availability Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 18
#3 Data confidentiality & Audibility Can customers trust the cloud? Can cloud computing environment be as secure as in-house IT environments? Encrypting data before placing it in the cloud Security of the cloud is an open issue Requirements for audibility in compliance with HIPAA regulations A healthcare company TC3 moved their HIPAA-compliant application to AWS storing encrypted data Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 19 #4 Data transfer bottleneck Data-intensive applications may lead to considerable data placement and transfer cost It is faster and cheaper to transfer data by physically shipping hard disks via overnight delivery service. Example: S3 bandwidth 5-18 Mbits/second. For 10TB it will take 45 days and cost around $1,000 transfer fees Sending ten 1 TB disks via overnight shipping would take less than a day and cost around $400 Once data is in the cloud it may no longer be a bottleneck and may enable new services Amazon is hosting large public datasets such as US Census data for free on S3 Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 20 #5 Performance unpredictability Multiple virtual machines can share CPUs and memory well, but I/O sharing is more problematic Possible solutions: Improve architectures and OS to efficiently virtualize interrupts and I/O channels IBM mainframes largely overcame these problems in 1980s Flash memory will decrease I/O interference Semiconductor memory is much faster (microseconds vs. milliseconds) and uses less energy than mechanical hard disks Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 21
#5 Performance unpredictability (cont) One third of today s server market is highperformance computing (HPC) Applications with potential parallelism (e.g., financial analysis, movie animation, etc) can benefit from elastic computing However, HPC applications need to ensure that all threads of a program are running simultaneously and today s virtual machines and operating systems do not provide programmer-visible way to ensure this Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 22 #6 Scalable storage Properties of cloud computing such as short term usage (scaling down and up), no-upfront cost, and infinite capacity on demand are applied easier to computations than to persistent storage Creating storage that would scale arbitrarily up and down on-demand and meet programmers expectations for scalability, data durability, and high availability is still a research problem Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 23 #7 Bugs in large-scale distributed systems Hard to reproduce and fix the bugs in smaller configurations, so debugging must occur at scale in the production datacenters Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 24
#8 Scaling quickly Pay-as-you-go Storage & network bandwidth are charged by bytes Computation is charged as follows Google AppEngine automatically scales as the load increases and decreases and charges users by the cycles used AWS charges for the number of instances occupied by the hour, even if the machine is idle Being able to automatically scale up and down in response to load saves money Per-byte and per-hour costs encourage programmers to pay attention on efficiency (i.e., acquire resources only when necessary) Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 25 #8 Scaling quickly (cont) Conservation of resources is another reason for scaling efficiently Idle computer uses 2/3 of the power of a busy computer Careful use of resources would reduce the impact of datacenters on the environment Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 26 #9 Reputation fate sharing One customer s bad behavior can affect the reputation of the cloud as a whole The question of transfer of legal liability is open Cloud computing providers would want legal liability to remain with the customer and not to be transferred to them Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 27
#10 Software licensing Cloud computing providers originally relied on open source software because licensing model for commercial software is not a good match to Utility computing Commercial software may consider changing their licensing structure Microsoft and Amazon now offer pay-as-you-go licensing for Windows server and Windows SQL server. EC2 instance running Microsoft Windows costs $0.15 per hour instead of the $0.10 per hour of the open source version Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 28 References Above the Clouds: A Berkeley View of Cloud Computing You may find the authors conversation on YouTube interesting as well Is Cloud Computing hyped and overblown? (also on YouTube) Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 29