Cloud Computing. Up until now



Similar documents
Cloud Computing. Lecture 24 Cloud Platform Comparison

A Survey on Cloud Storage Systems

DNS records. RR format: (name, value, type, TTL) Type=NS

Alfresco Enterprise on AWS: Reference Architecture

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

A programming model in Cloud: MapReduce

Cloud Computing Trends

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers

Scaling Analysis Services in the Cloud

Assignment # 1 (Cloud Computing Security)

Lecture 6 Cloud Application Development, using Google App Engine as an example

Distribution transparency. Degree of transparency. Openness of distributed systems

Putchong Uthayopas, Kasetsart University

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Cloud Infrastructure Planning. Chapter Six

Ø Teaching Evaluations. q Open March 3 through 16. Ø Final Exam. q Thursday, March 19, 4-7PM. Ø 2 flavors: q Public Cloud, available to public

Cloud Computing. Summary

Cloud computing. Examples

Cloud computing - Architecting in the cloud

Aspera Direct-to-Cloud Storage WHITE PAPER

Large-Scale Web Applications

Cloud Computing Is In Your Future

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

CLOUD COMPUTING. When It's smarter to rent than to buy

Cloud Computing: Making the right choices

Cloud Computing with Microsoft Azure

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Mark Bennett. Search and the Virtual Machine

2) Xen Hypervisor 3) UEC

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Building a Highly Available and Scalable Web Farm

Platforms in the Cloud

ArcGIS for Server in the Amazon Cloud. Michele Lundeen Esri

What s New in SharePoint 2016 (On- Premise) for IT Pros

Amazon Elastic Beanstalk

Technical Writing - Definition of Cloud A Rational Perspective

Cloud Computing and Amazon Web Services. CJUG March, 2009 Tom Malaher

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Hadoop IST 734 SS CHUNG

WINDOWS AZURE EXECUTION MODELS

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms


Cloud Computing Disaster Recovery (DR)

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

Implementing Microsoft Azure Infrastructure Solutions

From Internet Data Centers to Data Centers in the Cloud

Distributed Data Parallel Computing: The Sector Perspective on Big Data

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

Course 20533: Implementing Microsoft Azure Infrastructure Solutions

Data Centers and Cloud Computing. Data Centers

Amazon EC2 Product Details Page 1 of 5

Introduction to Cloud Computing

Scalable Application. Mikalai Alimenkou

Internet Content Distribution

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Scalable Linux Clusters with LVS

Research Paper Available online at: A COMPARATIVE STUDY OF CLOUD COMPUTING SERVICE PROVIDERS

Scaling Out With Apache Spark. DTL Meeting Slides based on

Open Source Technologies on Microsoft Azure

Designing Apps for Amazon Web Services

Cloud Models and Platforms

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel

SHARPCLOUD SECURITY STATEMENT

LinuxWorld Conference & Expo Server Farms and XML Web Services

24/11/14. During this course. Internet is everywhere. Frequency barrier hit. Management costs increase. Advanced Distributed Systems Cloud Computing

SharePoint 2013 on Windows Azure Infrastructure David Aiken & Dan Wesley Version 1.0

Building Multi-Site & Ultra-Large Scale Cloud with Openstack Cascading

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Alfresco Enterprise on Azure: Reference Architecture. September 2014

Hadoop and Map-Reduce. Swati Gore

Written examination in Cloud Computing

How To Choose Between A Relational Database Service From Aws.Com

The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.

THE WINDOWS AZURE PROGRAMMING MODEL

Challenges for Data Driven Systems

Cloud Computing Training

Managing large clusters resources

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Apache Hadoop. Alexandru Costan

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

Cloud Deployment Models

Introduction to Azure: Microsoft s Cloud OS

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Handling Flash Crowds from your Garage

Introduction to Cloud Computing

Transcription:

Cloud Computing Lecture 20 Cloud Platform Comparison & Load Balancing 2010-2011 Up until now Introduction, Definition of Cloud Computing Pre-Cloud Large Scale Computing: Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling Cloud: Map Reduce Storage Execution Monitoring Programming 1

Cloud Platform Comparison Load Balancing Outline Comparison of Cloud Platform Google / Google App Engine Hadoop Amazon Web Services / Eucalyptus Microsoft Azure 2

Computing 3 visions for Cloud Computing: Who will win? AmazonWeb Services x86 Microsoft Azure CLR (VM) Google App Engine Framework Aplicacional (Python, Java) Storage Disk blocks SQL server API BigTable Network BlocksofIP addresses Declarative but automatic (endpoints) 3 level applicational topology»thisistheideal model! Inpractice, the overlap is much larger! Comparison: Storage AWS / Eucalyptus Microsoft Azure Google / Hadoop SQL RDS SQL Azure X Tables SimpleDB Tables (Datastore [BigTable]) / HBase Objects/Blocks S3 Blobs GFS/ HDFS Queues Simple Queue Service(SQS) Queues (Task Queue) 3

Comparison: Storage There are two general complaints: Performance (latency). Coherency models do not scale. The bottom-line is that the storage scalability problem is not solved. There are no available reliable metrics. The market is still too dynamic. Google services are not accessible remotely. It is always possible to make an intermediary bridge service. Programming languages: Comparison: Programming Model Amazon: Language not relevant. The program is a VM. Google: Java and Python. Azure: Any.NET language - C#, J#, VB.NET, etc... Google (servlet/jsp) has the most restrictive model. It is the simplest choice and will tend to be the first one until limitations are found. 4

Comparison: Remote Interaction Model There are little differences/variations. All systems are based on Web Services. Most services support both REST and SOAP protocols. In most cases, applications/machines/services/stores have their own DNS names. Stored objects are identified by type less strings. Comparison: Integration The Amazon VM model permits normal interactions between servers. Google requires that other servers be accessible via Web Services. Azure supports richer integration mechanism with external servers: AppFabric, Access Control e Queues. DryadLINQtransparently integrates local and remote applications. 5

Comparison: Price Resource Unit Amazon Google Microsoft Bandwidth (outgoing) GB $0.03 - $0.085 $0.12 $0.15 Bandwidth (ingoing) GB $0.10 $0.10 $0.10 Computation Instance hour $0.10 - $1.201 $0.10 $0.12 Storage GB per month $0.05 (>5PB) to 0.14 (<1TB) $0.15 $0.15 Storage Calls Each 10k calls $0.01 (GET) $0.10 (others) $0.01 Prices are very similar. AWS, because they use system VMs, has a larger granularity. Scenario Application ported to the cloud Web Application Parallel Processing Mixed Application Characteristics Monolythic application in Java or.net. Web app with load balancer, logic layer and database. Long lasting calculations without GUI. Cloud application integrated with external servers. Platform/Application Match Amazon Normal EC2 instance. System configuration needed. Normal EC2 instance + RDS. Requires system config. and AutoScale. If RDS does not scale, requires port to S3. Many pre-built instances with infra-structure, e.g. MPI. MapReduce instances may be used. EC2 instance may access external servers. Google May require porting and requires data and logic refactoring. Very good match with Google App Engine. Automatic scalability. Requires DB rewrite. No support for larger scale applications. No direct support. Some integration possible using a bridge app to the Datastore. Microsoft If.NET refactor data. Otherwise more complex. Well adapted to the Web Role model. Worker roles + blobs e queues provide some/adequate support. AppFabric ServiceBus supports integration with external applications. 6

Hurdles to CC on the 3 Main Platforms 1. Availability: Depends on the SLA and the provider s track record. 2. Lock-In: Stronger with Google App Engine, then Azure, weaker with AWS. 3. Confidentiality and Auditing: In general confidentiality is guaranteed. No open auditing is available. Regarding applications, EC2 provides higher isolation. 4. Data transfer costs: Similar prices. AWS now has bulk transfer services (you can send them your disks). Cost/benefit is application dependent. Must be analyzed. 5. Reliable Performance For general applications, the situation is similar: there are recovery and repetition mechanisms for most services. In the case of MapReducethere is skipping mode to recover tasks. 6. Scalable storage 7. Large-scale software errors 8. Speed of scale-up: Hurdles to Cloud Computing Clearer feedback with EC2 instances. 9. Reputation propagation: Similar situation on all 3 major platforms. Not solved. Less relevant for Google App Engine. 10. Compatible licensing: only relevant at AWS (solved!) 7

Conclusions The main difference between the main providers is the applicational model: Google has the most restrictive model. The cost of an easy to program system is more lock-in than lack of functionality. I can do whatever I want on EC2 but a scalable application will require distributed scalable services.. Scalability: What is the Best Approach for Cloud Computing Clients? Handling flash crowds from your garage, USENIX 08 8

Flash Crowds! We have seen several examples of scalability in a cloud platform. What about the clients? What if have a server running an application and need to scale? How do I adapt the front-ends? Three main requirements: The system must scale to a very large size. The system must scale quickly. Off-peek operation must be cheap. Data storage services: Available Tools (i) Pros: they are cheap and they scale transparently for the user. Cons: Only solve the problem of static content. Virtual servers: Before the cloud it was already possible to rent virtual servers at ISP (even at different geographical locations). Cons: It only solves the bandwidth problem. Mostly, the computation of the distributed applications doesn t really scale. 9

Available Tools (ii) Cloud computing services. External DNS services: Prevents the service from facing a bottleneck on the DNS requests. MISSING! Scalable relational database service: As we have seen, it s not trivial to scale a classical relational database service. There are many similar services but they always sacrifice some aspect: transactional model, features of the query language, scalability. Scalable Architectures (i) What is the best approach to matching a large set of clients with a multi-server service? Hyp. 1: Use only a storage service. Good for servers with a large percentage of static content. 10

Scalable Architectures (ii) Hyp. 2: Cluster with DNS load balancing Rent several machines (e.g. EC2). Add machines to the DNS record. By default, addresses are used in round-robin fashion. Causes delays to the clients who cached the DNS record but in general the issue is the large number of clients and not a large number of requests from the same client. There are commercial implementations (e.g. RightScale). Scalable Architectures (iii) Hyp. 3: HTTP Redirection Having a server to redirect the initial client request to a set of backend servers. Subsequent requests don t go through the redirection. Hyp. 4: L4 or L7 Rerouting A front-end server analyzes the request source (4 OSI level 4 e.g. TCP) or the content (OSI level 7 e.g. HTTP) and reroutes the request to the corresponding back-end server. Requires a high-performance server or switch, but the client does not see the redirection. There are commercial implementations (e.g. Flexiscale). Hyp. 5: Hybrids of the 4 previous hypothesis. 11

M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redirection HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Redirecting clients (specially if it s done only when a session begins), is very cheap even if the front-server is receiving back-end status reports and running a load balancing algorithm. 12

M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited The UDP-based DNS response has only 512 bytes (up to 25 back-end servers). Most ISP complete the request using TCP if there are more than 25. However, some DNS clients only use the first reply. M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity(all clientrequeststo same server) Incoherent, but in the case of L4 there are growing hurdles to success: NAT, proxies,... 13

M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity Incoherent Scale-up Time Immediate + DNS TTL Scale-down Time Immediate Session duration Session duration Days It is difficult to identify when sessions finish (e.g. webmail). There are DNS clients that ignore DNS records TTL and take days to invalidate their DNS cache. M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity Incoherent Scale-up Time Scale-down Time Immediate The front-end VM start-up of the time storage service. Not the web server. Immediate Session duration Session duration + DNS TTL Days Front-end Fault: New Sessions Significant Fault BUT, it s cheaper than a replicated DNS service! 14

M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity Incoherent Scale-up Time Immediate If there is load balancing of the redirection servers, Scale-down one has Time to wait for the client to try another server. Immediate Session duration It should take max 2.5 s but in some Linux implementations it takes up to 3 min! Front-end Fault: New Sessions Session duration + DNS TTL Days Significant Fault Front-end Fault: Ongoing Sessions Has no effect Rare Effect Replicated Front-end Fault: New Sessions Improbable Longdelayfor 1/m sessions? Longdelayfor 1/m sessions? Small Delay M: repl. front-end N: repl. back-end Applicability Storage Service Static HTTP HTTP Redir. HTTP L4/L7 Rerouting DNS Load Balancing Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity Incoherent Scale-up Time Immediate + DNS TTL Scale-down Time Immediate Session duration Session duration Days Front-end Fault: New Sessions Front-end Fault: Ongoing Sessions Replicated Front-end Fault: New Sessions Replicated Front-end Fault: Ongoing Sessions Back-end Fault: New Sessions e.g., in S3 1% of first write attempts fail, but Has no effect immediate retries succeed. Back-end Fault: Ongoing Sessions Improbable Improbable Improbable Improbable Longdelayfor 1/m sessions? Has no effect Has no effect User recouverable fault Longdelayfor 1/m sessions? 1/m sessions fail. Has no effect Occasional fault Significant Fault Rare Effect Small Delay Some sessions have small delay. Longdelayfor 1/n sessions. Longdelayfor 1/n sessions. 15

M: repl. front-end N: repl. back-end Storage Service HTTP Redir. L4/L7 Rerouting DNS Load Balancing Applicability Static HTTP HTTP Scale Limitations Significant Client arrival rate Request arrival rate Unlimited Client Affinity Incoherent Scale-up Time Immediate + DNS TTL Scale-down Time Immediate Session duration Session duration Days Front-end Fault: New Sessions Significant Fault Front-end Fault: Ongoing Sessions Has no effect Rare Effect Replicated Front-end Fault: New Sessions Improbable Longdelayfor 1/m sessions? Longdelayfor 1/m sessions? Small Delay Replicated Front-end Fault: Ongoing Sessions Improbable Has no effect 1/m sessions fail. Some sessions have small delay. Back-end Fault: New Sessions Improbable Has no effect Has no effect Longdelayfor 1/n sessions. Back-end Fault: Ongoing Sessions Improbable User recoverable fault Occasional fault Longdelayfor 1/n sessions. Example: MapCruncher Map conversion site. Loaded with 25 GB of interactive demo maps. Flash crowd due to Microsoft publicizing it. The server had theoretical capacity to handle traffic (100 images/sec.), but the lack of reference locality (each client looking at different parts of the maps) made the thrashing unbearable. Moved all the static content to S3: they pay $4 if there is no traffic. 16

Example 2: Assirra CaptchaWeb Service based on distinguishing cats from dogs. EC2 servers + 100GB of images placed on S3. Database of image metadata: SQL server was slow. Nightly transfer of a image key indexed structure (read-only DB) to each of the applicational servers. Example 2: Azirra How can the session state be maintained? Hyp. 1: Inside S3. It s slow. Hyp. 2: On the applicationalservers disks. Since they use DNS load balancing it s not guaranteed that the question and answer to the captchago to the same server. Solution: Forward all session requests to the same server. Server id stored in session id. It s very cheap because it requires no disk accesses and only 10% change servers between request and response. 17

Example 2: Azirra Again, a flash crowd after a trade fair appearance. 75000 requests in 24h. Two interesting observations: 30000 requests were from a DoS. Using more instance was cheap. The attacker gave up but it would have been cheap to keep them running until a filter were set up. Example 3: InkBlotPassword.com Website for associating mnemonic images (Rorschach inkblots) to passwords. After the two previous experiences, they simplified the development process. Is it worth optimizing code? If optimizations are only for peek periods, it s better to pay for more machines. The website was mentioned on Slashdot (tech news site) without the authors knowing. They detected a flash crowd (request queue = 130!), started 12 new nodes. 20 min. later, the website was stable. Three days later they were again stable at only 3 servers. Total cost of the flash crowd: $150. 18

Next Time... Cloud Data Centers 19