Cloud Computing Lecture 24 Cloud Platform Comparison 2014-2015 1
Up until now Introduction, Definition of Cloud Computing Pre-Cloud Large Scale Computing: Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling Cloud: Map Reduce Storage Execution Monitoring Programming 2
Outline Cloud Platform Comparison Load Balancing 3
Comparison of Cloud Platform Google / Google App Engine Hadoop Amazon Web Services / Eucalyptus Microsoft Azure OpenStack 4
3 visions for Cloud Computing: Who will win? Amazon Web Services Microsoft Azure Google App Engine Computing x86 CLR (VM) Framework Aplicacional (Python, Java) Storage Disk blocks SQL server API BigTable Network Blocks of IP addresses Declarative but automatic (endpoints) 3 level applicational topology This is the ideal model! In practice, the overlap is much larger! 5
Comparison: Storage AWS / Eucalyptus Microsoft Azure Google / Hadoop SQL RDS SQL Azure Tables SimpleDB Tables Google Cloud SQL (MySQL) (Datastore [BigTable]) / HBase Objects/Blocks S3 Blobs GFS/ HDFS Queues Simple Queue Service (SQS) Queues (Task Queue) 6
Comparison: Storage There are two general complaints: Performance (latency). Strict coherency models do not scale. The bottom-line is that the storage scalability problem is not solved. There are no available reliable metrics. The market is still too dynamic. Google services are not accessible remotely. It is always possible to make an intermediary bridge service. 7
Comparison: Programming Model Programming languages: Amazon: Language not relevant. The program is a VM. Google: Java, Go and Python. Azure: Any.NET language -C#, J#, VB.NET, etc... Google (servlet/jsp) has the most restrictive model. It is the simplest choice and will tend to be the first one until limitations are found. 8
Comparison: Remote Interaction Model There are little differences/variations. All systems are based on Web Services. Most services support both REST and SOAP protocols. In most cases, applications/machines/services/stores have their own DNS names. Stored objects are identified by typeless strings. 9
Comparison: Integration The Amazon VM model permits normal interactions between servers. Google requires that other servers be accessible via Web Services. Azure supports richer integration mechanism with external servers: AppFabric, Access Control e Queues. DryadLINQ transparently integrates local and remote applications. 10
Comparison: Price Resource Unit Amazon Google Microsoft Bandwidth (outgoing) Bandwidth (ingoing) GB $0.03 - $0.085 $0.12 $0.15 GB $0.10 $0.10 $0.10 Computation Instance hour $0.10 - $1.201 $0.10 $0.12 Storage GB per month $0.05 (>5PB) to 0.14 (<1TB) $0.15 $0.15 Storage Calls Each 10k calls $0.01 (GET) $0.10 (others) $0.01 Prices are very similar. AWS, because they use system VMs, has a larger granularity. 11
Platform/Application Match Scenario Characteristics AWS Google A E Azure Application ported to the cloud Monolythic application in Java or.net. Web Application Parallel Processing Web app with load balancer, logic layer and database. Long lasting calculations without GUI. Mixed Application Cloud application integrated with external servers. 12
Platform/Application Match Scenario Characteristics AWS Google A E Azure Application ported to the cloud Web Application Parallel Processing Mixed Application / Integration / Workflow Monolythic application in Java or.net. Web app with load balancer, logic layer and database. Long lasting calculations without GUI. Cloud application integrated with external servers. Normal EC2 instance. System configuration needed. Normal EC2 instance + RDS. Requires system config. and AutoScale. If RDS does not scale, requires port to S3. Many pre-built instances with infra-structure, e.g. MPI. MapReduce instances may be used. EC2 instance may access external servers. May require porting and requires data and logic refactoring. Very good match with Google App Engine. Automatic scalability. (Requires DB rewrite.) No support for larger scale applications. No direct support. Some integration possible using a bridge app to the Datastore. If.NET, refactor data. Otherwise more complex. Well adapted to the Web Role model. If not.net, port/cross-compile code. Worker roles + blobs e queues provide some/adequate support. AppFabric ServiceBus supports integration with external 13 applications
Hurdles to CC on the 3 Main Platforms 1. Availability: Depends on the SLA and the provider s track record. 2. Lock-In: Stronger with Google App Engine, then Azure, weaker with AWS. 3. Confidentiality and Auditing: In general confidentiality is guaranteed. No open auditing is available. Regarding applications, EC2 provides higher isolation. 4. Data transfer costs: Similar prices. AWS now has bulk transfer services (you can send them your disks). Cost/benefit is application dependent. Must be analyzed. 14
Hurdles to Cloud Computing 5. Reliable Performance For general applications, the situation is similar: there are recovery and repetition mechanisms for most services. In the case of MapReduce there is skipping mode to recover tasks. 6. Scalable storage 7. Large-scale software errors 8. Speed of scale-out: Clearer feedback with EC2 instances. 9. Reputation propagation: Not solved. Less relevant for Google App Engine. Similar situation on all 3 major platforms. 10. Compatible licensing: only relevant at AWS (solved!) 15
Conclusions The main difference between the main providers is the applicational model: Google has the most restrictive model. The cost of an easily programmable system is more lock-in than lack of functionality: I can do whatever I want on EC2 but a scalable application will require distributed scalable services.. 16
Next Time... Cloud Data Centers 17