INFO5011 Advanced Topics in IT: Cloud Computing Week 10: Consistency and Cloud Computing
|
|
- Scot McCoy
- 8 years ago
- Views:
Transcription
1 INFO5011 Advanced Topics in IT: Cloud Computing Week 10: Consistency and Cloud Computing Dr. Uwe Röhm School of Information Technologies! Notions of Consistency! CAP Theorem! CALM Conjuncture! Eventual Consistency! Properties! Dynamic Data Placement Outline! Data Consistency Properties and the Trade-offs in Commercial Cloud Storages INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-2
2 Revisit: The CAP Theorem Consistency [Brewer, PODC2000] Availability Partitioning Tolerance! Theorem: You can have at most two of these properties for any shared-data system. INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10v-3 Notions of Consistency! Strong Consistency (aka 1-copy-Serializability)! behavior as if there is only one copy of each data item, and! only serializable accesses permitted! Weak Consistency! No guarantee that a subsequent (read) access from this or any other client will return a just updated value! Might mean several versions of data (i.e. not 1-copy), might mean not serializable! Updates will be propagated eventually to all replicas after some delay ( inconsistency interval ) => Eventual Consistency INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-4
3 Eventual Consistency! A model originally proposed for disconnected operation (e.g., mobile computing)! Different nodes keep replicas and each update is eventually propagated to each replica! And eventually, there is agreement on which update is the latest! DNS is the most well-known system implementing eventual consistency! Usual definition is counterfactual: once updating ceases, and the system stabilizes, then after a long enough period, all replicas will have the same value INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-5 Why Eventual Consistency?! Argument 1: Availability is King, Network Partitioning is Fact! Cloud infrastructure, especially the lowest level (data storage) has a crucial always on requirement! But networking can fail (temporarily) even within one rack or between different racks of same data center! Algorithms for strong consistency would block till network is up again => seen as unacceptable! Argument 2: Latency Penalty and Costs too high! In some applications, there s replication between different data centers needed. Having this synchronous (strong consistent) would impose huge performance penalty! Typical latency within one data center: 2-3 ms! Typical latency between continents: >100ms! Also: Replication needs bandwidth which translates into costs INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-6
4 Example from Yahoo! [VLDB2011]! Scenario: a social networking application that uses this Figure distributed 1: database Globally replicated database that asynchronously distributed database, propagates with replicas updateskept to in-sync remotevia datacen- an! ters. asynchronous replication mechanism! inter data center communication needed! east Users coast typically of theshow U.S., some and inlocality France. which For should clarity direct in ourthe discussion, replication we mechanism will focus on to a avoid single costs table and containing improve perf. records, but our techniques generalize directly to multiple tables or INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-7 other data models. Each replica location stores a full or partial copy of the table. Because of the high latency for communicating between datacenters, replication is typically done asynchronously. Usually, Sidenote: writes are Partial persistedreplication one or more localto servers save and acknowledged Bandwidth to the while applications increasing (e.g., made 1-safe). Latency Later, updates are sent to other replica locations. An example of! Goal: replication algorithm with low bandwidth and low latency this architecture is shown in Figure 1. As the figure shows,! we Side-Constraints: can think of the system as having two distinct components:! Latency a database SLAs system, (at least which local access manages must reads be read and fast) writes of data! Legal records, Constraints and a(not replication all data is system, allowed to which be stored manages everywhere) replication Approach: of updates between replica locations. In real systems! these! Asynchronous, componentsprimary-copy might be on replication the same server (as in MySQL replication! Any write [3]) is or guaranteed different to be servers done at (as the master in PNUTS first [9]). The replication! Partial replication system must ensure reliable delivery of updates to remote! full-copies datacenters vs. stubs despite failures. Individual servers might! fail Read-everywhere, (and even but lose if stub, data), it is actually but local a remote or read remote (latency copies penalty) can! Policy be used Constraints for recovery. that define In! each the allowable location, and mandatory a given locations recordfor exists full replicas either of each as record, a fulland replica! the orminimum as a stub. number Aof full replicas for is each a normal record copy of the record,! Dynamic possibly placement enhanced algorithm with that metadata takes local reads/writes to supportinto selectivefor replication, promoting stubs suchto as full acopies list of and other vice versa full replicas. A stub account INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-8 contains only the record s primary key and metadata, but no data values. Note that we do not consider selective replication at the field or column level in this paper. are only updated warded scheme, is delete notify th A rec databas wise, th the stub ably the penalty been ser now nee sponse l As we i forward It may example promote replicas promoti terns, b This allo number convert replicas forward data mu 3.2 O Interespecial ity. The width u tant. H ing send duce the bandwid Inter- Repl
5 Dynamic Placement! Dynamic demotion/promotion of stubs! Reading a stub: stub promoted to full replica! Update on replica: if after retention interval -> stub INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-9 Extra Properties for Eventual Consistency! Programming an application is much harder if storage supports only eventual consistency! What to do until everything settles down??! Handling inconsistency in the sequence of reads! Cf Hellerstein et al PODS 10/CIDR 11: eventual consistency data model supports monotonic programs (a very limited class)! A range of extra properties, which (if the storage provides this) can make programming not quite so hard! e.g., read-your-own-writes, monotonic reads, INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-10
6 Properties of Eventual Consistency! Causal Consistency! If client A has communicated with client B that an item has been updated by A, B will see the updated value of A! Read-your-writes Consistency! Special form of Causal Consistency for A=B! Session Consistency! Practical version of Read-your-writes Consistency that guarantees this property just within one session, but not between separate sessions! Monotonic Read Consistency! A subsequent read will never return an older version of a data item than a previous read! Monotonic Write Consistency! Serializability just for writes INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) The CALM Conjuncture: Consistency and Logical Monotonicity! Observation 1: monotonic eventually consistent! Observation 2: coordination at every nonmonotonic operation eventual consistent! Conjecture: non-monotonic and uncoordinated inconsistent [Hellerstein, PODS 2010 Keynote] INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-12
7 State-of-the-Art with Current Offerings for Cloud Storage! CIDR 2011 Paper: Data Consistency Properties and the Trade-offs in Commercial Cloud Storages: the Consumers Perspective! The following slides are from this talk, available in the original from INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) CIDR 2011: Consistency from the Consumer s Perspective! Paper investigates consistency models provided by commercial cloud storages! If weak consistency, which extra properties supported?! How often and in what circumstances is inconsistency (stale values) observed?! Any differences between what is observed and what is announced from the vendor?! Investigation of the benefits for consumer of accepting weaker consistency models! Are the benefits significant to justify consumers effort?! When vendor offers choice of consistency model, how do they compare in practice? INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-14
8 Observed Platforms! A variety of commercial cloud NoSQLs that are offered as storage service! Amazon S3! Two options: Regular and Reduced redundancy (durability)! Amazon SimpleDB! Two options: Consistent Reads and Eventual Consistent Reads! Google App Engine datastore! Two options: Strong and Eventual Consistent Reads! Windows Azure Table and Blob! No option available in data consistency INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) Frequency of Observing Stale Data! Experimental Setup! A writer updates data once each 3 secs, for 5 mins! On GAE, it runs for 27 seconds! A reader(s) reads it 100 times/sec! Check if the data is stale by comparing value seen to the most recent value written! Plot against time since most recent write occurred Execute the above once every hour On GAE, every 10 min For at least 10 days Done in Oct and Nov, 2010 INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-16
9 SimpleDB: Read and Write from a Single Thread! With eventual consistent read, 33% of chance to read freshest values within 500ms! Perhaps one master and two other replicas. Read takes value randomly from one of these? First time for eventual consistent read to reach 99% fresh is stable 500ms Outlier cases of stale read after 500ms, but no regular daily or weekly variation observed INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) Stale Data in Other Cloud Data Stores Cloud NoSQL and Accessing Source SimpleDB (access from one thread, two threads, two processes, two VMs or two regions) S3 (with five access configurations) GAE datastore (access from a single app or two apps) What Observed Accessing source has no affect on the observable consistency. Eventual consistent reads have 33% chance to see stale value, till 500ms after write. No stale data was observed in ~4M reads/config. Providing better consistency than SLA describes. Eventual consistent reads from different apps have very low (3.3E -4 %) chance to observe values older than previous reads. Other reads never saw stale data. Azure Storages (with five access configurations) No stale data observed. Matches SLA described by the vendor (all reads are consistent). INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-18
10 Additional Properties: Read-Your-Writes?! Read-your-writes: a read always sees a value at least as fresh as the latest write from the same thread/session! Our experiment: When reader and writer share 1 thread, all reads should be fresh! SimpleDB with eventual consistent read: does not have this property! GAE with eventual consistent read: may have this property! No counterexample observed in ~3.7M reads over two weeks INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) Additional Properties: Monotonic Reads?! Monotonic Reads: Each read sees a value at least as fresh as that seen in any previous read from the same thread/session! Our experiment: check for a fresh read followed by a stale one! SimpleDB: Monotonic read consistency is not supported! In staleness, two successive eventual consistent reads are almost independent! The correlation between staleness in two successive reads (up to 450ms after write) is , which is very low 1 st Stale 39.94% (~1.9M) 1 st Fresh 23.36% (~1.1M) 2 nd Stale 2 nd Fresh 21.08% (~1.0M) 15.63% (~0.7M)! GAE with eventual consistent read: not supported! 3.3E -4 % chance to read values older than previous reads INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-20
11 Additional Properties: Monotonic Writes?! Monotonic Writes: Each write is completed in a replica after previous writes have been completed! Programming is notoriously hard if monotonic write consistency is missing! W. Vogels. Eventually consistent. Commun. ACM, 52(1), 2009.! This is an implementation property, not directly visible to consumer. But we explore what happens when we do successive writes, and try to read the data INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) SimpleDB s Eventual Consistent Read: Monotonic Write! A data has value v0 before each run. Writing value v1 and then v2 there, then read it repeatedly v1!= v2 v1 = v2 When v1!= v2, writing v2 pushes v1 to replicas immediately (previous value v0 is not observed) Very different from the only writing one value case When v1 = v2, second write does not push (v0 is observed) Same as the only writing one value case INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-22
12 SimpleDB s Eventual Consistent Read: Further exploration -- Inter-Element Consistency SimpleDB s Data Model Domain Item Attribute Value SimpleDB s API Write a value Write multiple values in an item Write multiple values across items in a domain Read a value Read multiple values in an item Read multiple values across items in a domain! Consistency between two values when writing and reading them through various combinations of APIs Reading two values independently Writing two at once and reading two at once Writing two in the same domain independently INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) Each read has 33% chance of freshness. Each read operation is independent Both are stale or both are fresh. Seems batch write and batch read access to one replica The second write pushes the value of the first write (but only if two values are different) 10- Trade-Off Analysis of SimpleDB: A Benefit for Consumer from Weak Consistency? No significant difference was observed in RTT, throughput, failure rate under various readwrite ratios If anything, it favors consistent read! Financial cost is exactly same * Each INFO5011 client "Cloud sends Computing" 1 rps. All obtained (U. under Röhm 99:1 and Y. read-write Zhou) ratio 10-24
13 What Consumers Can Observe (as of the state of these experiments)! SimpleDB platform showed frequent inconsistency! It offers option for consistent read. No extra costs for the consumer were observed from our experiments! At least under the scale of our experiments (few KB stored in a domain and ~2,500 rps)?? Maybe the consumer should always program SimpleDB with consistent reads? INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) What Consumers Can Observe (contd) (as of the state of these experiments)! Some platforms gave (mostly) better consistency than they promise! Consistency almost always (5-nines or better)! Perhaps consistency violated only when network or node failure occurs during execution of the operation?? Maybe the chance of seeing stale data is so rare on these platforms that it need not be considered in programming?! There are other, more frequent, sources of data corruption such as data entry errors! The manual processes that fix these may also be used for rare errors from stale data INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-26
14 Implications of these Experiments for Consumers?! Can a consumer rely on our findings in decision-making? NO!! Vendors might change any aspect of implementation (including presence of properties) without notice to consumers.! e.g., Shift from replication in a data center to geographical distribution! Vendors might find a way to pass on to consumers the savings from eventual consistent reads (compared to consistent ones)! The lesson is that clear SLAs are needed, that clarify the properties that consumers can expect INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) Summary! Strong Consistency! is most desirable for consumers, but seen as not achievable with network partitioning tolerance! Cf. CAP Theorem! Eventual Consistency:! Allows nodes to disagree on current data value! But current algorithms/systems differ widely in how this is achieved! Current Cloud Storage Implementations! Provide different variants of eventual consistency without disclosing the implementation details or clear SLAs! Currently missing SLAs (observable, but no guarantees):! Rate of inconsistent reads, time to convergence, performance under variety of workloads, availability, costs INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-28
15 References! Werner Vogels: Eventually Consistent. Communications of the ACM, Volume 52, No. 1, Jan 2009.! H. Wada, A. Fekete, L. Zhao, K.. Lee and A. Liu: Data Consistency Properties and the Trade-offs in Commercial Cloud Storages: the Consumers Perspective. In CIDR 2011.! Sudarshan Kadambi, Jianjun Chen, Brian F. Cooper, David Lomax, Raghu Ramakrishnan, Adam Silberstein, Erwin Tam, Hector Garcia- Molinn: Where in the World is My Data? In: VLDB 2011.! Joseph M. Hellerstein: The Declarative Imperative Experiences and Conjunctures on Distributed Logic. SIGMOD Record 39:1, March INFO5011 "Cloud Computing" (U. Röhm and Y. Zhou) 10-29
INFO5011 Advanced Topics in IT: Cloud Computing Week 12: Cloud Computing Security and Data Privacy
INFO5011 Advanced Topics in IT: Cloud Computing Week 12: Cloud Computing Security and Data Privacy Dr. Uwe Röhm School of Information Technologies! Cloud Computing Security Outline! Data Privacy in the
More informationEventually Consistent
Historical Perspective In an ideal world there would be only one consistency model: when an update is made all observers would see that update. The first time this surfaced as difficult to achieve was
More informationData Consistency Properties and the Trade offs in Commercial Cloud Storages: the Consumers Perspective
Data Consistency Properties and the Trade offs in Commercial Cloud Storages: the Consumers Perspective Hiroshi Wada, Alan Fekete, Liang Zhao, Kevin Lee and Anna Liu National ICT Australia NICTA School
More informationDistributed Data Stores
Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High
More informationThe Cloud Trade Off IBM Haifa Research Storage Systems
The Cloud Trade Off IBM Haifa Research Storage Systems 1 Fundamental Requirements form Cloud Storage Systems The Google File System first design consideration: component failures are the norm rather than
More informationTowards secure and consistency dependable in large cloud systems
Volume :2, Issue :4, 145-150 April 2015 www.allsubjectjournal.com e-issn: 2349-4182 p-issn: 2349-5979 Impact Factor: 3.762 Sahana M S M.Tech scholar, Department of computer science, Alvas institute of
More informationThe relative simplicity of common requests in Web. CAP and Cloud Data Management COVER FEATURE BACKGROUND: ACID AND CONSISTENCY
CAP and Cloud Data Management Raghu Ramakrishnan, Yahoo Novel systems that scale out on demand, relying on replicated data and massively distributed architectures with clusters of thousands of machines,
More informationBerkeley Ninja Architecture
Berkeley Ninja Architecture ACID vs BASE 1.Strong Consistency 2. Availability not considered 3. Conservative 1. Weak consistency 2. Availability is a primary design element 3. Aggressive --> Traditional
More informationIntroduction to NOSQL
Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo
More informationYahoo! Cloud Serving Benchmark
Yahoo! Cloud Serving Benchmark Overview and results March 31, 2010 Brian F. Cooper cooperb@yahoo-inc.com Joint work with Adam Silberstein, Erwin Tam, Raghu Ramakrishnan and Russell Sears System setup and
More informationReplicated Data Consistency Explained Through Baseball
Replicated Data Consistency Explained Through Baseball Doug Terry Microsoft Research Silicon Valley MSR Technical Report October 2011 Abstract Some cloud storage services, like Windows Azure, replicate
More informationMASTER PROJECT. Resource Provisioning for NoSQL Datastores
Vrije Universiteit Amsterdam MASTER PROJECT - Parallel and Distributed Computer Systems - Resource Provisioning for NoSQL Datastores Scientific Adviser Dr. Guillaume Pierre Author Eng. Mihai-Dorin Istin
More informationNoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
More informationthe emperor s new consistency the case against weak consistency in data centers marcos k. aguilera microsoft research silicon valley
the emperor s new consistency the case against weak consistency in data centers marcos k. aguilera microsoft research silicon valley synopsis weak consistency widely adopted today in data center systems
More informationHAT not CAP: Highly Available Transactions
HAT not CAP: Highly Available Transactions Talk at Dagstuhl Seminar 13081, February 19 2013 Draft Paper at http://arxiv.org/pdf/1302.0309.pdf Peter Bailis (UCBerkeley), Alan Fekete (U of Sydney), Ali Ghodsi
More informationGeo-Replication in Large-Scale Cloud Computing Applications
Geo-Replication in Large-Scale Cloud Computing Applications Sérgio Garrau Almeida sergio.garrau@ist.utl.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Cloud computing applications
More informationCloud Storage over Multiple Data Centers
Cloud Storage over Multiple Data Centers Shuai MU, Maomeng SU, Pin GAO, Yongwei WU, Keqin LI, Albert ZOMAYA 0 Abstract The increasing popularity of cloud storage services has led many companies to migrate
More informationBig Data & Scripting storage networks and distributed file systems
Big Data & Scripting storage networks and distributed file systems 1, 2, adaptivity: Cut-and-Paste 1 distribute blocks to [0, 1] using hash function start with n nodes: n equal parts of [0, 1] [0, 1] N
More informationComparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
More informationCAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt
CAP Theorem and Distributed Database Consistency Syed Akbar Mehdi Lara Schmidt 1 Classical Database Model T2 T3 T1 Database 2 Databases these days 3 Problems due to replicating data Having multiple copies
More informationCloud data store services and NoSQL databases. Ricardo Vilaça Universidade do Minho Portugal
Cloud data store services and NoSQL databases Ricardo Vilaça Universidade do Minho Portugal Context Introduction Traditional RDBMS were not designed for massive scale. Storage of digital data has reached
More informationTree-Based Consistency Approach for Cloud Databases
Tree-Based Consistency Approach for Cloud Databases Md. Ashfakul Islam Susan V. Vrbsky Department of Computer Science University of Alabama What is a cloud? Definition [Abadi 2009] shift of computer processing,
More informationDatabase Replication with Oracle 11g and MS SQL Server 2008
Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely
More informationHow To Manage Cloud Hosted Databases From A User Perspective
A support for end user-centric SLA administration of Cloud-Hosted Databases A.Vasanthapriya 1, Mr. P. Matheswaran, M.E., 2 Department of Computer Science 1, 2 ABSTRACT: Service level agreements for cloud
More informationCassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
More informationWhere We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 25: DBMS-as-a-service and NoSQL We learned quite a bit about data management see course calendar Three topics left: DBMS-as-a-service and NoSQL
More informationProviding Security and Consistency as a service in Auditing Cloud
Providing Security and Consistency as a service in Auditing Cloud Goka.D.V.S.Seshu Kumar M.Tech Department of CSE Sankethika Vidya Parishad Engineering College G.Kalyan Chakravarthi, M.Tech Assistant Professor
More informationHow to Choose Between Hadoop, NoSQL and RDBMS
How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A
More informationDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases aka Just Enough Distributed Systems To Be Dangerous (in 40 minutes) Todd Lipcon (@tlipcon) Cloudera June 11, 2009 Introduction Common Underlying
More informationIMPLEMENTATION OF NOVEL MODEL FOR ASSURING OF CLOUD DATA STABILITY
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPLEMENTATION OF NOVEL MODEL FOR ASSURING OF CLOUD DATA STABILITY K.Pushpa Latha 1, V.Somaiah 2 1 M.Tech Student, Dept of CSE, Arjun
More informationIntroduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and
Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and Jaliya Ekanayake Range in size from edge facilities
More informationBenchmarking and Analysis of NoSQL Technologies
Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The
More informationHigh Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo
High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral
More informationThe State of Cloud Storage
205 Industry Report A Benchmark Comparison of Speed, Availability and Scalability Executive Summary Both 203 and 204 were record-setting years for adoption of cloud services in the enterprise. More than
More informationReferences. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline
References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
More informationIntroduction to Database Systems CSE 444
Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
More informationOn- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
More informationOutline. Clouds of Clouds lessons learned from n years of research Miguel Correia
Dependability and Security with Clouds of Clouds lessons learned from n years of research Miguel Correia WORKSHOP ON DEPENDABILITY AND INTEROPERABILITY IN HETEROGENEOUS CLOUDS (DIHC13) August 27 th 2013,
More informationTwo Level Auditing Framework in a large Cloud Environment for achieving consistency as a Service.
Two Level Auditing Framework in a large Cloud Environment for achieving consistency as a Service. Aruna V MTech Student Department of CSE St.Peter s Engineering College, Hyderabad, TS, INDIA N Mahipal
More informationCloud Computing Is In Your Future
Cloud Computing Is In Your Future Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Cloud Computing is Utility Computing Illusion
More informationCloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015
Cloud DBMS: An Overview Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Outline Definition and requirements S through partitioning A through replication Problems of traditional DDBMS Usage analysis: operational
More informationLARGE-SCALE DATA STORAGE APPLICATIONS
BENCHMARKING AVAILABILITY AND FAILOVER PERFORMANCE OF LARGE-SCALE DATA STORAGE APPLICATIONS Wei Sun and Alexander Pokluda December 2, 2013 Outline Goal and Motivation Overview of Cassandra and Voldemort
More informationPipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery. Razvan Ghitulete Vrije Universiteit
PipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery Razvan Ghitulete Vrije Universiteit Introduction /introduction Ubiquity: the final frontier Internet needs
More informationA Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems Ismail Hababeh School of Computer Engineering and Information Technology, German-Jordanian University Amman, Jordan Abstract-
More informationA Framework for Highly Available Services Based on Group Communication
A Framework for Highly Available Services Based on Group Communication Alan Fekete fekete@cs.usyd.edu.au http://www.cs.usyd.edu.au/ fekete Department of Computer Science F09 University of Sydney 2006,
More informationConsistency Trade-offs for SDN Controllers. Colin Dixon, IBM February 5, 2014
Consistency Trade-offs for SDN Controllers Colin Dixon, IBM February 5, 2014 The promises of SDN Separa&on of control plane from data plane Logical centraliza&on of control plane Common abstrac&ons for
More informationHigh Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper
High Availability with Postgres Plus Advanced Server An EnterpriseDB White Paper For DBAs, Database Architects & IT Directors December 2013 Table of Contents Introduction 3 Active/Passive Clustering 4
More informationBrewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Seth Gilbert Nancy Lynch Abstract When designing distributed web services, there are three properties that
More informationOverview of Luna High Availability and Load Balancing
SafeNet HSM TECHNICAL NOTE Overview of Luna High Availability and Load Balancing Contents Introduction... 2 Overview... 2 High Availability... 3 Load Balancing... 4 Failover... 5 Recovery... 5 Standby
More informationDATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES
DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES Bettina Kemme Dept. of Computer Science McGill University Montreal, Canada Gustavo Alonso Systems Group Dept. of Computer Science ETH Zurich,
More informationAdaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu
More informationMassive Data Storage
Massive Data Storage Storage on the "Cloud" and the Google File System paper by: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung presentation by: Joshua Michalczak COP 4810 - Topics in Computer Science
More informationLOAD BALANCING MECHANISMS IN DATA CENTER NETWORKS
LOAD BALANCING Load Balancing Mechanisms in Data Center Networks Load balancing vs. distributed rate limiting: an unifying framework for cloud control Load Balancing for Internet Distributed Services using
More informationConsistency-Based Service Level Agreements for Cloud Storage
Consistency-Based Service Level Agreements for Cloud Storage Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla, Mahesh Balakrishnan, Marcos K. Aguilera, Hussam Abu-Libdeh Microsoft Research Silicon
More informationRAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University
RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction
More informationCloud Computing with Microsoft Azure
Cloud Computing with Microsoft Azure Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Azure's Three Flavors Azure Operating
More informationDISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2
DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.
More informationwow CPSC350 relational schemas table normalization practical use of relational algebraic operators tuple relational calculus and their expression in a declarative query language relational schemas CPSC350
More informationLocality Based Protocol for MultiWriter Replication systems
Locality Based Protocol for MultiWriter Replication systems Lei Gao Department of Computer Science The University of Texas at Austin lgao@cs.utexas.edu One of the challenging problems in building replication
More informationDesign and Evolution of the Apache Hadoop File System(HDFS)
Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop
More informationCluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
More informationTransactions and ACID in MongoDB
Transactions and ACID in MongoDB Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently
More informationSPM rollouts in Large Ent erprise: different iat ing exist ing cloud architectures
SPM rollouts in Large Ent erprise: different iat ing exist ing cloud architectures 1 Table of contents Why this white paper?... 3 SPM for SMEs vs. SPM for LEs... 3 Why a multi-tenant and not single-tenant
More informationCan the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
More informationCS5412: ANATOMY OF A CLOUD
1 CS5412: ANATOMY OF A CLOUD Lecture VII Ken Birman How are cloud structured? 2 Clients talk to clouds using web browsers or the web services standards But this only gets us to the outer skin of the cloud
More informationENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING
ENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING Enzo Unified Extends SQL Server to Simplify Application Design and Reduce ETL Processing CHALLENGES SQL Server does not scale out
More informationFacebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election
More informationWindows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft
Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft Agenda: Windows Azure Storage Overview Architecture Key Design Points 2 Overview Windows Azure Storage Cloud Storage - Anywhere and
More informationCS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
More informationCS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
More informationScheduling and Monitoring of Internally Structured Services in Cloud Federations
Scheduling and Monitoring of Internally Structured Services in Cloud Federations Lars Larsson, Daniel Henriksson and Erik Elmroth {larsson, danielh, elmroth}@cs.umu.se Where are the VMs now? Cloud hosting:
More informationData Management in the Cloud
Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
More informationWINDOWS AZURE DATA MANAGEMENT
David Chappell October 2012 WINDOWS AZURE DATA MANAGEMENT CHOOSING THE RIGHT TECHNOLOGY Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents Windows Azure Data Management: A
More informationAlthough research on distributed database systems. Consistency Tradeoffs in Modern Distributed Database System Design COVER FEATURE
COVER FEATURE Consistency Tradeoffs in Modern Distributed Database System Design Daniel J. Abadi, Yale University The CAP theorem s impact on modern distributed database system design is more limited than
More informationA Brief Analysis on Architecture and Reliability of Cloud Based Data Storage
Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf
More informationThe CAP theorem and the design of large scale distributed systems: Part I
The CAP theorem and the design of large scale distributed systems: Part I Silvia Bonomi University of Rome La Sapienza www.dis.uniroma1.it/~bonomi Great Ideas in Computer Science & Engineering A.A. 2012/2013
More informationManaging Documents with NoSQL in Service Oriented Architecture
Managing Documents with NoSQL in Service Oriented Architecture Milorad P. Stević, The Higher Education Technical School of Professional Studies, Novi Sad, Serbia, milorad.stevic@live.com Abstract The need
More informationArchitecting For Failure Why Cloud Architecture is Different! Michael Stiefel www.reliablesoftware.com development@reliablesoftware.
Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com Outsource Infrastructure? Traditional Web Application Web Site Virtual
More informationAn Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
More information10 How to Accomplish SaaS
10 How to Accomplish SaaS When a business migrates from a traditional on-premises software application model, to a Software as a Service, software delivery model, there are a few changes that a businesses
More informationHow swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda
How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda 1 Outline Build a cost-efficient Swift cluster with expected performance Background & Problem Solution Experiments
More informationA Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System
A Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System Sashi Tarun Assistant Professor, Arni School of Computer Science and Application ARNI University, Kathgarh,
More informationDistributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
More informationThe Availability of Commercial Storage Clouds
The Availability of Commercial Storage Clouds Literature Study Introduction to e-science infrastructure 2008-2009 Arjan Borst ccn 0478199 Grid Computing - University of Amsterdam Software Engineer - WireITup
More informationIntroduction to Database Systems CSE 444. Lecture 24: Databases as a Service
Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service CSE 444 - Spring 2009 References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website
More informationThe Hadoop Distributed File System
The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS
More informationStructured Data Storage
Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct
More informationINFO5011 Advanced Topics in IT: Cloud Computing
INFO5011 Advanced Topics in IT: Cloud Computing Week 5: Distributed Data Management: From 2PC to Dynamo Dr. Uwe Röhm School of Information Technologies Outline Distributed Data Processing Data Partitioning
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationRPO represents the data differential between the source cluster and the replicas.
Technical brief Introduction Disaster recovery (DR) is the science of returning a system to operating status after a site-wide disaster. DR enables business continuity for significant data center failures
More informationThe CAP-Theorem & Yahoo s PNUTS
The CAP-Theorem & Yahoo s PNUTS Stephan Müller June 5, 2012 Abstract This text is thought as an introduction to the CAP-theorem, as well as for PNUTS, a particular distributed databased. It subsumes a
More informationCloud Computing. Lecture 24 Cloud Platform Comparison 2014-2015
Cloud Computing Lecture 24 Cloud Platform Comparison 2014-2015 1 Up until now Introduction, Definition of Cloud Computing Pre-Cloud Large Scale Computing: Grid Computing Content Distribution Networks Cycle-Sharing
More informationMiddleware and Distributed Systems. System Models. Dr. Martin v. Löwis. Freitag, 14. Oktober 11
Middleware and Distributed Systems System Models Dr. Martin v. Löwis System Models (Coulouris et al.) Architectural models of distributed systems placement of parts and relationships between them e.g.
More informationSwiftStack Global Cluster Deployment Guide
OpenStack Swift SwiftStack Global Cluster Deployment Guide Table of Contents Planning Creating Regions Regions Connectivity Requirements Private Connectivity Bandwidth Sizing VPN Connectivity Proxy Read
More informationPerformance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment
International Journal of Applied Information Systems (IJAIS) ISSN : 2249-868 Performance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment Yusuf Abubakar Department of Computer Science
More informationVirtual Infrastructure Security
Virtual Infrastructure Security 2 The virtual server is a perfect alternative to using multiple physical servers: several virtual servers are hosted on one physical server and each of them functions both
More informationRADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters
RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters Sage Weil, Andrew Leung, Scott Brandt, Carlos Maltzahn {sage,aleung,scott,carlosm}@cs.ucsc.edu University of California,
More informationDesigning a Cloud Storage System
Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes
More informationSpeak<geek> Tech Brief. RichRelevance Infrastructure: a robust, retail- optimized foundation. richrelevance
1 Speak Tech Brief RichRelevance Infrastructure: a robust, retail- optimized foundation richrelevance : a robust, retail-optimized foundation Internet powerhouses Google, Microsoft and Amazon may
More informationCommunication System Design Projects
Communication System Design Projects PROFESSOR DEJAN KOSTIC PRESENTER: KIRILL BOGDANOV KTH-DB Geo Distributed Key Value Store DESIGN AND DEVELOP GEO DISTRIBUTED KEY VALUE STORE. DEPLOY AND TEST IT ON A
More information