Scalability. We can measure growth in almost any terms. But there are three particularly interesting things to look at:

Size: px
Start display at page:

Download "Scalability. We can measure growth in almost any terms. But there are three particularly interesting things to look at:"

Transcription

1 Scalability The ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. We can measure growth in almost any terms. But there are three particularly interesting things to look at: Size scalability: adding more nodes should make the system linearly faster; growing the dataset should not increase latency Geographic scalability: it should be possible to use multiple data centers to reduce the time it takes to respond to user queries, while dealing with crossdata center latency in some sensible manner. Administrative scalability: adding more nodes should not increase the administrative costs of the system (e.g. the administrators-to-machines ratio). A scalable system is one that continues to meet the needs of its users as scale increases. There are two particularly relevant aspects - performance and availability - which can be measured in various ways.

2 Performance (and latency) Characterization of the amount of useful work accomplished by a computer system compared to the time and resources used. Depending on the context, this may involve achieving one or more of the following: Short response time/low latency for a given piece of work High throughput (rate of processing work) Low utilization of computing resource(s) Latency: the state of being latent; delay, a period between the initiation of something and the occurrence. Latent: From Latin latens, latentis, present participle of lateo ("lie hidden"). Existing or present but concealed or inactive.

3 Availability (and fault tolerance) the proportion of time a system is in a functioning condition. If a user cannot access the system, it is said to be unavailable. In formula, availability = uptime / (uptime + downtime) from a technical perspective, availability is mostly about being fault tolerant. Fault tolerance is the ability of a system to behave in a well-defined manner once faults occur Availability Nickname Downtime per year 90% one nine more than a month 99% two nines less than 4 days 99.9% three nines less than 9 hours 99.99% four nines less than 1 hour % five nines about 5 minutes % six nines about 31 seconds

4 Coordination

5 Consensus & Agreement It is generally important that the processes within a distributed system have some sort of agreement Coordination among multiple parties involves agreement among those parties Agreement Consensus Consistency Agreement is difficult in a dynamic asynchronous system in which processes may fail or join/leave

6 System Model A system model is a set of assumptions about the environment and facilities on which a distributed system is implemented. Assumptions include: what capabilities the nodes have and how they may fail how communication links operate and how they may fail and overall properties of the system, such as assumptions about time and order A robust system model is one that makes the weakest assumptions

7 Node Model Nodes serve as hosts for computation and storage. They have: the ability to execute a program the ability to store data into volatile memory (which can be lost upon failure) and into stable state (which can be read after a failure) a clock (which may or may not be assumed to be accurate) Nodes execute deterministic algorithms: the local computation, the local state after the computation, and the messages sent are determined uniquely by the message received and local state when the message was received.

8 Node Failures There are many possible failure models which describe the ways in which nodes can fail. Crash faults: nodes can only fail by crashing, and can (possibly) recover after crashing at some later point. Omissione faults: nodes can fail by crashing or by dropping messages Byzantine faults: nodes can fail by misbehaving in any arbitrary way. Sending wrong messages, ignoring messages, sending useless messages, faking state, etc

9 Link Model Links connect individual nodes to each other. Messages sent in either direction. Network is unreliable and subject to message loss and delays. A network partition occurs when the network fails while the nodes themselves remain operational. Partitioned nodes may be accessible by some clients, and so must be treated differently from crashed nodes.

10 Link Failures There are many possible failure models which describe the ways in which nodes can fail. Crash faults: nodes can only fail by crashing, and can (possibly) recover after crashing at some later point. Byzantine faults: nodes can fail by misbehaving in any arbitrary way. Sending wrong messages, ignoring messages, sending useless messages, faking state, etc

11 Communication Patterns Re e qu nse o sp st Re Synchronous Asynchronous

12 Timing/Ordering Synchronous system model: processes execute in lock-step; there is a known upper bound on message transmission delay; each process has an accurate clock Asynchronous system model: no timing assumptions; processes execute at independent rates; there is no bound on message transmission delay; useful clocks do not exist

13 Impossibility Theorems Two fundamental theorems, FLP and CAP, influences the system design choices FLP theorem: asynchronicity vs synchronicity Consensus is impossible to implement in such a way that it both a) is always correct and b) always terminates if even one machine might fail in an asynchronous system with crash-* stop failures CAP theorem: what happens when network partitions are included in the failure model You can t implement consistent storage and respond to all requests if you might drop messages between processes.

14 FLP Impossibility of Distributed Consensus with One Faulty Process, by Fischer, Lynch and Paterson (1985) Consensus Problem: we have a set of processes, each one with a private input; the processes communicate; the processes must agree on on some process s input.

15 Consensus is important With consensus we can implement anything we can imagine: leader decision mutual exclusion transaction commitment much more In some models consensus is possible, in some other models, it is not The goal is to learn whether, for a given model, consensus is possible or not and prove it!

16 Consensus Properties 1. Agreement: Every correct process must agree on the same value. 2. Integrity: Every correct process decides at most one value, and if it decides some value, then it must have been proposed by some process. 3. Termination: All correct processes eventually reach a decision. 4. Validity: If all correct processes propose the same value V, then all correct processes decide V.

17 FLP System Model Asynchronous communication model, i.e., no upper bound on the amount of time processors may take to receive, process and respond to an incoming message Communication links between processors are assumed to be reliable. It is well known that given arbitrarily unreliable links no solution for consensus could be found even in a synchronous model. Processors are allowed to fail according to the crash fault model this simply means that processors that fail do so by ceasing to work correctly. There are more general failure models, such as byzantine failures where processors fail by deviating arbitrarily from the algorithm they are executing.

18 Consequences of FLP There is no way to solve the consensus problem under a very minimal system model in a way that cannot be delayed forever Complete correctness if not possible in asynchronous models In practice, we may live with very low probability of disagreement (give up safety) In practice, we may live with very low probability of blocking (give up liveness) Two-phase commit or even three-phase commit can block forever This result is particularly relevant to people designing algorithms

19 CAP Presented as Brewer s Conjecture in 2000 Formalized and proved in Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services, by Lynch and Gilbert (2002) Consistency, availability and partition-tolerance cannot be achieved all at the same time in a distributed system. Simply, in an asynchronous network that performs as expected, where messages may be lost (partition-tolerance), it is impossible to implement a service providing correct data (consistency) and eventually responding to every request (availability) under every pattern of message loss. Slides from _Downloads/Workshop/Talks/2010-HS-DIS-I_Giangreco-CAP_Theorem- Talk.pdf

20 Consistency All the nodes in the system see the same state of the data Formally, we speak of atomic or linearizable consistency There exists a sequential order on all operations which is consistent with the order of invocations and responses, such that each operation looks as if it were completed at a single instant. V1 V1 V1

21 Availability Every request received by a non-failing node should be processed and must result in a response

22 Partition Tolerance If some nodes crash and/or some communications fail, system still performs as expected

23 CAP Theorem It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: Availability Atomic consistency in all fair executions (including those in which messages are lost). Asynchronous, i. e. there is no clock, nodes make decisions based only on the messages received and local computation.

24 Consequences of CAP

25 Consequences of CAP When partitions are rare (e.g., parallel systems), CAP should allow perfect C and A most of the time In distributed systems it is not possible to avoid network partitions. There is not a need to choose between either C or A, instead, it is more an act of balancing between the two properties.

26 Practical Consequences of CAP Many system designs used in early distributed relational database systems did not take into account partition tolerance (e.g. they were CA designs). There is a tension between strong consistency and high availability during network partitions. A distributed system consisting of independent nodes connected by an unpredictable network cannot behave in a way that is indistinguishable from a non-distributed system. There is a tension between strong consistency and performance in normal operation. Strong consistency requires that nodes communicate and agree on every operation. This results in high latency during normal operation. If we do not want to give up availability during a network partition, then we need to explore whether consistency models other than strong consistency are workable for our purposes.

27 Consistency A consistency model is a contract between programmer and system, wherein the system guarantees that if the programmer follows some specific rules, the results of operations on the data store will be predictable. Strong consistency models guarantee that the apparent order and visibility of updates is equivalent to a non-replicated system Linearizable consistency Sequential consistency Weak consistency models are not strong: Client-centric consistency models Causal consistency: strongest model available Eventual consistency models

28 Strong Consistency Models Linearizable consistency: Under linearizable consistency, all operations appear to have executed atomically in an order that is consistent with the global real-time ordering of operations. (Herlihy & Wing, 1991) Sequential consistency: Under sequential consistency, all operations appear to have executed atomically in some order that is consistent with the order seen at individual nodes and that is equal at all nodes. (Lamport, 1979)

29 Primary Copy Replication All updates are performed on the primary log of operations (or alternatively, changes) is shipped across the network to the backup replicas Asynch and synch variants Asynch used by MySQL and MongoDB

30 Primary Copy Fault Tolerance Failures can cause violations to consistency guarantees Site failures, network partitions, undelivered messages We need an atomic commitment protocol (ACP) Commonly used in database systems (i.e., transactions)

31 Atomic Commitment Protocols 1 coordinator, N participants Coordinator knows the names of all participants Participants know the name of the coordinator, not necessarily each other Each site contains a distributed transaction log surviving failures Each process may cast exactly one of two votes: Yes or No Each process can reach exactly one of two decisions: Commit or Abort Consensus Problem!!!!

32 ACP Conditions 1. All processes that reach a decision reach the same one. 2. A process cannot reverse its decision after it has reached one. 3. The Commit decision can only be reached if all processes voted Yes. 4. If there are no failures and all processes voted Yes, then the decision will be to Commit. 5. Consider any execution containing only failures that the algorithm is designed to tolerate. At any point in this execution, if all existing failures are repaired and no new failures occur for sufficiently long, then all processes will eventually reach a decision.

33 Two Phase Commit (2PC) Ready? Yes/No Coordinator Coordinator Participants Participants

34 Two Phase Commit (2PC) Coordinator Commit/Abort Terminates if no faults Satisfy conditions 1-4 Does not satisfy condition 5 (as expected) 3n messages for n nodes What happens with Participants faults?

35 2PC and multiple failures Multiple participant failures: No problem as long as the coordinator lives Coordinator and participant failures: No Abort Abort X Yes Yes X?? The remaining participants cannot take a decision

36 2PC and multiple failures Solution: Add another phase to the protocol! The new phase precedes the commit phase The goal is to inform all participants that all are ready to commit or abort At the end of this phase, every participant knows whether or not all participants want to commit before any participant has actually committed or aborted! This protocol is called the three-phase commit protocol (3PC)

37 Three Phase Commit (3PC) Ready? Yes Coordinator Coordinator Participants Participants

38 Three Phase Commit (3PC) Prepare Ack Coordinator Coordinator Participants Participants

39 Three Phase Commit (3PC) Commit Coordinator Participants

40 Weak Consistency Models Client-centric consistency models are consistency models that involve the notion of a client or session in some way. The eventual consistency model says that if you stop changing values, then after some undefined amount of time all replicas will agree on the same value. Since it is trivially satisfiable (liveness property only), it is useless without supplemental information.

41 Eventual Consistency If no new updates are made to the data object, eventually all accesses will return the last updated value. Special form of weak consistency Eventual network partitions tolerated Requires conflict resolution mechanisms to disseminate write The message exchange is a.k.a. anti-entropy

42 Dynamo Service providing persistent data storage to other services running on Amazon s distributed computing infrastructure. Dynamo: Amazon s Highly Available Key-value Store, SOSP 2007 First paper discussing implementation of weak consistent, highly available data replication service

43 Amazon s Platform Architecture Client Requests Page Rendering Services Request Routing Stateless Aggregator Services Request Routing Stateful Services S3 Datastores Dynamo Instances

44 Design Issues Replication for high availability and durability replication technique: synchronous or asynchronous? conflict resolution: when and who? Dynamo s goal is to be always writable rejecting writes may result in poor Amazon customer experience data store needs to be highly available for writes, e.g., accept writes during failures and allow write conversations without prior context Design choices optimistic (asynchronous) replication for non-blocking writes conflict resolution on read operation for high write throughput conflict resolution by client for user-perceived consistency

45 Design Principles Incremental Scalability scale out (and in) one node at a time minimal impact on both operators of the systems and the system itself Symmetry every node should have the same set of responsibilities no distinguished nodes or nodes that take on a special role symmetry simplifies system provisioning and maintenance Decentralization favor decentralized peer-to-peer techniques achieve a simpler, more scalable, and more available system Heterogeneity exploit heterogeneity in the underlying infrastructure load distribution proportional to capabilities of individual servers adding new nodes with higher capacity without upgrading all nodes

46 Interfaces Data objects (blobs) are identified through an hash key Two single object operations, no group operations Read operation: GET(K) [OBJECT(S), CONTEXT] Locate object replicas associated with key k Return single object or multiple conflicting version of the same object with context Write operation: PUT(K, OBJECT, CONTEXT) Determine where the replicas should be placed based on key k Write the replicas to disks Context System metadata, opaque to the caller Include information on object, such as versioning

47 Partitioning Consisten Hashing Load distribution Arrival and departures have neighbor impact E keys A D C B

48 Replication Every data item is replicated at N hosts N is configured per-instance of Dynamo each key k is assigned a coordinator host that handles write requests for k coordinator is also in charge of replication of data items within its range Coordinator stores data item with key k locally Coordinator stores data item at N-1 clockwise successors nodes Every node is responsible for the region of the ring between itself and its predecessor Preference list enumerates nodes that are responsible for storing a key k contains more than N nodes to account for node failures

49 Data Versioning Replication performed after a response is sent to a client This is called asynchronous replication May result in inconsistencies under network partitions But operations should not be lost add to cart should not be rejected but also not forgotten If add to cart is performed when latest version is not available it is performed on an older version We may have different versions of a key/value pair Once a partition heals versions are merged The goal is not to lose any add to cart Data versioning technique: Vector clocks Capture causality between different versions of an object

50 Vector Clocks Each write to a key k is associated with a vector clock VC(k) VC(k) is an array (map) of integers In theory: one entry VC(k)[i] for each node i When node i handles a write of key k it increments VC(k)[i] VCs are included in the context of the put call In practice: VC(k) will not have many entries (only nodes from the preference list should normally have entries), and Dynamo truncates entries if more than a threshold (say 10)

51 Data Versioning 1. Client A writes new object D node N1 writes version D1 2. Client A updates object D node N1 writes version D2 3. Client A updates object D node N2 writes version D3 4. Client B reads and updates object D node N3 writes version D4 5. Client C reads object D (i.e., D3 and D4) client performs reconciliation node N1 write version D5 Dx[Ny,Z] WRITE handled by N1 1 object version logical clock handling node D1[N1,1] WRITE handled by N1 2 D2[N1,2] WRITE handled 3 by N2 4 D3[N1,2][N2,1] Reconciliation and WRITE handled by N1 WRITE handled by N3 D4[N1,2][N3,1] 5 D5[N1,3][N2,1] [N3,1]

52 Handling PUT/GET Any node can receive GET/PUT request for any key. This node is selected by Generic load balancer By a client library that immediately goes to coordinator nodes in a preference list If the request comes from the load balancer Node serves the request only if in preference list Otherwise, the node routes the request to the first node in preference list Each node has routing info to all other nodes Extended preference list: N nodes from preference list + some additional nodes (following the circle) to account for failures Failure-free case: nodes from preference list are involved in get/put Failures case: first N alive nodes from extended preference list are involved

53 R and W R = number of nodes that need to participate in a GET W = number of nodes that need to participate in a PUT R + W > N (a quorum system) Handling PUT (by coordinator) // rough sketch 1. Generate new VC, write new version locally 2. Send value and VC to N selected nodes from preference list 3. Wait for W -1 Handling GET (by coordinator) // rough sketch 1. Send read to N selected nodes from preference list 2. Wait for R 3. Select highest versions per VC, return all such versions 4. Reconcile/merge different versions 5. Write back reconciled version

54 Quorum Systems Formally, a quorum system S = {S1,, SN} is a collection of quorum sets Si U such that two quorum sets have at least an element in common For replication, we consider two quorum sets, a read quorum R and a write quorum W Rules: 1. Any read quorum must overlap with any write quorum 2. Any two write quorums must overlap U is the set of replicas, i.e., U = N

55 Quorum Rules Read rule: R + W > N read and write quorums overlap Write rule: 2 W > N two write quorums overlap The quorum sizes determine the costs for read and write operations Minimum quorum sizes for are min W = N min R = N 2 Write quorums requires majority Read quorum requires at least half of the nodes Typical Dynamo (R,W,N) = (2,2,3)

56 Handling Failures N selected nodes are the first N healthy nodes Might change from request to request Hence these quorums are Sloppy quorums Sloppy vs. strict quorums sloppy allow availability under a much wider range of partitions (failures) but sacrifice consistency Also, important to handle failures of an entire data center Power outages, cooling failures, network failures, disasters Preference list accounts for this (nodes spread across data centers) Hinted Handoff if a node is unreachable, a hinted replica is sent to next healthy node nodes receive hinted replicas keep them in a separate database hinted replica is delivered to original node when it recovers

57 Replica Synchronization Using vector clocks, concurrent and out of date updates can be detected. Performing read repair then becomes possible In some cases (concurrent changes) we need to ask the client to pick a value. What about node rejoining? Nodes rejoining the cluster after being partitioned When a failed node is replaced or partially recovered Replica synchronization is used: to bring nodes up to date after a failure for periodically synchronizing replicas with each other

58 Replica Reconciliation A node holding data received via hinted handoff may crash before it can pass data to unavailable node in preference list Need another way to ensure each (k, v) pair replicated N times Nodes nearby on ring periodically compare the (k, v) pairs they hold, and copy any they are missing that are held by the other

59 Merkle Trees Idea: hierarchically summarize the (k, v) pairs a node holds by ranges of keys Leaf node: hash of one (k, v) pair Internal node: hash of concatenation of children Compare roots; if match, done If don t match, compare children; recur Benefits minimize the amount of data to be transferred for synch reduce the number of required disk reads

60 Anti Entropy in Dynamo Dynamo uses Merkle trees for anti-entropy as follows each node maintains a separate Merkle tree for each key range it hosts two nodes exchange the root of the Merkle tree of the key ranges that they host in common using the synch mechanism described before, the nodes determine if they have any differences and perform the appropriate synch

Distributed Data Stores

Distributed Data Stores Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Seth Gilbert Nancy Lynch Abstract When designing distributed web services, there are three properties that

More information

Eventually Consistent

Eventually Consistent Historical Perspective In an ideal world there would be only one consistency model: when an update is made all observers would see that update. The first time this surfaced as difficult to achieve was

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Geo-Replication in Large-Scale Cloud Computing Applications

Geo-Replication in Large-Scale Cloud Computing Applications Geo-Replication in Large-Scale Cloud Computing Applications Sérgio Garrau Almeida sergio.garrau@ist.utl.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Cloud computing applications

More information

The Cloud Trade Off IBM Haifa Research Storage Systems

The Cloud Trade Off IBM Haifa Research Storage Systems The Cloud Trade Off IBM Haifa Research Storage Systems 1 Fundamental Requirements form Cloud Storage Systems The Google File System first design consideration: component failures are the norm rather than

More information

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know

More information

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election

More information

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful. Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one

More information

Cassandra A Decentralized, Structured Storage System

Cassandra A Decentralized, Structured Storage System Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922

More information

CS435 Introduction to Big Data

CS435 Introduction to Big Data CS435 Introduction to Big Data Final Exam Date: May 11 6:20PM 8:20PM Location: CSB 130 Closed Book, NO cheat sheets Topics covered *Note: Final exam is NOT comprehensive. 1. NoSQL Impedance mismatch Scale-up

More information

INFO5011 Advanced Topics in IT: Cloud Computing

INFO5011 Advanced Topics in IT: Cloud Computing INFO5011 Advanced Topics in IT: Cloud Computing Week 5: Distributed Data Management: From 2PC to Dynamo Dr. Uwe Röhm School of Information Technologies Outline Distributed Data Processing Data Partitioning

More information

Availability Digest. MySQL Clusters Go Active/Active. December 2006

Availability Digest. MySQL Clusters Go Active/Active. December 2006 the Availability Digest MySQL Clusters Go Active/Active December 2006 Introduction MySQL (www.mysql.com) is without a doubt the most popular open source database in use today. Developed by MySQL AB of

More information

A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011

A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011 A Review of Column-Oriented Datastores By: Zach Pratt Independent Study Dr. Maskarinec Spring 2011 Table of Contents 1 Introduction...1 2 Background...3 2.1 Basic Properties of an RDBMS...3 2.2 Example

More information

Consistency Trade-offs for SDN Controllers. Colin Dixon, IBM February 5, 2014

Consistency Trade-offs for SDN Controllers. Colin Dixon, IBM February 5, 2014 Consistency Trade-offs for SDN Controllers Colin Dixon, IBM February 5, 2014 The promises of SDN Separa&on of control plane from data plane Logical centraliza&on of control plane Common abstrac&ons for

More information

Transactions and ACID in MongoDB

Transactions and ACID in MongoDB Transactions and ACID in MongoDB Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently

More information

Middleware and Distributed Systems. System Models. Dr. Martin v. Löwis. Freitag, 14. Oktober 11

Middleware and Distributed Systems. System Models. Dr. Martin v. Löwis. Freitag, 14. Oktober 11 Middleware and Distributed Systems System Models Dr. Martin v. Löwis System Models (Coulouris et al.) Architectural models of distributed systems placement of parts and relationships between them e.g.

More information

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies

More information

Distributed Systems. Tutorial 12 Cassandra

Distributed Systems. Tutorial 12 Cassandra Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse

More information

Distributed Software Systems

Distributed Software Systems Consistency and Replication Distributed Software Systems Outline Consistency Models Approaches for implementing Sequential Consistency primary-backup approaches active replication using multicast communication

More information

Design Patterns for Distributed Non-Relational Databases

Design Patterns for Distributed Non-Relational Databases Design Patterns for Distributed Non-Relational Databases aka Just Enough Distributed Systems To Be Dangerous (in 40 minutes) Todd Lipcon (@tlipcon) Cloudera June 11, 2009 Introduction Common Underlying

More information

Introduction to NOSQL

Introduction to NOSQL Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo

More information

Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons. Dr Robert N. M. Watson

Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons. Dr Robert N. M. Watson Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons Dr Robert N. M. Watson 1 Last 3me Saw how we can build ordered mul3cast Messages between processes in a group Need to dis3nguish

More information

Data Consistency on Private Cloud Storage System

Data Consistency on Private Cloud Storage System Volume, Issue, May-June 202 ISS 2278-6856 Data Consistency on Private Cloud Storage System Yin yein Aye University of Computer Studies,Yangon yinnyeinaye.ptn@email.com Abstract: Cloud computing paradigm

More information

Database Replication Techniques: a Three Parameter Classification

Database Replication Techniques: a Three Parameter Classification Database Replication Techniques: a Three Parameter Classification Matthias Wiesmann Fernando Pedone André Schiper Bettina Kemme Gustavo Alonso Département de Systèmes de Communication Swiss Federal Institute

More information

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing) 1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication

More information

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00 Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn

More information

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas 3. Replication Replication Goal: Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas Problems: Partial failures of replicas and messages No

More information

Big Data & Scripting storage networks and distributed file systems

Big Data & Scripting storage networks and distributed file systems Big Data & Scripting storage networks and distributed file systems 1, 2, adaptivity: Cut-and-Paste 1 distribute blocks to [0, 1] using hash function start with n nodes: n equal parts of [0, 1] [0, 1] N

More information

High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es

High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es High Throughput Computing on P2P Networks Carlos Pérez Miguel carlos.perezm@ehu.es Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured

More information

How to Choose Between Hadoop, NoSQL and RDBMS

How to Choose Between Hadoop, NoSQL and RDBMS How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 4. Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ NoSQL Overview Main objective:

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11 Informix Dynamic Server May 2007 Availability Solutions with Informix Dynamic Server 11 1 Availability Solutions with IBM Informix Dynamic Server 11.10 Madison Pruet Ajay Gupta The addition of Multi-node

More information

High Availability Solutions for the MariaDB and MySQL Database

High Availability Solutions for the MariaDB and MySQL Database High Availability Solutions for the MariaDB and MySQL Database 1 Introduction This paper introduces recommendations and some of the solutions used to create an availability or high availability environment

More information

Database Replication with Oracle 11g and MS SQL Server 2008

Database Replication with Oracle 11g and MS SQL Server 2008 Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely

More information

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Jialin Zhang Tsinghua University zhanggl02@mails.tsinghua.edu.cn Wei Chen Microsoft Research Asia weic@microsoft.com Abstract

More information

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Cloud DBMS: An Overview Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Outline Definition and requirements S through partitioning A through replication Problems of traditional DDBMS Usage analysis: operational

More information

P2P Storage System. Presented by: Aakash Therani, Ankit Jasuja & Manish Shah

P2P Storage System. Presented by: Aakash Therani, Ankit Jasuja & Manish Shah P2P Storage System Presented by: Aakash Therani, Ankit Jasuja & Manish Shah What is a P2P storage system? Peer-to-Peer(P2P) storage systems leverage the combined storage capacity of a network of storage

More information

NoSQL. Thomas Neumann 1 / 22

NoSQL. Thomas Neumann 1 / 22 NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

High Availability Storage

High Availability Storage High Availability Storage High Availability Extensions Goldwyn Rodrigues High Availability Storage Engineer SUSE High Availability Extensions Highly available services for mission critical systems Integrated

More information

Distributed Data Management

Distributed Data Management Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Transactional Support for SDN Control Planes "

Transactional Support for SDN Control Planes Transactional Support for SDN Control Planes Petr Kuznetsov Telecom ParisTech WTTM, 2015 Software Defined Networking An emerging paradigm in computer network management Separate forwarding hardware (data

More information

MASTER PROJECT. Resource Provisioning for NoSQL Datastores

MASTER PROJECT. Resource Provisioning for NoSQL Datastores Vrije Universiteit Amsterdam MASTER PROJECT - Parallel and Distributed Computer Systems - Resource Provisioning for NoSQL Datastores Scientific Adviser Dr. Guillaume Pierre Author Eng. Mihai-Dorin Istin

More information

A Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System

A Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System A Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System Sashi Tarun Assistant Professor, Arni School of Computer Science and Application ARNI University, Kathgarh,

More information

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb, Consulting MTS The following is intended to outline our general product direction. It is intended for information

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

Outline. Clouds of Clouds lessons learned from n years of research Miguel Correia

Outline. Clouds of Clouds lessons learned from n years of research Miguel Correia Dependability and Security with Clouds of Clouds lessons learned from n years of research Miguel Correia WORKSHOP ON DEPENDABILITY AND INTEROPERABILITY IN HETEROGENEOUS CLOUDS (DIHC13) August 27 th 2013,

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How

More information

High Availability Design Patterns

High Availability Design Patterns High Availability Design Patterns Kanwardeep Singh Ahluwalia 81-A, Punjabi Bagh, Patiala 147001 India kanwardeep@gmail.com +91 98110 16337 Atul Jain 135, Rishabh Vihar Delhi 110092 India jain.atul@wipro.com

More information

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper High Availability with Postgres Plus Advanced Server An EnterpriseDB White Paper For DBAs, Database Architects & IT Directors December 2013 Table of Contents Introduction 3 Active/Passive Clustering 4

More information

This paper defines as "Classical"

This paper defines as Classical Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data : High-throughput and Scalable Storage Technology for Streaming Data Munenori Maeda Toshihiro Ozawa Real-time analytical processing (RTAP) of vast amounts of time-series data from sensors, server logs,

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network

More information

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper Connectivity Alliance Access 7.0 Database Recovery Information Paper Table of Contents Preface... 3 1 Overview... 4 2 Resiliency Concepts... 6 2.1 Database Loss Business Impact... 6 2.2 Database Recovery

More information

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases Distributed Architectures Distributed Databases Simplest: client-server Distributed databases: two or more database servers connected to a network that can perform transactions independently and together

More information

Client/Server and Distributed Computing

Client/Server and Distributed Computing Adapted from:operating Systems: Internals and Design Principles, 6/E William Stallings CS571 Fall 2010 Client/Server and Distributed Computing Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Traditional

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Architecture Chapter Outline Distributed transactions (quick

More information

SCALABILITY AND AVAILABILITY

SCALABILITY AND AVAILABILITY SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase

More information

Data Management in the Cloud

Data Management in the Cloud Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server

More information

Efficient Fault-Tolerant Infrastructure for Cloud Computing

Efficient Fault-Tolerant Infrastructure for Cloud Computing Efficient Fault-Tolerant Infrastructure for Cloud Computing Xueyuan Su Candidate for Ph.D. in Computer Science, Yale University December 2013 Committee Michael J. Fischer (advisor) Dana Angluin James Aspnes

More information

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014 Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability

More information

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju Chapter 7: Distributed Systems: Warehouse-Scale Computing Fall 2011 Jussi Kangasharju Chapter Outline Warehouse-scale computing overview Workloads and software infrastructure Failures and repairs Note:

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper Connectivity Alliance 7.0 Recovery Information Paper Table of Contents Preface... 3 1 Overview... 4 2 Resiliency Concepts... 6 2.1 Loss Business Impact... 6 2.2 Recovery Tools... 8 3 Manual Recovery Method...

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

Distributed Database Management Systems

Distributed Database Management Systems Distributed Database Management Systems (Distributed, Multi-database, Parallel, Networked and Replicated DBMSs) Terms of reference: Distributed Database: A logically interrelated collection of shared data

More information

Using Object Database db4o as Storage Provider in Voldemort

Using Object Database db4o as Storage Provider in Voldemort Using Object Database db4o as Storage Provider in Voldemort by German Viscuso db4objects (a division of Versant Corporation) September 2010 Abstract: In this article I will show you how

More information

Berkeley Ninja Architecture

Berkeley Ninja Architecture Berkeley Ninja Architecture ACID vs BASE 1.Strong Consistency 2. Availability not considered 3. Conservative 1. Weak consistency 2. Availability is a primary design element 3. Aggressive --> Traditional

More information

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing DFSgc Distributed File System for Multipurpose Grid Applications and Cloud Computing Introduction to DFSgc. Motivation: Grid Computing currently needs support for managing huge quantities of storage. Lacks

More information

Consistency Models for Cloud-based Online Games: the Storage System s Perspective

Consistency Models for Cloud-based Online Games: the Storage System s Perspective Consistency Models for Cloud-based Online Games: the Storage System s Perspective Ziqiang Diao Otto-von-Guericke University Magdeburg 39106 Magdeburg, Germany diao@iti.cs.uni-magdeburg.de ABSTRACT The

More information

NoSQL Databases. Nikos Parlavantzas

NoSQL Databases. Nikos Parlavantzas !!!! NoSQL Databases Nikos Parlavantzas Lecture overview 2 Objective! Present the main concepts necessary for understanding NoSQL databases! Provide an overview of current NoSQL technologies Outline 3!

More information

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps (Pessimistic) stamp Ordering Another approach to concurrency control: Assign a timestamp ts(t) to transaction T at the moment it starts Using Lamport's timestamps: total order is given. In distributed

More information

A Framework for Highly Available Services Based on Group Communication

A Framework for Highly Available Services Based on Group Communication A Framework for Highly Available Services Based on Group Communication Alan Fekete fekete@cs.usyd.edu.au http://www.cs.usyd.edu.au/ fekete Department of Computer Science F09 University of Sydney 2006,

More information

The Hadoop Distributed File System

The Hadoop Distributed File System The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS

More information

USING REPLICATED DATA TO REDUCE BACKUP COST IN DISTRIBUTED DATABASES

USING REPLICATED DATA TO REDUCE BACKUP COST IN DISTRIBUTED DATABASES USING REPLICATED DATA TO REDUCE BACKUP COST IN DISTRIBUTED DATABASES 1 ALIREZA POORDAVOODI, 2 MOHAMMADREZA KHAYYAMBASHI, 3 JAFAR HAMIN 1, 3 M.Sc. Student, Computer Department, University of Sheikhbahaee,

More information

Chapter 10: Distributed DBMS Reliability

Chapter 10: Distributed DBMS Reliability Chapter 10: Distributed DBMS Reliability Definitions and Basic Concepts Local Recovery Management In-place update, out-of-place update Distributed Reliability Protocols Two phase commit protocol Three

More information

A survey of big data architectures for handling massive data

A survey of big data architectures for handling massive data CSIT 6910 Independent Project A survey of big data architectures for handling massive data Jordy Domingos - jordydomingos@gmail.com Supervisor : Dr David Rossiter Content Table 1 - Introduction a - Context

More information

these three NoSQL databases because I wanted to see a the two different sides of the CAP

these three NoSQL databases because I wanted to see a the two different sides of the CAP Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the

More information

CAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt

CAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt CAP Theorem and Distributed Database Consistency Syed Akbar Mehdi Lara Schmidt 1 Classical Database Model T2 T3 T1 Database 2 Databases these days 3 Problems due to replicating data Having multiple copies

More information

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Scalability of web applications. CSCI 470: Web Science Keith Vertanen Scalability of web applications CSCI 470: Web Science Keith Vertanen Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing Approaches

More information

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example MapReduce MapReduce and SQL Injections CS 3200 Final Lecture Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design

More information

Name: 1. CS372H: Spring 2009 Final Exam

Name: 1. CS372H: Spring 2009 Final Exam Name: 1 Instructions CS372H: Spring 2009 Final Exam This exam is closed book and notes with one exception: you may bring and refer to a 1-sided 8.5x11- inch piece of paper printed with a 10-point or larger

More information

DISTRIBUTED AND PARALLELL DATABASE

DISTRIBUTED AND PARALLELL DATABASE DISTRIBUTED AND PARALLELL DATABASE SYSTEMS Tore Risch Uppsala Database Laboratory Department of Information Technology Uppsala University Sweden http://user.it.uu.se/~torer PAGE 1 What is a Distributed

More information

The CAP theorem and the design of large scale distributed systems: Part I

The CAP theorem and the design of large scale distributed systems: Part I The CAP theorem and the design of large scale distributed systems: Part I Silvia Bonomi University of Rome La Sapienza www.dis.uniroma1.it/~bonomi Great Ideas in Computer Science & Engineering A.A. 2012/2013

More information

Chapter 6: distributed systems

Chapter 6: distributed systems Chapter 6: distributed systems Strongly related to communication between processes is the issue of how processes in distributed systems synchronize. Synchronization is all about doing the right thing at

More information

Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers

Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers BASEL UNIVERSITY COMPUTER SCIENCE DEPARTMENT Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers Distributed Information Systems (CS341/HS2010) Report based on D.Kassman, T.Kraska,

More information

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems

More information

Distribution transparency. Degree of transparency. Openness of distributed systems

Distribution transparency. Degree of transparency. Openness of distributed systems Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed

More information

MapReduce Jeffrey Dean and Sanjay Ghemawat. Background context

MapReduce Jeffrey Dean and Sanjay Ghemawat. Background context MapReduce Jeffrey Dean and Sanjay Ghemawat Background context BIG DATA!! o Large-scale services generate huge volumes of data: logs, crawls, user databases, web site content, etc. o Very useful to be able

More information

Creating A Highly Available Database Solution

Creating A Highly Available Database Solution WHITE PAPER Creating A Highly Available Database Solution Advantage Database Server and High Availability TABLE OF CONTENTS 1 Introduction 1 High Availability 2 High Availability Hardware Requirements

More information

Database Replication with MySQL and PostgreSQL

Database Replication with MySQL and PostgreSQL Database Replication with MySQL and PostgreSQL Fabian Mauchle Software and Systems University of Applied Sciences Rapperswil, Switzerland www.hsr.ch/mse Abstract Databases are used very often in business

More information

FAWN - a Fast Array of Wimpy Nodes

FAWN - a Fast Array of Wimpy Nodes University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed

More information

High Availability and Disaster Recovery Solutions for Perforce

High Availability and Disaster Recovery Solutions for Perforce High Availability and Disaster Recovery Solutions for Perforce This paper provides strategies for achieving high Perforce server availability and minimizing data loss in the event of a disaster. Perforce

More information