Distributed Systems (5DV147) What is Replication? Replication. Replication requirements. Problems that you may find. Replication.



Similar documents
Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

Distributed Software Systems

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Distributed Systems Lecture 1 1

Distributed Storage Networks and Computer Forensics

Algorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Christian Schindelhauer

Middleware and Distributed Systems. System Models. Dr. Martin v. Löwis. Freitag, 14. Oktober 11

Distributed Data Management

Software Replication

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Google File System. Web and scalability

RAID Storage, Network File Systems, and DropBox

Massive Data Storage

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Lecture 36: Chapter 6

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC /23/2006

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

PIONEER RESEARCH & DEVELOPMENT GROUP

File System Reliability (part 2)

Guide to SATA Hard Disks Installation and RAID Configuration

The Cloud Trade Off IBM Haifa Research Storage Systems

COMP 7970 Storage Systems

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

PARALLELS CLOUD STORAGE

Cloud Storage over Multiple Data Centers

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons. Dr Robert N. M. Watson

TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Guide to SATA Hard Disks Installation and RAID Configuration

Database Replication Techniques: a Three Parameter Classification

High Availability and Clustering

A Framework for Highly Available Services Based on Group Communication

SAN Conceptual and Design Basics

Dr Markus Hagenbuchner CSCI319. Distributed Systems

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

CSE-E5430 Scalable Cloud Computing Lecture 2

Peer-to-peer Cooperative Backup System

Lecture 18: Reliable Storage

CSE-E5430 Scalable Cloud Computing P Lecture 5

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

In Memory Accelerator for MongoDB

Hadoop Scalability at Facebook. Dmytro Molkov YaC, Moscow, September 19, 2011

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

How To Write A Hexadecimal Program

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Hadoop and Map-Reduce. Swati Gore

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

Disk Array Data Organizations and RAID

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Overview. File Management. File System Properties. File Management

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Snapshots in Hadoop Distributed File System

Data Corruption In Storage Stack - Review

How To Create A P2P Network

Name: 1. CS372H: Spring 2009 Final Exam

Fault Tolerance in the Internet: Servers and Routers

How To Understand The Concept Of A Distributed System

Transactions and ACID in MongoDB

Magnus: Peer to Peer Backup System

PipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery. Razvan Ghitulete Vrije Universiteit

Price/performance Modern Memory Hierarchy

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Exchange DAG backup and design best practices

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Network Attached Storage. Jinfeng Yang Oct/19/2015

Chapter 10: Scalability

Big Data With Hadoop

Note: Correction to the 1997 Tutorial on Reed-Solomon Coding

Global Server Load Balancing

Separating Agreement from Execution for Byzantine Fault-Tolerant Services

BME CLEARING s Business Continuity Policy

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Appendix A Core Concepts in SQL Server High Availability and Replication

CS420: Operating Systems

The Microsoft Large Mailbox Vision

Sistemas Operativos: Input/Output Disks

Fault-Tolerant Framework for Load Balancing System

How To Understand The Power Of A Content Delivery Network (Cdn)

Transcription:

Distributed Systems (DV47) Replication Fall 20 Replication What is Replication? Make multiple copies of a data object and ensure that all copies are identical Two Types of access; reads, and writes (updates) Reasons, have a backup plan: Handle more work (e.g. web-servers) Keep data safe (fault tolerance) Reduce latencies (DN's and aching) Keep data available Motivation 4 Replication requirements Transparency (illusion of a single copy) lients must be unaware of replication onsistency Obtain identical results from different copies (is that true?) lient Logical object Physical Object Physical Object Not always identical: Some have received updates Motivation Problems that you may find Multiple clients access replicas oncurrent access, rather than exclusive Operations are interleaved How do we ensure correctness? Replica placement Placing servers Placing content Overhead required to keep replicas up to date Global synchronization (Atomic operations) Motivation 6

Types of ordering adapted to replication Some definitions orrectness FIFO if a client issues r and then r, any correct Replica Manager that handles r handles r before it ausal if the issuing of r happened-before issuing r, then any correct Replica Manager that handles r handles r before it Total if a correct Replica Manager handles r before r, then any correct that handles r handles r before it Sequential consistency property Order of operations is consistent with the program order in which each individual process executed them Linearizability property Order of operations is consistent with the real times at which the operations occurred during execution Basic correctness property An interleaved sequence of operations must meet the specification of a single correct copy of the object(s), i.e., clients can not make a difference between replicated systems and single copy ones. 7 8 Example of interleaved operations for 2 clients: orrectness : A, B, 2: d, e, f Real Order during execution: A, B, d,, e, f An interleaving with sequential consistency: A, B, d, e, f, Interleaving with linearizability: A, B, d,, e, f 9 0 Passive (primary-backup) replication Passive replication One primary replica manager, many backup replicas If primary fails, backups can take its place (election!) Implements linearizability if: Primary A failing primary is replaced by a unique backup s agree on which operations were performed before primary crashed View-synchronous group communication! Figure adapted from Instructor s Guide for oulouris, Dollimore, Kindberg and Blair, Distributed Systems: oncepts and Design Edn. Pearson Education 202 based on Figure 8. 2

Steps of passive replication. Request Front end issues request with unique ID 2. oordination Primary checks if request has been carried out, if so, returns cached response. Execution Perform operation, cache results 4. Agreement Primary sends updated state to backups, backups reply with Ack.. Response Primary sends result to front end, which forwards to the client Primary 2 4 What happens if the primary crashes? Before agreement After agreement Active replication 4 Active replication s play equivalent roles All replica managers carry out all operations Front ends multicast one request at a time (FIFO) Requests are totally ordered Implements sequential consistency Tolerate Byzantine failures Models of Replication Steps of active replication. Request Front end adds unique identifier to request, multicasts to s 2. oordination Totally ordered request delivery to s. Execution Each executes request 4. Agreement Not needed. Response All s respond to front end, front end interprets response and forwards response to client 2 2 2 Replication: models Figure adapted from Instructor s Guide for oulouris, Dollimore, Kindberg and Blair, Distributed Systems: oncepts and Design Edn. Pearson Education 202 based on Figure 8.4 6 Advantages of Active replication omparing active and passive replication Simple Same code everywhere Failure transparent 7 Both handle crash failures (but differently) Only active can handle arbitrary failures Passive may suffer from large overheads Optimizations? Send reads to backups in passive Lose linearizability property! Send reads to specific in active Lose fault tolerance Exploit commutativity of requests to avoid ordering requests in active 8

Semi Active Replication Intermediate soluyion between Active and Passive replication Main difference with active replication each time replicas have to make a non-deterministic decision, a process, called the leader, makes the choice and sends it to the followers 9 omparing active and passive replication Both handle crash failures (but differently) Only active can handle arbitrary failures Passive may suffer from large overheads Optimizations? Send reads to backups in passive Lose linearizability property! Send reads to specific in active Lose fault tolerance Exploit commutativity of requests to avoid ordering requests in active 20 Problem Replication vs coding 2 How do you make replicas In P2P systems loud Systems RAID and RAID 6 Option : Make replicas and copy the data :) Option 2: Use coding theory to come-up with something intelligent Network coding Erasure coding 22 What in erasure coding? Example: Replication vs Erasure coding () Suppose you have a large file that you want to replicate Divide that file into m pieces Run an erasure coding algorithm on the pieces to produce m+n pieces You will be able to reconstruct the file if you have any m pieces 2 One large file, let us say, of size TB. One large distributed system with 0000 servers If you replicate the file on machines in your network, you require TB to host the file and its replicas To have higher redundancy, you need more space If the three machines fail, file lost 24

Some probability For Replication Let Ɛ be the maximum probability of unavailability tolerated for an object o a is the average node availability Ɛ = P(object o is unavailable) = P( all k replicas of o are unavailable) = P (one replica is unavailable) k = ( - a) k Taking the log of both sides: k= log Ɛ / log(-a) 2 26 Example: Replication vs Erasure coding (2) Take the same file But chop it in 0 parts (m) of equal size, i.e., 00 GB Set n in the erasure coding algorithm to Run the algorithm to produce m+n pieces, all of size 00 GB Distribute on machines out of the 000 machines, i.e., total disk size used,. TB Now, if up to of the machines fail, you will still be able to reproduce the file And that is black magic (Using Galois Fields and XoRs) 27 28 You're just too good to be true Points against coding Sounds like we just solved the problem of data replication But have we? an you think of why are people still using good ol' normal replication? 29 omplexity added to the system More complex systems, more bugs, harder testing, longer implementation times Download/read latency Now you need to get your data from m machines with variable latency What if you just want to read the first 00 lines in a text file? Easy with replication Not easy with coding 0

Required Readings Summary Optional Readings Summary Erasure oding vs. Replication: A Quantitative omparison https://docs.switzernet.com/people/emin-gabrielyan/0602-capillaryreferences/ref/weatherspoon02.pdf Extra reading (some bonus questions will be based on this paper) A Tutorial on Reed Solomon oding for Fault-Tolerance in Understanding Replication in Databases and Distributed Systems (Until page, the rest is highly recommended to read, but optional) http://infoscience.epfl.ch/record/226/files/i_teh_report_999.pdf RAID-like Systems by James S. Plank Available: http://web.eecs.utk.edu/~plank/plank/papers/s-96-2.pdf Note, you probably know everything you need as background to understand this. It will take some of you outside their comfort zone (Mathematics, yucky!), but it is worth your effort! I will be happy to help anyone after the 2 nd of October on this :) 2 Next Lecture onsistency