Distributed Systems (DV47) Replication Fall 20 Replication What is Replication? Make multiple copies of a data object and ensure that all copies are identical Two Types of access; reads, and writes (updates) Reasons, have a backup plan: Handle more work (e.g. web-servers) Keep data safe (fault tolerance) Reduce latencies (DN's and aching) Keep data available Motivation 4 Replication requirements Transparency (illusion of a single copy) lients must be unaware of replication onsistency Obtain identical results from different copies (is that true?) lient Logical object Physical Object Physical Object Not always identical: Some have received updates Motivation Problems that you may find Multiple clients access replicas oncurrent access, rather than exclusive Operations are interleaved How do we ensure correctness? Replica placement Placing servers Placing content Overhead required to keep replicas up to date Global synchronization (Atomic operations) Motivation 6
Types of ordering adapted to replication Some definitions orrectness FIFO if a client issues r and then r, any correct Replica Manager that handles r handles r before it ausal if the issuing of r happened-before issuing r, then any correct Replica Manager that handles r handles r before it Total if a correct Replica Manager handles r before r, then any correct that handles r handles r before it Sequential consistency property Order of operations is consistent with the program order in which each individual process executed them Linearizability property Order of operations is consistent with the real times at which the operations occurred during execution Basic correctness property An interleaved sequence of operations must meet the specification of a single correct copy of the object(s), i.e., clients can not make a difference between replicated systems and single copy ones. 7 8 Example of interleaved operations for 2 clients: orrectness : A, B, 2: d, e, f Real Order during execution: A, B, d,, e, f An interleaving with sequential consistency: A, B, d, e, f, Interleaving with linearizability: A, B, d,, e, f 9 0 Passive (primary-backup) replication Passive replication One primary replica manager, many backup replicas If primary fails, backups can take its place (election!) Implements linearizability if: Primary A failing primary is replaced by a unique backup s agree on which operations were performed before primary crashed View-synchronous group communication! Figure adapted from Instructor s Guide for oulouris, Dollimore, Kindberg and Blair, Distributed Systems: oncepts and Design Edn. Pearson Education 202 based on Figure 8. 2
Steps of passive replication. Request Front end issues request with unique ID 2. oordination Primary checks if request has been carried out, if so, returns cached response. Execution Perform operation, cache results 4. Agreement Primary sends updated state to backups, backups reply with Ack.. Response Primary sends result to front end, which forwards to the client Primary 2 4 What happens if the primary crashes? Before agreement After agreement Active replication 4 Active replication s play equivalent roles All replica managers carry out all operations Front ends multicast one request at a time (FIFO) Requests are totally ordered Implements sequential consistency Tolerate Byzantine failures Models of Replication Steps of active replication. Request Front end adds unique identifier to request, multicasts to s 2. oordination Totally ordered request delivery to s. Execution Each executes request 4. Agreement Not needed. Response All s respond to front end, front end interprets response and forwards response to client 2 2 2 Replication: models Figure adapted from Instructor s Guide for oulouris, Dollimore, Kindberg and Blair, Distributed Systems: oncepts and Design Edn. Pearson Education 202 based on Figure 8.4 6 Advantages of Active replication omparing active and passive replication Simple Same code everywhere Failure transparent 7 Both handle crash failures (but differently) Only active can handle arbitrary failures Passive may suffer from large overheads Optimizations? Send reads to backups in passive Lose linearizability property! Send reads to specific in active Lose fault tolerance Exploit commutativity of requests to avoid ordering requests in active 8
Semi Active Replication Intermediate soluyion between Active and Passive replication Main difference with active replication each time replicas have to make a non-deterministic decision, a process, called the leader, makes the choice and sends it to the followers 9 omparing active and passive replication Both handle crash failures (but differently) Only active can handle arbitrary failures Passive may suffer from large overheads Optimizations? Send reads to backups in passive Lose linearizability property! Send reads to specific in active Lose fault tolerance Exploit commutativity of requests to avoid ordering requests in active 20 Problem Replication vs coding 2 How do you make replicas In P2P systems loud Systems RAID and RAID 6 Option : Make replicas and copy the data :) Option 2: Use coding theory to come-up with something intelligent Network coding Erasure coding 22 What in erasure coding? Example: Replication vs Erasure coding () Suppose you have a large file that you want to replicate Divide that file into m pieces Run an erasure coding algorithm on the pieces to produce m+n pieces You will be able to reconstruct the file if you have any m pieces 2 One large file, let us say, of size TB. One large distributed system with 0000 servers If you replicate the file on machines in your network, you require TB to host the file and its replicas To have higher redundancy, you need more space If the three machines fail, file lost 24
Some probability For Replication Let Ɛ be the maximum probability of unavailability tolerated for an object o a is the average node availability Ɛ = P(object o is unavailable) = P( all k replicas of o are unavailable) = P (one replica is unavailable) k = ( - a) k Taking the log of both sides: k= log Ɛ / log(-a) 2 26 Example: Replication vs Erasure coding (2) Take the same file But chop it in 0 parts (m) of equal size, i.e., 00 GB Set n in the erasure coding algorithm to Run the algorithm to produce m+n pieces, all of size 00 GB Distribute on machines out of the 000 machines, i.e., total disk size used,. TB Now, if up to of the machines fail, you will still be able to reproduce the file And that is black magic (Using Galois Fields and XoRs) 27 28 You're just too good to be true Points against coding Sounds like we just solved the problem of data replication But have we? an you think of why are people still using good ol' normal replication? 29 omplexity added to the system More complex systems, more bugs, harder testing, longer implementation times Download/read latency Now you need to get your data from m machines with variable latency What if you just want to read the first 00 lines in a text file? Easy with replication Not easy with coding 0
Required Readings Summary Optional Readings Summary Erasure oding vs. Replication: A Quantitative omparison https://docs.switzernet.com/people/emin-gabrielyan/0602-capillaryreferences/ref/weatherspoon02.pdf Extra reading (some bonus questions will be based on this paper) A Tutorial on Reed Solomon oding for Fault-Tolerance in Understanding Replication in Databases and Distributed Systems (Until page, the rest is highly recommended to read, but optional) http://infoscience.epfl.ch/record/226/files/i_teh_report_999.pdf RAID-like Systems by James S. Plank Available: http://web.eecs.utk.edu/~plank/plank/papers/s-96-2.pdf Note, you probably know everything you need as background to understand this. It will take some of you outside their comfort zone (Mathematics, yucky!), but it is worth your effort! I will be happy to help anyone after the 2 nd of October on this :) 2 Next Lecture onsistency