DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES

Similar documents
Database Replication: a Tale of Research across Communities

Database Replication Techniques: a Three Parameter Classification

Data Replication and Snapshot Isolation. Example: Cluster Replication

Data Distribution with SQL Server Replication

Survey on Comparative Analysis of Database Replication Techniques

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

Byzantium: Byzantine-Fault-Tolerant Database Replication

Amr El Abbadi. Computer Science, UC Santa Barbara

Database Replication with Oracle 11g and MS SQL Server 2008

CAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt

Database Replication: A Survey of Open Source and Commercial Tools

Tashkent: Uniting Durability with Transaction Ordering for High-Performance Scalable Database Replication

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

Transaction Management Overview

New method for data replication in distributed heterogeneous database systems

Replication: Analysis & Tackle in Distributed Real Time Database System

Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation

Conflict-Aware Load-Balancing Techniques for Database Replication

IV Distributed Databases - Motivation & Introduction -

Geo-Replication in Large-Scale Cloud Computing Applications

Data Management in the Cloud

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Conflict-Aware Load-Balancing Techniques for Database Replication

Synchronization and replication in the context of mobile applications

Real-time Data Replication

Conflict-Aware Load-Balancing Techniques for Database Replication

A Taxonomy of Partitioned Replicated Cloud-based Database Systems

Distributed Systems. Tutorial 12 Cassandra

Ph.D. Thesis Proposal Database Replication in Wide Area Networks

Chapter 3 - Data Replication and Materialized Integration

Transactions and the Internet

Introduction to NOSQL

Correctness Criteria for Database Replication: Theoretical and Practical Aspects

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

Framework Model for Database Replication within the Availability Zones

How To Manage A Multi-Tenant Database In A Cloud Platform

Lecture 7: Concurrency control. Rasmus Pagh

Chapter 18: Database System Architectures. Centralized Systems

Data Consistency on Private Cloud Storage System

Pronto: High Availability for Standard Off-the-shelf Databases

Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures.

Efficient Middleware for Byzantine Fault Tolerant Database Replication

Principles of Distributed Database Systems

On Non-Intrusive Workload-Aware Database Replication

Goals. Managing Multi-User Databases. Database Administration. DBA Tasks. (Kroenke, Chapter 9) Database Administration. Concurrency Control

The PostgreSQL Database And Its Architecture

Segmentation in a Distributed Real-Time Main-Memory Database

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK

Syllabus for Course : Database Systems Engineering at Kinneret College

DISTRIBUTED AND PARALLELL DATABASE

TransPeer: Adaptive Distributed Transaction Monitoring for Web2.0 applications

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Transactions and ACID in MongoDB

CMSC724: Concurrency

Low-Latency Multi-Datacenter Databases using Replicated Commit

The Cost of Increased Transactional Correctness and Durability in Distributed Databases

Designing a Data Solution with Microsoft SQL Server 2014

A Replication Protocol for Real Time database System

Designing a Data Solution with Microsoft SQL Server

Heterogeneous Database Replication Gianni Pucciani

Scalable and Highly Available Database Replication through Dynamic Multiversioning. Kaloian Manassiev

Course 20465: Designing a Data Solution with Microsoft SQL Server

Designing a Data Solution with Microsoft SQL Server

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

Transactions and Concurrency Control. Goals. Database Administration. (Manga Guide to DB, Chapter 5, pg , ) Database Administration

Guerrilla Warfare? Guerrilla Tactics - Performance Testing MS SQL Server Applications

Relational Databases in the Cloud

MS 20465C: Designing a Data Solution with Microsoft SQL Server

Managing Transaction Conflicts in Middleware-based Database Replication Architectures

Information Systems. Computer Science Department ETH Zurich Spring 2012

Serializability, not Serial: Concurrency Control and Availability in Multi Datacenter Datastores

Report Data Management in the Cloud: Limitations and Opportunities

Course 20465C: Designing a Data Solution with Microsoft SQL Server

Conflict-Aware Scheduling for Dynamic Content Applications

Practical Cassandra. Vitalii

Transcription:

DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES Bettina Kemme Dept. of Computer Science McGill University Montreal, Canada Gustavo Alonso Systems Group Dept. of Computer Science ETH Zurich, Switzerland ETH, Zurich and McGill University, Montreal VLDB 2010 Singapore

Across communities Everything inside Everything fails the database Only Protocols Only correctness performance [Kemme, Alonso, ICDCS 98] Implementation matters matters [Kemme, Alonso, VLDB2000] Postgres R (Dragon project) Database person Distributed systems person

Postgres R Intro to Replication Postgres R In perspective Systems Today The next 10 years

A brief introduction to database replication

DBS DBS Database Copy DBS Database Copy DBS Database Copy DBS Scalability Fault tolerance Fast access Special purpose copies

Primary Copy vs. Update Everywhere

Eager (synchr.) vs. Lazy (asynchr.) 1 2 2 1

Theory of replication 10 years ago PRIMARY COPY UPDATE EVERYWHERE LAZY EAGER

Replication in practice 10 years ago PRIMARY COPY UPDATE EVERYWHERE LAZY EAGER

Replication in 2000

Read one / Write All Distributed locking (writes) 2 Phase Commit

Write Write BOT Read Write Read Write Write Write 2PC 2PC 2PC

The Dangers of Replication Gray et al. SIGMOD 1996

Response Time and Messages T= centralized database T= replicated database update: 2N messages 2PC 17

and that s not all Network becomes an issue Messages = copies x write operations Quorums? Reads must be local for complex SQL operations Different in key value stores (e.g., Cassandra)

Our goal Can we get scalability and consistency when replicating a database?

Postgres R in detail

Fundamentals Exploitation of group communication systems Ordering semantics Affect isolation / concurrency control Delivery semantics Affect atomicity

Key insight in Postgres R BEFORE AFTER Pre Ordering Mechanism Scheduler Scheduler Scheduler Scheduler 22

The devil is in the details Total order to serialization order Provide various levels of isolation / atomicity degrees Read operations always local Propagate changes on transaction basis No 2PC Rely on delivery guarantees Return to user once local replica commits Determinism Propagate changes vs. SQL statements

Distributed locking Response Time in ms Did we really avoided the dangers of replication? 200 150 100 50 Postgres- R 0 1 2 3 4 5 Number Servers Response Time in ms 800 600 400 200 Distributed locking 0 1 2 3 4 5 Number Servers Postrges-R removed a lot of the overhead of replication, providing scalability while maintaining strong consistency 24

In perspective

What worked Ordered and guaranteed propagation of changes through an agreement protocol external to the engine The implementation was crucial to prove the point Thinking through the optimizations / real system issues Levels of consistency

What did not work Modify the engine Today middleware based solutions Enforce serializability Today SI and session consistency Data warehousing less demanding Cloud computing has lowered the bar

Systems today

Very rich design space Applications (OLAP vs OLTP) Data layer (DB vs. others) Throughput/Response time Staleness Availability guarantees Partial vs. full replication Granularity of changes, operations

A Suite of Replication Protocols Serializability Correctness Local decisions Mssgs/txn 1 copy serializability write locks abort conflicting local read locks 2 mssgs write set confirm commit Problems very high abort rate Cursor Stability no lost updates no dirty reads write locks abort conflicting local read locks 2 mssgs write set confirm commit inconsistent reads Snapshot isolation allows write skew first writer to commit wins (deterministic) 1 mssg write set high abort rate for update hot spots Hybrid 1 copy serializability as serializability for update transactions 2 mssgs write set confirm commit requires to identify the queries 6 versions of each protocol depending on delivery guarantees and whether deferred updates or shadow copies are used (22 protocols) [Kemme and Alonso, ICDCS 98] 31

Architecture Eventually all this moved to a middleware layer above the database Middle R All systems out there today

Consistency variations Snapshot isolation, optimistic cc, conflict detection, relaxed consistency Middle R GORDA project Tashkent Pronto/Sprint C JDBC C. Amza et al.

Architectural variations Primary copy Specialized satellites VM Ganymed DBFarm SQL Azure Zimory (virtualized satellites)

Engine variations Xkoto (Teradata) Galera Continuent SQL Azure Ganymed MySQL, DB2, SQL Server, Oracle, Heterogeneous

Change unit variations Xkoto (Teradata) SQL statement propagation Continuent Log capture Determinism! SQL statement Log entries

Cloud solutions PAXOS Key value stores, files, tables Cassandra PNUTS Big Table Cloudy

With a lot of related topics Consistency (Khuzaima et al., Cahill et al.) Applications (Vandiver et al.) Agreement protocols (Lamport et al.) Determinism (Thomson et al.) Recovery (Kemme et al., Jimenez et al.)

The next 10 years

Understanding the full picture Paxos Group communication protocols differ in the exact properties they provide Often difficult to understand for outsiders Subtleties in implementation and efficiency Complex implementations Adjusting the agreement protocols to the needs of databases Properties that suffice Efficient implementation Plenty of use cases (one size does not fit all)

Database and distributed systems Databases and distributed systems have converged in practice Many similar concepts Research Work still done in separate communities Teaching Dire need for joint courses (thanks to Amr El Abbadi and Divy Agrawal from UCSB!)

Thanks Andre Schiper, Fernando Pedone, Matthias Wiesmann (EPFL) Marta Patiño, Ricardo Jiménez (UPM) The PostgreSQL community PhD and master students at ETH and McGill who have worked and are working on related ideas Many colleagues and friends...