Distributed Database Systems



Similar documents
Distributed Data Management

Distributed Databases

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Topics. Distributed Databases. Desirable Properties. Introduction. Distributed DBMS Architectures. Types of Distributed Databases

DISTRIBUTED AND PARALLELL DATABASE

Principles of Distributed Database Systems

Distributed Databases. Concepts. Why distributed databases? Distributed Databases Basic Concepts

Distributed Database Management Systems

An Overview of Distributed Databases

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps

Data Management in the Cloud

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Chapter 18: Database System Architectures. Centralized Systems

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Distributed and Parallel Database Systems

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

Lecture 7: Concurrency control. Rasmus Pagh

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Transaction Management Overview

Chapter 10: Distributed DBMS Reliability

CHAPTER 6: DISTRIBUTED FILE SYSTEMS

chapater 7 : Distributed Database Management Systems

TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES

Optimizing Performance. Training Division New Delhi

Dr Markus Hagenbuchner CSCI319. Distributed Systems

The Sierra Clustered Database Engine, the technology at the heart of

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

Database Replication Techniques: a Three Parameter Classification

Tier Architectures. Kathleen Durant CS 3200

Distributed Databases in a Nutshell

Database Management. Chapter Objectives

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Introduction to Database Management Systems

Introduction to Parallel and Distributed Databases

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

Network Attached Storage. Jinfeng Yang Oct/19/2015

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

Distributed Databases

Microkernels & Database OSs. Recovery Management in QuickSilver. DB folks: Stonebraker81. Very different philosophies

The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999

A Shared-nothing cluster system: Postgres-XC

Definition of SOA. Capgemini University Technology Services School Capgemini - All rights reserved November 2006 SOA for Software Architects/ 2

Database Tuning and Physical Design: Execution of Transactions

Transaction Processing Monitors

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

DataBlitz Main Memory DataBase System

The Classical Architecture. Storage 1 / 36

ZooKeeper. Table of contents

International Journal of Advanced Research in Computer Science and Software Engineering

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Outline. Distributed DBMSPage 4. 1

DB2 for Linux, UNIX, and Windows Performance Tuning and Monitoring Workshop

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

Distributed Database Management Systems for Information Management and Access

Oracle Database 10g. Page # The Self-Managing Database. Agenda. Benoit Dageville Oracle Corporation benoit.dageville@oracle.com

Big data management with IBM General Parallel File System

MS SQL Performance (Tuning) Best Practices:

Client Server Architecture

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

Distributed File Systems

Client/Server and Distributed Computing

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Client/Server Computing Distributed Processing, Client/Server, and Clusters

A distributed system is defined as

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

AN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT

Mind Q Systems Private Limited

Transaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit

Course Content. Transactions and Concurrency Control. Objectives of Lecture 4 Transactions and Concurrency Control

SQL Server 2012 Database Administration With AlwaysOn & Clustering Techniques

Principles and characteristics of distributed systems and environments

Agenda. Transaction Manager Concepts ACID. DO-UNDO-REDO Protocol DB101

Introduction to Database Systems. Module 1, Lecture 1. Instructor: Raghu Ramakrishnan UW-Madison

How To Understand The Concept Of A Distributed System

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Database Replication with Oracle 11g and MS SQL Server 2008

INTRODUCTION TO DATABASE SYSTEMS

IBM DB2: LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Cloud Computing at Google. Architecture

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Review: The ACID properties

The Oracle Universal Server Buffer Manager

Chapter 16 Distributed Processing, Client/Server, and Clusters

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Tushar Joshi Turtle Networks Ltd

Transcription:

Distributed Database Systems Vera Goebel Department of Informatics University of Oslo 2011 1

Contents Review: Layered DBMS Architecture Distributed DBMS Architectures DDBMS Taxonomy Client/Server Models Key Problems of Distributed DBMS Distributed data modeling Distributed query processing & optimization Distributed transaction management Concurrency Control Recovery 2

Transaction Management Functional Layers of a DBMS Interface Applications Data Model / View Management Control Semantic Integrity Control / Authorization Compilation Execution Query Processing and Optimization Storage Structures Data Access Buffer Management Consistency Concurrency Control / Logging Database 3

Dependencies among DBMS components Access Path Mgmt Transaction Mgmt Sorting component Log component (with savepoint mgmt) Central Components System Buffer Mgmt Lock component Indicates a dependency 4

Centralized DBS logically integrated physically centralized T1 T2 T3 T4 network T5 DBS Traditionally: one large mainframe DBMS + n stupid terminals 5

Distributed DBS Data logically integrated (i.e., access based on one schema) Data physically distributed among multiple database nodes Processing is distributed among multiple database nodes T1 T2 DBS1 Why a Distributed DBS? T3 network DBS3 DBS2 Performance via parallel execution - more users - quick response More data volume - max disks on multiple nodes Traditionally: m mainframes for the DBMSs + n terminals 6

DBMS Implementation Alternatives Distribution Distributed Homogeneous DBMS DBMS Distributed Homog. Federated DBMS Distributed Multi-DBMS Distributed Heterogeneous DBMS Client/Server Distribution Centralized Homogeneous DBMS DBMS Distributed Heterog. Federated DBMS Centralized Homog. Federated DBMS Distributed Heterog. Multi-DBMS Centralized Homog. Multi-DBMS Autonomy Centralized Heterogeneous DBMS Heterogeneity Centralized Heterog. Federated DBMS Centralized Heterog. Multi-DBMS 7

Common DBMS Architectural Configurations No Autonomy Federated Multi N1 N2 D1 D2 GTM & SM N3 N4 D3 D4 D1 D2 D3 D4 Fully integrated nodes Complete cooperation on: - Data Schema - Transaction Mgmt Fully aware of all other nodes in the DDBS Independent DBMSs Implement some cooperation functions : - Transaction Mgmt - Schema mapping Aware of other DBSs in the federation Fully independent DBMSs Independent, global manager tries to coordinate the DBMSs using only existing DBMS services Unaware of GTM and other DBSs in the MultiDBS 8

Parallel Database System Platforms (Non-autonomous Distributed DBMS) Shared Everything... Proc1 Proc2 ProcN Mem1 Shared Nothing Fast Interconnection Network Fast Interconnection Network Disk 1... Disk p Mem2. MemM Proc1 Mem1 Disk 1...... ProcN MemN Disk N Typically a special-purpose hardware configuration, but this can be emulated in software. Can be a special-purpose hardware configuration, or a network of workstations. 9

Distributed DBMS Advantages: Improved performance Efficiency Extensibility (addition of new nodes) Transparency of distribution Storage of data Query execution Autonomy of individual nodes Problems: Complexity of design and implementation Data consistency Safety Failure recovery 10

Client/Server Database Systems The Simple Case of Distributed Database Systems? 11

Client/Server Environments data (object) server + n smart clients (workstations) objects stored and administered on server objects processed (accessed and modified) on workstations [sometimes on the server too] CPU-time intensive applications on workstations GUIs, design tools Use client system local storage capabilities combination with distributed DBS services => distributed server architecture 12

Clients with Centralized Server Architecture workstations... local communication network Data Server interface database functions Disk 1... database Disk n 13

Clients with Distributed Server Architecture workstations... local communication network Data Server 1 Data Server m interface interface distributed DBMS... distributed DBMS local data mgnt. functions local data mgnt. functions Disk 1... database Disk n Disk 1... database Disk p 14

Clients with Data Server Approach 3-tier client/server... Application Server Data Server user interface query parsing data server interface communication channel application server interface database functions Disk 1... database Disk n 15

In the Client/Server DBMS Architecture, how are the DB services organized? There are several architectural options! 16

Client/Server Architectures Relational client process server process Application cursor management SQL Object-Oriented client process DBMS object/page cache management Application objects, pages, or files server process 17

Object Server Architecture server process client process Log/Lock Manager Object Manager Object Cache Application Object Manager Object Cache objects object references queries method calls locks log records File/Index Manager Page Cache Manager Storage Allocation and I/O Page Cache Database 18

Object Server Architecture - summary Unit of transfer: object(s) Server understands the object concept and can execute methods most DBMS functionality is replicated on client(s) and server(s) Advantages: - server and client can run methods, workload can be balanced - simplifies concurrency control design (centralized in server) - implementation of object-level locking - low cost for enforcing constraints on objects Disadvantages/problems: - remote procedure calls (RPC) for object references - complex server design - client cache consistency problems - page-level locking, objects get copied multiple times, large objects 19

Page Server Architecture client process Application Object Cache Object Manager File/Index Manager Page Cache Manager pages page references locks log records Log/Lock Manager server process Page Cache Manager Storage Allocation and I/O Page Cache Page Cache database 20

Page Server Architecture - summary unit of transfer: page(s) server deals only with pages (understands no object semantics) server functionality: storage/retrieval of page(s), concurrency control, recovery advantages: - most DBMS functionality at client - page transfer -> server overhead minimized - more clients can be supported - object clustering improves performance considerably disadvantages/problems: - method execution only on clients - object-level locking - object clustering - large client buffer pool required 21

File Server Architecture client process Object Cache Page Cache Application Object Manager File/Index Manager Page Cache Manager locks log records pages page references Log/Lock Manager server process NFS database Space Allocation 22

File Server Architecture - summary unit of transfer: page(s) simplification of page server, clients use remote file system (e.g., NFS) to read and write DB page(s) directly server functionality: I/O handling, concurrency control, recovery advantages: - same as for page server - NFS: user-level context switches can be avoided - NFS widely-used -> stable SW, will be improved disadvantages/problems: - same as for page server - NFS write are slow - read operations bypass the server -> no request combination - object clustering - coordination of disk space allocation 23

Cache Consistency in Client/Server Architectures Synchronous Asynchronous Deferred Avoidance- Based Algorithms Client sends 1 msg per lock to server; Client waits; Server replies with ACK or NACK. Client sends 1 msg per lock to the server; Client continues; Server invalidates cached copies at other clients. Client sends all write lock requests to the server at commit time; Client waits; Server replies when all cached copies are freed. Detection- Based Algorithms Client sends object status query to server for each access; Client waits; Server replies. Client sends 1 msg per lock to the server; Client continues; After commit, the server sends updates to all cached copies. Client sends all write lock requests to the server at commit time; Client waits; Server replies based on W-W conflicts only. Best Performing Algorithms 24

Comparison of the 3 Client/Server Architectures Page & File Server Simple server design Complex client design Fine grained concurrency control difficult Very sensitive to client buffer pool size and clustering Conclusions: No clear winner Object Server Complex server design Relatively simple client design Fine-grained concurrency control Reduces data movement, relatively insensitive to clustering Sensitive to client buffer pool size Depends on object size and application s object access pattern File server ruled out by poor NFS performance 25

Problems in Distributed DBMS Services Distributed Database Design Distributed Directory/Catalogue Mgmt Distributed Query Processing and Optimization Distributed Transaction Mgmt Distributed Concurreny Control Distributed Deadlock Mgmt Distributed Recovery Mgmt directory management query processing distributed DB design reliability (log) influences concurrency control (lock) deadlock management transaction management 26

Distributed Storage in Relational DBMSs horizontal fragmentation: distribution of rows, selection vertical fragmentation: distribution of columns, projection hybrid fragmentation: projected columns from selected rows allocation: which fragment is assigned to which node? replication: multiple copies at different nodes, how many copies? Design factors: Most frequent query access patterns Available distributed query processing algorithms Evaluation Criteria Cost metrics for: network traffic, query processing, transaction mgmt A system-wide goal: Maximize throughput or minimize latency 27

Distributed Storage in OODBMSs Must fragment, allocate, and replicate object data among nodes Complicating Factors: Encapsulation hides object structure Object methods (bound to class not instance) Users dynamically create new classes Complex objects Effects on garbage collection algorithm and performance Approaches: Store objects in relations and use RDBMS strategies to distribute data All objects are stored in binary relations [OID, attr-value] One relation per class attribute Use nested relations to store complex (multi-class) objects Use object semantics Evaluation Criteria: Affinity metric (maximize) Cost metric (minimize) 28

Horizontal Partitioning using Object Semantics Divide instances of a class into multiple groups Based on selected attribute values Node1: Attr A < 100 Attr A >= 100 : Node2 Based on subclass designation Class C Node1: Subcl X Subcl Y : Node2 Fragment an object s complex attributes from it s simple attributes Fragment an object s attributes based on typical method invocation sequence Keep sequentially referenced attributes together 29

Vertical Partitioning using Object Semantics Fragment object attributes based on the class hierarchy Instances of Class X Instances of Subclass subx Instances of Subclass subsubx Fragment #1 Fragment #2 Fragment #3 Breaks object encapsulation? 30

Path Partitioning using Object Semantics Use the object composition graph for complex objects A path partition is the set of objects corresponding to instance variables in the subtree rooted at the composite object Typically represented in a structure index Composite Object OID {SubC#1-OID, SubC#2-OID, SubC#n-OID} 31

More Issues in OODBMS Fragmentation Local Methods Remote Methods Local Object Data Good performance; Issue: Replicate methods so they are local to all instances? Send data to remote site, execute, and return result OR fetch the method and execute; Issues: Time/cost for two transfers? Ability to execute locally? Remote Object Data Fetch the remote data and execute the methods; Issue: Time/cost for data transfer? Send additional input values via RPC, execute and return result; Issues: RPC time? Execution load on remote node? Replication Options Objects Classes (collections of objects) Methods Class/Type specifications 32

Distributed Directory and Catalogue Management Directory Information: Description and location of records/objects Size, special data properties (e.g.,executable, DB type, user-defined type, etc.) Fragmentation scheme Definitions for views, integrity constraints Options for organizing the directory: Centralized Fully replicated Partitioned Issues: bottleneck, unreliable Issues: consistency, storage overhead Issues: complicated access protocol, consistency Combination of partitioned and replicated e.g., zoned, replicated zoned 33

Distributed Query Processing and Optimization Construction and execution of query plans, query optimization Goals: maximize parallelism (response time optimization) minimize network data transfer (throughput optimization) Basic Approaches to distributed query processing: Pipelining functional decomposition Processing Timelines Rel A Node 1 Select A.x > 100 Node 2 Rel B Join A and B on y Node 1: Node 2: Parallelism data decomposition Frag A.1 Node 1 Select A.x > 100 Frag A.2 Node 2 Node 3 Union Processing Timelines Node 1: Node 2: Node 3: 34

Creating the Distributed Query Processing Plan Factors to be considered: - distribution of data - communication costs - lack of sufficient locally available information 4 processing steps: (1) query decomposition (2) data localization control site (uses global information) (3) global optimization (4) local optimization local sites (use local information) 35

local sites control site Generic Layered Scheme for Planning a Dist. Query calculus query on distributed objects query decomposition algebraic query on distributed objects data localization Global schema Fragment schema This approach was initially designed for distributed relational DBMSs. fragment query global optimization optimized fragment query with communication operations Statistics on fragments It also applies to distributed OODBMSs, Multi-DBMSs, and distributed Multi- DBMSs. local optimization Local schema optimized local queries 36

Distributed Query Processing Plans in OODBMSs Two forms of queries Explicit queries written in OQL Object navigation/traversal Reduce to logical algebraic expressions Planning and Optimizing Additional rewriting rules for sets of objects Union, Intersection, and Selection Cost function Should consider object size, structure, location, indexes, etc. Breaks object encapsulation to obtain this info? Objects can reflect their access cost estimate Volatile object databases can invalidate optimizations Dynamic Plan Selection (compile N plans; select 1 at runtime) Periodic replan during long-running queries } 37

Distributed Query Optimization information needed for optimization (fragment statistics): - size of objects, image sizes of attributes - transfer costs - workload among nodes - physical data layout - access path, indexes, clustering information - properties of the result (objects) - formulas for estimating the cardinalities of operation results execution cost is expressed as a weighted combination of I/O, CPU, and communication costs (mostly dominant). total-cost = CCPU * #insts + CI/O * #I/Os + CMSG * #msgs + CTR * #bytes Optimization Goals: response time of single transaction or system throughput 38

Distributed Transaction Managment Transaction Management (TM) in centralized DBS Achieves transaction ACID properties by using: concurrency control (CC) recovery (logging) TM in DDBS Achieves transaction ACID properties by using: Transaction Manager Concurrency Control (Isolation) Commit/Abort Protocol (Atomicity) Recovery Protocol (Durability) Replica Control Protocol (Mutual Consistency) Log Manager Buffer Manager Strong algorithmic dependencies 39

Classification of Concurrency Control Approaches CC Approaches Isolation Pessimistic Optimistic Locking Timestamp Ordering Hybrid Locking Timestamp Ordering Centralized Basic Primary Copy Multi- Version phases of pessimistic transaction execution validate read compute write Distributed Conservative phases of optimistic transaction execution read compute validate write 40

Number of locks Number of locks Two-Phase-Locking Approach (2PL) 2PL Lock Graph Obtain Lock Isolation Release Lock BEGIN LOCK END POINT Transaction Duration Obtain Lock Release Lock Strict 2PL Lock Graph BEGIN Period of data item use END Transaction Duration 41

Communication Structure of Centralized 2PL Data Processors at Coordinating TM Central Site LM participating sites 3 Operation 1 Lock Request 2 Lock Granted Isolation 4 End of Operation 5 Release Locks 42

Communication Structure of Distributed 2PL Isolation Data Processors at Participating LMs Coordinating TM participating sites 2 Operation 1 Lock Request 3 End of Operation 4 Release Locks 43

Distributed Deadlock Management Isolation Example Site X T1 x needs a waits for T1 on site y to complete Site Y T1 x holds lock Lc T1 y holds lock La waits for T1 x to release Lc waits for T2 y to release Ld T2 x holds lock Lb T2 y holds lock Ld T2 y needs b waits for T2 on site x to complete Distributed Waits-For Graph - requires many messages to update lock status - combine the wait-for tracing message with lock status update messages - still an expensive algorithm 44

Communication Structure of Centralized 2P Commit Protocol Coordinating TM Participating Sites Atomicity Count the votes If (missing any votes or any Vote-Abort) Then Global-Abort Else Global-Commit 1 Prepare Global-Abort or Global-Commit 3 2 Make local commit/abort decision Write log entry Vote-Commit or Vote-Abort 4 ACK Write log entry Who participates? Depends on the CC alg. Other communication structures are possible: Linear Distributed 45

State Transitions for 2P Commit Protocol Commit Coordinating TM Initial Wait 1 Prepare 3 Global-Abort or Global-Commit Abort 2 Participating Sites Vote-Commit or Vote-Abort 4 ACK Initial Ready Commit Abort Advantages: - Preserves atomicity - All processes are synchronous within one state transition Disadvantages: - Many, many messages - If failure occurs, the 2PC protocol blocks! Atomicity Attempted solutions for the blocking problem: 1) Termination Protocol 2) 3-Phase Commit 3) Quorum 3P Commit 46

Failures in a Distributed System Types of Failure: Transaction failure Node failure Media failure Network failure Partitions each containing 1 or more sites Issues to be addressed: How to continue service How to maintain ACID properties while providing continued service How to ensure ACID properties after recovery from the failure(s) Who addresses the problem? Termination Protocols Modified Concurrency Control & Commit/Abort Protocols Recovery Protocols, Termination Protocols, & Replica Control Protocols 47

Termination Protocols Coordinating TM Participating Sites Initial 1 Prepare Wait 2 Commit Abort 4 3 Global-Abort or Global-Commit Vote-Commit or Vote-Abort ACK Initial Ready Commit Abort Atomicity, Consistency Use timeouts to detect potential failures that could block protocol progress Timeout states: Coordinator: wait, commit, abort Participant: initial, ready Coordinator Termination Protocol: Wait Send global-abort Commit or Abort BLOCKED! Participant Termination Protocol: Ready Query the coordinator If timeout then query other participants; If global-abort global-commit then proceed and terminate else BLOCKED! 48

Number of locks Replica Control Protocols Consistency Update propagation of committed write operations STRICT UPDATE LAZY UPDATE BEGIN END Obtain Locks Period of data use 2-Phase Commit Release Locks COMMIT POINT 49

Strict Replica Control Protocol Consistency Read-One-Write-All (ROWA) Part of the Concurrency Control Protocol and the 2-Phase Commit Protocol CC locks all copies 2PC propagates the updated values with 2PC messages (or an update propagation phase is inserted between the wait and commit states for those nodes holding an updateable value). 50

Lazy Replica Control Protocol Consistency Propagates updates from a primary node. Concurrency Control algorithm locks the primary copy node (same node as the primary lock node). To preserve single copy semantics, must ensure that a transaction reads a current copy. Changes the CC algorithm for read-locks Adds an extra communication cost for reading data Extended transaction models may not require single copy semantics. 51

Recovery in Distributed Systems Atomicity, Durability Select COMMIT or ABORT (or blocked) for each interrupted subtransaction Commit Approaches: Redo use the undo/redo log to perform all the write operations again Retry use the transaction log to redo the entire subtransaction (R + W) Abort Approaches: Undo use the undo/redo log to backout all the writes that were actually performed Compensation use the transaction log to select and execute reverse subtransactions that semantically undo the write operations. Implementation requires knowledge of: Buffer manager algorithms for writing updated data from volatile storage buffers to persistent storage Concurrency Control Algorithm Commit/Abort Protocols Replica Control Protocol 52

Network Partitions in Distributed Systems Partition #2 N1 N2 N3 Partition #1 N8 N7 N6 network N5 N4 Partition #3 Issues: Termination of interrupted transactions Partition integration upon recovery from a network failure Data availability while failure is ongoing 53

Data Availability in Partitioned Networks Concurrency Control model impacts data availability. ROWA data replicated in multiple partitions is not available for reading or writing. Primary Copy Node CC can execute transactions if the primary copy node for all of the read-set and all of the write-set are in the client s partition. Availability is still very limited... We need a new idea! 54

Quorums ACID Quorum a special type of majority Use quorums in the Concurrency Control, Commit/Abort, Termination, and Recovery Protocols CC uses read-quorum & write-quorum C/A, Term, & Recov use commit-quorum & abort-quorum Advantages: More transactions can be executed during site failure and network failure (and still retain ACID properties) Disadvantages: Many messages are required to establish a quorum Necessity for a read-quorum slows down read operations Not quite sufficient (failures are not clean ) 55

Read-Quorums and Write-Quorums Isolation The Concurrency Control Algorithm serializes valid transactions in a partition. It must obtain A read-quorum for each read operation A write-quorum for each write operation Let N=total number of nodes in the system Define the size of the read-quorum Nr and the size of the write-quorum Nw as follows: Nr + Nw > N Nw > (N/2) Simple Example: N = 8 Nr = 4 Nw = 5 When failures occur, it is possible to have a valid read quorum and no valid write quorum 56

Commit-Quorums and Abort-Quorums The Commit/Abort Protocol requires votes from all participants to commit or abort a transaction. Commit a transaction if the accessible nodes can form a commitquorum Abort a transaction if the accessible nodes can form an abort-quorum Elect a new coordinator node (if necessary) Try to form a commit-quorum before attempting to form an abort-quorum Let N=total number of nodes in the system Define the size of the commit-quorum Nc and the size of abort-quorum Na as follows: Na + Nc > N; 0 Na, Nc N Simple Examples: N = 7 Nc = 4 Na = 4 N = 7 Nc = 5 Na = 3 ACID 57

Conclusions Nearly all commerical relational database systems offer some form of distribution Client/server at a minimum Only a few commercial object-oriented database systems support distribution beyond N clients and 1 server Future research directions: Distributed data architecture and placement schemes (app-influenced) Distributed databases as part of Internet applications Continued service during disconnection or failure Mobile systems accessing databases Relaxed transaction models for semantically-correct continued operation 58