Database Tuning and Physical Design: Execution of Transactions



Similar documents
Transaction Management Overview

Course Content. Transactions and Concurrency Control. Objectives of Lecture 4 Transactions and Concurrency Control

Lecture 7: Concurrency control. Rasmus Pagh

Concurrency Control. Module 6, Lectures 1 and 2

Recovery Theory. Storage Types. Failure Types. Theory of Recovery. Volatile storage main memory, which does not survive crashes.

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall

Textbook and References

Concurrency control. Concurrency problems. Database Management System

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Introduction to Database Management Systems

Transactions and the Internet

Database Concurrency Control and Recovery. Simple database model

Transactions and Concurrency Control. Goals. Database Administration. (Manga Guide to DB, Chapter 5, pg , ) Database Administration

INTRODUCTION TO DATABASE SYSTEMS

Review: The ACID properties

Transactions. SET08104 Database Systems. Napier University

Goals. Managing Multi-User Databases. Database Administration. DBA Tasks. (Kroenke, Chapter 9) Database Administration. Concurrency Control

Data Management in the Cloud

Chapter 14: Recovery System

The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999

Chapter 15: Recovery System

Transactional properties of DBS

Transactions and Recovery. Database Systems Lecture 15 Natasha Alechina

CMSC724: Concurrency

Transactions: Definition. Transactional properties of DBS. Transactions: Management in DBS. Transactions: Read/Write Model

! Volatile storage: ! Nonvolatile storage:

Transaction Processing Monitors

Chapter 6 The database Language SQL as a tutorial

Introduction to Database Systems. Module 1, Lecture 1. Instructor: Raghu Ramakrishnan UW-Madison

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps

Roadmap DB Sys. Design & Impl. Detailed Roadmap. Paper. Transactions - dfn. Reminders: Locking and Consistency

Topics. Introduction to Database Management System. What Is a DBMS? DBMS Types

Information Systems. Computer Science Department ETH Zurich Spring 2012

Crashes and Recovery. Write-ahead logging

Crash Recovery. Chapter 18. Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke

Chapter 10. Backup and Recovery

How To Recover From Failure In A Relational Database System

CS 245 Final Exam Winter 2013

Recovery algorithms are techniques to ensure transaction atomicity and durability despite failures. Two main approaches in recovery process

How To Write A Transaction System

Chapter 10: Distributed DBMS Reliability

Recovery System C H A P T E R16. Practice Exercises

COS 318: Operating Systems

Module 3 (14 hrs) Transactions : Transaction Processing Systems(TPS): Properties (or ACID properties) of Transactions Atomicity Consistency

The first time through running an Ad Hoc query or Stored Procedure, SQL Server will go through each of the following steps.

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??

Unit 12 Database Recovery

Chapter 16: Recovery System

Lecture 18: Reliable Storage

Distributed Databases

PostgreSQL Concurrency Issues

Outline. Failure Types

Redo Recovery after System Crashes

CHAPTER 6: DISTRIBUTED FILE SYSTEMS

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Failure Recovery Himanshu Gupta CSE 532-Recovery-1

UVA. Failure and Recovery. Failure and inconsistency. - transaction failures - system failures - media failures. Principle of recovery

Homework 8. Revision : 2015/04/14 08:13

Contents RELATIONAL DATABASES

Agenda. Transaction Manager Concepts ACID. DO-UNDO-REDO Protocol DB101

Synchronization and recovery in a client-server storage system

Concurrency Control: Locking, Optimistic, Degrees of Consistency

Lesson 12: Recovery System DBMS Architectures

Distributed Transactions

Introduction to Database Systems CS4320. Instructor: Christoph Koch CS

Microkernels & Database OSs. Recovery Management in QuickSilver. DB folks: Stonebraker81. Very different philosophies

A Proposal for a Multi-Master Synchronous Replication System

Inside Microsoft SQL Server 2005: The Storage Engine

Recovery Protocols For Flash File Systems

Serializable Isolation for Snapshot Databases

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

TRANSACÇÕES. PARTE I (Extraído de SQL Server Books Online )

The Oracle Universal Server Buffer Manager

File System Design and Implementation

Recovery: An Intro to ARIES Based on SKS 17. Instructor: Randal Burns Lecture for April 1, 2002 Computer Science Johns Hopkins University

THE UNIVERSITY OF TRINIDAD & TOBAGO

DATABASDESIGN FÖR INGENJÖRER - 1DL124

A Shared-nothing cluster system: Postgres-XC

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

Recovery: Write-Ahead Logging

Derby: Replication and Availability

DataBlitz Main Memory DataBase System

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

MS SQL Performance (Tuning) Best Practices:

David Dye. Extract, Transform, Load

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Oracle EXAM - 1Z Oracle Database 11g Release 2: SQL Tuning. Buy Full Product.

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Generalized Isolation Level Definitions

Logistics. Database Management Systems. Chapter 1. Project. Goals for This Course. Any Questions So Far? What This Course Cannot Do.

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Recovery Principles in MySQL Cluster 5.1

Transcription:

Database Tuning and Physical Design: Execution of Transactions David Toman School of Computer Science University of Waterloo Introduction to Databases CS348 David Toman (University of Waterloo) Transaction Execution 1 / 20

Basics of Transaction Processing Query (and update) processing converts requests for sets of tuples to requests for reads and writes of physical objects in the database. database objects (depending on granularity) can be individual attributes records physical pages files (only for concurrency control purposes) Goals ) correct and concurrent execution of queries and updates ) guarantee that acknowledged updates are persistent David Toman (University of Waterloo) Transaction Execution 2 / 20

ACID Requirements Transactions are said to have the ACID properties: Atomicity: all-or-nothing execution Consistency: execution preserves database integrity Isolation: transactions execute independently (as if they were executed in the system alone) Durability: updates made by a committed transaction will not be destroyed by subsequent failures. Implementation of transactions in a DBMS comes in two parts: Concurrency Control: committed transactions do not interfere Recovery Management: committed transactions are durable, aborted transactions have no effect on the database David Toman (University of Waterloo) Transaction Execution 3 / 20

Concurrency Control: assumptions 1 we fix a database: a set of objects read/written by transactions: ) r i [x ]: transaction T i reads object x ) w i [x ]: transaction T i writes (modifies) object x 2 a transaction T i is a sequence of operations T i = r i [x 1 ]; r i [x 2 ]; w i [x 1 ]; : : : ; r i [x 4 ]; w i [x 2 ]; c i c i is the commit request of T i. 3 for a set of transactions T 1 ; : : : ; T k we want to produce a schedule S of operations such that ) every operation o i 2 T i appears also in S ) T i s operations in S are ordered the same way as in T i Goal: produce a correct schedule with maximal parallelism David Toman (University of Waterloo) Transaction Execution 4 / 20

Transactions and Schedules If T i and T j are concurrent transactions, then it is always correct to schedule the operations in such a way that: T i will appear to precede T j meaning that T j will see all updates made by T i, and T i will not see any updates made by T j, or T i will appear to follow T j, meaning that T i will see T j s updates and T j will not see T i s. Idea how to define Correctness: it must appear as if the transactions have been executed sequentially (in some serial order). David Toman (University of Waterloo) Transaction Execution 5 / 20

Serializable Schedules Definition An execution of is said to be serializable if it is equivalent to a serial execution of the same transactions. Example: An interleaved execution of two transactions: S a = w 1 [x ] r 2 [x ] w 1 [y] r 2 [y] An equivalent serial execution (T 1, T 2 ): S b = w 1 [x ] w 1 [y] r 2 [x ] r 2 [y] An interleaved execution with no equivalent serial execution: S c = w 1 [x ] r 2 [x ] r 2 [y] w 1 [y] David Toman (University of Waterloo) Transaction Execution 6 / 20

Conflict Equivalence How do we determine if two schedules are equivalent? Conflict Equivalence: ) cannot be based on any particular database instance two operations conflict if they (1) belong to different transactions (2) access the same data item x (3) at least one of them is a write operation w [x ]. we require that in two conflict-equivalent histories all conflicting operations are ordered the same way. yields conflict-serializable schedules ) conflict-equivalent to a serial schedule View Equivalence: allows more schedules, but it is harder (NP-hard) to compute David Toman (University of Waterloo) Transaction Execution 7 / 20

Other Properties of Schedules Serializability guarantees correctness. However, we d like to avoid other unpleasant situations. Recoverable Schedules: (RC) transaction T j reads a value T i has written, T j succeeds to commit, and T i tries to abort (in this order) ) to abort T 2 we need to undo effects of a committed transaction T 1. ) commits only in order of the read-from dependency Cascadeless Schedules (ACA): if T j above didn t commit we can abort it: may lead to cascading aborts of many transactions ) no reading of uncommitted data David Toman (University of Waterloo) Transaction Execution 8 / 20

How to Get a Serializable Schedule? So how do we build schedulers that produce serializable and cascadeless schedules? The scheduler receives requests from the query processor(s). For each operation it chooses one of the following actions: execute it (by sending to a lower module), delay it (by inserting in some queue), or reject it (thereby causing abort of the transaction) ignore it (as it has no effect) Two main kinds of schedulers: ) conservative (favors delaying operations) ) aggressive (favors rejecting operations) David Toman (University of Waterloo) Transaction Execution 9 / 20

Two Phase Locking (2PL) Transactions must have a lock on objects before access: a shared lock is required to read an object an exclusive lock is required to write an object It is insufficient just to acquire a lock, access the data item, and then release it immediately... 2PL Protocol A transaction has to acquire all locks before it releases any of them. Theorem Two-phase locking guarantees that the produced transaction schedules are (conflict) serializable. In practice: STRICT 2PL (locks held till commit; this guarantees ACA) David Toman (University of Waterloo) Transaction Execution 10 / 20

Deadlocks and What to do With 2PL we may end with a deadlock: r 1 [x ]; r 2 [y]; w 2 [x ](blocked by T 1 ); w 1 [y](blocked by T 2 ) How do we deal with this: deadlock prevention: ) locks granted only if they can t lead to a deadlock. ) ordered data items and locks granted in this order. deadlock detection: ) wait for graphs and cycle detection. ) resolution: the system aborts one of the offending transactions (involuntary abort). in practice: detection (or often just a timeout) and abort David Toman (University of Waterloo) Transaction Execution 11 / 20

Variations on Locking Multi-granularity Locking ) not all locked objects have the same size ) advantageous in presence of bulk vs. tiny updates Predicate Locking ) locks based on selection predicate rather than on a value Tree Locking ) tries to avoid congestion in roots of (B-)trees ) allows relaxation of 2PL due to tree structure of data Lock Upgrade protocols... David Toman (University of Waterloo) Transaction Execution 12 / 20

Inserts and Deletes We have been assuming a fixed set of data items. ) what if we try to insert or delete an item? does plain 2PL (correctly) handle this situation? NO: ) one transaction tries to count records in a table ) second transactions adds/ deletes a record this situation is called the phantom problem. Solution: operations that ask for all records have to lock against insertion/deletion of a qualifying record ) locks on tables ) index locking and other techniques David Toman (University of Waterloo) Transaction Execution 13 / 20

Isolation Levels in SQL The guarantee of serializable executions may carries a heavy price. Performance may be poor because of blocked transactions and deadlocks. Four isolation levels are supported: Level 3: (Serializability) ) essentially table-level strict 2PL Level 2: (Repeatable Read) ) tuple-level strict 2PL; phantom tuples may occur Level 1: (Cursor Stability) ) tuple-level exclusive-lock only strict 2PL reading the same object twice: different values Level 0: ) neither read nor write locks are acquired ) transaction may read uncommitted updates David Toman (University of Waterloo) Transaction Execution 14 / 20

Recovery: Goals and Setting Two goals: 1 allow transactions to be committed (with a guarantee that the effects are permanent) or aborted(with a guarantee that the effects disappear) 2 allow the database to be recovered to a consistent state in case on HW/power/... failure. Input: a 2PL, ACA schedule of operations produced by TM. Output: a schedule of reads/writes/forced writes. David Toman (University of Waterloo) Transaction Execution 15 / 20

Approaches to Recovery Two essential approaches: 1 Shadowing ) copy-on-write and merge-on-commit approach ) poor clustering ) used in system R, but not in modern systems 2 Logging ) use of LOG (separate disk) to avoid forced writes ) good utilization of buffers ) preserves original clusters David Toman (University of Waterloo) Transaction Execution 16 / 20

Log-Based Approaches A log is a read/append only data structure (a file) ) transactions add log records about what they do Log records contain several types of information: UNDO information: old versions of objects that have been modified by a transaction. UNDO information can be used to undo database changes made by a transaction that aborts. REDO information: new versions of objects that have been modified by a transaction. REDO records can be used to redo the work done by a transaction that commits. BEGIN/COMMIT/ABORT records are recorded whenever a transaction begins, commits, or aborts. David Toman (University of Waterloo) Transaction Execution 17 / 20

Example of a LOG log head! T 0,begin (oldest part) T 0,X,99,100 T 1,begin T 1,Y,199,200 T 2,begin T 2,Z,51,50 T 1,M,1000,10 T 1,commit T 3,begin T 2,abort T 3,Y,200,50 T 4,begin (newest part) T 4,M,10,100 log tail! T 3,commit David Toman (University of Waterloo) Transaction Execution 18 / 20

Write-Ahead Logging (WAL) How do we make sure the LOG is consistent with the main database? Write-Ahead Logging (WAL) approach requires: 1 UNDO rule: a log record for an update is written to log disk before the corresponding data (page) is written to the main disk (guarantees Atomicity) 2 REDO rule: all log records for a transaction are written to log disk before commit (guarantees Durability) David Toman (University of Waterloo) Transaction Execution 19 / 20

Summary ACID properties of transactions guarantee correctness of concurrent access to the database and of data storage. consistency and isolation based on serializability ) leads to definition of correct schedulers ) responsibility of the transaction manager durability and atomicity ) responsibility of the recovery manager ) synchronous writing is too inefficient replaced by synchronous writes to a LOG and WAL David Toman (University of Waterloo) Transaction Execution 20 / 20