CS 245 Final Exam Winter 2013

Similar documents
Recovery System C H A P T E R16. Practice Exercises

Databases and Information Systems 1 Part 3: Storage Structures and Indices

CSE 444 Midterm Test

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall

Lecture 7: Concurrency control. Rasmus Pagh

Transaction Management Overview

Chapter 15: Recovery System

Database Tuning and Physical Design: Execution of Transactions

! Volatile storage: ! Nonvolatile storage:

UVA. Failure and Recovery. Failure and inconsistency. - transaction failures - system failures - media failures. Principle of recovery

Introduction to Database Management Systems

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

Concurrency Control. Module 6, Lectures 1 and 2

How To Recover From Failure In A Relational Database System

Recovery: An Intro to ARIES Based on SKS 17. Instructor: Randal Burns Lecture for April 1, 2002 Computer Science Johns Hopkins University

Agenda. Transaction Manager Concepts ACID. DO-UNDO-REDO Protocol DB101

Crashes and Recovery. Write-ahead logging

The Oracle Universal Server Buffer Manager

Chapter 16: Recovery System

COS 318: Operating Systems

PSYCH 105S: General Psychology Tuesdays and Thursdays 11:00 12:50 CERAS, Room 300

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps

Recovery Theory. Storage Types. Failure Types. Theory of Recovery. Volatile storage main memory, which does not survive crashes.

Textbook and References

Crash Recovery. Chapter 18. Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke

Windows NT File System. Outline. Hardware Basics. Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik

Redo Recovery after System Crashes

Unit 12 Database Recovery

Chapter 14: Recovery System

Course Content. Transactions and Concurrency Control. Objectives of Lecture 4 Transactions and Concurrency Control

Lecture 18: Reliable Storage

Recovery Principles in MySQL Cluster 5.1

Chapter 10. Backup and Recovery

Outline. Windows NT File System. Hardware Basics. Win2K File System Formats. NTFS Cluster Sizes NTFS

Transactions and Recovery. Database Systems Lecture 15 Natasha Alechina

Review: The ACID properties

Recovery: Write-Ahead Logging

Lecture 1: Data Storage & Index

Restore and Recovery Tasks. Copyright 2009, Oracle. All rights reserved.

Concurrency Control: Locking, Optimistic, Degrees of Consistency

PostgreSQL Concurrency Issues

Outline. Failure Types

Information Systems. Computer Science Department ETH Zurich Spring 2012

Synchronization and recovery in a client-server storage system

CIS 631 Database Management Systems Sample Final Exam

Technical Overview Simple, Scalable, Object Storage Software

Concurrency control. Concurrency problems. Database Management System

Transactional Information Systems:

Understanding Backup and Recovery Methods

Chapter 10: Distributed DBMS Reliability

INTRODUCTION TO DATABASE SYSTEMS

Physical Data Organization

Transactions, Views, Indexes. Controlling Concurrent Behavior Virtual and Materialized Views Speeding Accesses to Data

Storage and File Structure

Failure Recovery Himanshu Gupta CSE 532-Recovery-1

Lesson 12: Recovery System DBMS Architectures

DBMS Recovery Procedures Frederick Gallegos Daniel Manson

AN STUDY ON DATA CONSISTENCY IN SPATIAL DATABASE SYSTEM

Homework 8. Revision : 2015/04/14 08:13

Dr Markus Hagenbuchner CSCI319. Distributed Systems

Microkernels & Database OSs. Recovery Management in QuickSilver. DB folks: Stonebraker81. Very different philosophies

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

CS 155 Final Exam. CS 155: Spring 2013 June 11, 2013

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

Query Processing C H A P T E R12. Practice Exercises

DATABASDESIGN FÖR INGENJÖRER - 1DL124

High-Performance Concurrency Control Mechanisms for Main-Memory Databases

Flowcharting, pseudocoding, and process design

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Overview of Storage and Indexing

DATABASE DESIGN - 1DL400

3. PGCluster. There are two formal PGCluster Web sites.

Data Management in the Cloud

Recovery Protocols For Flash File Systems

Topics. Distributed Databases. Desirable Properties. Introduction. Distributed DBMS Architectures. Types of Distributed Databases

Distributed Databases

Database Concurrency Control and Recovery. Simple database model

Roadmap DB Sys. Design & Impl. Detailed Roadmap. Paper. Transactions - dfn. Reminders: Locking and Consistency

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Oracle Database Links Part 2 - Distributed Transactions Written and presented by Joel Goodman October 15th 2009

Derby: Replication and Availability

RARITAN VALLEY COMMUNITY COLLEGE COMPUTER SCIENCE (CS) DEPARTMENT. CISY-294, Oracle: Database Administration Fundamentals Part I

Symantec Endpoint Encryption Full Disk

SQL Tables, Keys, Views, Indexes

Transactions. SET08104 Database Systems. Napier University

Percona Server features for OpenStack and Trove Ops

CS435 Introduction to Big Data

DISTRIBUTED AND PARALLELL DATABASE

Recovery algorithms are techniques to ensure transaction atomicity and durability despite failures. Two main approaches in recovery process

Distributed Database Management Systems

Full and Complete Binary Trees

Transcription:

CS 245 Final Exam Winter 2013 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 140 minutes (2 hours, 20 minutes) to complete it. Print your name: The Honor Code is an undertaking of the students, individually and collectively: 1. that they will not give or receive aid in examinations; that they will not give or receive unpermitted aid in class work, in the preparation of reports, or in any other work that is to be used by the instructor as the basis of grading; 2. that they will do their share and take an active part in seeing to it that others as well as themselves uphold the spirit and letter of the Honor Code. The faculty on its part manifests its confidence in the honor of its students by refraining from proctoring examinations and from taking unusual and unreasonable precautions to prevent the forms of dishonesty mentioned above. The faculty will also avoid, as far as practicable, academic procedures that create temptations to violate the Honor Code. While the faculty alone has the right and obligation to set academic requirements, the students and faculty will work together to establish optimal conditions for honorable academic work. I acknowledge and accept the Honor Code. Signed: Problem Points Maximum 1 10 2 10 3 10 4 10 5 10 6 10 7 10 Total 70 1

Problem 1 (10 points) (a) Consider the following schedule for transactions T 1, T 2, T 3 : R 1 (A)W 1(C)R 2 (B)W 3 (C)W 1 (A)R 3 (B)R 1 (A)W 2 (B)R 2 (C)W 3 (B)R 3 (A) Below are some pairs of transactions. Say yes if the first transaction in the pair must precede the second one in any equivalent serial schedule (if it exists), and no otherwise. T 1, T 2 : T 3, T 2 : T 2, T 3 : T 3, T 1 : (b) Consider a schedule: S 1 : R 1 (B)W 3 (A)R 2 (C)W 2 (C)R 3 (C)W 1 (B)R 3 (B)W 3 (C) Is S 1 serializable? : Is S 1 2-phase lockable? : Consider a schedule: S 2 : R 1 (A)W 1 (A)R 2 (B)W 2 (C)R 2 (A)R 3 (C)W 3 (B)R 1 (B) Is S 2 serializable? : Is S 2 2-phase lockable? : 2

Problem 2 (10 points) Consider a multi-granularity locking system, with lock modes S,X,IS,IX,SIX. The object hierarchy is as follows: There is a root R, with blocks A and B under it. Block A contains records a1 and a2, while block B contains records b1 and b2. (a) The next four questions give a pair of locks each, answer yes if that pair of locks is compatible, and no otherwise. If a lock is on something other than the root, assume that other suitable locks on parents have been taken, and consider compatibility with those other locks as well. For example, S on B is considered incompatible with X on b1, since the X on b1 will require IX on B. S on b1, X on B : IX on R, IX on R : SIX on A, X on a2 : IS on A, S on a1 : (b) For the next three questions, give the list of locks (both lock type and object being locked) required to do the operation mentioned. Start with the lock on the root, and go downwards in the hierarchy. The answer should have a format like: IS(R),IS(A),S(a1). Modify a1 : Insert a record into B : Scan A (read all records in A): 3

Problem 3 (10 points) Consider a database that has six elements - A, B, C, D, E and F. The initial values of the elements are: A = 10, B = 20, C = 30, D = 40, E = 50, F = 60 Let there be three transactions U, V and W that modify these elements concurrently. U: A := 5, B:= 15, D := 30 V: C := 25 W: E := 35, F:=45 While the elements are being modified by the transactions, the database system crashes. The recovery mechanism depends on the logging scheme we use. In the questions below, we present the contents of the log at the time of crash and state the logging scheme used. (For each log entry we give the relevant data, as discussed in class examples.) For each question, your task is to identify if a combination of values of the elements is possible in the disk at the time of crash and to help recover the database elements. (a) Pure undo logging is used. < ST ART U > < U, A, 10 > < ST ART V > < U, B, 20 > < V, C, 30 > < COMMIT V > < ST ART CKP T (U) > < ST ART W > < U, D, 40 > Is the following state of database elements (on disk) possible at the time of crash? A = 5, B = 20, C = 25, D = 40, E = 50, F = 60 : (True/ False) What are the values of the database elements after a successful recovery? A B C D E F 4

(b) Pure redo logging < ST ART U > < U, A, 5 > < ST ART V > < U, B, 15 > < V, C, 25 > < COMMIT V > < ST ART CKP T (U) > < ST ART W > < U, D, 30 > < W, E, 35 > < COMMIT W > Is the following state of database elements (on disk) possible at the time of crash? A = 5, B = 15, C = 25, D = 30, E = 35, F = 60 : (True/ False) What are the values of the database elements after a successful recovery? A B C D E F (c) Undo-redo logging < ST ART U > < U, A, 10, 5 > < ST ART V > < U, B, 20, 15 > < V, C, 10, 25 > < COMMIT V > < ST ART CKP T (U) > < ST ART W > < U, D, 40, 30 > < W, E, 50, 35 > < END CKP T > < COMMIT U > < W, F, 60, 45 > Is the following state of database elements (on disk) possible at the time of crash? A = 10, B = 20, C = 25, D = 40, E = 50, F = 60 : (True/ False) What are the values of the database elements after a successful recovery? A B C D E F 5

Problem 4 (10 points) Consider a database with six elements A, B, C, D, E and F. There are five transactions T, U, V, W and X that read and write to these database elements. The times at which the transactions start, try to validate and finish are as in the diagram below. The read and write sets of transactions T, U, W and V are also indicated. U RS = {A, C} WS = {D} W RS = {A} WS = {A, D} X RS =? WS =? Ο Ο Ο Ο Ο T RS = {D} WS = {B, C} V RS = {E} WS = {A, F} = start = validate Ο = finish (a) Does the transaction U validate?(yes/ No) (b) Does the transaction V validate?(yes/ No) (c) Does the transaction W validate?(yes/ No) (d) If we know that the transaction X validates, give the list of possible elements in the read set and write set of X. Elements that could be in the read set of X Elements that could be in the write set of X 6

Problem 5 (10 points) Consider a select query over relation R(A, B, C) with condition (A = a) (B = b). interested in estimating the number of resulting tuples. We are (a) Using the simple statistics covered in class, T (R), V (R, A) and V (R, B), what is the expected number of tuples in the result, E 1? Expected result size, E 1 = (b) Suppose that in addition we maintain V (R, A, B), the number of distinct A and B value pairs that appear in R. For example, if R contains these three tuples (and only these tuples): (1,2,20), (1,2,21), (2,3,20), (2,5,40), (3, 8,20), (3,8,45), (3,8,55), then V (R, A, B) = 4. With this new statistic, what is a better estimate for the result size, E 2? (This new estimate handles the correlation between A and B better than the previous estimate.) Expected result size, E 2 = (c) The ratio E 2 /E 1 tells us how much larger the improved estimate is. For example, if E 2 /E 1 = 5, then the improved estimate is 5 times larger than the old estimate. Write expressions (as a function of T (R), V (R, A) and V (R, B)) for the minimum and maximum values E 2 /E 1 can take. Minimum value for E 2 /E 1 = Maximum value for E 2 /E 1 = (d) Is it easier to maintain V (R, A) and V (R, B) or the new statistic V (R, A, B)? Circle easier values to maintain: (V (R, A) and V (R, B)) or (V (R, A, B)) 7

Problem 6 (10 points) You have implemented an extensible hash index, and you want to make it perform well when concurrent transactions access the index. The following pseudo-code implements your extensible hash index. There are two calls you handle, findkey and insertrecord. (Ignore deletions for this problem.) Variable d represents the directory. Procedure findkey(k) -> record r [ b <- findbucket(d, k); r <- findrecord(b,k);] Procedure insertrecord(d, k, r) [ b <- findbucket(d, k); if istherespace(b)= true then insert(b, k, r) else [ doubledirectory(d); splitbucket(k, b); b <- findbucket(d, k); insert(b, k, r) ] ] The meaning of the functions in this pseudo-code should be clear to anyone who has taken CS245. For instance, istherespace(b) returns true if there is room in bucket b to insert one more key. Our goal now is to add lock and unlock actions to this code so that we follow the tree locking protocol. In our case the tree has only two levels: The root is the directory d and the leaves are the hash buckets. (Assume that a bucket and all of its overflow buckets constitute a single tree leaf node.) We have available the following locking operations: RL(n): read (share) lock tree node n. WL(n): write (exclusive) lock tree node n. U(n): unlock tree node n (works for either read or write locks) In the code on the next page please add the locking actions that would implement the tree locking protocol. You want to be as efficient as possible, so do not write lock a node if you can get away with a read lock. Also, release locks as soon as it is safe to do so. For simplicity you do not have to show locks for a data record r. Finally, note that some of the black lines next page may need no lock, while others may need more than one lock action. If there is more than one lock on a line, write the actions in the correct order (left to right). 8

Procedure findkey(k) -> record r [ b <- findbucket(d, k); r <- findrecord(b,k); ] Procedure insertrecord(d, k, r) [ b <- findbucket(d, k); if istherespace(b)= true then [ insert(b, k, r) ] else [ doubledirectory(d); splitbucket(k, b); b <- findbucket(d, k); insert(b, k, r) ] ] 9

Problem 7 (10 points) Consider logical action logging and the recovery strategy discussed in class. The recovery strategy is reproduced at the end of this problem (Notes 10, slides 73-74). Note that step numbers have been added (e.g., 1.1, 2.1), so you can refer to them in your answers below. (a) Say you want to only record undo logical actions in the log. (You will not write any redo log entries in the log.) Which of the following rules will have to be enforced as you process transactions? (Your answer must include all relevant rules, and may include any number of these rules.) (R1) Flush database actions by transaction T to disk only after the T commit record has been written to the log. (R2) Before transaction T s commit record is written to the log, all of T s actions must have been flushed to the database. (R3) The log record for an action of T is only written to disk after the action has been reflected on the database on disk. (R4) The log record for an action of T must be written to disk before the action is reflected on the database on disk. (R5) Before the T commit record is written to the log, all of the action log records of T must be written to the log. (R6) The T commit record must precede all of the action log records of T in the log. Rules to enforce with pure undo logical logging: (b) Is it possible to simplify the general recovery strategy for the case where we only do undo logging? In your answer specify which steps (1.1, 1.2, 1.3, 2.1, 2.2, 2.3) if any can be safely removed. (You can of course assume that the rules you indicated in part (a) are enforced.) Steps that can be removed with pure undo logical logging: (c) Questions (c) and (d) are analogous to questions (a) and (b), but now for the pure redo logging case. Say you want to only record redo logical actions in the log. (You will not write any undo log entries in the log.) Which of the following rules will have to be enforced as you process transactions? (Your answer must include all relevant rules, and may include any number of these rules.) (R1) Flush database actions by transaction T to disk only after the T commit record has been written to the log. (R2) Before transaction T s commit record is written to the log, all of T s actions must have been flushed to the database. (R3) The log record for an action of T is only written to disk after the action has been reflected on the database on disk. 10

(R4) The log record for an action of T must be written to disk before the action is reflected on the database on disk. (R5) Before the T commit record is written to the log, all of the action log records of T must be written to the log. (R6) The T commit record must precede all of the action log records of T in the log. Rules to enforce with pure redo logical logging: (d) Is it possible to simplify the general recovery strategy for the case where we only do redo logging? In your answer specify which steps (1.1, 1.2, 1.3, 2.1, 2.2, 2.3) if any can be safely removed. (You can of course assume that the rules you indicated in part (c) are enforced.) Steps that can be removed with pure redo logical logging: Recovery Strategy ********* FOR REFERENCE ***************** [1] Reconstruct state at time of crash [1.1] Find latest valid checkpoint, Ck, and let ac be its set of active transactions Scan log from Ck to end: [1.2] For each log entry [lsn, page] do: if lsn(page) < lsn then redo action [1.3] If log entry is start or commit, update ac [2] Abort uncommitted transactions [2.1] Set ac contains transactions to abort [2.2] Scan log from end to Ck : For each log entry (not undo) of an ac transaction, undo action (making log entry) [2.3] For ac transactions not fully aborted, read their log entries older than Ck an undo their actions 11