Concepts Agenda Database Concepts Overview ging, REDO and UNDO Two Phase Distributed Processing Dr. Nick Bowen, VP UNIX and xseries SW Development October 17, 2003 Yale Oct 2003 Database System ACID index / 1 / 2.. / N GER Lock Mgr Buffer Supply Database Part # part name quantity index Customer Database Name=Bob Balance=21 Atomicity Consistency Durability All or nothing undo aborted redo committed coordinate with distributed A correct transformation of the source Aborting any transactions that fail to pass R.M. consistency tests Once done, it is done Forcing all log records of committed transactions to durable memory Redoing any recently committed work at restart File manager File manager Isolation Individual transactions do not bump into one another DB101 : DBM ecommerce Supply DB (inventory) decrement Billing - create bill Shipping - create order for factory for ship Failure reasons (3) Normally just overhead it all works Programming style error, e.g. abort changes System crashes, credit card authorized failure, DB crashes DO-UNDO-REDO Protocol Programming style for resource managers Structure each operation ( DO ) so it can be undone or redone Any operation should perform The actual operation ( DO ) A log record An UNDO program A REDO program 1
DO-UNDO-REDO Protocol DO: Normal Operations OLD DO NEW Begin() - Decrement widget (22->21) - Increment Bob s Balance (21->30) () Q=21 Bob B=30 UNDO: Back out changes NEW UNDO OLD Supply DB Block X Part #=22 Part Name=Widget Quantity=22 Customer DB Block Y Name=Bob Balance=21 Name=Fred Balance=10 REDO: Ensure changes OLD REDO NEW Begin_= 123 UNDO/REDO Widget OLD=22, NEW=21 REDO/UNDO Quantity OLD=21, NEW=30 = 123-2 is a critical component [x] [y] [z] [w] Buffer manager replaces page -2 Very asynchronous lots of I/O must occur concurrently Allows regular I/O to be asynchronous or deferred Buffer manager page -2 Start [x] -2 [z] -3 [w] Memory Numerous changes s can be the bridge between these Database Disk Blocks s (continued) Data Flow in a System Restart force at every would be a serial bottleneck If lots of s are concurrently committing (approximately) a log force can handle lots of s WADS concept tx 1 tx 2 tx 3 lots of transactions ( often duplexed) Physical log files 1. Find checkpoint 2. Read log forward 3. REDO each operation At END: 1. UNDO un-commit transactions REDO ( Requests) UNDO () 2
Normal (no failure) execution of an application interacting with a resource manager Begin_work() _work() TRID Normal functions Lock requests phase 1? YES/NO phase 2? ACK Join_work Lock Write_commit() record and a force log The Heart of the Issue: Force of Prepare & Begin_work() _work() Begin_= 123 TRID Normal functions phase 1? YES/NO phase 2? ACK UNDO/REDO Widget OLD=22, NEW=21 Lock requests Join_work Lock Write_commit() record and a force log Answer YES to Phase 1: prepared to OR Abort REDO/UNDO Quantity OLD=21, NEW=30 = 123 Which means forcing out all REDO & UNDO records. Two Phase We have shown two phase for one system, one RM Can be extended: Multiple RM's on a single system Multiple RM's on distributed systems Two phase commit: Making computations atomic asks RM s to commit Can be multiple ones Can be on different systems Each Vote YES or NO Vote YES means Prepared to commit or Abort proceeds to commit if all YES Forces End Phase 1 log record This is the atomic instance For example Restart Processing wants to commit TRID=555 1. must begin a 2. RM s must tell they want to join a transaction. 3. Major Stages: Phase 1 Phase 2 Prepare Decide Complete Ask each RM to vote: YES/NO mgr logs a commit record (if all yes) to durable storage. This is a precise commit point, when the bits hit the disk. tells RM s about decision Once all RM s ACK commit, write a completion record manager finds all Begin_work with no commit record => s to abort Begin_work with commit => RM s must be told with no complete about s Begin_work + commit + complete => Ignore RM Restart Must find work either process an abort (UNDO) or ensure everything is done (REDO) 3
Why is this complicated? between two products Begin RM 1 RM 2 MQ Series between distributed systems Distributed s Root s manager where application performs original begin_work Communication RM A RM A CRM B RM B A B Can get complicated Distributed Two Phase A B C A B C O 1!!! O 2 Local prepare call local RM s Distributed prepare send prepares on each outgoing session Decide all local RM s vote YES & outgoing sessions vote YES tell RM s & outgoing session Complete - same A coordinates A B C D E B, D, E participate B coordinates for C Prepare now has a time-out record has names of distributed RM s who need to know about phase 2 commit In doubt of CRM voted YES, but does not know. Must poll upstream Mgr Presumed abort protocol (optimization: coordinator is queried on that is not active, prepared, or committed, then assume aborted) presumption that no record found == abort participant sends ACK only when the completion state is durable coordinator only sends records the completion log when all commit messages have been acknowledged with A 4
Coordinator Participant Putting it All Together Active Prepared ting Write commit (force) Prepare Yes ACK Local Prepare Write prepare (force) Local Work Write completion (Lazy) ACK when durable Active Prepared ting Basic transaction & database concepts Normal commit processing Importance of the Two Phase Extending for Distributed Processing Further Reading Processing: Concepts and Techniques, Jim Gray and Andreas Reuter. Chapter 10 Concepts pages 529-582 Note - - 1000 other pages!!!! Write completion record in (Lazy) ted Force delays response time Not good for contention (lock held time goes up) Required to make protocol work Any failure ted 5