Database High Availability DB2 9 DBA certification Solutions 2010 exam 731 P.O. Box 200, 5520 AE Eersel, The Netherlands Tel.:(+31) 497-530190, Fax: (+31) 497-530191 E-mail: kbrant@kbce.nl Disclaimer The information contained in this presentation is based on techniques, algorithms, and documentation published by the several authors and companies, and in addition is the result of research. It is therefore subject to change at any time without notice or warning. The information contained in this presentation has not been submitted to any formal tests or review and is distributed on an As is basis without any warranty, either expressed or implied. The use of this information or the implementation of any of these techniques is a client responsibility and depends on the client s ability to evaluate and integrate them into the client s operational environment. While each item may have been reviewed for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Clients attempting to adapt these techniques to their own environments do so at their own risks. Foils, handouts, and additional materials distributed as part of this presentation or seminar should be reviewed in their entirety. Note: This presentation gives you an overview of techniques used by database vendors. It can not be used for making company decisions regarding high availability without further studies. Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 2 1
Trademarks This presentation contains many trademarks in use by database vendors if we are aware of a trademark we put it in capitals. Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 3 Agenda What is downtime Techniques in use SQL Server Cluster DB2 Data Sharing / PureScale vs Oracle RAC Wise Words Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 4 2
What is downtime? Terminology in use Term Business Risk Solution Data Recovery High Availability Disaster Recovery Downtime and Data loss Downtime Permanent Data loss and "unable to continue" Not investing in hardware, software and knowledge means potential high risk for downtime and data loss Redundant data Redundant system components Redundant systems and facilities Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 6 3
Permitted downtime? Uptime SLA About Downtime and Data loss Downtime per Year Downtime per Month 99.9% 8.76 hours 43.8 minutes 99.99% 52.6 minutes 4.38 minutes 99.999% 5.26 minutes 0.438 minutes Acceptable data/transaction loss (if any)? Mean time to recovery? Difference for "normal down" and disaster? How much damage is done after how much time? $$? Who is first in case of disaster? Note: Database uptime application availability Application failures Hardware Outages (Power, Network, etc.) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 7 What is causing downtime? Downtime Unplanned down Planned down Hardware Failure Data Corruption Data / Appl Changes System Upgrades Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 8 4
Unplanned: Hardware Failure Storage subsystem Disk or Controller Firmware or driver problem Network Often causing a partial down (difficult to measure) Often rely on third party (SLA!) Unplanned down Hardware Failure Data Corruption Server Cluster Support: Oracle RAC / Microsoft MSCS / IBM Sysplex and PureScale Where is the backup / cluster server? Can virtual server be a solution? Power Outage Environment change, too many requests (unstable grid) etc. Third party, difficult to SLA Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 9 Unplanned: Site failure Site Failure Complete Server room down (e.g. fire) Can always happen because you depend on external party (e.g. power) More than just a database problem All data and hardware is involved Can you handle? Network changes Workload (different config) Fail-back situation Unplanned down Hardware Failure Data Corruption Isn't there a hidden Single Point of Failure E.g. glassfiber back-up in the same bundle Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 10 5
Unplanned: Data Corruption Human error Biggest problem difficult because many scenarios which need different solutions Unplanned down Hardware Failure Data Corruption Logical Corruption Difficult to detect (sometimes after years strange data emerges) Can you detect the cause (e.g. program) and how much data is affected? Special techniques to go back in time and select data again Can you re-process the data / transactions? Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 11 Planned: Data / Application changes Application upgrades Schema changes still difficult Still a market for vendor tools DB2 versioning has performance impact Running systems need to shutdown in order to reload E.g. middleware transaction refresh How are we testing / accepting this If test fails, how do we undo the change? Planned down Data/Appl Changes System Upgrades Data maintenance Offline REORG (sometimes needed) Roll-in / Roll-out data Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 12 6
Planned: System Upgrades Hardware upgrades Growth End of life cycle / vendor support Redundant is not always hot-swap Software upgrades Operating system, middleware, DBMS etc. Wise to combine upgrades? What if the new combination is not stable How to respond to vendor patches Needed? What if we don't Policy? One size fits all? Planned down Data/Appl Changes System Upgrades Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 13 Down vs Solution Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 14 7
Time Needed Most common down Partial down System does work but certain function are unavailable Examples: Transactions with certain input abort Certain location cannot connect Single database corrupt End-user to database goes through many layers How to report? Layered approach can buy to downtime Let user work with front-end as if it is real-time Bring down backend for maintenance and queue requests E.g. online banking without full history or real-time transactions Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 15 How long does it take to fix it Breakdown of recovery Many problems the analysis takes a (very) long time Human errors Corruptions Many companies suffer a knowledge problem How to fix it Creating the scenario Testing the scenario? Speed of recovery itself Best parameters Tools 10 8 6 4 2 0 Recovery Time Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 16 8
Techniques in use to minimize downtime Traditional backup types Database Backups Full backup Online / Offline (z/os Sharelevel) Incremental & Differential backup Include log backup Any other than Full backup is substitute for log! Disk is better than tape First backup to disk (separate physical disk volume) Detect exceptions encountered during backup Verify backup files Copy backup files to tape, remote disk or storage manager (TSM) Data retention policy for backup files What are you going to do with these backups? Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 18 9
Location of backup files Duration of retention Protection of sensitive data! Backup Retention Policy Sarbanes/Oxley (SOX) HIPAA Internal policies for data management and protection Access to backups from offsite data storage Often the weakest link in security scenario's Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 19 Two techniques Snapshot technology Real disk mirroring Share disk after snap until an update is done How useful is the snapped disk Was DBMS aware of the snap? Was it up? Did the DBMS participate in the snap? Snap can be extremely useful As backup As a fallback (e.g. after failed upgrade) Be careful Not all scenarios work on the snapped copy Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 20 10
Rebuild database (restore & rollforward) DB2 LUW more flexible than z/os Include log with backup Differential and delta backups Used for: Redirected restore (problem investigation) Simple scenario's which allow for down-time As a safety net when all else fails Less safe than you might imagine Backup just reads files (does not analyze them) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 21 Create offline solution Replication Based on master / slave Can be a "High Availability" solution All database support this (sometimes very advanced) Often horizontal / vertical segmented Mostly row based but sometimes optimized E.g. MySQL DRBD is like RAID1 over network Replication does not take care of: IP takeover Heartbeat & automate takeover Slave becoming master Fail back and resync Conflict resolution (slave is read-only) With further automation can be base for High Availability (MySQL Flipper) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 22 11
DB2 for LUW HADR Replication using Log Shipping Many limitation compared to DataGuard Xkoto Gridscale for DB2 More options than HADR Data Guard Suports both log shipping and SQL shipping Very mature / flexible product DB2 for z/os Trackersite Very basic (really in use?) Microsoft SQL Server mirror Witness server automates the takeover Xkoto has also gridscale for SQL Server Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 23 Clustering solution Withdraw a "node" from the solution Network problem (IP needs re-routed) Failover might not be active Can it be automated? Heartbeat or Timeout Client re-route Split Brain problem What happens to running Units Of Work? Locked data or other node backout? Take-over of transaction / restart of transaction / fail transaction Controlled "failure" for planned down (e.g. upgrade) Fail back / Insert node into the cluster Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 24 12
Shared Everything Shared nothing vs shared everything DB2 Data Sharing & PureScale (HA & Perf) Oracle RAC (HA & Perf) Does not share memory, only disk Microsoft SQL Server Cluster (HA) Based on hardware / operating system solution Sybase IQ?? Shared Nothing Microsoft SQL Server mirror (HA) Oracle Data Guard (HA) DB2 HADR (HA) MySQL Cluster (HA) MySQL Replication (Perf) Both have many limitations, still large system use them Perf: Terradata, Postgres (Greenplum), Netezza, DB2 LUW DPF Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 25 High Availabilty Replication v.s. Cluster Both can have Single Point of Failure (SPF) Wrong configuration can destroy data SAN/NAS I/O overhead when shared storage With RAID SAN is no longer SPF Make sure network to the SAN is not SPF Replication is easy to break Inconsistent data (e.g. middle transaction) Painful start-up / restart Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 26 13
Split Brain condition Due to communications failures nodes are separated Missing heartbeat is really down or communication failure If multiple nodes control of the cluster, then it's called a split-brain condition If this happens, then bad things will happen Special software solutions are needed to 100% secure a down of the other node(s) This software can become Single Point of Failure Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 27 SQL Server Cluster 14
SQL Server cluster: Failover clustering Client PCs SQL fails over and is available to clients Failure Occurs! Node A Node B SQL SQL Heartbeat Passive Node Disk cabinet A Disk cabinet B SCSI Reserve Broken New Reservation Established Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 29 SQL Server Cluster: Data Mirroring Application Commit Witness 1 5 SQL Server Principal 2 SQL Server Mirror 2 >2 4 3 >3 Log Data Log Data Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 30 15
SQL Server Comparison Database Mirroring Scope: user Database Standard hardware Very fast failover (seconds) OS flexible (e.g. 32/64) Independent storage Reporting on mirror (Read-Only) Geographic separation OK Failover Clustering Scope: Full instance Certified hardware Automatic failover (minutes) Enterprise OS Shared storage Standby not available Servers co-located (site failure!) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 31 Compare DB2 to Oracle DB2 DB2 DB2 CF Log Log Log CF Data DB2 Data Sharing / PureScale Single system image Dynamic workload management Software / Hardware Solution No Single Point of Failure ORACLE RAC High Speed inter-system links Lots of communications No Global cache Cache Fusion / Interconnect Passes data around a lot Extra communication overhead Scalability? Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 32 16
Components in a RAC Cluster Global Cache Services (GCS) Manage Data Page Synchronization Sends DATA to other nodes Global Enqueue Service (GES) Manages Global Locks for non-data pages Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 33 Oracle RAC in action cache fusion Example how Oracle RAC moves data around 8741 Instance A Instance B Instance C Do READ data block 8741 Instance D Master Node 2 3 1 Want to read data block 8741 8741 Read into buffer cache Instance B becomes "owner" of the block Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 34 17
Oracle RAC in action cache fusion Example how Oracle RAC moves data around 8741 3 8741 Instance A Instance B Instance C Send 8741 Please send 8741 to node C Instance D Master Node 2 1 Want to read data block 8741 4 Received data block 8741 from Node B 8741 Data no longer comes from disk, owners forward them. Multiple copies around Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 35 Oracle RAC in action cache fusion Example how Oracle RAC moves data around Instance A 6 Flush PI data block 8741 8741 2 Instance C 6 8741 Flush PI data block 8741 3 Forward 8741 to node B Instance D Master Node 1 8741 PI = Previous Image Owners have to write, after write caches are flushed New read requests moves the data again around 5 4 Write data block 8741 Instance B 8741 Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 36 18
What happens if RAC node fails Crash recovery by other node Freeze GCS, not allowing updates to database anymore Data block remaster and recover the pages using redo/undo Invalidate blocks recovered False node failure detection Can have split brain problem Oracle Custer software Heartbeat based, Single Point of Failure (Yes, says IBM) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 37 Unavailable time Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 38 19
RAC vs Purescale Scalability IBM PureScale benchmark: 95% scalability 32 members 81% scalability 112 members Oracle RAC?? No figures License forbids publication of measurements Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 39 Even Larry admits eweek (www.eweek.com) 31-Oct-2003: I make fun of a lot of other databases all other databases, in fact, except the mainframe version of DB2. It's a first-rate piece of technology. Larry Ellison, Oracle's Founder and CEO I guess we have to add DB2 LUW PureScale to it ;-) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 40 20
Wise words Prepare for failure DISASTER WILL HAPPEN Ensure that no important data is lost Think of the different types of unavailability (there is no golden bullet) Keep It Simple, Stupid (KISS) Complexity is the enemy of reliability Saving on education is like stopping a watch to save time Automate as much as possible Careful you still understand it, so document it (incl. what if scenario) Test it! Frequently!! Use good scenarios! Audit it You need a devile's advocate to find the holes Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 41 New technology Source: channelinsider.com If you can see it an touch it then it is: physical If you cannot see it but you can touch it then it is: transparent If you can see it but not touch it then it is: virtual If you cannot see it nor touch it then It's gone! Be careful with new technology ;-) Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 42 21
QUESTIONS? Copyright KBCE b.v., 2010 All Intellectual Rights Reserved 43 22