Kroll Ontrack Data Recovery Oracle Data Loss: When the best of plans fail
Breaking the Zettabyte barrier Over 1,000,000,000,000,000,000,000 bytes of data exists currently
Learning Objectives 1. Identify data loss impacts 2. Identify common data loss scenarios 3. Identify recovery options 4. Recommendations when data loss occurs
Learning Objective 1 Data Loss Impacts
Impact of data loss Change in Amount of Data Managed Since Last Year 0% 5% 10% 15% 20% 25% 30% 35% 40% 1-10% 11% 11-25% 36% 26-50% 25% 51-100% 11% > 100% 5% No Change Decreased 1% 2% Don't Know/Unsure 10% Source: Database Trends and Applications, Mar 2011
Costs Associated With Data Loss Impact - Why should you care?» $18.2 billion cost per year» $145,000 per hour Companies with an outage lasting for more than 10 days» Most will never fully recover financially» 50% out of business within 5 years Of the 50%, 70% close their doors within twelve months
Impacts to Performance Storage Choices» The type of storage - impacts performance» and recoverability Rebuild vs Recover - Unscheduled Downtime» Meeting SLA for uptime guarantees» Penalties for not meeting uptime Rebuild vs Recover - Missing data» Backup may not contain most current data Rebuild vs Recover - Employee Costs» Employee s time spent in recovery instead of generating revenue
Learning Objective 2 Data Loss Scenarios
Storage Types Common Oracle Data Stores» DAS» SAN» NAS What do they all have in common» HARD DRIVES!
Common Data Loss Scenarios Hardware failures RAID Disk Software failures File System Data Corruption Database Corruption VMware Metadata Corruption Human error Deleted Overwritten Formatted (Guest and Host level)
Common Causes of Data Loss
Common Causes of Data Loss Hardware failure 44% Human error 32% Natural Disasters 3% Virus 7% Software malfunctions 14% Source: Ontrack Data International
Learning Objective 3 Recovery Options
Traditional Data Loss
Hardware Failure Storage Layer Physical
Oracle/VMware RAID Recovery
Oracle/VMware Recovery Overview RAID Failure Common RAID failures: Multiple drive failure Incorrect RAID rebuild Drives forced online RAID re-configured Disk 0 Disk 1 Disk 2 Disk 3 Disk 4
VMware Recovery Overview RAID Failure KO NTFS Recovery Tool KO NSS Recovery Tool KO EXT Recovery Tool KO XFS Recovery Tool Specialized file system recovery tools. All repair applications are developed by Kroll Ontrack Kroll Ontrack VMFS Recovery Tool RAID Device VMFS Volume VMDK1 VMDK2 VMDK3 VMDK4 NTFS NSS EXT XFS Kroll Ontrack RAID Manager Kroll Ontrack Rollback Layer Works like a VMware snapshot to catch any changes or repairs needed Locally attached drives, separated from RAID controller Disk 0 Disk 1 Disk 2 Disk 3 Disk 4
Case Study - EMC MetaLUN VMware Recovery Data Loss Event» Data Processing service had two drives fail in a complex RAID array; 70TB MetaLUN» 30 VMs accessing this storage; 16 clients with deliverables due; $4 Million USD at risk if deliverable date is missed. Internal Recovery Efforts» Called EMC Support: Given the option force the MetaLUN online and truncate volume, but there will be missing data. Kroll Ontrack s Recovery Efforts Aug 9 -? PM HDD Predictive Failure; Storage Support notified Aug 10-5AM Predicted HDD fails; Global Hotspare starts rebuilding Aug 10 9AM Second HDD failure; RAID rebuild stops Aug 10 10AM Failed Drive sent to KO Aug 11 11PM 1 HDD successfully recovered; 20hrs in C/R, 16hrs in Lab Aug 11 11:30 PM OEM verifies RAID, IT verifies data; RAID rebuild resumed to HS Aug 12-1AM Volume opened up to data processing to meet client s deadlines Aug 12-6AM Client data processed before SLA expiration
Where Does Data Loss Start? Human Error» Accidental Deletion of Files or DB Records» Reformat of Storage» Overwritten Data» Disaster Recovery Errors» Reinstall of Software
Deleted VMware Recovery
Deleted Virtual Machines When files are deleted in a VMFS volume, all the pointers to the data area are erased VMFS Volume VMFS Metadata VM1-VMDK1 VM1-VMDK2 VM3-VMDK1 VM2-VMDK1 VM3-VMDK2 VM2-VMDK2
Deleted Virtual Machines The data is still on the device, but there are no structures pointing to it VMFS Volume VMFS Metadata VM1-VMDK1 VM1-VMDK2 VM3-VMDK1 VM2-VMDK1 VM3-VMDK2 VM2-VMDK2
Deleted Virtual Machines This is one of the most challenging types of recoveries, because the VMDK files are often fragmented and Ontrack must find all the pieces and assemble them back into files KO VMFS Recovery Tool VMFS Volume VMFS Metadata VM1-VMDK1 VM1-VMDK2 VM3-VMDK1 VM2-VMDK1 VM3-VMDK2 VM2-VMDK2
Deleted Virtual Machines With the KO VMFS Recovery Tools, Ontrack engineers can focus on unallocated areas of the volume to search for the deleted virtual machines. KO VMFS Recovery Tool VMFS Volume VMFS Metadata VM1-VMDK1 VM1-VMDK2 VM3-VMDK1 VM2-VMDK1 VM3-VMDK2 VM2-VMDK2
Case Study Deleted Virtual Disk Initial Facts» 28 LUNs (28 DataStores)» 440 VMDKs» 1000+ snapshots Additional Information» Exchange, SQL, Oracle, File Servers all gone» Backups also deleted
Case Study Deleted Virtual Disk Challenges» Customer did not have any documentation» Unknown content of virtual disks» Changing priorities Recovery Process» Connected multiple machines to KO RDR» Multiple engineers from around the world processed volumes 24 x 7» 3 weeks to recover all of the critical VMs and snapshots
Oracle Specific Data Loss Scenarios Database Corruption Log File Corruption Backup Corruption ASM Corruption
Example 1 Database File Corruption State Agency Data Loss Scenario» Xiotech Magnatude SAN» Oracle 10» Oracle database files corrupt due to SAN corruption Ontrack Recovery» Mine the corrupt files for good and partial row data» Export to SQL_LDR scripts for customer to import
Example 2 Log File Corruption State Agency Data Loss Scenario» EMC Symmetrix» Oracle 10.2.04, 1TB database» Oracle ASM volumes on a RedHat Enterprise server» Log files were corrupt Ontrack Recovery» Mine the logs and recover missing data» Export to SQL_LDR scripts for customer to import
Example 3 ASM Corruption Fortune 500 Company Data Loss Scenario» Hitachi SAN with 112 LUNs» Oracle 10, 14TB database» Oracle ASM volumes corrupted Ontrack Recovery» Recover raw page data» Virtually rebuild Oracle database files» Export to SQL_LDR scripts for customer to import
Learning Objective 4 Recommendations
Recommendations When Data Loss Occurs Don t panic and don t update your resume When troubleshooting, do not write any data to the storage array or change storage configurations. Don t format the volume that has missing data Use the support system offered by the software provider Restore data to an alternate location and contact a data recovery company with extensive virtual data recovery experience including the ability to perform remote recoveries
VMware Recovery Tip Sheet
Design Recommendations Implement naming conventions for hosts, guests, physical servers and virtual file system volume Control who has access to the environment Document the backup and recovery plan and include the contact information of your preferred data recovery vendor in the plan Test your backups on a regular basis Use the tools to manage your virtual environment; don t take shortcuts Be careful how you use snapshots and do your housekeeping Monitor the data stores, logs and swaps
Kroll Ontrack Worldwide Reach
Thank you!