Backup and restore of Oracle databases: introducing a disk layer



Similar documents
Experience in running relational databases on clustered storage

Use RMAN to relocate a 10TB RAC database with minimum downtime. Tao Zuo tao_zuo@npd.com NPD Inc. 9/2011

Oracle Database Backups and Disaster Autodesk

RMAN What is Rman Why use Rman Understanding The Rman Architecture Taking Backup in Non archive Backup Mode Taking Backup in archive Mode

Configuring Backup Settings. Copyright 2009, Oracle. All rights reserved.

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

Agenda. Overview Configuring the database for basic Backup and Recovery Backing up your database Restore and Recovery Operations Managing your backups

How To Backup And Restore A Database With A Powervault Backup And Powervaults Backup Software On A Poweredge Powervalt Backup On A Netvault 2.5 (Powervault) Powervast Backup On An Uniden Power

Oracle Database 10g: Backup and Recovery 1-2

Backup/Recovery Strategy and Impact on Applications. Jacek Wojcieszuk, CERN IT Database Deployment and Persistancy Workshop October, 2005

<Insert Picture Here> RMAN Configuration and Performance Tuning Best Practices

About the Author About the Technical Contributors About the Technical Reviewers Acknowledgments. How to Use This Book

Oracle9i Database: Advanced Backup and Recovery Using RMAN

An Oracle White Paper March Backup and Recovery Strategies for the Oracle Database Appliance

High Availability Databases based on Oracle 10g RAC on Linux

10th TF-Storage Meeting

Oracle 12c Recovering a lost /corrupted table from RMAN Backup after user error or application issue

Using HP StoreOnce Backup systems for Oracle database backups

BrightStor ARCserve Backup

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Oracle Database 11g: Administration Workshop II DBA Release 2

Oracle Recovery Manager

Configuring Backup Settings Configuring and Managing Persistent Settings for RMAN Configuring Autobackup of Control File Backup optimization

If you have not multiplexed your online redo logs, then you are only left with incomplete recovery. Your steps are as follows:

Oracle Backup and Recovery Best Practices Dell Compellent Storage Center. Dell Compellent Technical Best Practices

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

Symantec Enterprise Vault And NetApp Better Together

Long term retention and archiving the challenges and the solution

Why Not Oracle Standard Edition? A Dbvisit White Paper By Anton Els

Implementing an Enterprise Class Database Backup and Recovery Plan

Zero Downtime Backup solution for Oracle10g

Backup and Recovery Solutions for Exadata. Ľubomír Vaňo Principal Sales Consultant

Oracle Recovery Manager 10g. An Oracle White Paper November 2003

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Oracle Database 11g: Administration Workshop II DBA Release 2

EMC Backup and Recovery for Oracle Database 11g Without Hot Backup Mode using DNFS and Automatic Storage Management on Fibre Channel

11. Oracle Recovery Manager Overview and Configuration.

ORACLE CORE DBA ONLINE TRAINING

Get Success in Passing Your Certification Exam at first attempt!

Oracle Database 11g: Administration Workshop II Release 2

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Oracle 11g DBA Training Course Content

RMAN BACKUP & RECOVERY. Recovery Manager. Veeratteshwaran Sridhar

Oracle Data Protection Concepts

Backup and Recovery Solutions for Exadata. Cor Beumer Storage Sales Specialist Oracle Nederland

Introduction to NetApp Infinite Volume

Understanding Disk Storage in Tivoli Storage Manager

Evaluation of Cloud ONTAP and AltaVault using AWS

EMC NetWorker Module for Oracle Release 5.0

SnapManager for Oracle 2.2. Anand Ranganathan Product & Partner Engineer (PPE)

Experience and Lessons learnt from running High Availability Databases on Network Attached Storage

Using RMAN to restore a database to another server in an ASM environment

<Insert Picture Here>

12. User-managed and RMAN-based backups.

Strategies for Oracle Database Backup and Recovery: Case Studies. Mingguang Xu

EMC Unified Storage for Oracle Database 11g/10g Virtualized Solution. Enabled by EMC Celerra and Linux using NFS and DNFS. Reference Architecture

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

e Number: Passing Score: 800 Time Limit: 120 min File Version: 1.0

This appendix describes the following procedures: Cisco ANA Registry Backup and Restore Oracle Database Backup and Restore

Solution Brief: Creating Avid Project Archives

An Oracle White Paper November Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager

Backup Types. Backup and Recovery. Categories of Failures. Issues. Logical. Cold. Hot. Physical With. Statement failure

Techniques for implementing & running robust and reliable DB-centric Grid Applications

ORACLE RMAN DESIGN BEST PRACTICES WITH EMC DATA DOMAIN

D12CBR Oracle Database 12c: Backup and Recovery Workshop NEW

DPAD Introduction. EMC Data Protection and Availability Division. Copyright 2011 EMC Corporation. All rights reserved.

SQL-BackTrack the Smart DBA s Power Tool for Backup and Recovery

HOW TO. RMAN Restore for Standby 10gR2

Database Disaster Recovery using only RMAN Backups

One Solution for Real-Time Data protection, Disaster Recovery & Migration

Oracle 10g Feature: RMAN Incrementally Updated Backups

NetApp for Oracle Database

Oracle Cloud Storage and File system

EMC BACKUP AND RECOVERY SOLUTIONS

HP LeftHand SAN Solutions

Pricing - overview of available configurations

High Performance Oracle RAC Clusters A study of SSD SAN storage A Datapipe White Paper

EMC BACKUP MEETS BIG DATA

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software

The future of Storage and Storage Management Using Virtualization to Increase Productivity. Storyflex VISION 2007 Hans Lamprecht NetApp SEE Vienna

SAP HANA Operation Expert Summit BUILD - High Availability & Disaster Recovery

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

Introducing NetApp FAS2500 series. Marek Stopka Senior System Engineer ALEF Distribution CZ s.r.o.

The Data Placement Challenge

Oracle 11g Database Administration

Quick Start - NetApp File Archiver

Backup and Recovery 1

WHITE PAPER. Oracle RMAN Design Best Practices with Data Domain. Storage. Deduplication

Backup and Recovery Redesign with Deduplication

Database Recovery For Newbies

Tier0 plans and security and backup policy proposals

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

Backup and Recovery. Oracle RMAN 11 g. Oracle Press ORACLG. Matthew Hart. Robert G. Freeman. Mc Graw Hill. Lisbon London Madrid Mexico City Milan

RMAN in the Trenches: To Go Forward, We Must Backup

Integrating Network Appliance Snapshot and SnapRestore with VERITAS NetBackup in an Oracle Backup Environment

Transcription:

Backup and restore of Oracle databases: introducing a disk layer by Ruben Gaspar IT-DB-DBB CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/it BR evolution: Backup to disk

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/it BR evolution: Backup to disk- 2

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/it BR evolution: Backup to disk- 3

Target Oracle databases for backup to disk ~70 Oracle databases, most of them running Oracle clusterware (RAC) 49 are being backed up to disk and then tape 21 are just backed up with snapshots. Test and development instances. 15 Data Guard RAC clusters in Prod Active Data Guard since upgrade to 11g They are just backed up to tape Redo Transport 10 Oracle single instance in DBaaS also backed up using snapshots. BR evolution: Backup to disk- 4 4

Oracle backup basics The Oracle clock: System Change Number (SCN) It will take 544 years to run out of SCN at 16K/s smon_scn_time tracks time versus SCN Type of backups Consistent: taken while database has been cleanly shutdown. All redo applied to data files. Archive logs are not produced. Inconsistent: taken while database is running. Database must be in archivelog mode. It means archive logs will be produced. Point in Time Recoveries (PITR) are possible. Drawback: clean-up of archivelogs is critical to avoid that database blocks TSM was playing a critical role here Backup sets: Oracle proprietary format for backups. Binary files. Backup sets are containers for one or several backup pieces Backup pieces contain blocks of 1 or several data files (multiplexing) RMAN channels: disk or tape or proxy, read data files and write back to the backup media. We use SBT: serial backup to tape API, using IBM Tivoli Data Protection 6.3 (provided by TSM support) BR evolution: Backup to disk- 5

Oracle backup basics (II) Backup jobs based on templates. Recovery Manager API --Full backup incremental level 0 database; --comulative backup incremental level 2 cumulative database; --Incremental backup incremental level 1 database; --Archivelogs backup tag 'BR_TAG' archivelog all delete all input; Retention policy from 60 to 90 days, depending on DB. CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 90 DAYS; e.g. LEMONRAC [1xfull + 6xdifferential + archivelogs] * 13 weeks 2 Full Cum. Inc 1 PITR Arch Fulls (GB) Inc (GB) Archived logs Total LEMONRAC 87902.42 857.52 13319.39 102079.32 Controlfile backup, automatically taken by each backup CONFIGURE CONTROLFILE AUTOBACKUP ON; e.g. LHCBSTG [2xfull + 5xdifferential + 24x4 archivelogs] *13 weeks = 934GB BR evolution: Backup to disk- 6

BR evolution: Backup to disk- 7

What is there to be backed up? Backup jobs using RMAN API take care of : Database files: user and system files Control files: contain structure and status of data files. They have also all backup history Archived logs: backup of redo logs. Needed for inconsistent backup strategies. They need to be backed up and removed from the active file system otherwise if running out of space, database freezes/stops. 5.1TB redo logs produced per day ALL THREE ARE CRITICAL FOR A BACKUP/RECOVERY strategy BR evolution: Backup to disk- 8

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary BR evolution: Backup to disk- 9

Backup architecture Custom solution: about 15k lines of code, Perl + Bash Flexible: easy to adapt to new Oracle release, backup media Based on Oracle Recovery Manager (RMAN) templates Central logging Easy to extend via Perl plug-ins: snapshot, exports, RO tablespaces, BR evolution: Backup to disk- 10

Backup architecture Custom solution: about 15k lines of code, Perl + Bash Flexible: easy to adapt to new Oracle release, backup media Based on Oracle Recovery Manager (RMAN) templates Central logging Easy to extend via Perl plug-ins: snapshot, exports, RO tablespaces, We send compressed: 1 out of 4 full backups All archivelogs BR evolution: Backup to disk- 11

Impact on TSM Savings depend on database workload, e.g.: backup sets on disk for three databases DB Full (GB) Inc (GB) Archived logs (GB) Savings EDHP 29197.76 1216.697 2169.766 70% CASTORNS 4944.839 213.256 336.2889 71% ATLASSTG 1484.146 724.9567 3063.658 45% + x 1/4 Sent to tape + backup sets are compressed (see later) Source: TSM support Savings ~ 71% 1 7 5 BR evolution: Backup to disk- 12

Impact on TSM (II) 15 accounts: alicestg,atlasstg,cmsstg,castorns,.. Source: TSM support ~70% savings 29 accounts: pdb,wcernp,itcore,aisdbp, ~47% savings BR evolution: Backup to disk- 13

Workflow for disk/tape backups DISK Same workflow as per tape backups to ease maintenance Disk or Tape templates are almost identical, just channel allocation differs Disk channel allocation calculated on the fly considering available space in aggregate and file system: using Netapp management API called ZAPI About 75 templates to adapt to all type of backup strategies Tape and disk backup strategies co-exist Reversible changing from one to another is a matter of changing templates. BR evolution: Backup to disk- 14

Workflow for disk/tape backups DISK Same workflow as per tape backups to ease maintenance Disk or Tape templates are almost identical, just channel allocation differs Disk channel allocation calculated on the fly considering available space in aggregate and file system: using Netapp management API called ZAPI About 75 templates to adapt to all type of backup strategies Tape and disk backup strategies co-exist Reversible changing from one to another is a matter of changing templates. BR evolution: Backup to disk- 15

Typical DB architecture Public interface 10GbE LAN Public interface RAC 10GbE 10GbE Interconnect 10GbE Private network 10GbE C-mode 10GbE 1 GbE Cluster interconnect 1 GbE 7-mode 6Gb/s mgmt network 6Gb/s 01 02 03 04 Archivelogs controfile datafiles BR evolution: Backup to disk- 16 Media Manager Server IBM TSM backup01 backup02 At least 2 file systems for backup to disk: /backup/dbsxx/dbname 16

New C-mode features Transparent file system movements: cluster01::> volume move start -destination-aggregate aggr1_c01n02 -vserver vs1 -volume castorns03 -cutover-window 10 DNS load balancing inside the cluster Automatic virtual IP rebalancing (based on failover groups) Access security via export-policy joins firewall + different authentication mechanisms: sys, krb5, ntlm Global namespace Compression and Deduplication We strongly rely on compression as the way to satisfy 2.3PB of backup set storage needs using 1.1PB of disk BR evolution: Backup to disk- 17

Backup to disk configuration on database servers Global namespace in use: /backup/dbsxx Ease management: mount point unchanged as data moves. It s a Netapp C-mode feature (see later) 7-mode: mount o priv-controllerip:/vol/castorns03 /ORA/dbs03/CASTOR C-mode: mount -o public-ip-cluster:/backup/dbs01/castorns /backup/dbs01/castorns /backup/dbs01/<dbname> autobackup controlfile + backupsets /backup/dbsxx/<dbname> backupsets RMAN configuration parameters: minimal change CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/backup/dbs01/<dbname>/<dbname>_%f'; BR evolution: Backup to disk- 18

Particular cases Solution also operational in a Data Guard configuration: full and incremental taken on standby (more while talking about restores) archivelogs + controlfile Primary Database Redo Transport Active Data Guard for users access and for disaster recovery Multiple channels: rman_channels_connect in order to distribute backup load username/password@rac-node1 username/password@rac-node2 full + incremental + controlfile Plug-in for RO tablespaces backup (ACCLOG: size about 170TB, growth 70TB/year) Automatic clean-up in case of tablespace state change One backup set per tablespace Extension to allow special mount points (ACCLOG) rman_mounts_readonly BR evolution: Backup to disk- 19 19

Backup to disk performance Backups run faster ~ 50% than on tape ACCLOG full backup 5TB 34 hours ~ 35 MB/s Tape Sending backup sets from disk to tape needs optimisation Work on progress with TSM support 14 hours ~ 100MB/s Disk 20 BR evolution: Backup to disk- 20

Backup to Disk space consumption Channels order is important storage management Space distribution should be according planning to avoid miss balance. File systems should grow at same pace. Emptiest volume is always selected on top Automatic size extension BR evolution: Backup to disk- 21 21

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary BR evolution: Backup to disk- 22

Recovery platform Only reliable proof of truth: run a recovery Any change introduce in backup platform/backup strategy is always validated via test recoveries Isolation Run independently of the production database Cant access any other system (database network links) No user jobs must run Flexible and easy to customize Maximize recovery server: several recoveries at the same time Exports taken after a successful recovery help in support cases: mainly logical errors Open source: http://sourceforge.net/projects/recoveryplat/ BR evolution: Backup to disk- 23 23

Recovery platform (II) Introducing disk buffer highly improves our recovery testing Also tested with Data Guard configurations: Data Guard: Oracle support ID 1070039.1 RMAN> set backup files for device type disk to accessible Restore from disk are usually 50% faster More recoveries can be run, nowadays about 40 recoveries per week No blocking of tape resources that could be used by backups BR evolution: Backup to disk- 24 24

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary BR evolution: Backup to disk- 25

Backup to disk cluster 2xFAS6240 Netapp controllers 24xdiskshelf DS4243 24x3TB SATA disks each (576 disks) raid_dp (raid6) 1.1 PB usable space split into 8 aggregates ~ 135TB each 2xquad core 64bit Intel(R) Xeon(R) CPU E5540 @ 2.53GHz 10gbps connectivity Multipath SAS loops 3 gbps Flash cache 512GB per node BR evolution: Backup to disk- 26

How fast, How compressed 450 400 350 300 250 MB/s 200 150 100 Compression (datafiles) 50 0 RMAN backup to disk* 1 2 3 Number of channels Online compression of datafiles ~55% (saved by compression) Backupsets compression of a 501 GB tablespace of random alphanumeric strings, dbms_random. Croncompression Netapp 8.1.1 no-compressed (t) basic low medium high Nocompressed-fs Inlinecompression Netapp 8.1.1 501GB 83GB (6h21 ) 116GB (49 ) 88GB (07h23 ) 82GB (11h02 ) 459GB(41 ) 188GB 188GB(46 ) knfs dnfs dnfs + Ontap compression *Ontap 8.1.1. fas6240, 72x 3TB SATA disks. Percentage saved (%) 83% 76,8% 82,4% 83,6% 8,3% 62% 62% BR evolution: Backup to disk- 27

Compression: real values Used(GB)* Saved (GB) %saved-bycompression AISDB_PROD 24719 25941 52 CASTORNS 3629 3448 49 CMSSTG 6510 6395 50 CSR 20636 32008 61 ITCORE 16387 23552 60 EDHP 9631 24913 66 LEMONRAC 47104 49152 51 *Space used on controller side Logical space used: Used + Saved BR evolution: Backup to disk- 28 28

NAS controllers throughput net_data_recv disk_data_written compression ratio 29 BR evolution: Backup to disk- 29

Deduplication When combined with compression, it doesn t provide good results Due to the way compression works: compression group: 32k, our Oracle block is 8k, Wafl block is 4k 4k 4k Checksum 30 BR evolution: Backup to disk- 30

Deduplication When combined with compression, it doesn t provide good results Due to the way compression works: compression group: 32k, our Oracle block is 8k, Wafl block is 4k Control files are a different story. Block size of 16k 4k 4k DB Type Location Size(GB) Checksum PAYP archives /backup/dbs01 0.91 PAYP archives /backup/dbs02 22.90 PAYP controlfile /backup/dbs01 456.92 PAYP fullinc /backup/dbs01 68.00 PAYP fullinc /backup/dbs02 81.10 31 BR evolution: Backup to disk- 31

Agenda CERN Oracle databases & Oracle backup basics Backup to disk implementation details Recovery platform Some bits of backup to disk backend Summary BR evolution: Backup to disk- 32

Summary Backup and Recovery testing is critical Tape copies are essential but TSM became a critical point of failure for DB services Adding a disk buffer Removes TSM criticality Reduces DB volume in TSM Speeds up backups and restores Better response time Better resource utilization Disk buffer plug-ins were easily integrated in our backup framework First system to exploit Ontap C-mode features Valuable experience for the future BR evolution: Backup to disk- 33

Questions? BR evolution: Backup to disk- 34