Tivoli Storage Manager Explained



Similar documents
Understanding Disk Storage in Tivoli Storage Manager

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Tivoli Storage Manager Scalability Enhancements

Data Deduplication and Tivoli Storage Manager

Tivoli Storage Manager Scalability: Past and Present. Dave Cannon IBM Storage Systems Division Tucson, Arizona

IBM Tivoli Storage Manager for Databases Version Data Protection for Microsoft SQL Server Installation and User's Guide IBM

Tivoli Data Protection for NDMP

IBM Tivoli Storage Manager for Databases Version Data Protection for Microsoft SQL Server Installation and User's Guide IBM

IBM Tivoli Storage Manager

Effective Planning and Use of TSM V6 Deduplication

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

6. Backup and Recovery 6-1. DBA Certification Course. (Summer 2008) Recovery. Log Files. Backup. Recovery

EMC Backup Storage Solutions: The Value of EMC Disk Library with TSM

IBM Tivoli Storage Manager 6

IBM Tivoli Storage Manager

A Practical Guide to Backup and Recovery of IBM DB2 for Linux, UNIX and Windows in SAP Environments Part 1 Backup and Recovery Overview

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

ADSMConnect Agent for Oracle Backup on Sun Solaris Installation and User's Guide

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

<Insert Picture Here> Oracle Secure Backup 10.3 Secure Your Data, Protect Your Budget

GNR TSM documentation Page 1of 10. TSM Documentation. Finn Henningsen - Sagitta Performance Systems Version th April 2002

Agenda. Overview Configuring the database for basic Backup and Recovery Backing up your database Restore and Recovery Operations Managing your backups

Physical Data Organization

Best Practices for Using BMC Recovery Manager to Meet Data Retention Regulations

Data Protection for Exchange: A Look Under the Hood

Together with SAP MaxDB database tools, you can use third-party backup tools to backup and restore data. You can use third-party backup tools for the

How To Manage A Data Warehouse On A Database 2 For Linux And Unix

IBM Tivoli Storage Manager for Mail Version Data Protection for Microsoft Exchange Server Installation and User's Guide IBM

Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication

Data Deduplication and Tivoli Storage Manager

IBM Tivoli Storage Manager for Linux Version Installation Guide IBM

San Francisco Chapter. Information Systems Operations

IBM Tivoli Storage Manager

Oracle 11g Database Administration

Beyond backup toward storage management

Enterprise Backup and Restore technology and solutions

Database Administration

Have a Plan of ATTACK. Not a panic attack. 10 September 2003 IBM Internal Use Only Jarrett Potts, Tivoli Sales Enablement

DB2 backup and recovery

Best Practices. Using IBM InfoSphere Optim High Performance Unload as part of a Recovery Strategy. IBM Smart Analytics System

Backup and Recovery 1

VERITAS Business Solutions. for DB2

IBM Tivoli Storage Manager for Virtual Environments Version Data Protection for VMware User's Guide IBM

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

IBM Tivoli Storage Manager for Enterprise Resource Planning Version Data Protection for SAP HANA Installation and User's Guide

Backup with synchronization/ replication

Information Systems. Computer Science Department ETH Zurich Spring 2012

Adaptive Server Enterprise

EMC Disk Library with EMC Data Domain Deployment Scenario

IBM Tivoli Storage Manager for Microsoft SharePoint

Versity All rights reserved.

Oracle. Brief Course Content This course can be done in modular form as per the detail below. ORA-1 Oracle Database 10g: SQL 4 Weeks 4000/-

Database Management. Chapter Objectives

Planning for a Disaster Using Tivoli Storage Manager. Laura G. Buckley Storage Solutions Specialists, Inc.

The Classical Architecture. Storage 1 / 36

Symantec NetBackup OpenStorage Solutions Guide for Disk

XenData Archive Series Software Technical Overview

SAP HANA Backup and Recovery (Overview, SPS08)

Symantec OpenStorage Date: February 2010 Author: Tony Palmer, Senior ESG Lab Engineer

Patterns of Information Management

Data Recovery and High Availability Guide and Reference

IBM Tivoli Storage Manager Version 7.1. Client Messages and Application Programming Interface Return Codes

Workflow Templates Library

Keys to Optimizing Your Backup Environment: Tivoli Storage Manager

EMC CLARiiON Backup Storage Solutions: Backup-to-Disk Guide with IBM Tivoli Storage Manager

EMC: The Virtual Data Center

Distributed Data Management

How To Backup At Qmul

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Tivoli Continuous Data Protection for Files

3 Setting up Databases on a Microsoft SQL 7.0 Server

Recovery Principles in MySQL Cluster 5.1

Avid. Avid Interplay Web Services. Version 2.0

Outline. Failure Types

TSM for Advanced Copy Services: Today and Tomorrow

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

LBNC and IBM Corporation Document: LBNC-Install.doc Date: Path: D:\Doc\EPFL\LNBC\LBNC-Install.doc Version: V1.0

Oracle Database 10g: Backup and Recovery 1-2

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

Exam Number/Code : Exam Name: Name: PRO:MS SQL Serv. 08,Design,Optimize, and Maintain DB Admin Solu. Version : Demo.

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

IBM Tivoli Storage Manager Suite for Unified Recovery

SQL-BackTrack the Smart DBA s Power Tool for Backup and Recovery

QStar White Paper. Tiered Storage

File-System Implementation

DB2 Backup and Recovery

DB2 9 DBA exam 731 prep, Part 6: High availability: Backup and recovery

EnterpriseBACKUP OpenVMS centric cross platform backup and media management

Understanding Connected DataProtector

CA ARCserve and CA XOsoft r12.5 Best Practices for protecting Microsoft Exchange

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

REMOTE BACKUP-WHY SO VITAL?

Chapter 11 I/O Management and Disk Scheduling

Compliance Procedure

Oracle Backup and Recover 101. Osborne Press ISBN

Trends in Enterprise Backup Deduplication

Directory Backup and Restore

Transcription:

IBM Software Group Dave Cannon IBM Tivoli Storage Management Development Oxford University TSM Symposium 2003 Presentation Objectives Explain TSM behavior for selected operations Describe design goals and rationale Point out tradeoffs Provide guidance for administrators 2 1

Topics Basics Policy management Storage of objects Storage pools Database and recovery log 3 Basics TSM highlights Architecture Progressive backup 4 2

TSM Highlights Automated, policy-driven storage management Distributed, heterogeneous clients Centralized storage-management servers Support for numerous storage devices Comprehensive storage-management function B Backup/recovery B Archive/retrieve B Space management (HSM) B Data protection for specific applications B Application Programming Interface (API) B Content management 5 Architecture The server database holds information on users, administrators, policy, and the location of objects in the storage hierarchy Database TSM Client TSM Server Recovery Log Backup/archive clients HSM clients API applications Administrative clients The server recovery log maintains information about database transactions so can be committed or rolled back atomically, maintaining referential integrity in the database The storage hierarchy is a collection of devices in which the TSM server stores client objects Storage Hierarchy 6 3

Progressive Backup Initial backup Each object backed up only once B Reduces network traffic B Avoids unnecessary copies of same data B Reduces impact to client application Consolidates client data on few tapes B Reduce storage requirements B Improve restore performance Incremental backups File expiration Automated space reclamation Incremental backups 7 Policy Management Object-level policy management Constructs for policy management Inventory management Versioning and retention 8 4

Object-level Policy Management Not all data is the same B Some client nodes may be more critical or have different storage requirements than other nodes B Data types (backup, archive, space-managed) may need to be managed differently B Even for the same node and data type, some objects may have different requirements than others Manual management of stored data is expensive and error-prone TSM provides granular, automated mechanism for administering policy B Node specificity is achieved by assigning node to a policy domain B Data-type specificity is achieved through use of backup/archive copy groups and HSM policy B Object specificity is achieved by binding each object to a management class when that object is first stored on TSM Policy attributes for assigned management class can later be changed Backup objects can be rebound to a different management class 9 Constructs for Policy Management Application server nodes Workstation nodes Domain Management class Backup copy group Management class HSM Archive copy group policy Backup copy group Management class HSM Archive copy group policy Backup copy group HSM Archive copy group policy Domain Management class Backup copy group Management class HSM Archive copy group policy Backup copy group Management class HSM Archive copy group policy Backup copy group HSM Archive copy group policy Backup copy group Destination storage pool What if file in use? Enforce frequency? Back up only if modified? How many versions? How long to retain? Archive copy group Destination storage pool What if file in use? How long to retain? HSM policy Destination storage pool Backup required before migration? Days before migration? Migration technique? 10 5

Inventory Management TSM database contains an inventory (catalog) of "end-user-visible" attributes (node, file space, name, policy) for managed client objects Objects tracked in the inventory include Directories Files File system images Delta objects (subfiles) Each distinct object is assigned a unique 64-bit object identifier that is used for operations on that object Via expiration, TSM server administers policy rules for retention and versioning of client objects 11 Versioning of Backup Objects During backup B If no corresponding object on the server, new object becomes the active version B If corresponding object already exists on the server, new object becomes active version and the existing active version is deactivated B Extraneous versions are marked for expiration (base date set to 0) The number of allowed versions of a backup object is determined by the object's management class and copy group 12 6

Expiration of Objects Based on Versioning or Retention During backup, extraneous versions are marked for subsequent expiration based on B VEREXISTS (backup object that exists on client system) B VERDELETED (backup object deleted from client system) Once object has been marked for expiration, it cannot be queried from the client or restored, but database entries are not removed until expiration During expiration, object is deleted if previously marked for expiration or if retention period of the object has been exceeded B RETEXTRA (inactive backup object with other versions) B RETONLY (inactive backup object with no other versions) B RETVER (archive object) When do objects "disappear" from client's view? B During backup (extraneous versions) B During expiration (objects whose retention time has elapsed) 13 Example of Versioning and Retention VEREXISTS: 2 RETEXTRA: 30 Insertion Date Object Id State Expiration Base Date 12:00:00 01/01/2002 10 Inactive 0 14:00:00 01/01/2002 20 Inactive 08:00:00 01/02/2002 08:00:00 01/02/2002 30 Active None Object 10 is an extraneous version and would be expired during the next expiration operation Object 20 would be eligible for expiration on 08:00:00 02/01/2002 (30 days after the base date) if it is not versioned before then 14 7

Delta-Base Versioning and Expiration 1. Backup of entire file 2. Backup of delta file F V1 (active) B D V1 (inactive) V2 (active) 3. Backup of delta file B D D V1 (marked for expiration) V2 (inactive) V3 (active) 4. Backup of delta file 5. Backup of entire file 6. Backup of delta file B D D V1 (marked for expiration) V2 (marked and expired) V3 (inactive) B D D V1 (marked for expiration) V3 (marked and expired) V4 (inactive) B D B V1 (marked and expired) V4 (marked and expired) V5 (inactive) D V4 (active) B V5 (active) D V6 (active) Example assumes VEREXISTS=2 Once base file is marked for expiration, it cannot be restored by itself Expiration does not delete base file with dependent deltas (deletion of base may not occur until next expiration process after deletion of last delta) 15 Storage of Objects File aggregation Movement of active files only? Data validation for stored objects Database regression and references to stored objects 16 8

What is File Aggregation? TSM server groups client objects into aggregates during backup or archive Information about individual client objects is maintained and used for certain operations (e.g., deletion, retrieval) For many operations, especially internal data transfer, entire aggregate can be processed as a single entity Aggregate size and number of client objects per aggregate can be controlled a Physical file (non-aggregate) with logical file a b c d e f Physical file (aggregate) with logical files b - f 17 Why Aggregate Files? Performance! Aggregates can be transferred without examining constituent files? Reduced overhead for database updates during transfer operations because storage information not updated for each logical file (also reduced database size) a d c Database b e abcde Database Without aggregation With aggregation Improved storage device performance because data transferred in larger units e d c b a abcde Without aggregation With aggregation 18 9

Logical File Storage and Retrieval 1. Send logical files l k j i h g f e d c b a Server 3. Store metadata Database Client Storage 2. Aggregate and store objects a b c d e f g h i j k l m Server 1. Request file h h 4. Send file h Client Retrieval 2. Get metadata Database 3. Perform partial-object retrieve a b c d e f g h i j k l m Offset of h 19 Reclaiming Space via Aggregate Reconstruction a b c d e f g h i j k l m Over time, logical files within an aggregate may be deleted, leaving wasted space a b c d e f g h i j k l m a b c g h l m a b c g h l m Remaining logical files are moved in "regions" Reconstructed aggregate contains all remaining logical files, without empty space 20 10

Problem: Invalidation of Copies Aggr X (Primary) a b c d Reconstruction Duplication Aggr X (Backup) a b c d X is available and recoverable from copy pool Aggr Y (Primary) a c d Copy does not match Y is not available and recoverable from copy pool Problem occurs because aggregates in different storage pools are not reconstructed at the same time Since Y not found in the copy pool, should it be backed up? 21 Solution: Aggregate Aliases Aggr X (Primary) a b c d Duplication Aggr X (Backup) a b c d X is available and recoverable from copy pool Reconstruction Aggr Y (Primary) a c d X and Y are aliases Y is available and recoverable from copy pool Aliases are aggregates that contain exactly the same files, but at different offsets New backup of Y not required because its alias X already exists in copy pool Files in X can be accessed if Y is unavailable Y can be restored from X 22 11

Benefits of Aggregate Aliases Copy Storage Pools Aggregation Availability/Recovery Aliases Performance Reconstruction Space Reclamation 23 Movement of Active Files Only? Movement of only active files during internal data-transfer operations (storage pool backup, migration) would eliminate many of the benefits of file aggregation B Server would need to examine contents of each aggregate to determine if it contains any active files B Aggregate could no longer be treated as a single entity during transfer operations B Some aggregates contain a mixture of active and inactive files Could transfer aggregates that contain at least one active file This could result in the transfer of many inactive files What about reorganizing aggregates to segregate the active and inactive files? B Ongoing file deactivation would leave inactive files in the "active" aggregates B Aggregate aliases would not work because there would be no correspondence between the aggregates in different storage pools If TSM were to provide an option to disable aggregation and allow movement of active files only, performance impact would be significant 24 12

Consistency of Database and Stored Objects TSM uses its database to locate objects in storage Data integrity requires that database information be consistent with data stored on storage pool volumes Inconsistency can occur for reasons such as B Hardware errors B Media damage or degradation B Storage pool volumes have been overwritten B Database has been regressed to an earlier version If inconsistencies are detected or suspected, consider B Are the errors permanent (errors written to media) or temporary (data written correctly, but errors during read)? B Are there duplicate copies of the data? Audit Volume can be used to check for consistency 25 Data Validation for Stored Objects TSM server embeds control information in stored data for integrity checking during client restore/retrieve, aggregate reconstruction, and Audit Volume Optionally, CRC checking can be enabled by storage pool B CRC is generated and stored with data as it is initially written to storage pool volume B During Audit Volume, CRC is generated and compared with stored CRC value Example: B Objects a, b, c, d, and e stored in aggregate X B Object f and g stored in aggregate Y B Non-aggregated object h Object header prepended before each object Frame headers and trailer for each physical file a b c c d e e f f g g h h Frame headers/trailer for aggregate X Frame headers/trailer for aggregate Y Frame headers/trailer for object h 26 13

Database Regression and References to Stored Objects 1. Database backup 2. Movement of files 3. Overwriting of storage DB Backup DB DB DB Backup DB Backup DB Backup a b c Storage pool volume Reclamation Storage pool volumes a b c d e f Storage pool volumes a b c 4. Database restore DB d e f Restore DB Backup Storage pool volumes a b c Regressing database to earlier point in time can lead to incorrect references for overwritten files Reuse delay ensures that empty sequential-access volumes are not overwritten for specified time (should match retention time of database backups) Use Audit Volume to detect inconsistencies B All random-access volumes B All reused/deleted volumes per volume history file 27 Storage Pools Why storage pools? Storage virtualization Random-access and sequential-access storage Copy storage pools Reclamation of offsite volumes 28 14

Why Storage Pools? Storage hierarchy exploits attributes of different device types B Performance B Concurrent access B Cost B Ability to remove media for remote storage Organization of storage into storage pools supports automatic, policybased management of stored objects Migration Backup Migration Copy Pool Reclamation Storage Pool Hierarchy 29 Storage Virtualization Object Name Object Attributes Object ID 15 16 19 Object ID Offset/ Length Aggr ID 15 10 16 10 19 10 ID 10 10 Storage Pool Volume Location Inventory 81 82 86 98 81 80 82 80 86 80 Aggregation Each object is assigned a surrogate key, a unique 64-bit object ID If file is aggregated, the object ID is mapped to one or more aggregates, each with a unique aggregate ID ID of aggregate or non-aggregated object is mapped to storage locations, each consisting of storage pool, volume, and position Name-location independence allows movement and duplication of objects in storage, transparent to user or application to which data belongs 80 80 80 98 98 98 Storage 30 15

Random-access and Sequential-access Storage Storage pools are classified as random-access or sequential-access Separate code paths for managing pools, volumes, and stored objects Some operations are supported for only one access type B Fundamental differences (e.g., caching supported for random-access only) B To reduce development/maintenance effort (e.g., copy pools are sequentialaccess only) Some processes have very different algorithms depending on access type (e.g., to optimize mounting/positioning of sequential-access volumes) Random-access b a c a c b c a b b c Primary pools only Block allocation Policies B Migration parameters B Cache? B Simultaneous write to copy pool? B Maximum physical file size B CRC data? Sequential-access a b c Primary or copy pools Volume selection Policies B Migration/reclamation parameters B Collocation? B Simultaneous write to copy pool? B Maximum physical file size B CRC data? B Maximum scratch volumes B Reuse delay 31 File-level Duplication Using Copy Pools Supports incremental backup of storage pool data Existing copies still valid even if primary volume is reclaimed or data moved to another level in the hierarchy Primary and duplicate data can have different media/device types or capacities (short tapes) Device and platform independence (no reliance on system copy utilities) Copies can be made synchronously (during creation) or asynchronously (after creation) Performance of file-level duplication improved through file aggregation Availability/recoverability at various levels B Damaged physical file B Primary volume Migration Backup B Storage pool Migration Copy Pool Reclamation Storage Pool Hierarchy 32 16

Reclamation of Offsite Volumes Server location DB 1. Use database to identify remaining files on reclaimable volumes 3. Update database to reference new location of files Offsite location Reclaimable off-site volumes 5. Transport onsite 4. Transport offsite 2. Copy files from on-site location to new copy pool volumes Database backup Automated data transfer per policy (steps 1-3 above) Allows space reclamation of volumes at offsite locations without library Avoids risking data loss, which could occur if volume were brought onsite for traditional reclamation Can be coupled with reuse delay for protection if database is regressed 33 Database and Recovery Log Database objects and structure Database buffer pool Recovery log structure Recovery log utilization 34 17

Database Table Objects From the perspective of a TSM component that uses the database, table schema describes key and data columns for table Key columns Data columns Record Within the database, table structure is a balanced tree (B + -tree), whose nodes are stored as database pages Root Internal (non-leaf) node Leaf node 35 Database Bit-vector Objects From the perspective of a TSM component that uses the database, a bit-vector is a linear array of bits representing allocation of blocks on a disk storage pool volume 11111100011111111000000 Within the database, a bit-vector is stored as linked pages 36 18

Database Structure smp ic dir root page Bit-vector Table Object directory Image copy header (persistent information for DB backup) Space map page every 4MB (tracks allocated pages) 37 Database Buffer Pool empty empty empty Checkpoint Memory cache for DB Defers database updates Bufferwriter writes out DB pages making them again Checkpointer occasionally records all pages in the recovery log Latch Buffer writer Database Recovery Log 38 19

Recovery Log Structure Logical Segment 25 Segment 22 Segment 23 Segment 24 1 MB 1 MB 1 MB 1 MB Physical Segment 1 Segment 2 Segment 3 Segment 4 Segments: Log is composed of segments, each containing 256 4KB pages Logical segment numbers always increase Log will wrap Log Sequence Number (LSN): Log Sequence Number 64 Bits Logical segment number : 44 bits Logical segment number Page : 8 bits Offset: 12 bits Displayed as 3 dotted numbers: 23141.153.45 Page Offset 39 Log Modes and Location of Log Tail Active Records For normal mode, log tail is oldest of these Log Tail Log Head Oldest active transaction LSN for oldest page in buffer pool Last checkpoint record (there MUST be a checkpoint in the active record area) Start of currently running DB backup (log tail is pinned during backup) OR Oldest transaction since last DB backup if the log is in rollforward mode 40 20

Recovery Log Utilization empty empty DB Buffer Pool Log Tail empty Buffer writer Checkpoint DB Log Head Recovery Log Active Log Records Tortoise and hare problem: If the log head (hare) overtakes the tail (tortoise), the circular recovery log is full 41 Product Changes to Avoid Full Recovery Log Increased maximum recovery log size Cancellation of stalled sessions (Throughputdatathreshold / Throughputtimethreshold options) Enhanced database trigger Show Logpinned command to determine what is pinning the log Show Logstats and Show Logreset commands to determine log space consumed by checkpoint records Autonomic transaction throttling if recovery log approaches full condition 42 21

Conclusion TSM provides a comprehensive storage-management solution We've discussed some of the features that make this possible B Progressive backup B Object-level policy management B Aggregation of stored files for improved performance B Storage pools for hierarchical, policy-driven storage B Storage virtualization B Automated duplication of data and management of offsite volumes B Sophisticated database and recovery log 43 22