Understanding Disk Storage in Tivoli Storage Manager



Similar documents
Tivoli Storage Manager Explained

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Tivoli Storage Manager Scalability Enhancements

Data Deduplication and Tivoli Storage Manager

EMC Backup Storage Solutions: The Value of EMC Disk Library with TSM

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

EMC CLARiiON Backup Storage Solutions: Backup-to-Disk Guide with IBM Tivoli Storage Manager

Backup and Recovery 1

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

EMC Disk Library with EMC Data Domain Deployment Scenario

Effective Planning and Use of TSM V6 Deduplication

Keys to Optimizing Your Backup Environment: Tivoli Storage Manager

IBM Tivoli Storage Manager

Tivoli Storage Manager Scalability: Past and Present. Dave Cannon IBM Storage Systems Division Tucson, Arizona

Accelerating Backup/Restore with the Virtual Tape Library Configuration That Fits Your Environment

Tivoli Data Protection for NDMP

Symantec OpenStorage Date: February 2010 Author: Tony Palmer, Senior ESG Lab Engineer

Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication

Implementing Tivoli Storage Manager on Linux on System z

Disk-To-Disk Backup: Making a Bigger D. Presented by Kelly J. Lipp CTO STORServer, Inc.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

VERITAS Business Solutions. for DB2

IBM Tivoli Storage Manager 6

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

Tivoli Storage Manager Directions

Protecting Microsoft SQL Server with an Integrated Dell / CommVault Solution. Database Solutions Engineering

Data Deduplication and Tivoli Storage Manager

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

IBM TSM Backup with EMC Data Domain Deduplication Storage

Top 10 Tips for Improving Tivoli Storage Manager Performance

Long term retention and archiving the challenges and the solution

STORAGE CENTER. The Industry s Only SAN with Automated Tiered Storage STORAGE CENTER

Disk Library for mainframe - DLm6000 Product Overview

WHITE PAPER PPAPER. Symantec Backup Exec Quick Recovery & Off-Host Backup Solutions. for Microsoft Exchange Server 2003 & Microsoft SQL Server

Trends in Enterprise Backup Deduplication

How To Backup At Qmul

Symantec NetBackup OpenStorage Solutions Guide for Disk

TSM for Advanced Copy Services: Today and Tomorrow

Get Success in Passing Your Certification Exam at first attempt!

Exam Name: Fundamentals of Applying Tivoli Storage

IBM Tivoli Storage Manager

A Virtual Tape Library Architecture & Its Benefits

IBM Data Deduplication Strategy and Operations

EMC CLARiiON Backup Storage Solutions

Using HP StoreOnce Backup Systems for NDMP backups with Symantec NetBackup

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Step by Step Guide To vstorage Backup Server (Proxy) Sizing

W H I T E P A P E R E n t e r p r i s e V T L s : S t r a t e g ic for Large-Scale Datacenters

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

TSM Family Capacity Pricing Special Bid Offering IBM Corporation

A Brief Introduction to IBM Tivoli Storage Manager Disaster Recovery Manager A Plain Language Guide to What You Need To Know To Get Started

Amazon Cloud Storage Options

The New Data Imperative

Alternate Methods of TSM Disaster Recovery: Exploiting Export/Import Functionality

Using HP StoreOnce D2D systems for Microsoft SQL Server backups

Creating a Cloud Backup Service. Deon George

IBM TotalStorage IBM TotalStorage Virtual Tape Server

GNR TSM documentation Page 1of 10. TSM Documentation. Finn Henningsen - Sagitta Performance Systems Version th April 2002

XenData Archive Series Software Technical Overview

IBM Tivoli Storage Manager Suite for Unified Recovery

Improving disaster recovery with Virtual Tape Libraries in Mainframe Environments By Deni Connor Principal Analyst, Storage Strategies NOW

<Insert Picture Here> Oracle Secure Backup 10.3 Secure Your Data, Protect Your Budget

WHITE PAPER: customize. Best Practice for NDMP Backup Veritas NetBackup. Paul Cummings. January Confidence in a connected world.

The Modern Virtualized Data Center

WHITE PAPER: ENTERPRISE SECURITY. Symantec Backup Exec Quick Recovery and Off-Host Backup Solutions

Big Data Storage in the Cloud

Deduplication and Beyond: Optimizing Performance for Backup and Recovery

EMC PERSPECTIVE. An EMC Perspective on Data De-Duplication for Backup

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

Using Virtual Tape Libraries with Veritas NetBackup Software

Technical White Paper for the Oceanspace VTL6000

Physical Data Organization

Outline. Failure Types

Reference Architecture. EMC Global Solutions. 42 South Street Hopkinton MA

Considerations when Choosing a Backup System for AFS

Storing Data: Disks and Files

Planning for a Disaster Using Tivoli Storage Manager. Laura G. Buckley Storage Solutions Specialists, Inc.

EMC Data de-duplication not ONLY for IBM i

WHITE PAPER BRENT WELCH NOVEMBER

Data Protection for Exchange: A Look Under the Hood

A CommVault White Paper: Quick Recovery

Doc. Code. OceanStor VTL6900 Technical White Paper. Issue 1.1. Date Huawei Technologies Co., Ltd.

Dell PowerVault DL2200 & BE 2010 Power Suite. Owen Que. Channel Systems Consultant Dell

Considerations when Choosing a Backup System for AFS

Redefining Backup for VMware Environment. Copyright 2009 EMC Corporation. All rights reserved.

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

How To Manage A Data Warehouse On A Database 2 For Linux And Unix

Lunch and Learn: Modernize Your Data Protection Architecture with Multiple Tiers of Storage Session 17174, 12:30pm, Cedar

Reduce your data storage footprint and tame the information explosion

Library Recovery Center

IBM Storage Management within the Infrastructure Laura Guio Director, WW Storage Software Sales October 20, 2008

Protect Microsoft Exchange databases, achieve long-term data retention

Backup and Recovery by using SANWatch - Snapshot

TSM (Tivoli Storage Manager) Backup and Recovery. Richard Whybrow Hertz Australia System Network Administrator

A Dell Technical White Paper Dell Compellent

Transcription:

Understanding Disk Storage in Tivoli Storage Manager Dave Cannon Tivoli Storage Manager Architect Oxford University TSM Symposium September 2005

Disclaimer Unless otherwise noted, functions and behavior described in this presentation apply to Tivoli Storage Manager 5.3, but may differ in past or future product levels Description of potential future product enhancements does not constitute a commitment to deliver the described enhancements or to do so in a particular timeframe 2

Agenda Tivoli Storage, IBM Software Group Background Random- and sequential-access disk in TSM Special considerations for sequential-access disk Potential future enhancements for disk storage 3

Disk vs. Tape Potential disk advantages Faster access by avoiding delays for tape mounts and positioning Reduced management cost (no tape handling) Reliability (RAID, no tape robotics or media failures) Potential tape advantages Removability/portability for off-site storage (disaster recovery) High-speed data transfer for large objects Cost effectiveness (especially for long-term, off-site archiving) Reliability (less susceptible to hardware failure or file system corruption) Tiered approach with copies on offsite tape exploits strengths of disk and tape 4

Industry Trend Toward Increasing Use of Disk Lower cost of disk storage (ATA, SATA) Promotion of disk-based appliances and solutions (EMC, Network Appliance) Virtual tape library (VTL) products comprised of preconfigured disk systems that emulate tape Disk-based technologies Replication Snapshots Continuous data protection (CDP) 5

TSM is Designed for Disk in a Storage Hierarchy Disk has been an integral part of the TSM data storage hierarchy since 1993 Virtualization of disk volumes in a storage pool allows objects to be stored across multiple volumes and file systems Automatic, policy-based provisioning of disk storage pool space and allocation of that space during store operations Automatic, policy-based migration to tape or other media types in tiered hierarchy Incremental backup of objects from primary disk pool to tape copy pool for availability or offsite vaulting Objects automatically accessed in copy pool if not available in primary storage pool Store Client DB TSM database tracks location of files in data storage Migration Copy Storage Pool Hierarchy Copy Pool 6

Disk Usage Trend in TSM Traditional Disk Usage LAN-based data transfer between client and disk storage Data initially stored on disk to allow concurrent client backups without tape delays Backup from disk to tape copy storage pool for availability and disaster recovery Most data migrated to tape within 24 hours Emerging Disk Usage Increasing interest in LAN-free transfer between client and disk Data initially stored on disk to allow concurrent client backups without tape delays Backup from disk to tape copy storage pool (may be main reason for tape) Data may be stored on disk indefinitely 7

A Detour on File Aggregation TSM server groups client objects into aggregates during backup or archive Information about individual client objects is maintained and used for certain operations (e.g., deletion, retrieval) For internal data transfer operations (migration, storage pool backup), entire aggregate is processed as a single entity for greatly improved performance a Physical file (non-aggregate) with logical file a abc def g h i a b c d e f g h j k l m Over time, wasted space accumulates as logical files are deleted l m b c d e f Physical file (aggregate) with logical files b - f a b c g h l m Aggregate reconstruction 8

Agenda Tivoli Storage, IBM Software Group Background Random- and sequential-access disk in TSM Special considerations for sequential-access disk Potential future enhancements for disk storage 9

Overview of Random- and Sequential-Access Disk TSM supports two methods for storing and accessing data on magnetic disk Random-access storage pools (also known as DISK pools) Sequential-access storage pools (also known as FILE pools) Random- and sequential-access disk pools differ in how TSM manages disk storage and the operations that are supported TSM development views sequential-access disk as strategic Current functions on random-access disk supported for the foreseeable future Future product enhancements involving disk storage may be offered only for sequential-access disk 10

Basics of Random- and Sequential-Access Disk Random-Access Disk Sequential-Access Disk Storage pool definition Pools spanning multiple file systems Predefined device class DISK Supported Device class with device type of FILE Supported Storage pool volumes Files or raw logical volumes Files Volume creation Define Volume command Space trigger Define Volume command Space trigger Scratch volumes TSM caching Supported Not supported Use for copy storage pool Not supported Supported 11

Space Allocation and Tracking Random Access Sequential Access + Storage pool volumes Storage pool volume b a b a a b c b a b c a b c Location and length of file a Location of blocks for file a Database Database 011010010010 Bit vector Space allocated in randomly located 4KB blocks TSM server tracks volumes and blocks on which each file is stored Bit vector in TSM database tracks allocated and free blocks for each volume Space allocation and tracking requires overhead May not scale well for extremely large files Files written sequentially in FILE volume TSM database only tracks volume and offset at which each file is stored TSM has less overhead for space allocation and tracking 12

Concurrent Volume Access Random Access + Sequential Access Restore Node 1 Backup storage pool Restore Node 1 Backup storage pool Restore Node 2 Backup Node 3 Restore Node 2 Backup Node 3 Multiple TSM sessions or processes can concurrently use the same disk volume However, individual I/O operations for each volume are serialized Disk volume is locked by a single process or session using that volume Other operations cannot access the volume until the lock is released, usually when the locking operation has completed all work on the volume To avoid volume contention, sequential-access volume sizes should be much smaller than random-access volume sizes 13

LAN-free Backup/Restore Random Access Sequential Access + Not supported Supported using either of the following to control shared access to sequential disk volumes SANergy Storage pool volumes on SAN FS (supported in TSM 5.3.1) Reduces CPU cycles on TSM server and moves network traffic from LAN to SAN LAN SAN Server Client / Storage Agent Control Data flow Alternative approach for LAN-free would be a virtual tape library (VTL) appliance 14

Multi-Session Restore Random Access Sequential Access + Session 1 Session 1 Session 2 Client Server Client / Storage Agent Server Multi-session restore allows only one session for all random-access disk volumes Multi-session restore allows one session per sequential-access volume Multi-session restore is performed only for no-query restore (NQR) operations 15

Migration Tivoli Storage, IBM Software Group Random Access Sequential Access High/low migration thresholds based on percentage occupancy of the pool If node is grouped and target pool is collocated by group, parallel migration processes each work on a different group Otherwise, parallel migration processes each work on a different node Optimized for transfer by node and file space, making it an ideal intermediate buffer for transfer from non-collocated tape to collocated tape (e.g., restore from copy pool to collocated tape pool) High/low migration thresholds based on percentage of volumes containing data Parallel migration processes each work on a different source volume, possibly dividing work more evenly among processes Collocated sequential disk can be used as a buffer for transfer from non-collocated tape to collocated tape 16

Storage Pool Backup Random Access Sequential Access + If node is grouped and target pool is collocated by group, parallel backup processes each work on a different group Otherwise, parallel backup processes each work on a different node Each physical file (aggregate or nonaggregated file) must be checked during every storage pool backup Parallel backup processes each work on a different source volume Optimization: For each primary pool volume and copy pool, database stores offset of volume that has already been backed up (no need to recheck during each backup) Optimization can be especially important for long-term storage of data on disk A B C Volume backed up up to this point 17

Space Recovery Random Access Sequential Access + When physical file is moved to another pool (if caching not enabled) Space occupied by cached data is recovered as needed When physical file is deleted (for aggregates, all files in aggregate must be deleted) No reconstruction of empty space within aggregates, a disadvantage if aggregated files are stored for long periods of time Space is not immediately recovered after movement or deletion Reclamation recovers empty space created by deletion or movement of files a b c d e f g h l m a b c d e f g h l m Empty space accumulates until entire aggregate is deleted Aggregate reconstruction a bc g h l m 18

Fragmentation Random-Access Disk Sequential-Access Disk + Aggregate fragmentation caused by expiration of files Fragmentation of space within TSM volumes caused by deletion of physical files File system fragmentation leading to fragmentation of files that constitute TSM volumes Empty space accumulates in aggregates until all logical files in aggregate are deleted. May result in wasted space for longterm storage on disk. Volume fragmentation can also occur due to allocation of multiple extents if client size estimate is too low. Fragmentation can degrade performance, but is relieved by migration if no TSM caching. Fragmentation is usually minimal because volumes are predefined or created by space trigger. Empty space is recovered by aggregate reconstruction during reclamation. Deletion of physical files results in empty space within volumes, but this is recovered during reclamation. Use of scratch volumes causes fragmentation because volumes are extended as needed. Fragmentation can be avoided either by predefining volumes or using space trigger. 19

Database Regression 1. Database backup with references to files a,b,c 2. Files a,b,c deleted and space overwritten by d,e,f 3. Database restored with invalid references to a,b,c DB Backup DB DB Backup DB Backup ba c de f ba c DB Restore DB Backup b a c Disk storage pool volume d e f Disk storage pool volume d e f Disk storage pool volume Random Access After database regression, all volumes must be audited This may be time-consuming for large DISK pools (for example, pools used for long-term data storage) Sequential Access + After database regression, audit only volumes that were reused or deleted after database backup OR With REUSEDELAY set, volume audit can be avoided completely Time delays for volume audits during critical recovery operations can be minimized or eliminated 20

Random vs. Sequential Disk: Which is Best? Disk Storage Usage Recommendation Traditional disk usage LAN-based storage to disk Daily migration from disk to tape Either random or sequential, depending on requirements LAN-free data transfer between client and disk storage Long-term storage of data on disk Sequential (or possibly VTL) Sequential offers significant advantages Reconstruction recovers space in aggregates Optimized storage pool backup Reduced volume fragmentation Multi-session restore Avoidance of volume audit Exploitation of new disk storage features Sequential-access disk may be required 21

Agenda Tivoli Storage, IBM Software Group Background Random- and sequential-access disk in TSM Special considerations for sequential-access disk Potential future enhancements for disk storage 22

Optimizing Space Efficiency Scratch volumes Volumes are created and extended only as needed Space is conserved at the expense of file-system fragmentation Non-scratch volumes (created by Define Volume command or space trigger) No collocation is more space-efficient Smaller volumes are more space-efficient Reclamation should be performed regularly to recover space Efficient because no mount/dismount Many volumes can be reclaimed concurrently Volume Size + Space Efficiency Increased space efficiency - None Group Node Filespace Collocation 23

Reducing Volume Contention High-granularity collocation (by node or filespace) reduces contention Smaller volume sizes reduce contention Minimizing Volume Contention Volume Size - Decreased volume contention + None Group Node Filespace Collocation 24

Improving Client Restore Performance For no-query restore operations (used for most large restores), database scanning is greatly reduced if data is well collocated Multi-session restore operations achieve greater parallelism if data is spread over multiple sequential-access volumes, indicating that parallelism may be increased by Lower-granularity collocation Smaller volumes NQR Database Processing Multi-session Parallelism Volume Size - - - Reduced DB processing + + + None Group Node Filespace Collocation Volume Size + Increased parallelism - None Group Node Filespace Collocation 25

Avoiding Fragmentation Perform reclamation regularly Avoid use of scratch volumes Predefine volumes using Define Volume command Use space trigger to provision additional volumes as needed 26

Striking a Balance Configuration of sequential-access pools involves tradeoffs, but the following may be a reasonable starting point for most environments Define volumes and use space triggers for additional volume provisioning Collocate by node Use volume size scaled to the size of stored objects For file systems, volume size of 500 MB to 10 GB For databases and other large objects, volume size of 100 GB Set reclamation threshold at 20-60% and allow multiple reclamation processes 27

Agenda Tivoli Storage, IBM Software Group Background Random- and sequential-access disk in TSM Special considerations for sequential-access disk Potential future enhancements for disk storage 28

Potential Future Enhancements for Disk Storage Enhancements specifically for sequential-access disk pools Migration thresholds based on percentage occupancy rather than volumes with data Support for raw logical volumes Concurrent read access for volumes Performance improvements on z/os server Collocation of active data (probably sequential-access disk only) Management of redundant files Data shredding (overwrite) as data is deleted from disk Additional snapshot support Additional exploitation of continuous data protection (CDP) technology 29