IBM Software Group Dave Cannon IBM Tivoli Storage Management Development Oxford University TSM Symposium 2003 Presentation Objectives Explain TSM behavior for selected operations Describe design goals and rationale Point out tradeoffs Provide guidance for administrators 2 1
Topics Basics Policy management Storage of objects Storage pools Database and recovery log 3 Basics TSM highlights Architecture Progressive backup 4 2
TSM Highlights Automated, policy-driven storage management Distributed, heterogeneous clients Centralized storage-management servers Support for numerous storage devices Comprehensive storage-management function B Backup/recovery B Archive/retrieve B Space management (HSM) B Data protection for specific applications B Application Programming Interface (API) B Content management 5 Architecture The server database holds information on users, administrators, policy, and the location of objects in the storage hierarchy Database TSM Client TSM Server Recovery Log Backup/archive clients HSM clients API applications Administrative clients The server recovery log maintains information about database transactions so can be committed or rolled back atomically, maintaining referential integrity in the database The storage hierarchy is a collection of devices in which the TSM server stores client objects Storage Hierarchy 6 3
Progressive Backup Initial backup Each object backed up only once B Reduces network traffic B Avoids unnecessary copies of same data B Reduces impact to client application Consolidates client data on few tapes B Reduce storage requirements B Improve restore performance Incremental backups File expiration Automated space reclamation Incremental backups 7 Policy Management Object-level policy management Constructs for policy management Inventory management Versioning and retention 8 4
Object-level Policy Management Not all data is the same B Some client nodes may be more critical or have different storage requirements than other nodes B Data types (backup, archive, space-managed) may need to be managed differently B Even for the same node and data type, some objects may have different requirements than others Manual management of stored data is expensive and error-prone TSM provides granular, automated mechanism for administering policy B Node specificity is achieved by assigning node to a policy domain B Data-type specificity is achieved through use of backup/archive copy groups and HSM policy B Object specificity is achieved by binding each object to a management class when that object is first stored on TSM Policy attributes for assigned management class can later be changed Backup objects can be rebound to a different management class 9 Constructs for Policy Management Application server nodes Workstation nodes Domain Management class Backup copy group Management class HSM Archive copy group policy Backup copy group Management class HSM Archive copy group policy Backup copy group HSM Archive copy group policy Domain Management class Backup copy group Management class HSM Archive copy group policy Backup copy group Management class HSM Archive copy group policy Backup copy group HSM Archive copy group policy Backup copy group Destination storage pool What if file in use? Enforce frequency? Back up only if modified? How many versions? How long to retain? Archive copy group Destination storage pool What if file in use? How long to retain? HSM policy Destination storage pool Backup required before migration? Days before migration? Migration technique? 10 5
Inventory Management TSM database contains an inventory (catalog) of "end-user-visible" attributes (node, file space, name, policy) for managed client objects Objects tracked in the inventory include Directories Files File system images Delta objects (subfiles) Each distinct object is assigned a unique 64-bit object identifier that is used for operations on that object Via expiration, TSM server administers policy rules for retention and versioning of client objects 11 Versioning of Backup Objects During backup B If no corresponding object on the server, new object becomes the active version B If corresponding object already exists on the server, new object becomes active version and the existing active version is deactivated B Extraneous versions are marked for expiration (base date set to 0) The number of allowed versions of a backup object is determined by the object's management class and copy group 12 6
Expiration of Objects Based on Versioning or Retention During backup, extraneous versions are marked for subsequent expiration based on B VEREXISTS (backup object that exists on client system) B VERDELETED (backup object deleted from client system) Once object has been marked for expiration, it cannot be queried from the client or restored, but database entries are not removed until expiration During expiration, object is deleted if previously marked for expiration or if retention period of the object has been exceeded B RETEXTRA (inactive backup object with other versions) B RETONLY (inactive backup object with no other versions) B RETVER (archive object) When do objects "disappear" from client's view? B During backup (extraneous versions) B During expiration (objects whose retention time has elapsed) 13 Example of Versioning and Retention VEREXISTS: 2 RETEXTRA: 30 Insertion Date Object Id State Expiration Base Date 12:00:00 01/01/2002 10 Inactive 0 14:00:00 01/01/2002 20 Inactive 08:00:00 01/02/2002 08:00:00 01/02/2002 30 Active None Object 10 is an extraneous version and would be expired during the next expiration operation Object 20 would be eligible for expiration on 08:00:00 02/01/2002 (30 days after the base date) if it is not versioned before then 14 7
Delta-Base Versioning and Expiration 1. Backup of entire file 2. Backup of delta file F V1 (active) B D V1 (inactive) V2 (active) 3. Backup of delta file B D D V1 (marked for expiration) V2 (inactive) V3 (active) 4. Backup of delta file 5. Backup of entire file 6. Backup of delta file B D D V1 (marked for expiration) V2 (marked and expired) V3 (inactive) B D D V1 (marked for expiration) V3 (marked and expired) V4 (inactive) B D B V1 (marked and expired) V4 (marked and expired) V5 (inactive) D V4 (active) B V5 (active) D V6 (active) Example assumes VEREXISTS=2 Once base file is marked for expiration, it cannot be restored by itself Expiration does not delete base file with dependent deltas (deletion of base may not occur until next expiration process after deletion of last delta) 15 Storage of Objects File aggregation Movement of active files only? Data validation for stored objects Database regression and references to stored objects 16 8
What is File Aggregation? TSM server groups client objects into aggregates during backup or archive Information about individual client objects is maintained and used for certain operations (e.g., deletion, retrieval) For many operations, especially internal data transfer, entire aggregate can be processed as a single entity Aggregate size and number of client objects per aggregate can be controlled a Physical file (non-aggregate) with logical file a b c d e f Physical file (aggregate) with logical files b - f 17 Why Aggregate Files? Performance! Aggregates can be transferred without examining constituent files? Reduced overhead for database updates during transfer operations because storage information not updated for each logical file (also reduced database size) a d c Database b e abcde Database Without aggregation With aggregation Improved storage device performance because data transferred in larger units e d c b a abcde Without aggregation With aggregation 18 9
Logical File Storage and Retrieval 1. Send logical files l k j i h g f e d c b a Server 3. Store metadata Database Client Storage 2. Aggregate and store objects a b c d e f g h i j k l m Server 1. Request file h h 4. Send file h Client Retrieval 2. Get metadata Database 3. Perform partial-object retrieve a b c d e f g h i j k l m Offset of h 19 Reclaiming Space via Aggregate Reconstruction a b c d e f g h i j k l m Over time, logical files within an aggregate may be deleted, leaving wasted space a b c d e f g h i j k l m a b c g h l m a b c g h l m Remaining logical files are moved in "regions" Reconstructed aggregate contains all remaining logical files, without empty space 20 10
Problem: Invalidation of Copies Aggr X (Primary) a b c d Reconstruction Duplication Aggr X (Backup) a b c d X is available and recoverable from copy pool Aggr Y (Primary) a c d Copy does not match Y is not available and recoverable from copy pool Problem occurs because aggregates in different storage pools are not reconstructed at the same time Since Y not found in the copy pool, should it be backed up? 21 Solution: Aggregate Aliases Aggr X (Primary) a b c d Duplication Aggr X (Backup) a b c d X is available and recoverable from copy pool Reconstruction Aggr Y (Primary) a c d X and Y are aliases Y is available and recoverable from copy pool Aliases are aggregates that contain exactly the same files, but at different offsets New backup of Y not required because its alias X already exists in copy pool Files in X can be accessed if Y is unavailable Y can be restored from X 22 11
Benefits of Aggregate Aliases Copy Storage Pools Aggregation Availability/Recovery Aliases Performance Reconstruction Space Reclamation 23 Movement of Active Files Only? Movement of only active files during internal data-transfer operations (storage pool backup, migration) would eliminate many of the benefits of file aggregation B Server would need to examine contents of each aggregate to determine if it contains any active files B Aggregate could no longer be treated as a single entity during transfer operations B Some aggregates contain a mixture of active and inactive files Could transfer aggregates that contain at least one active file This could result in the transfer of many inactive files What about reorganizing aggregates to segregate the active and inactive files? B Ongoing file deactivation would leave inactive files in the "active" aggregates B Aggregate aliases would not work because there would be no correspondence between the aggregates in different storage pools If TSM were to provide an option to disable aggregation and allow movement of active files only, performance impact would be significant 24 12
Consistency of Database and Stored Objects TSM uses its database to locate objects in storage Data integrity requires that database information be consistent with data stored on storage pool volumes Inconsistency can occur for reasons such as B Hardware errors B Media damage or degradation B Storage pool volumes have been overwritten B Database has been regressed to an earlier version If inconsistencies are detected or suspected, consider B Are the errors permanent (errors written to media) or temporary (data written correctly, but errors during read)? B Are there duplicate copies of the data? Audit Volume can be used to check for consistency 25 Data Validation for Stored Objects TSM server embeds control information in stored data for integrity checking during client restore/retrieve, aggregate reconstruction, and Audit Volume Optionally, CRC checking can be enabled by storage pool B CRC is generated and stored with data as it is initially written to storage pool volume B During Audit Volume, CRC is generated and compared with stored CRC value Example: B Objects a, b, c, d, and e stored in aggregate X B Object f and g stored in aggregate Y B Non-aggregated object h Object header prepended before each object Frame headers and trailer for each physical file a b c c d e e f f g g h h Frame headers/trailer for aggregate X Frame headers/trailer for aggregate Y Frame headers/trailer for object h 26 13
Database Regression and References to Stored Objects 1. Database backup 2. Movement of files 3. Overwriting of storage DB Backup DB DB DB Backup DB Backup DB Backup a b c Storage pool volume Reclamation Storage pool volumes a b c d e f Storage pool volumes a b c 4. Database restore DB d e f Restore DB Backup Storage pool volumes a b c Regressing database to earlier point in time can lead to incorrect references for overwritten files Reuse delay ensures that empty sequential-access volumes are not overwritten for specified time (should match retention time of database backups) Use Audit Volume to detect inconsistencies B All random-access volumes B All reused/deleted volumes per volume history file 27 Storage Pools Why storage pools? Storage virtualization Random-access and sequential-access storage Copy storage pools Reclamation of offsite volumes 28 14
Why Storage Pools? Storage hierarchy exploits attributes of different device types B Performance B Concurrent access B Cost B Ability to remove media for remote storage Organization of storage into storage pools supports automatic, policybased management of stored objects Migration Backup Migration Copy Pool Reclamation Storage Pool Hierarchy 29 Storage Virtualization Object Name Object Attributes Object ID 15 16 19 Object ID Offset/ Length Aggr ID 15 10 16 10 19 10 ID 10 10 Storage Pool Volume Location Inventory 81 82 86 98 81 80 82 80 86 80 Aggregation Each object is assigned a surrogate key, a unique 64-bit object ID If file is aggregated, the object ID is mapped to one or more aggregates, each with a unique aggregate ID ID of aggregate or non-aggregated object is mapped to storage locations, each consisting of storage pool, volume, and position Name-location independence allows movement and duplication of objects in storage, transparent to user or application to which data belongs 80 80 80 98 98 98 Storage 30 15
Random-access and Sequential-access Storage Storage pools are classified as random-access or sequential-access Separate code paths for managing pools, volumes, and stored objects Some operations are supported for only one access type B Fundamental differences (e.g., caching supported for random-access only) B To reduce development/maintenance effort (e.g., copy pools are sequentialaccess only) Some processes have very different algorithms depending on access type (e.g., to optimize mounting/positioning of sequential-access volumes) Random-access b a c a c b c a b b c Primary pools only Block allocation Policies B Migration parameters B Cache? B Simultaneous write to copy pool? B Maximum physical file size B CRC data? Sequential-access a b c Primary or copy pools Volume selection Policies B Migration/reclamation parameters B Collocation? B Simultaneous write to copy pool? B Maximum physical file size B CRC data? B Maximum scratch volumes B Reuse delay 31 File-level Duplication Using Copy Pools Supports incremental backup of storage pool data Existing copies still valid even if primary volume is reclaimed or data moved to another level in the hierarchy Primary and duplicate data can have different media/device types or capacities (short tapes) Device and platform independence (no reliance on system copy utilities) Copies can be made synchronously (during creation) or asynchronously (after creation) Performance of file-level duplication improved through file aggregation Availability/recoverability at various levels B Damaged physical file B Primary volume Migration Backup B Storage pool Migration Copy Pool Reclamation Storage Pool Hierarchy 32 16
Reclamation of Offsite Volumes Server location DB 1. Use database to identify remaining files on reclaimable volumes 3. Update database to reference new location of files Offsite location Reclaimable off-site volumes 5. Transport onsite 4. Transport offsite 2. Copy files from on-site location to new copy pool volumes Database backup Automated data transfer per policy (steps 1-3 above) Allows space reclamation of volumes at offsite locations without library Avoids risking data loss, which could occur if volume were brought onsite for traditional reclamation Can be coupled with reuse delay for protection if database is regressed 33 Database and Recovery Log Database objects and structure Database buffer pool Recovery log structure Recovery log utilization 34 17
Database Table Objects From the perspective of a TSM component that uses the database, table schema describes key and data columns for table Key columns Data columns Record Within the database, table structure is a balanced tree (B + -tree), whose nodes are stored as database pages Root Internal (non-leaf) node Leaf node 35 Database Bit-vector Objects From the perspective of a TSM component that uses the database, a bit-vector is a linear array of bits representing allocation of blocks on a disk storage pool volume 11111100011111111000000 Within the database, a bit-vector is stored as linked pages 36 18
Database Structure smp ic dir root page Bit-vector Table Object directory Image copy header (persistent information for DB backup) Space map page every 4MB (tracks allocated pages) 37 Database Buffer Pool empty empty empty Checkpoint Memory cache for DB Defers database updates Bufferwriter writes out DB pages making them again Checkpointer occasionally records all pages in the recovery log Latch Buffer writer Database Recovery Log 38 19
Recovery Log Structure Logical Segment 25 Segment 22 Segment 23 Segment 24 1 MB 1 MB 1 MB 1 MB Physical Segment 1 Segment 2 Segment 3 Segment 4 Segments: Log is composed of segments, each containing 256 4KB pages Logical segment numbers always increase Log will wrap Log Sequence Number (LSN): Log Sequence Number 64 Bits Logical segment number : 44 bits Logical segment number Page : 8 bits Offset: 12 bits Displayed as 3 dotted numbers: 23141.153.45 Page Offset 39 Log Modes and Location of Log Tail Active Records For normal mode, log tail is oldest of these Log Tail Log Head Oldest active transaction LSN for oldest page in buffer pool Last checkpoint record (there MUST be a checkpoint in the active record area) Start of currently running DB backup (log tail is pinned during backup) OR Oldest transaction since last DB backup if the log is in rollforward mode 40 20
Recovery Log Utilization empty empty DB Buffer Pool Log Tail empty Buffer writer Checkpoint DB Log Head Recovery Log Active Log Records Tortoise and hare problem: If the log head (hare) overtakes the tail (tortoise), the circular recovery log is full 41 Product Changes to Avoid Full Recovery Log Increased maximum recovery log size Cancellation of stalled sessions (Throughputdatathreshold / Throughputtimethreshold options) Enhanced database trigger Show Logpinned command to determine what is pinning the log Show Logstats and Show Logreset commands to determine log space consumed by checkpoint records Autonomic transaction throttling if recovery log approaches full condition 42 21
Conclusion TSM provides a comprehensive storage-management solution We've discussed some of the features that make this possible B Progressive backup B Object-level policy management B Aggregation of stored files for improved performance B Storage pools for hierarchical, policy-driven storage B Storage virtualization B Automated duplication of data and management of offsite volumes B Sophisticated database and recovery log 43 22