Tivoli Storage Manager Scalability Enhancements Dave Cannon Tivoli Storage Management Development Oxford University TSM Symposium September 2001 Agenda Recent enhancements Planned enhancements Potential future enhancements TSM Scalability Enhancements 1-2 Dave Cannon
Scalability Contributors Performance Manageability TSM Scalability Resource Sharing Recent Enhancements Subfile backup/restore Tivoli Data Protection for EMC Tivoli Data Protection for ESS LAN-free data movement Journal-based incremental backup 3590 performance for S/390 TDS for Storage Management Analysis Increased recovery log size Volume-specific space reclamation TSM Management Console Performance Manageability TSM Scalability Resource Sharing Tape library sharing LAN-free disk sharing with SANergy Unicode support TSM Scalability Enhancements 3-4 Dave Cannon
Planned Enhancements These enhancements are planned Move Nodedata command (V5.1) Export/import enhancements (V5.1+) Simultaneous writes to copy pools (V5.1) Windows 2000 image backup (V5.1) Multi-session restore (V5.1) New concepts (beginning in V4.2.1) TDP for NDMP (V4.2.1) SCSI-3 extended copy (V5.1) Plans, schedules, and functional content are subject to change Move Nodedata Command Consolidation Movement to Disk node BOB node BOB Storage pool hierarchy move nodedata bob fromstgpool=tapepool Storage pool hierarchy move nodedata bob fromstgpool=tapepool tostgpool=diskpool Moves files belonging to specified node(s) and residing in a specified sequential-access storage pool Prepares for rapid client restore by Consolidating node data within a sequential-access pool Moving data to disk for easy access Planned: Move Nodedata TSM Scalability Enhancements 5-6 Dave Cannon
File-Filtering Options Multiple Nodes Source move nodedata bob,ted fromstgpool=sourcepool tostgpool=targetpool Target node BOB node TED By File space Source move nodedata bob filespace=/data,/work fromstgpool=sourcepool tostgpool=targetpool Target node BOB in file space /data node BOB in file space /work By Data Type Source move nodedata bob type=archive filespace=/data fromstgpool=sourcepool tostgpool=targetpool Target Archive files for node BOB in file space /data If multiple nodes are specified, all file spaces are moved Planned: Move Nodedata Data-Transfer Options Aggregate Reconstruction Source Multiple Processes Source move nodedata bob fromstgpool=sourcepool reconstruct=yes node BOB node BOB Proce s Proce s move nodedata bob fromstgpool=sourcepool tostgpool=targetpool maxprocess=2 node BOB Target Target Planned: Move Nodedata TSM Scalability Enhancements 7-8 Dave Cannon
Server-to-Server Export/Import Source Export and import using server-to-server communication Target Export Import Media Media Export and import are performed in a single operation Eliminates need for common media Eliminates copies to and from export media Avoids management and transportation of media Useful for splitting servers and for duplicating servers to achieve disaster recovery protection Planned: Export/Import Server-to-Server Export/Import (cont.) Initiated with export command on source server Export Server Export Node Export Admin Export Policy New export parameters Toserver Previewimport Replacedefs Dates Planned: Export/Import TSM Scalability Enhancements 9-10 Dave Cannon
Merging of File Spaces During Import Files on Source Server During Export Files on Target Server Before Import Import Action Files on Target Server After Import Server's File Insertion Date T1 T2 T1 T 3 Inactive Active Skip Insert T1 T2 T3 Marked for expiration (Verexists=3) Inactive Inactive T4 Insert T4 Active Mergefilespaces parameter indicates whether import will merge files into existing file spaces on target server or generate new file spaces Allows transfer of backup and archive data in stages Provides restart capability Planned: Export/Import Incremental Export Server's File Insertion Date Files on Source Server During Export T1 T2 T3 T4 Export Action Skip Skip Export Export Fromdate/Todate Fromdate and Todate specify earliest insertion date of files to be exported Coupled with file space merging, allows ongoing duplication of data between two servers Planned: Export/Import TSM Scalability Enhancements 11-12 Dave Cannon
Simultaneous Writes to Copy s C B A COPYPOOL1 C B A C B A DISKPOOL TSM Client C B A define stgpool diskpool copystgpools=copypool1,copypool2 COPYPOOL2 Simultaneous writes to primary pool and copy pool(s) during client backup, archive, and space-management operations Target pools can have different device classes Should be used in conjunction with incremental storage pool backup Planned: Simultaneous Writes Simultaneous Writes: Error Handling If an output error occurs writing to the primary pool, all writes fail and the transaction rolls back If an output error occurs writing to a copy pool Default behavior is to discontinue writing to this copy pool for remainder of the session, but continue storing files into primary pool and any other designated copy pools Optional behavior is to fail all writes and end session (all or nothing) Planned: Simultaneous Writes TSM Scalability Enhancements 13-14 Dave Cannon
Windows 2000 Image Backup TSM Client Windows 2000 User interface for backup/restore Maps disk blocks to be backed up Client Volumes NTFS, FATx Raw LAN LAN Data Flow LAN-Free Data Flow SAN OS/390, Windows NT/2000, AIX, Sun, HP Storage Hierarchy Control Data Flow Optimized for backup/restore of entire file system Fast, file-system image backup Minimal overhead for TSM server database Online-backup causes minimal disruption to applications Fast restore because no overhead for file creation Backup/restore can be over LAN or LAN-free Planned: Windows 2000 Image Backup Windows 2000 Image Backup (cont.) Online Backup Requires Logical Volume Snapshot Agent (LVSA) Uses virtual snapshot to create PIT image Steps 1. Quiesce applications (pre-snapshot command) 2. Begin snapshot 3. Resume application processing (post-snapshot command) 4. Back up volume image 5. End snapshot Offline Backup Does not require LVSA Steps 1. Quiesce applications 2. Lock volume 3. Back up volume image 4. Unlock volume 5. Resume application processing Backup of Used Blocks In-use blocks are sent to server (possibly interleaved with occasional unused blocks for efficiency) Requires file system (NTFS, FATx) Backup of All Blocks All blocks in file system are sent to server Used for raw volumes Planned: Windows 2000 Image Backup TSM Scalability Enhancements 15-16 Dave Cannon
Multi-Session Restore Session 1 Session 2 Session 3 Storage pool volumes Client Server LAN-based restores using multiple client-server sessions for improved throughput Limited by Number of sequential-access volumes with data to be restored Mount points Client's resourceutilization option Planned: Multi-Session Restore Concept: Outboard Data Mover Request Request Data Mover TSM Client Data Data Control Data Flow Source Device TSM data mover Named device, external to TSM client or server Accepts request from TSM to transfer data Reduces CPU cycles on TSM client and server Avoids data movement over the LAN Examples: NAS device, SCSI-3 device Target Device Planned: Concepts TSM Scalability Enhancements 17-18 Dave Cannon
Concept: Data Format SCSI-3 Device NAS Device Native data format TAPEPOOL NBH data format SCSIPOOL Dump data format NASPOOL TSM server stores data in its own "native" format or in NBH format Outboard data movers may store data in other formats Each storage pool and data mover will have a designated data format Certain operations may be restricted for non-native storage pools Planned: Concepts Concept: Path Path Path NAS Device Storage Agent Path Tape Drive Path SCSI-3 Device A TSM path consists of Source and target Method by which source can access target Paths will replace Device parameter on library and drive definitions Drivemapping definitions for storage agents Allows sharing of the target device for improved resource utilization and scalability Planned: Concepts TSM Scalability Enhancements 19-20 Dave Cannon
TDP for NDMP Topology TSM Client Displays NAS information User interface for backup/restore Monitors backup/restore Can cancel backup/restore Request TCP/IP Accepts requests from client Provides backup/restore commands Runs backup/restore as TSM process Initiates and monitors NDMP sessions Controls library/tape operations Stores meta-data for stored images Manages policy NDMP Control (TCP/IP) Paths to Drives (optional) NAS Device (Data Mover) Accepts requests from TSM server Performs tape/library operations Transfers data during backup/restore Reports results to TSM server Paths to Drives Data Transfer (SCSI/FC) Tape Library (can be shared) Data Format is NetApp Dump Control Data Flow Path NAS File System Planned: TDP for NDMP TDP for NDMP Function NDMP-controlled backup of Network Appliance file servers with Data ONTAP 6.1.1 or higher Full file-system image Differential file-system image (files that have changed since last full backup) NDMP-controlled restore Full file-system image Full file-system image plus one differential file-system image Policy-based management of file-system images Data flow for backup/restore is LAN-free and outboard of TSM client and server Parallel backup/restore operations when multiple NAS file systems are processed Planned: TDP for NDMP TSM Scalability Enhancements 21-22 Dave Cannon
TDP for NDMP Function (cont.) SCSI-attached libraries controlled via Direct attachment to TSM server Passing of SCSI commands through NAS device Sharing of tape drives Windows NT/2000 servers UNIX servers planned for 2002 Choice of user interfaces for initiating, monitoring, and canceling backup and restore operations Server console or administrative command-line client Administrative web interface Windows NT/2000, AIX, or 32-bit Sun Solaris client Web client Scheduling of backup/restore operations using the administrative command scheduler Planned: TDP for NDMP SCSI-3 Extended Copy TSM Client Windows 2000 User interface for backup/restore Maps disk blocks to be backed up Control Data Flow Path TCP/IP Windows NT/2000, AIX, Sun Runs backup/restore as background process Initiates and monitors SCSI-3 copy Controls library/tape operations Stores meta-data for stored images Paths to Drives Data Transfer SAN Data Transfer Data Format is NBH Client Volumes NTFS Raw volumes Paths to Disks SCSI-3 Device (Data Mover) Paths to Drives Tape Library Features and benefits are similar to those of Windows 2000 image backup Additionally, CPU cycles are moved from TSM client and server, which has potential to improve performance Planned: SCSI-3 TSM Scalability Enhancements 23-24 Dave Cannon
Potential Future Enhancements Examples of items that have been considered for possible future implementation, but are not currently in plan Collocation enhancements Parallel migration Recovery log utilization Database reorganization Use of ESS/Timefinder for database backup TDP for NDMP extensions Some Observations on Collocation Volume Capacity << Size of Collocation Unit No or insufficient collocation Scattering of data across many volumes Long restores Volume Capacity ~ Size of Collocation Unit Efficient tape utilization Efficient restores Efficient data transfer? Volume Capacity >> Size of Collocation Unit Collocation too granular Inefficient use of tapes and library slots, if tapes are dedicated to a node OR Random mixing of nodes on each tape can lead to inefficient internal data-transfer operations to maintain collocation Excessive mounting of target tapes Multiple passes of input tapes As tape capacities increase, this case may become more common A C A C A B C D B B Source Volume D D E Filling Target Volumes Potential: Collocation TSM Scalability Enhancements 25-26 Dave Cannon
Collocation Groups Define groups of nodes whose data will be collocated together on sequential media: define collocgroup a update node ted collocgroup=a update node sue collocgroup=a update node mary collocgroup=a Group A: Node TED Node SUE Node MARY Group B: Node BOB Node JOE Node ANN BOB JOE SUE MARY TED ANN Group A Group B Migration Minimizes mounting of target volumes A B Group A Group B B Reclamation Avoids multiple passes of source volumes Minimizes mounting of target volumes A For move/copy operations to sequential media, data transfer performed by group Avoids multiple passes of source volumes and minimizes volume mounts Reduces number of tapes required for effective collocation Increases feasibility of collocated copy pools for offsite storage Potential: Collocation Parallel Migration Today: Each process migrates data for a different set of nodes Process 1 Process 2 node BOB node SUE node TED Enhancement: Allow multiple processes to migrate data for the same node Process 1 node FILESERVER Process 2 node FILESERVER Allows parallel processing of data for very large nodes Ensures participation by all processes until migration is complete Potential: Parallel Migration TSM Scalability Enhancements 27-28 Dave Cannon
Recovery Log Utilization DB Buffer Log Tail empty clean empty clean dirty clean dirty dirty clean dirty clean dirty clean dirty clean dirty empty dirty Checkpoint Buffer writer Maximum log size increased from 5 GB to 13 GB in V4.2 DB Log Head Active Log Records Potential future enhancements Provide tool to diagnose the reason for pinned recovery log Reduce log space occupied by checkpoint records Reduce checkpoint frequency Avoid accumulation of checkpoint records in log Improve performance of buffer writer Potential: Recovery Log Database Reorganization Database Reorganization Benefits of database reorganization using current unload/load Reduces database space utilization Minimizes fragmented pages Improves performance for database scans Potential enhancements Tool to determine when database reorganization is needed Online, non-disruptive database reorganization Potential: DB Reorganization TSM Scalability Enhancements 29-30 Dave Cannon
Use of ESS/Timefinder for DB Backup IBM Enterprise Storage Server 1. Quiesce DB activity TSM DB Source Volumes 2. Flash copy 4. Back up database TSM DB 3. Resume DB activity Target Volumes EMC Timefinder 1. Quiesce DB activity TSM DB Primary Volumes 2. Break mirror 4. Back up database TSM DB 3. Resume DB activity BCV Volumes Database backup with minimal impact to TSM server Potential: DB Backup TDP for NDMP: Single-File Restore Restore of individual files or directories from file-system image Basic restore Administrator specifies file name and directory File server scans backup to locate specified file Direct-access restore TSM collects and stores file information during backup of file-system image Client GUI displays image contents and provides interface for specification of file(s) to be restored File server positions to and restores selected files Potential: TDP for NDMP TSM Scalability Enhancements 31-32 Dave Cannon
TDP for NDMP: 3-Way Configuration NAS device 1 does not need access to tape drives Exploits tape libraries with integrated NDMP-server support Data flow over the LAN NAS Device 1 NDMP Control (TCP/IP) Data Transfer (TCP/IP) Control Data Flow NAS Device 2 (or tape library with integrated NDMP support) Data Transfer (SCSI/FC) Data Transfer (SCSI/FC) Tape Library NAS File System Potential: TDP for NDMP TDP for NDMP: Filer to Server Storage Hierarchy NDMP Control (TCP/IP) Data Transfer (TCP/IP) NAS device does not need access to tape drives NAS data stored in TSM's storage hierarchy with native data format Data flow over the LAN NAS Device NAS File System Control Data Flow Potential: TDP for NDMP TSM Scalability Enhancements 33-34 Dave Cannon
Other TDP for NDMP Enhancements Additional NAS vendors IBM EMC Celerra EMC CLARiiON IP4700 Auspex Additional tape libraries (V4.2.1 supports SCSI only) 3494 ACSLS Backup of NAS images to copy storage pools (V4.2.1 supports duplication by backing up to multiple primary pools under different node names) Potential: TDP for NDMP Summary Recent releases of Tivoli Storage Manager have improved scalability through enhancements to Performance Manageability Resource sharing Future product releases will further improve scalability by offering new/improved functions and by exploiting emerging technologies TSM Scalability Enhancements 35-36 Dave Cannon