Tivoli Storage Manager Scalability: Past and Present Dave Cannon IBM Storage Systems Division Tucson, Arizona 1
Scalability Contributors Servers Clients Performance Network DB Storage pools Hardware resources Time constraints Manageability Administrative staff Total cost of ownership 2
Scalability Enhancements ADSM 2.1 DB performance Storage pool backup DB backup Admin schedules Tivoli Storage Manager Scalability ADSM 3.1 File aggregation SQL interface ADSM 3.1.2 Enterprise configuration Command routing Event logging Command scripts TSM 3.7 Image backup Backup sets Multi-session client DB unload/load Improved DB audit Self tuning MVS server changes NT server changes Future Possibilities Tivoli Decision Support SAN exploitation Hardware integration Server-server export Volume transfer Move node data Collocation groups DB page allocation Self-healing DB Asynch transactions Self-tuning sessions Subfile backup Redundant files Opportunistic backup Server serialization AIX server changes 1995 1997 1998 1999? 3
ADSM 2.1 Database performance improvements Storage pool backup and restore Database backup and restore Administrative command schedules 4
Database Performance Improvements Enhanced column encode/decode processing Word alignment of keys and records Key directory Database page Records 5
Storage Pool Backup and Restore Primary Storage Pools Storage Pool Backup Copy Storage Pools Off-site Vaulting Incremental backup of primary storage pools Can be automated using administrative schedules Backup copies not affected by movement of primary files Assists with the management of offsite volumes Recovery options Client can directly access copies without first restoring to primary pool Restore of file, volume, or entire storage pool 6
Database Backup and Restore Full or incremental backup while server is active Can be automated using administrative schedules DB backup trigger causes automatic backup if log fills Recovery options Point in time Point of failure Single database volume + + ADSM Database full incrementals recovery log records 7
Administrative Command Schedules ADSM 2.1 provided capability to schedule any administrative command Automation of Server operations (e.g., migration, storage pool backup) Server monitoring via query commands Capability was extended with introduction of Foreground processing option (ADSM 3.1) Server command scripts (ADSM 3.1.2) 11 10 9 8 7 12 6 1 5 2 3 4 Execute Administrative Command 8
ADSM 3.1 File aggregation Server SQL Interface 9
File Aggregation Server groups client files into aggregates during backup or archive File level granularity for retrieve and delete operations For many operations, entire aggregate can be processed as a single entity Storage pool migration Small F E D C B A SERVER Large Client Files Reclamation Storage pool backup/restore Move Data command A B C D E F Storage Pool 10
Database Information for Stored Files Non-aggregated files Inventory Storage c a b Client File name Directory...... a b c Aggregated files Inventory Mapping Storage a b c Client File name Directory...... a b c Aggr X Aggr X 11
Performance Benefits of Aggregation Reduced overhead for database updates (storage information not updated for each logical file during move/copy operations) a d c Database b e abcde Database Without aggregation With aggregation Improved storage device performance because data is transferred in larger units e d c b a abcde Without aggregation With aggregation 12
Server SQL Interface select node_name, sum(logical_mb) as DATA_IN_MB, sum(num_files) as NUM_OF_FILES from occupancy group by node_name having min(num_files) >= 1 order by data_in_mb desc ADSM Database ODBC Features Administrative Select command ODBC driver Increases manageability by allowing customized reporting and analysis 13
Administrative Select Command Database select * from LIBVOLUMES LIBVOLUMES BACKUPS LIBRARY_NAME VOLUME ----------------------- ------------- DLTLIB DLT01 DLTLIB DLT02 DLTLIB DTLLIB Select command ADSM administrative command Consistent with other relational database products System catalog tables implemented in ADSM database to assist administrator Confirmation message allows resource-intensive queries to be canceled before execution 14
SQL Select Statement Example adsm> select node_name, sum(logical_mb) as DATA_IN_MB, sum(num files) as NUM OF FILES from occupancy group by node_name having min(num_files)>= 1 order by data_in_mb desc NODE_NAME DATA_IN_MB NUM_OF_FILES --------- ---------- ------------ BAYKAL 136.25 3570 YANGTZE 50.32 698 LOIRE 30.59 11 POLONIUM 2.01 24 DANUBE 0.59 13 15
ODBC Interface Graphical interface to SQL is available in many products Lotus Approach Microsoft Access ODBC driver available for Win32 client Windows 95 Windows NT 3.51 and 4.0 ODBC Version 2.5 API 6 5 4 3 2 1 0 6 5 4 3 2 1 0 16
ADSM 3.1.2 Enterprise configuration Enterprise command routing Enterprise event logging Server command scripts 17
Enterprise Configuration Allows configuration and policy information to be defined once for any number of servers ADSM Configuration Manager Server that centrally stores information that is inherited by other servers Information "pushed" when updated Can be "dedicated" or regular server Managed Managed Server ADSM ManagedServer ADSM Server Server that inherits information from other servers Information "pulled" at startup and merged with locally defined policy Information locally cached Simplified management of multiple servers 18
Propagation of Object Definitions Objects Profile Associations Subscription profile 1 profile 2 Configuration Manager Propagation of objects Propagated object definitions Administrators Policy domains Administrative schedules Command scripts Client option sets Servers Server groups Managed Server Managed Server Managed Server 19
Benefits of Enterprise Configuration Simplifies task of maintaining multiple ADSM servers with similar configurations Provides mechanism for propagating designated configuration items between servers Automatic propagation using server-to-server communication Caching of configuration information on target server 20
Enterprise Command Routing Dev_group Web admin console or command-line client ServerA HQ_group ServerA,ServerZ: q session Dev_group,ServerA: q session HQ_group,Dev_group: q session ServerZ Commands routed for execution on one or multiple servers Groups of servers can be predefined and named Command output returned to issuer Supports single, unified logon Simplifies control and monitoring of multiple servers throughout the enterprise 21
Enterprise Event Logging Global Report User Exit Event Viewer Server Activity Log NT Event Log File Clients Network Server Single Server * Network Server Network Server * Clients* Network Server * Clients* * Single Server * * Clients* Network Server * Clients* * Network Server Clients Single Server * Network Server * Clients* Network Server * Clients* Single Server * NetView MVS SNMP Manager Client and server events can be sent to any selected receiver Events can be enabled/disabled based on severity and message id Simplifies monitoring of client/server events throughout the enterprise Provides consolidated reporting 22
Server Command Scripts Backup stgpool backuppool drmpool 11 10 9 8 7 12 6 1 2 3 4 5 wait=yes Backup DB dev=tape... Defined sequence of server commands stored in server database Supports parameter substitution at execution time Supports conditional branch logic based on return code Can be scheduled as an administrative command Sample scripts are delivered with server Simplifies administration by automating server operations 23
Example Script /* EXAMPLE - Sample Macro to perform actions using conditional logic */ /* Step 1: see if any sessions are running */ QUERY SESSION /* If Sessions Are running, reschedule the macro */ /* Note that we get rc_ok when sessions are running and rc_notfound */ /* when there are no sessions */ if(rc_ok) goto reschedule /* */ /* Backup the storage pool */ BACKUP STG BACKUPPOOL COPYPOOL WAIT=YES /* */ /* Get out if an error occurred */ if(error) exit /* */ /* backup the database */ BACKUP DB DEV=DLT3 TYPE=FULL EXIT /* */ /* This section reschedules the macro, if needed */ reschedule: DELETE SCHEDULE RUNSAMPLE DEFINE SCHEDULE RUNSAMPLE T=A CMD="RUN EXAMPLE" STARTT=NOW+0:20 24
Tivoli Storage Manager 3.7 Image backup and restore Backup sets Multiple-session backup/archive Database unload/load Improved database audit Self tuning MVS server enhancements NT server pseudo-kernel changes 25
Image Backup and Restore Full filesystem and raw-logical volume backup/restore Image sent from client as single object and tracked by server at image level Image backup operation is independent of any file-level backup Filesystem restore can be done in conjunction with file level restore Server DB 1 entry 1 object Storage Pools New/modified client (Web, CLI, and GUI) commands and options Modified administrative commands Supported on AIX-SUN-HP client platforms AIX, Solaris, HP FS 1 or /dev/hd9 26
Integrated Backup/Restore Operations Incremental backup of individual files in filesystem. Allows restore of any or all files. 1 Image backup of entire filesystem. Allows fast filesystem restore. 2 Incremental backup of individual files and periodic image backup. Allows fast restore of image with optional point-in-time reconciliation. 3 Image backup with incremental backup from image date. Incremental backups cannot be used for complete file-level restore. 4 Client Server 27
Image Backup/Restore Components TSM Server TSM API Plug-in utility TSM backup/archive client FS 1 or /dev/hd9 Backup/archive client makes initial connection with server Plug-in architecture loads utility to process image Plug-in utility performs image backup/restore Image and meta-data transmitted via Tivoli Storage Manager API 28
Benefits of Image Backup/Restore Solution for backup of raw logical volumes (DBs, etc.) Full filesystem backup with minimal server database overhead Optimized performance for backup/restore of entire filesystem Point-in-time backup may reduce required number of file versions Point-in-time restore option offers benefits of both image and incremental restore Fast restore to image date Reconciliation of filesystem to later point in time by deleting/restoring files based on incremental backup data 29
Backup Sets TSM Client 1. Backup of client files TSM Server 3a. Server-based restore Storage Pools 3b. LAN-free restore IBM QIC-5010 Data Cartridge 2. Generate Backupset command copies active files and meta-data to sequential media Records retention Vault 30
Benefits of Backup Sets Supports long-term storage without resending data from client (instant archive) Provides high-performance restore (server-based or LAN-free) Minimal database overhead for tracking of backup sets Point-in-time snapshot may reduce required number of file versions 31
Multiple-Session Backup/Archive Example: inc filespeca filespecb Single-session backup Query filespeca Send files Query filespecb Send files Multiple-session backup Query filespeca Send Data Query filespecb Send Data Producer thread(s) Consumer thread(s) 32
Multiple-Session Example Tivoli Storage Manager Client Tivoli Storage Manager Server session 1 dsmc incremental c: d: session 2 session 3 Compare Thread Data Thread Type Monitor text Thread Data Thread Sess Comm. Bytes Bytes Sess Platform Client Name Number Method... Sent Recvd Type ------ ------ ------- ------- ----- -------- ------------ 1 Tcp/Ip... 8.6 K 157 Node WinNT Earth 2 Tcp/Ip 383 14.2 M Node WinNT Earth 3 Tcp/Ip 368 12.4 M Node WinNT Earth 33
Benefits of Multiple-Session Backup/Archive Exploits multi-threaded client operating systems Increased parallelism leads to higher throughput Client controls number of sessions based on value of RESOURCEUTILIZATION option 34
Database Unload/Load Improves performance as compared to existing dump/load Multiple threads with pipelining and automatic tuning I/O blocks increase from 4KB to 256KB Recovery log I/O eliminated Extraneous data encode/decode steps eliminated Reorganizes database by packing pages Reduces database space utilization Minimizes fragmented pages Improves performance for database scans 35
Page Packing Insertion of a a new record into a full page New record added to full page If record fits in middle, page is split into two half-full pages If record fits at end, new page is created for inserted record 36
Unload Processing Traverse tree Copy leaves in order Unload processing copies records for each table in ascending key order Records from different tables are interleaved Requires a database whose internal b-tree tables are intact (NOT for salvage) Guarantees that database tables will be loaded in key order (packed) Multiple threads are used and performance is dynamically tuned based on throughput 37
Load Processing Read input Multiple threads load tables from queue Load table queue Records for each table are loaded in ascending key order Multiple threads are used and performance is dynamically tuned based on throughput 38
Improved Database Audit Multiple threads allow concurrent processing of different parts of the database (parallel processing) Recovery log I/O eliminated Enhanced algorithms eliminate redundant checks for referential integrity Backup Files Archive Files Disk Files Space-Managed Files Sequential Files Parallel processing 39
Self tuning objectives Optimize server performance Self Tuning Eliminate manual tuning for key parameters Provide intelligent storage management Benefits TSM automatically tunes itself Reduces customer requirement for performance knowledge and skill Keeps the external interface to TSM simple Derive performance Adjust & Test Acceptable? Intelligent, adaptive algorithm 40
Self Tuning of Buffer Pool Size Enable buffer pool self tuning using SELFTUNEBUFPOOLSIZE server option How it works Start Expiration processing Reset buffer pool statistics If cache ratio < 98% database buffer pool size will be increased Run expiration Expiration processing If cache ratio > 98% database buffer pool will not be changed Check buffer pool Cache Hit (X%) Amount of real storage available will be taken into account Yes Cache Hit < 98 % No Increase BUFPOOLSIZE by (98 %-X%)/2 % BUFPOOLSIZE unchanged End 41
MVS Server Enhancements Use of OpenEdition TCP/IP sockets on OS/390 R5 and higher reduces CPU utilization Optimized page fixing for VSAM disk I/O Page fix high-usage buffers for life of server/session Reduces lock contention for page fix and unfix Overlapped BSAM I/O (3590 only) Specified using TAPEIOBUFS server option Reduces start I/O processing and improves throughput 42
NT Server Pseudo-Kernel Changes Pseudo-kernel on each server platform provides common interface for operating-system services Development worked with engineers from Intel and IBM Kirkland Analyzed NT pseudo-kernel for bottlenecks and identified potential performance improvements Concluded that synchronization mechanisms degrade scalability, especially in SMP environment Improvements for Tivoli Storage Manager 3.7 include Use of NT event objects to eliminate explicit wait queue Use of NT critical sections to achieve higher-granularity serialization and avoid bottlenecks 43
Conclusion Scalability has been emphasized since the early days of ADSM Product scalability has increased through a progression of performance and manageability enhancements Tivoli Storage Manager 3.7 continues this trend with the introduction of performance-enhancing features and upgrades 44