Alternate Methods of TSM Disaster Recovery: Exploiting Export/Import Functionality Presented by: Eric Gruber TSM Sr. Architect, TSM Certified Consultant www.ckaonline.com
Value Recovery Time Tiers Best D/R practice is to blend tiers of solutions in order to maximize application coverage at lowest possible cost. One size, one technology, or one methodology doesn't fit all applications. Tier 7 - Highly automated, business wide, integrated solution (Example: GDPS/PPRC/VTS P2P, AIX PowerHA/XD, OS/400 HABP... Tier 6 - Storage mirroring (example: XRC, PPRC, VTS Peer to Peer) Zero or near zero zero data data recreation Applications with Low tolerance to outage Tier 5 - Software two site, two phase commit (transaction integrity) minutes to to hours data recreation up to to 24 24 hours data recreation 24-48 hours data recreation Tier 4 - Batch/Online database shadowing & journaling, Point in Time disk copy (FlashCopy), TSM-DRM Tier 3 - Electronic Vaulting, TSM**, Tape Tier 2 - PTAM, Hot Site,TSM** Tier 1 - PTAM * Applications Somewhat Tolerant to outage Applications very tolerant to outage 15 Min. 1-4 Hr.. 4-8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days Recovery Time Tiers based on SHARE definitions *PTAM = Pickup Truck Access Method with Tape **TSM = Tivoli Storage Manager *** = Geographically Dispersed Parallel Sysplex
Customer Requirements TSM Primary Pool replication or transfer to Hot-Site (Target) The Ability to back up customer data at both Source and Target Immediate access to TSM data at hot site without the need to restore the TSM Database The ability to provide backup and restore services simultaneously at the Hot-Site in the event of a disaster Disaster Recovery testing at hot site without impacting production The ability to have different retention requirements at source and site The ability to use different TSM Server architectures between source and site Minimal Personnel required at Hot-Site to begin recovering customer data
Thinking Cap
Hmmm What about Export Node? This could work!! Wait, what about performance? We can solve that. This functionality has been around a while Export/Import improved in TSM V5.5 Cool, now we can restart an Export Let s take a closer look at Export Node
Export Node Syntax >>-EXPort Node--+-----------+-----------------------------------> '-node_name-' >--+-------------------------------+----------------------------> '-FILESpace--=--file_space_name-' >--+------------------------+-----------------------------------> '-FSID--=--file_space_ID-' >--+----------------------------------+-------------------------> '-UNIFILESpace--=--file_space_name-' >--+-------------------------+----------------------------------> '-DOmains--=--domain_name-'.-FILEData--=--None-------------. >--+-------------------------------+----------------------------> '-FILEData--=--+-ALl----------+-' +-None---------+ +-ARchive------+ +-Backup-------+ +-BACKUPActive-+ +-ALLActive----+ '-SPacemanaged-'.-FROMTime--=--00:00:00-. '-FROMDate--=--date--+-----------------------+-' '-FROMTime--=--time-----' >--+------------------------------------------+----------------->.-TOTime--=--00:00:00-. '-TODate--=--date--+---------------------+-' '-TOTime--=--time----- '-EXPORTIDentifier--=--export_identifier-'.-PREVIEWImport--=--No------. >--+-------------------------+--+---------------------------+---> '-TOServer--=--servername-' '-PREVIEWImport--=--+-NO--+- '-Yes-'.-MERGEfilespaces--=--No------. >--+-----------------------------+------------------------------> '-MERGEfilespaces--=--+-No--+-' '-Yes-'.-Replacedefs--=--No------. >--+-------------------------+----------------------------------> '-Replacedefs--=--+-No--+-' '-Yes-'.-PROXynodeassoc--=--No------. >--+----------------------------+-------------------------------> '-PROXynodeassoc--=--+-No--+-' '-Yes-'.-ENCryptionstrength--=--AES-----. >--+--------------------------------+---------------------------> '-ENCryptionstrength--=--+-AES-+-' '-DES-'.-ALLOWSHREDdable--=--No------. >--+-----------------------------+----------------------------->< '-ALLOWSHREDdable--=--+-No--+-' '-Yes-'
Restartable Export TSM 5.5 Gives us Restartable Server-Server Export/Import Server-to-Server export/import only Provide the ability to isolate and export data based on file creation time Also provides the ability to isolate data based on range of time Several commands modified/added Export Node/Export Server - new parm on command for Todate/Totime Q Process command enhanced Q EXPORT command (NEW) SUSPEND EXPORT (NEW) CANCEL EXPORT (NEW) RESTART EXPORT (NEW)
Restartable Export, Cont. Export progresses through 3 phases: 1. Create Definitions on target server and create object list 2. Identify and export eligible files 3. File list complete export files Must complete phase 1 to be considered restartable Cancel process command will also make export non-eligible for restart Suspended exports are not effected by restarting the TSM server Source/Target TSM Server must be at 5.5 for function to work Correctly
Restartable Export, Cont. Export operations are suspended when any of the following occurs: A SUSPEND EXPORT command is issued for the running export operation Segment preemption - the file being read for export is deleted by some other process Communication errors on a server-to-server export No available mount points Necessary volumes are unavailable I/O errors are encountered: 2 ways to preview an export without actually writing any data: PREVIEW=YES and PREVIEWIMPORT Must have ALLOWSHREDDABLE=YES to allow export of data that is in a shreddable storage pool.
Restartable Export Graphic
Export Todate/Totime Options
Feature or Benefit A solution is, by definition, the resolution of a problem. If you don't have a problem, you don't need a solution
If you build it, we will Exploit it!
Here s what you need! TSM Servers at Source and Site Directory Management Classes (DIRMC) Large Random Access Disk Pools Migration Delay on Disk Storage Pools (Migdelay=2 or 3) Overflow Storage Pools (VTL for Mount Points) Long Term Tape Storage Pools High Bandwidth between Source and Site
Physical Architecture
REX or Perl to Create Schedules Each day, a REX script builds the TSM Administrative Schedules to the perform the Export Node There is an Administrative schedule created for each TSM Filespace of every node required daily.
Example of an Export Admin Schedule Schedule Name XAS13AR_NODE1_1 Description Command run V55XPRT_FILESPACE NODE1 ALLACTIVE SERVERARVIA1 SERVERARVIA2 XAS13AR_NODE1_1 01/12/2011 00:00:01 01/13/2011 23:59:59 1 XNODE1_1 XNODE1_1 XNODE1 Priority 6 Start date 2011-01-13 Start time 23:14:15
TSM Server Script Begin: /* V55XPRT_FILESPACE V5.5.02 */ /* Export Node (Filespace) */ /* Check NODE for Backup Activity */ select client_name from sessions where upper(client_name)='$1' if(rc_ok) goto Requeue /* Check ADMIN for Export Activity */ select client_name from sessions where upper(client_name)='$11' if(rc_ok) goto Requeue /* Check XPORTID for Export Activity */ select client_name from sessions where upper(client_name)='$12' if(rc_ok) goto Requeue /* Check for Migration Activity */ select * from processes where upper(process)='migra TION' if(rc_ok) goto Migration /* Check for Expiration */ select * from processes where upper(process)='expiration' if(rc_ok) goto Expiration /* Check for Data Base Backup Activity */ select * from processes where upper(process)='database BACKUP' if (rc_ok) goto DBBackup /* */ Migration: issue message i 'MSG999 Migration in Progress' goto Requeue /* */ Expiration: issue message i 'MSGAAA Expiration in Progress' goto Requeue /* */ DBBackup: issue message i 'MSGDBBB DB Backup in Progress' goto Requeue /* */ Requeue: issue message i 'MSG000 ReQueue "$5"' update schedule '$5' t=a startt=now+0:30 startd=today goto Fini /* */ Fini: Exit
TSM Server Script, Cont. /* Export the Filespace... */ Try1: ping server "$3" if(rc_ok) goto Via1 goto Try2 Via1: issue message i 'MSG111 Exporting Node "$1" FSID "$10" Via "$3"' /* first try restartables */ restart export '$1' if(rc_ok) goto Requeue restart export '$13' if(rc_ok) goto Requeue restart export '$12' if(rc_ok) goto Requeue /* Export it */ Export node '$1' filedata='$2' fromdate='$6' fromtime='$7' todate='$8' totime='$9' exportid='$12' toserver='$3' merge=yes replacedefs=yes FSID='$10' allowshred=yes if(rc_ok) goto Completed TSM Admin Schedule Schedule Name XAS13AR_NODE1_1 Description Command run V55XPRT_FILESPACE NODE1 ALLACTIVE SERVERARVIA1 SERVERARVIA2 XAS13AR_NODE1_1 01/12/2011 00:00:01 01/13/2011 23:59:59 1 XNODE1_1 XNODE1_1 XNODE1 Priority 6 Start date 2011-01-13 Start time 23:14:15
Trials and Tribulations Daily Housekeeping script suspends exports and restarts them when completed Need to manage mount points because you don t want an export waiting to long for a volume (VTL and Disk Pools) Possible recovery log pinning Directories sent on each export (DIRMC required) Export by FSID for performance Suspending and export could take a long time
Did we meet Customer Requirements? TSM Primary Pool replication or transfer to Hot-Site (Target) 1. Data is exported daily. Administrative schedule s start 2 hours after client nodes schedule associations. The Ability to back up customer data at both Source and Target 1. Internal TSM Server name at source and site are identical 2. Failover can be handled through DNS entry or second optfile /server stanza Immediate access to TSM data at hot site without the need to restore the TSM Database 1. Data is imported into the Primary pool daily 2. Server is up and available at all times for a restore The ability to provide backup and restore services simultaneously at the Hot-Site in the event of a disaster 1. In the event of a disaster, some customers will require restores and others will require a backup 2. No need to wait for return to normal operations for backup
Solving the Problem Disaster Recovery testing at hot site without impacting production 1. Data is online and available at all times at Hot Site Just point (dsm.opt) and shoot The ability to have different retention requirements at source an site 1. Backup Copygroups can be completely different at source and site The ability to use different TSM Server architectures between source and site 1. We are not bound due to the nature of export/import ingestion Minimal Personnel required at Hot-Site to begin recovering customer data 1. With the TSM database hot, alive and waiting for access, costs and time to data have been reduced significantly
What s next? Only way to be sure all data has been sent to site is to send from beginning of time. Best effort gets the Job done but.. Need true Node replication between source and site Replication should be TSM Centric and hardware independent Need a message in activity log that says node replication complete Since DB2 database at source and site replication can be developed
Requirements submitted to IBM Requirements we submitted to IBM: Rather than use Import/Export.. Replicate data for a node from TSM Server at site A to a TSM Server at site B and from TSMB to TSMA Replication should be Incremental in nature Replication of TSM database data to ensure completeness/consistency Keep it in deduplicated form if we are using deduplication Enable a hot standby TSM Server at site + existing data Learn more about the future directions of TSM When: Monday 6pm 7pm Where: Room 306
Thank You!! Questions?