The Data Integrity Imperative If it isn t accurate, it isn t available. Technical White Paper Visin Slutins, Inc.
Intrductin The fundamental requirement f high availability sftware is t ensure that critical data and applicatins are available wherever and whenever they are needed. That typically invlves replicating business data, applicatins and system values t a backup server and prviding the ability t switchver quickly, and pssibly even autmatically, t the backup server when necessary. Hwever, ne factr is ften verlked: the integrity f the backup data. The data n the backup server must be an exact replica f the prductin data. If errrs in the replicatin prcess r prblems n the backup server intrduce data errrs, the backup server may nt be able t functin when called upn r, pssibly even wrse, the cmpany may suffer cnsiderable lss if it unwittingly runs its business based n faulty data. Validating the integrity f the data n a backup server and crrecting it when necessary, shuld, therefre, be cre functins in a high availability slutin. In a never-dwn prductin envirnment, the integrity validatin and crrectin functins must exhibit three cre characteristics: (1) They must be capable f cnfirming that the primary and backup data are bit-fr-bit identical. Simply checking file attributes r merely checking at the recrd level is nt sufficient. (2) Data integrity checking must be cmpleted while active, meaning that it must be dne while users are accessing and/r updating the data, withut the need t take any systems r data ff-line. (3) The integrity f critical business data must be actively managed and ptimized, cmplete with audit trails, thrughut its entire life-cycle When questins f integrity are cnsidered, cmpanies ften fcus primarily n the integrity f business data. Clearly, this is crucial, but it is nly ne aspect f the integrity issue. Business data is useful nly if the applicatins that access, update, analyze and manipulate are functining prperly. That can happen nly if, in additin t the business data, the prgram cde and related system data, such as user IDs and passwrds, are als uncrrupted. Thus, the integrity f all system bjects, nt just business data, must be prtected in a high availability slutin. The Visin Integrity Slutin Visin established the benchmark fr data integrity in the high availability industry when, mre than five years ag, it incrprated the use f while active Cyclic Redundancy Check (CRC) technlgy int its data synchrnizatin checking facility t ensure bit-fr-bit integrity. In 2005, Visin Slutins again led the industry with a cmplete data life-cycle management slutin knwn as Directr Suite that enables businesses t ptimize and manage their iseries data envirnment, while active, frm cradle t grave, with audit trails. Currently, this capability is a Visin exclusive. CRC is the mst widely used technlgy fr ensuring the bit-fr-bit integrity f data that is being cpied frm ne lcatin t anther. IBM started using CRC in its disk drive technlgy in 1954 t ensure the reliability f data written frm memry t DASD. CRC is still used tday by netwrking technlgy cmpanies t ensure the integrity f the mst sensitive data transmitted acrss netwrks arund the glbe. A CRC algrithm generates a binary number used t represent a lng string f data. This allws prcesses t cmpare large data strings n different systems withut transmitting the entire string. It prvides the mst efficient and reliable methd fr detecting differences in data strings, including differences in rder f the characters. Cmmunicatin prgrams cmmnly use CRC checks t validate the accuracy f data
sent and received, but CRC is als applicable when validating the integrity f data n yur prductin and backup systems. While basic CRC algrithms are relatively simple t implement in a static envirnment, implementing them in a high availability envirnment, where the CRC must ften be perfrmed n terra-byte sized databases while they are in use, is far frm trivial. In fact, Visin is, t-date, the nly high availability vendr t successfully implement CRC technlgy t ensure bit-fr-bit integrity fr a sftware-based mirrring slutin. Visin cntinues t lead the iseries industry in the area f data integrity with the mst sphisticated, rbust set f capabilities including: 1. Database Integrity Visin s ORION uses multiple techniques t autmatically detect and repair database integrity issues at the recrd level, while users and applicatins are actively using the database. 2. Object Integrity - ORION ensures the validity f all system bjects using bject-level integrity checking techniques that prvide the flexibility needed t re-synchrnize at the individual bject, library, r link level, including the ability t autmatically detect and mirrr t the backup system any new bjects created n the primary system. 3. Backup System Integrity If the backup system crashes, leaving it in a crrupted state, ORION can autmatically recver and restre cmplete database and bject integrity n the backup system. 4. Data Optimizatin and Life-Cycle Management Prtecting the integrity f yur data starts with having an rganized and ptimized envirnment s yu can accurately keep track f all business critical data and manage it apprpriately thrugh its entire life-cycle, with apprpriate audit trails. In additin, ORION prvides facilities t ptimize data befre even starting the high availability prcesses. Database Integrity Visin prvides fur imprtant capabilities related t ensuring database integrity: CRC Synch Check and Repair While Active Sample Synch Check Supprt fr Database Cnstraints CRC fr IFS Stream Files CRC Synch Check While Active Nt nly des Visin s CRC synch check utility prvide the highest level f data integrity, ensuring bit-fr-bit identical replicatin n surce and target, but it als delivers the fllwing three capabilities that are needed fr industrial-strength enterprise-level applicatins.
CRC While Active CRC Sync Check and Object Repair is especially useful fr extremely large files with millins f recrds because, using this facility, re-synchrnizing des nt require a SAVE/RESTORE peratin r an electrnic send f the entire bject. The CRC validatin and the bject repair prcesses can ccur while the file cntinues t be updated by user applicatins n the surce system. Autmatic Object Repair While Active This facility autmatically repairs data integrity issues discvered by CRC. And, t ensure ptimal perfrmance and allw while-active capabilities, it nly re-synchrnizes the file segment that is actually in errr. Parallel Mde ORION can run multiple CRC synch check jbs in parallel t increase perfrmance in enterprise envirnments where large numbers f bjects need t be verified. When enabled in OMS/400, the CRC Sync Check mde divides files int a cnfigurable number f segments and then perfrms CRC checks against each segment. If any segment fails the CRC check, the bject will remain in a *FIX status. Sample Synch Check can perate in either blck r randm mde, as described belw: Blck Mde - When in this mde, OMS/400 simultaneusly checks the surce and target bjects fr synchrnizatin as fllws: All significant member-level file attributes f the tw bjects are checked. These attributes include the fllwing: Number f members Number f active and deleted recrds File member id A cnfigurable number f the last physical recrds in the surce file (by default 10, but yu can specify up t 99,999) are checked and cmpared with the target file. The tw sets f recrds must be identical. The file is cnsidered t be synchrnized if each item at every level is fund t be identical in the surce and target bjects. Cnversely, the failure f any single cnditin causes OMS/400 t reprt an ut-f-sync cnditin and place the bject n hld (status *HLD) unless the SYNCHKHLD data area n the target is cnfigured t lg an ut-fsync message fr the bject rather than place it n hld. Randm Mde The randm sample read technique uses the same functinality as blck mde, but it acts n randmly selected recrds instead f nly a grup f recrds at the end f a file. As with blck mde, yu can specify that up t 99,999 are t be checked. Additinally, a new technique called marking is used t determine if specific file cntents are currently being updated, thus remving the need t allcate the cmplete file n the surce system. As a result, marking prvides mre flexibility in terms f when and acrss which files sync checks can be run. Based n parameters set fr the sync check, the markers are simply jurnal entries that indicate that the member attributes within the markers are abut t be retrieved. Then, anther
entry is sent cntaining the member attributes. The fllwing prvides mre specific infrmatin abut this technique: The marker technique will be used t determine if the member attributes can be validated. If n recrds have been added r deleted between the marker and the member attributes, then the number f recrds and deleted recrds can be validated (i.e., sync checked). The facility cnfirms that recrds deleted frm the surce are als deleted n the target. If nt, and there are n pending failed relatinal integrity transactins fr that file, the file is placed n hld. Files that are allcated exclusively will nt be able t be validated by the sync check. These lcks may, fr example, ccur during the fllwing peratins: Save/Restre, CLRPFM, RGZPFM, ALCOBJ *EXCL. Files that have recrds added r deleted between the tw markers will nt have their recrd cunts validated. Cnstraints Wrking with related physical files ORION supprts the sending f files with referential cnstraints frm a surce system t a target system. This feature prvides the fllwing: The user can resynchrnize files with cnstraints using an Electrnic Send peratin. Because the ODS interface and ORION s Prductin Library Mnitr (PLM) use Electrnic Send t autmatically cnfigure new files t OMS, this functinality is supprted fr files that have cnstraints. This is particularly beneficial when yu have very dynamic applicatins with many delete and create actins ccurring in files with cnstraints. A sync check peratin detects when surce and target files with cnstraints are ut-f-sync and places the files n hld (*HLD). These individual held (*HLD) files can be resynchrnized by returning them t a *PND (pending) status and then sending them electrnically t the target. With ORION, using an Electrnic Send fr files with cnstraints is n different than fr data bjects where referential integrity des nt apply. The Electrnic Send peratin fr files with cnstraints deletes the target file and replaces it t ensure that cnstraints n the target bject remain identical t thse n the surce bject. CRC Syncheck Feature fr IFS This feature enables an efficient byte-fr-byte cmparisn between a surce and target system (using CRC) fr all data in an IFS bject, while the bject remains active fr update. IFS Authrity Repair Feature fr *ATTR Sync Check The IFS attribute sync check repair feature checks the attributes and authrities f *STMF bjects and can autmatically repair attributes and authrities n the target. By default, this feature is activated, but it is an ptinal feature that can be deactivated by setting the IFSAUTRPR cntrl value in the MRCTLVP
cntrl file. When activated, this functinality will autmatically repair attributes and authrities n the target system after cmpletin f an IFS attribute (*ATTR) sync check. Object Integrity Prductin Library Mnitr (PLM) - The PLM analyzes the libraries cntaining the bjects yu are mirrring and validates the status f mirrred bjects. The PLM, which runs daily, searches fr bjects that are nt being mirrred. If it finds any, it reprts the nature f the prblem. The PLM can als simplify the nging maintenance f OMS/400 by detecting, defining, jurnaling, and synchrnizing newly created bjects. In additin, it can als be cnfigured t autmatically re-send held bjects. The PLM als prvides an easy way t initially identifying bjects t OMS/400 fr mirrring. The PLM is cnfigured n the surce and a synchrnized cpy f yur PLM definitins is autmatically maintained n the target system. Thus, in the event f a rle swap, yur PLM cnfiguratin n the new surce system will already be in place. Attribute Synch Check - The ODS/400 synchrnizatin check cmpares the surce and target attributes f an bject and can be cnfigured t mirrr the surce bject if the tw versins differ r if the bject des nt exist n the target. An bject level sync check can be executed as a cmmand r as a scheduled jb t be run at a time yu define. Database Relatins - Object synch check fr database relatins verifies a lgical file s relatin t a physical file at the member level. Reverse Synch Check The bject synch check capability can be run in reverse mde which cmpares bjects n the target system t surce (as ppsed t cmparing the surce t the target). The benefit f this is that extraneus bjects n the target can be identified and dealt with as apprpriate. Electrnic Send with Referential Cnstraints Depending n the size f the bject and the extent f the repairs necessary it may be mre efficient t send the entire bject in rder t crrect an errr. In these circumstances, Visin s Electrnic Send utility maintains the bject s referential cnstraints n the target machine. Target Recvery Mde Enterprise-level high availability slutins must be able t maintain the integrity f the backup database. This can be a challenge when an unplanned utage n the backup system leaves the databases in a crrupted state. T handle these situatins, ORION detects any type f abnrmal end f the target system and autmatically re-starts in recvery mde s that the sftware knws where t restart the apply prcess. It then lks fr ut-f-synch cnditins that need t be repaired. Optimized Data Life-Cycle Management T prtect data integrity, while ensuring the perfrmance f yur systems, yur high availability slutin must manage and ptimize data thrughut its entire life-cycle. Visin s unique Directr, a highly
integrated set f tls fr systems management and ptimizatin, is designed t practively and autmatically manage an iseries system with a minimum f human interventin. Because human errr is ne f the greatest threats t data integrity, a primary gal f Directr is t minimize human interventin while maximizing the detectin and reslutin f any issues that might arise. T this end, Directr was built t an exacting mdel f autnmics, supprting all 8 pillars f autnmic cmputing: self-aware, self-cnfiguring, self-ptimizing, self-healing, self-prtecting, self-adapting, selfmanaging and self-anticipating. Directr, cmbined with Visin s Data Manager (described belw), delivers a self-installing engine fr iseries ptimizatin that cnfigures, mnitrs, and manages itself t prvide measurably imprved iseries perfrmance, resulting in significantly reduced disk I/O, imprved DASD and CPU utilizatin, thereby delivering lwer cst f wnership and higher ROI. Using Directr, system ptimizatin precedes the initiatin f the high availability prcesses and then cntinues n an nging basis. Directr s Re-Organize in Place feature allws yu t free-up unused disk space. In additin t reclaiming therwise lst space, Directr als allws yu t imprve applicatin perfrmance by reducing excessive disk I/O and memry utilizatin caused by the physical presence f lgically deleted recrds. Data Manager is a user-based tl within the highly integrated Visin Slutins OS Suite f tls that autmates the mdeling, testing, purging and archiving activities fr IBM iseries applicatin envirnments. Built n advanced technlgy that leverages a highly autnmic architecture, Data Manager delivers a self installing engine which prvides a simple query-like interface t manage the data purge and archive prcess frm the identificatin f redundant and unused data thrugh t the deletin and/r archive f data. The tlset enables peridic and incremental archiving and, as with all Visin slutins, perfrms its functins while users are active n the system.