Caché High Availability Guide

Transcription

1 Caché High Availability Guide Version October 2008 InterSystems Corporation 1 Memorial Drive Cambridge MA

2 Caché High Availability Guide Caché Version October 2008 Copyright 2008 InterSystems Corporation All rights reserved. This book was assembled and formatted in Adobe Page Description Format (PDF) using tools and information from the following sources: Sun Microsystems, RenderX, Inc., Adobe Systems, and the World Wide Web Consortium at The primary document development tools were special-purpose XML-processing applications built by InterSystems using Caché and Java. and Caché WEBLINK, Distributed Cache Protocol, M/SQL, N/NET, and M/PACT are registered trademarks of InterSystems Corporation. and InterSystems TrakCare, InterSystems Jalapeño Technology, Enterprise Cache Protocol, ECP, and InterSystems Zen are trademarks of InterSystems Corporation. All other brand or product names used herein are trademarks or registered trademarks of their respective companies or organizations. This document contains trade secret and confidential information which is the property of InterSystems Corporation, One Memorial Drive, Cambridge, MA 02142, or its affiliates, and is furnished for the sole purpose of the operation and maintenance of the products of InterSystems Corporation. No part of this publication is to be used for any other purpose, and this publication is not to be reproduced, copied, disclosed, transmitted, stored in a retrieval system or translated into any human or computer language, in any form, by any means, in whole or in part, without the express prior written consent of InterSystems Corporation. The copying, use and disposition of this document and the software programs described herein is prohibited except to the limited extent set forth in the standard software license agreement(s) of InterSystems Corporation covering such programs and related documentation. InterSystems Corporation makes no representations and warranties concerning such software programs other than those set forth in such standard software license agreement(s). In addition, the liability of InterSystems Corporation for any losses or damages relating to or arising out of the use of such software programs is limited in the manner set forth in such standard software license agreement(s). THE FOREGOING IS A GENERAL SUMMARY OF THE RESTRICTIONS AND LIMITATIONS IMPOSED BY INTERSYSTEMS CORPORATION ON THE USE OF, AND LIABILITY ARISING FROM, ITS COMPUTER SOFTWARE. FOR COMPLETE INFORMATION REFERENCE SHOULD BE MADE TO THE STANDARD SOFTWARE LICENSE AGREEMENT(S) OF INTERSYSTEMS CORPORATION, COPIES OF WHICH WILL BE MADE AVAILABLE UPON REQUEST. InterSystems Corporation disclaims responsibility for errors which may appear in this document, and it reserves the right, in its sole discretion and without notice, to make substitutions and modifications in the products and practices described in this document. For Support questions about any InterSystems products, contact: InterSystems Worldwide Customer Support Tel: Fax: support@intersystems.com

3 Table of Contents Introduction Write Image Journaling and Recovery Write Image Journaling Image Journal Two-Phase Write Protocol Recovery Recovery Procedure Error Conditions If Recovery Cannot Complete (UNIX and OpenVMS) Sample Recovery Errors Write Daemon Panic Condition Write Daemon Errors and System Crash Freeze Writes on Error Responding to a Freeze Limitations of Write Image Journaling Backup and Restore Backup Integrity and Recoverability Importance of Journals Backup Methods External Backup Online Backup Configuring Caché Backup Settings Define Database Backup List Configure Backup Tasks Schedule Backup Tasks Managing Caché Online Backups Run Backup Tasks View Backup Status View Backup History Handle Backup Errors Back Up Selected Globals and Routines Restoring from a Backup Using the Backup History to Recreate the Database Suspending Database Access During a Restore Restoring Database Properties Error Handling for Restore Caché High Availability Guide iii

4 2.7 Caché Backup Utilities Perform Backup and Restore Tasks Using ^BACKUP Back Up Databases Using ^DBACK Restore Databases Using ^DBREST Estimate Backup Size Using ^DBSIZE Sample Backup Procedures External UNIX Backup Script UNIX Backup and Restore OpenVMS Backup Journaling Journaling Overview Differences Between Journaling and Write Image Journaling Protecting Database Integrity Automatic Journaling of Transactions Rolling Back Incomplete Transactions Using Temporary Globals and CACHETEMP Journal Management Classes and Globals Configuring Journaling Configure Journal Settings Journaling Best Practices Journaling Operation Tasks Start Journaling Stop Journaling Switch Journal Files View Journal Files Purge Journal Files Restore Journal Files Journaling Utilities Perform Journaling Tasks Using ^JOURNAL Start Journaling Using ^JRNSTART Stop Journaling Using ^JRNSTOP Switch Journal Files Using ^JRNSWTCH Restore Globals From Journal Files Using ^JRNRESTO Filter Journal Records Using ^ZJRNFILT Display Journal Records Using ^JRNDUMP Update Journal Settings Using ^JRNOPTS Recover from Startup Errors Using ^STURECOV Convert Journal Files Using ^JCONVERT and ^%JREAD Set Journal Markers Using ^JRNMARK Manipulate Journal Files Using ^JRNUTIL iv Caché High Availability Guide

5 Manage Journaling at the Process Level Using %NOJRN Journal I/O Errors Freeze System on Journal I/O Error Setting is No Freeze System on Journal I/O Error Setting is Yes Special Considerations for Journaling Performance UNIX File System Recommendations Shadow Journaling Shadowing Overview Configuring Shadowing Configuring the Source Database Server Configuring the Destination Shadow Journaling on the Destination Shadow Managing and Monitoring Shadowing Shadow Checkpoints Shadow Administration Tasks Shadow Operations Tasks Using the Shadow Destination for Disaster Recovery System Failover Strategies No Failover Cold Failover Warm Failover Hot Failover Caché Cluster Management Overview of Caché Clusters Cluster Master Cluster Master as Lock Server Configuring a Caché Cluster Multiple Network Device Configuration Managing Cluster Databases Creating Caché Database Files Mounting Databases Caché Startup Write Image Journaling and Clusters Cluster Backup System Design Issues for Clusters Determining Database File Availability Cluster Application Development Strategies Block Level Contention Caché High Availability Guide v

6 6.9 Caché ObjectScript Language Features Remote Caché ObjectScript Locks DCP and UDP Networking Cluster Journaling Journaling on Clusters Cluster Journal Log Cluster Journal Sequence Numbers Cluster Failover Cluster Recovery Cluster Restore Failover Error Conditions Cluster Shadowing Configuring a Cluster Shadow Cluster Shadowing Limitations Tools and Utilities Cluster Journal Restore Perform a Cluster Journal Restore Generate a Common Journal File Perform a Cluster Journal Restore after a Backup Restore Perform a Cluster Journal Restore Based on Caché Backups Journal Dump Utility Startup Recovery Routine Setting Journal Markers on a Clustered System Cluster Journal Information Global Shadow Information Global and Utilities Caché Clusters on Tru64 UNIX Tru64 UNIX Caché Cluster Overview TruCluster File System Architecture Caché and CDSLs Remastering AdvFS Domains Planning a Tru64 Caché Cluster Installation Tuning a Tru64 Caché Cluster Member Caché and Windows Clusters Single Failover Cluster Setting Up a Failover Cluster Example Procedures Create a Cluster Group Create an IP Address Resource Create a Physical Disk Resource vi Caché High Availability Guide

7 9.2.4 Install Caché Create a Caché Cluster Resource Multiple Failover Cluster Setting Up a Multiple Failover Cluster ECP Failover ECP Recovery ECP and Caché Clusters Application Server Fails Data Server Fails Network Is Interrupted Cluster as an ECP Database Server ECP Clusters Appendix A: Caché ECP Clusters on Red Hat Enterprise Linux A.1 Pre-installation Planning A.2 Configuring the Cluster Services for Caché A.2.1 Define the Caché Cluster Services A.2.2 Install Caché A.3 Configuring the Second Node A.4 Adding Caché to the Cluster Services A.4.1 Caché Initialization File for Linux A.5 Maintaining the Caché Registry When Upgrading Caché High Availability Guide vii

8 List of Figures Shadowing Overview Relationships of Shadow States and Permissible Actions Cold Failover Configuration Warm Failover Configuration Hot Failover Configuration Cluster Shadowing Overview Example of Tru64 Cluster Configuration Single Failover Cluster Failover Cluster with Node Failure IP Address Advanced Properties IP Address Parameter Properties Physical Disk Dependency Properties Cluster Resource General Properties Cluster Resource Dependencies Properties Cluster Resource Advanced Properties Cluster Resource Parameters Properties Multiple Failover Cluster Multiple Failover Cluster with Node Failure viii Caché High Availability Guide

9 List of Tables Conditions Affecting Write Daemon Errors... 7 Write Daemon Error Conditions Backup Task Descriptions Values of backup_type UNIX Backup Utilities and Commands Journal Data Record Fields Displayed by ^JRNDUMP Journal File Command Type Codes Functions Available in ^JRNUTIL Caché High Availability Guide ix

10

11 Introduction As organizations rely more and more on computer applications, it is vital to safeguard the contents of databases. This guide explains the many mechanisms that Caché provides to maintain a highly available and reliable system. It describes strategies for recovering quickly from system failures while maintaining the integrity of your data. Caché write image journaling technology protects against internal integrity failures due to system crashes. Caché backup and journaling systems provide rapid recovery from physical integrity failures. Logical database integrity is ensured through transaction processing, locking, and automatic rollback. In addition, there are other mechanisms available to maintain high availability including shadow journaling and various recommended failover strategies involving Caché ECP (Enterprise Cache Protocol) and clustering. The networking capabilities of Caché can be customized to allow cluster failover. The following topics are addressed: Write Image Journaling and Recovery Backup and Restore Journaling Shadow Journaling System Failover Strategies Caché Cluster Management Cluster Journaling Caché Clusters on Tru64 UNIX Caché and Windows Clusters ECP Failover This guide also contains the following appendix: Caché ECP Clusters on Red Hat Enterprise Linux Caché High Availability Guide 1

12

13 1 Write Image Journaling and Recovery Caché uses write image journaling to maintain the internal integrity of your Caché database. It is the foundation of the database recovery process. This chapter discusses the following topics: Write Image Journaling Recovery Error Conditions Limitations 1.1 Write Image Journaling Caché safeguards database updates by using a two-phase technique, write image journaling, in which updates are first written from memory to a transitional journal, CACHE.WIJ, and then to the database. If the system crashes during the second phase, the updates can be reapplied upon recovery. The following topics are covered in greater detail: Image Journal Two-Phase Write Protocol Caché High Availability Guide 3

14 Write Image Journaling and Recovery Image Journal The Write daemon is activated at Caché startup and creates an image journal. The Write daemon records database updates here before writing them to the Caché database. By default, the write image journal (WIJ) is named CACHE.WIJ and resides in the system manager directory, usually install-dir/mgr, where install-dir is the installation directory. To specify a different location for this file, use the System Management Portal: 1. Navigate to the [Home] > [Configuration] > [Journal Settings] page. 2. Enter the new location of the image journal file in the Write image journal directory box and click Save. The name must identify an existing directory on the system and may be up to 63 characters long. If you edit this setting, restart Caché to apply the change. Important: InterSystems recommends locating the write image journal (WIJ) file on a separate disk from the database disks (those that contain the CACHE.DAT files) to reduce risk and increase performance. On some Linux and UNIX platforms, using a raw partition may improve performance. A raw partition is a UNIX character mode special file type that allows raw access to a contiguous portion of a physical disk. To place the image journal in a raw partition: 1. Calculate the size of the partition by adding the amount of database cache, the amount of routine buffer space, plus 10 megabytes. The result is the number of bytes you need to assign to the raw partition. 2. Create a raw partition of that size. See your UNIX system documentation for details. 3. Follow the previous procedure for changing the location of the WIJ directory from the [Home] > [Configuration] > [Journal Settings] page of the System Management Portal to specify the raw partition name for the Write image journal directory setting. CAUTION: The WIJ file should never be put on a networked disk Two-Phase Write Protocol Caché maintains application data in databases whose structure enables fast, efficient searches and updates. A database update occurs when a Set, Kill, ZSave, or ZRemove command is issued. Generally, when an application updates data, Caché must modify a number of blocks in the database structure to reflect the change. Due to the sequential nature of disk access, any sudden, unexpected interruption of disk or computer operation can halt the update of multiple database blocks after the first block has been written but before the last block has been updated. This incomplete update leads to an inconsistent database 4 Caché High Availability Guide

15 Recovery structure. The consequences can be as severe as a database that is totally unusable, all data irretrievable by normal means. The Caché write image journaling technology uses a two-phase process of writing to the database to protect against such events as follows: In the first phase, Caché records the changes needed to complete the update in the write image journal. Once it enters all updates to the write image journal, it sets a flag in the file and the second phase begins. In the second phase, the Write daemon writes the changes recorded in the write image journal to the database on disk. When this second phase completes, the Write daemon sets a flag in the write image journal to indicate it is empty. When Caché starts, it automatically checks the write image journal and runs a recovery procedure if it detects that an abnormal shutdown occurred. When the procedure completes successfully, the internal integrity of the database is restored. Caché also runs WIJ recovery following a successful shutdown as a safety precaution to ensure that database can be safely backed up. Caché write image journaling guarantees the order of updates. The Write daemon records all database modifications in the image journal. For example, assume that modifications A, B, and C normally occur in that order, but that only B is split over multiple blocks. All three modifications are in the image journal, and are written to the database, so all three are in the database following a failure, or none of them are. 1.2 Recovery When Caché starts, it automatically checks the write image journal and runs a recovery procedure if it detects that an abnormal shutdown occurred. Recovery is necessary if a system crash or other major system malfunction occurs at either of the following points in the two-phase write protocol process: Before the Write daemon has completed writing the update to the write image journal. In this case, recovery discards the incomplete entry and updates are lost. However, the databases are in a consistent and usable state and the transaction journal file can be applied, if it is being used, to restore any updates which may have been lost because they had not yet been written to the database. See the Journaling chapter for more information. After the update to the write image journal is complete but before the database is updated. In this case, the recovery procedure applies the updates from the write image journal file to the database to restore internal database integrity. Caché High Availability Guide 5

16 Write Image Journaling and Recovery Recovery Procedure If the write image journal is marked as complete, the Write daemon completed writing modified disk blocks to the image journal but had not completed writing the blocks back to their respective databases. This indicates that restoration is needed. The recovery program, cwdimj, does the following: Informs the system manager in the recovery log file. Performs dataset recovery. Continues and completes restoration Recovery Log File The recovery procedure records its progress in the cconsole.log file in the Caché system manager directory. This file contains a record of output from all recoveries run in the %SYS namespace. To view the file, open it with a text viewer or editor. You can also view its contents from the [Home] > [System Logs] > [View Console Log] page of the System Management Portal Dataset Recovery The recovery procedure allows you to confirm the recovery on a dataset-by-dataset basis. Normally, you specify all datasets. After each dataset prompt, type either: Y to restore that dataset N to reject restoration of that dataset You can also specify a new location for the dataset if the path to it has been lost, but you can still access the dataset. Once a dataset has been recovered, it is removed from the list of datasets requiring recovery and is not recovered during subsequent runs of the cwdimj program, should any be necessary. Typically, all recovery is performed in a single run of the cwdimj program Completes Restoration If no operator is present during the recovery procedure, Caché takes default actions in response to prompts: it restores all directories and automatically marks the write image journal as deleted. However, if a problem occurs during recovery, the cwdimj program aborts and the system is not started. Any datasets which were not successfully recovered are still marked as requiring recovery in the write image journal. See the Error Conditions section for more information. When the recovery procedure is complete, the recovery program asks whether it should mark the contents of the write image journal as deleted. If recovery has successfully written all blocks, answer Yes. However, if an error occurred during writing, or if you chose not to write the blocks, answer No; otherwise, you most likely will cause database degradation. 6 Caché High Availability Guide

17 Caché cannot run until either the contents of this file have been deleted or the file has been removed or renamed. When recovery completes normally, the write image journal is marked as deleted, and startup continues. If the Write daemon cannot create the write image journal, it halts all database modifications. The halt continues until the Write daemon can create the image journal, or until you shut down the system. Once the Write daemon is able to create the image journal, it sends the following message to the console log: Database updates have resumed Error Conditions 1.3 Error Conditions If an error occurs that causes database degradation, the Write daemon s action depends on the condition under which the error occurs. Conditions Affecting Write Daemon Errors Condition Database freezes on error. Error trapping is enabled with the command SET $ZT="^%ET". Error occurred as a result of a Caché ObjectScript command entered in programmer mode. Serious Disk Write error occurred in a Caché database file. Write Daemon Action Write daemon freezes the system and logs to the operator s console an error message of the type shown in the Freeze Writes on Error section. Error trapping halts the process where the error occurred. One of the error conditions listed in the Write Daemon Error Conditions table is stored in the ^ERTRAP global in the Caché database, unless there is a file-full condition in that database. In that case, the halt occurs with no indication as to why. One of the errors listed in the Write Daemon Error Conditions table appears on your screen. Write daemon freezes the system and displays the following message: SERIOUS DISK WRITE ERROR - WILL RETRY. If it cannot recover, it displays a message of the type shown in the Freeze Writes on Error section. If it is able to recover, database updates resume. Caché High Availability Guide 7

18 Write Image Journaling and Recovery Condition Serious Disk Read or Write error occurred in the write image file. Write Daemon Action Write daemon freezes the system while it attempts to recover, and displays one of the following messages: SERIOUS DISK ERROR WRITING IMAGE FILE - WILL RETRY or SERIOUS DISK ERROR READING IMAGE FILE - WILL RETRY. If it cannot recover, it displays a message of the type shown in the Freeze Writes on Error section. If it is able to recover, database updates resume If Recovery Cannot Complete (UNIX and OpenVMS) If recovery cannot complete, Caché prompts you to choose between the following two options: Abort startup, fix the problem that prevented recovery, and try again. This option is preferable if you have time for it. Delete or rename the write image journal file and continue startup. Caché will run with one or more databases suffering degradation caused when an update in progress did not complete when the system crashed or while recovery took place. If you delete the write image journal, you must restore those databases from backups or use repair utilities to fix them Sample Recovery Errors Error Opening CACHE.DAT If you cannot open a cache.dat or cache.ext file that needs to be restored, you see this message during the write phase: Can't open file: /usr/cache/cache.dat Its blocks weren't written Recovery continues trying to write blocks to all other directories to be restored. If this happens, do the following: 1. Do not delete the write image journal. 2. Try to correct the problem with the Caché database on which the error occurred. 3. Restart and let recovery try again. Directories that were restored the first time are not listed as having blocks to be written during this second recovery attempt. 8 Caché High Availability Guide

19 Error Conditions Error Writing to Caché Block If recovery starts to write to a Caché database file, but cannot write a particular block number, you see this message: Error writing block number xxxx If this error occurs four times in a single restoration, the restoration aborts, and you see this message: Error writing block number xxxx Do you want to delete the Write Image File (Y/N)? Y => Enter N to retain the write image journal. Recovery attempts to continue. If it still does not succeed and you receive this message again, contact the InterSystems Worldwide Response Center (WRC). If you must continue immediately, you can delete or rename the write image journal. If you delete it, you lose all changes recorded in it Error Reading Write Image Journal If an error occurs when recovery attempts to read the write image journal file, you see this message: Do you want to write them now (Y/N)? Y =>Yes *** WRITING ABORTED*** Can't read Cache Write Image File Do you want to delete the Write Image File (Y/N)? Y => Write Daemon Panic Condition If the global buffer pool is full of blocks that need to be written to databases, the Write daemon may enter a state where it cannot write to its write image journal. Before this happens, it notifies you on the operator s console and the cconsole.log file. It then prints a message for each block written to the database that was not written first to the write image journal. This technique allows you to track the cause of any subsequent database degradation. If the condition clears because global buffers have been freed up, the Write daemon informs you on the operator s console that the panic condition has ended. If the panic condition does not end, the system may hang. If so, running cstop automatically calls cforce, in which case you most likely have database degradation. To avoid this situation, allocate more database cache from the System Management Portal. If a panic condition message appears on the operator s console, try adding 1 MB to the cache Write Daemon Errors and System Crash Caché does not allow database modifications in the event of a Write daemon error. Then, if a Write daemon error occurs while accessing any of the databases, you avoid database degradation because all updates to any database on the system are suspended. Caché High Availability Guide 9

20 Write Image Journaling and Recovery If the system freezes, you must stop Caché and restart the system. Under rare circumstances, database degradation can occur that cannot be rectified by write image journaling. Run Integrity on the global identified in the error message that the Write daemon logged when the freeze occurred Freeze Writes on Error When the Write daemon encounters an error while writing a block, it freezes all processes performing database updates, and logs an error message to the operator s console log, cconsole.log, as long as the freeze continues. It sends the error messages first at thirty second, one-, two-, and four-minute intervals, and then at regular eight-minute intervals. If the cause of the freeze is an offline or write-protected disk, an operator can fix the problem and processing can continue. Otherwise, to recover from a freeze, you need to run: ccontrol force and then: ccontrol start When the system freezes due to an error, the Write daemon generates an operator console error message that reports the exact error that caused the system to freeze as well as the name of the cache.dat file and the global or routine that was involved in causing the error. The following is an example of an error message that would occur when accessing a global: *** CACHE: AN ERROR OCCURRED WHILE UPDATING A CACHE.DAT FILE THAT COULD CAUSE DATABASE DEGRADATION. TO PREVENT DEGRADATION ALL USER PROCESSES PERFORMING DATABASE UPDATES HAVE BEEN SUSPENDED AND THE WRITE DAEMON WILL NOT RUN. ERROR: <DATABASE> FILE: DUA0:[SYSM] GLOBAL: ^UTILITY If the error occurs while accessing a routine, the last part of the error message reads: ROUTINE: TESTING The following table describes the errors that can occur during a database update, and provides some possible solutions. Not every occurrence of these errors freezes the system; the system freezes only when the error occurs in the middle of a database update. 10 Caché High Availability Guide

21 Error Conditions Error Meaning Write Daemon Error Conditions Solution <FILEFULL> <DISKHARD> <DATABASE> <SYSTEM> A block could not be allocated when information was added to a database because no blocks were available. During an attempt to access a block in a file, the request to the operating system failed. This failure may have occurred because the disk is offline or because the actual size of the file is less than the expected size. A database integrity problem has been encountered. System error during database update. Determine whether there is expansion room in the Caché database. If not, increase the maximum size. Otherwise, determine whether there is enough physical space on the disk. Check that the disk is online. If it is, run Integrity on the global where the error occurred. Run Integrity on the global where the error occurred. Stop then restart Caché. If the problem still exists, contact the WRC. Once the problem is corrected, database updates are re-enabled Responding to a Freeze If a freeze occurs, follow the procedure below. 1. Check the operator console to see the directory, the global or routine, and the process in which the error occurred. 2. Fix any causes of the error that you can correct easily. For example, put a disk online. 3. If updates do not resume, stop Caché. 4. Restart Caché. 5. Fix any causes of the error you could not correct earlier. For example, if the error was <FILE- FULL>, you would need to provide more physical disk space, add a volume set, or increase the maximum size of the affected Caché database. 6. Run Integrity on the global or routine directory in the database where the error occurred to verify that no degradation occurred. Caché High Availability Guide 11

22 Write Image Journaling and Recovery Some error conditions (<DISKHARD> and <DATABASE>) indicate that database degradation may exist. If degradation exists, try the ^REPAIR utility or contact the WRC. Certain error conditions can cause degradation that write image journaling cannot repair; see the Limitations section. 1.4 Limitations of Write Image Journaling While the two-phase write protocol safeguards structural database integrity, it does not prevent data loss. If the system failure occurs prior to a complete write of an update to the write image journal, Caché does not have all the information it needs to perform a complete update to disk. Hence, that data is lost. In addition, write image journaling cannot eliminate internal database degradation in the following cases: A hardware malfunction on the drive that contains the temporary write image journal prevents Caché from reading this file. Note that the Write daemon freezes if the malfunction occurs during an attempt to read or write this temporary file while Caché is operating. In most cases this means that a malfunction of this disk results only in data loss, not database degradation. A drive malfunctions and its contents are irretrievably lost or permanently unalterable. You must restore the backup of this database for the directories using the malfunctioning drive. However, write image journaling can still restore directories on other disks. A single process (for example, due to a segmentation fault) disappears while within the global module. Such a situation could occur if: - On Windows NT, the Task Manager is used to halt a single process. - On OpenVMS or UNIX, the terminal for that process is disconnected. - On OpenVMS, a STOP/ID is issued. See the $ZUTIL(69,24) entry in the Caché ObjectScript Reference for further details. If an obscure situation occurs in which drive A contains pointer blocks to drive B, a Kill command deletes those pointers, and after the Garbage Collector begins its work, drive A becomes inoperable before the pointer block is rewritten. In this situation, write image journaling could fail. This condition usually follows another failure that would prevent this situation from being a problem. Furthermore, this situation is also likely to be one in which drive A has malfunctioned to such an extent that you would need to restore the database for that drive anyway. If you believe that one of these situations has occurred, please contact the WRC. 12 Caché High Availability Guide

23 2 Backup and Restore This chapter outlines the factors to consider when developing a solid plan for backing up your Caché system. It discusses techniques for ensuring the integrity and recoverability of your backups, as well as suggested backup methodologies. Later sections of the chapter contain details about the procedures used to perform these tasks, either through the System Management Portal or by using Caché and third-party utilities. It discusses the following topics: Backup Integrity and Recoverability Importance of Journals Backup Methods Configuring Caché Backup Settings Managing Caché Online Backups Restoring from a Backup Caché Backup Utilities Sample Backup Procedures Backup strategies can differ depending upon your operating system, preferred backup utilities, disk configurations, and backup devices. If you require further information to help you to develop a backup strategy tailored for your environment, or to review your current backup practices, please contact the InterSystems Worldwide Response Center (WRC). Caché High Availability Guide 13

24 Backup and Restore 2.1 Backup Integrity and Recoverability Regardless of the backup methods you use, it is critical to restore backups on a regular basis as a way to ensure that your backup strategy is a workable means of disaster recovery. The best practice is to restore every backup of the production environment to an alternate server, and then check the physical structure of the restored databases. This provides the following backup validation functions: Validates the recoverability of the backup media. Validates the global-level integrity of the databases in the backup. Provides a warm copy of the backup, substantially reducing the time required to restore the backup in the event of a disaster. If such an event occurs, you need only restore the updates in the journal files. Establishes a last known good backup. The backup methods described in this document preserve the physical structure of the database; therefore, a clean integrity check of the restored copy implies that the integrity of the production database was sound at the time of the backup. The converse, however, is not true; an integrity error detected on the restored copy of a database does not necessarily imply that there are integrity problems on the production database. There could, for example, be errors in the backup media. If you discover an integrity error in the restored database, immediately run an integrity check on the production database to verify the integrity of the production system. Note: See the Check Database Integrity section of the Managing Caché chapter of the Caché System Administration Guide for the details of checking database integrity. To further validate that the application is working correctly on the restored database, you can also perform application-level checks. To perform these checks, you may need to restore journal files to restore transactional integrity. See the Importance of Journals section for more information. Once you restore the backup and establish that it is a viable source of recovery, it is best to preserve that restored copy until you establish the next good backup. Therefore, the server on which you are validating the backup should ideally have twice the storage space required by production space to store the last-known good backup as well as the backup your are currently validating. (Depending on your needs, you may have less stringent performance requirements of the storage device used for restoring backups, allowing for a less expensive storage solution.) In this way, the last-known good backup is always available for use in a disaster even if validation of the current backup fails. To protect the enterprise from a disaster that could destroy the physical plant, regularly ship backup media to a secure off-site location. You can run backups during transaction processing; as a result, the backup file may contain partial transactions. When restoring from a backup, you first restore the backup file, then restore from the 14 Caché High Availability Guide

25 Importance of Journals journal files to complete the partial transactions in the backup file. Retain all journal files corresponding to the last-known backup until you identify a new backup as the last-known good backup. 2.2 Importance of Journals The backup of a Caché database alone is not enough to provide a viable restore of production data. In the event of a disaster that requires restoring from backup, you always apply journal files to the restored copy of the database. Applying journal files restores all journaled updates from the time of the backup, up to the time of the disaster. Also, applying journals is necessary to restore the transactional integrity of your database by rolling back uncommitted transactions (the databases may have contained partial transactions at the time of the backup). It is critical to ensure that journal files are available for restore in the event of a disaster. Take the following steps to prevent compromising the journal files when disaster recovery requires you to restore databases. Verify that you are journaling all databases that require durability and recoverability. Do not purge a journal file unless it was closed prior to the last-known good backup, as determined by the backup validation procedure discussed previously. Set the number of days and the number of successful backups after which to keep journal files appropriately. Define an alternate journal directory. Place the primary and alternate journal directories on disk devices that are separate from the storage of the databases, separate from the storage of the write image journal (WIJ), and separate from each other (primary and alternate journal directories should reside on different devices). For practical reasons, these different devices may be different logical unit numbers (LUNs) on the same storage area network (SAN), but the general rule is: the more separation the better. As best as possible, configure the system so that journals are isolated from any failure that may compromise the databases or WIJ, because if the database or WIJ is compromised, then restoring from a backup and journal files may be required. Consider using hardware redundancy such as mirroring to help protect the journals. Long-distance replication can also provide a real-time off-site copy of the journal files. The off-site copy of journals allows recovery from a disaster where the physical plant is destroyed (in conjunction with the off-site copy of the backup media). Set the journal Freeze on error option to Yes. If a journal failure occurs where journaling can no longer write to the primary nor the alternate journal device, you can configure the system to freeze. The alternative is to allow the system to continue, which leads to journaling being disabled. This, among other things, compromises the ability to reliably restore from backups and journal files. Caché High Availability Guide 15

26 Backup and Restore Important: It is critical to test the entire disaster recovery procedure from start to finish periodically. This includes backup restore, journal restore, and running simulated user activity on the restored environment. See the Journaling chapter of this guide for more information. 2.3 Backup Methods The two main methods of backing up Caché data are the external backup and the Caché online backup. Each of these methods have variations on how to implement them; your backup strategy can contain multiple types of backups performed at different times and with different frequency. This section describes the details and variations of the two types of backups: External Backup Online Backup External Backup Use the external backup in conjunction with technology that provides the ability to quickly create a functional snapshot of a logical disk volume. Such technologies exist at various levels, such as simple disk mirrors, volume shadowing at the operating system level, or more modern snapshot technologies provided at the SAN level. This approach is especially attractive for enterprises that have a very large amount of data, where the output of a Caché online backup would be so large as to be unwieldy. The approach is to freeze writes to all database files for the duration required to create a snapshot, then create a snapshot of the disk using the technology of choice. After you create the snapshot, thaw the system to again allow writes to the database while you copy the snapshot image to the backup media. Caché provides the Backup.General class with class methods to simplify and enhance this technique. On nonclustered instances of Caché, these class methods pause physical writes to the database during the creation of the snapshot, while allowing user processes to continue performing updates in memory. This allows for a zero-downtime external backup on nonclustered systems. Use this mechanism with a disk technology that can create the snapshot within several minutes; if you pause the Write daemon for an extended period of time, user processes could hang due to a shortage of free global buffers. Important: On clustered instances of Caché, this method pauses user processes for the duration of the freeze. In addition to pausing writes, the freeze method also handles switching journal files and writing a backup marker to the journal. The class methods that perform the database freeze and thaw operations 16 Caché High Availability Guide

27 are Backup.General.ExternalFreeze() and Backup.General.ExternalThaw() respectively. On nonclustered systems if you do not journal your databases you may lose data if the system crashes while it is suspended. There is also a Backup.General.QuiesceUpdates() class method that blocks new database update activity and waits for existing update activity to finish within a certain period of time. See the Backup.General class documentation in the Caché Class Reference for details on the use of these methods and examples. The following sections discuss the types of external backups and the advantages and disadvantages of each: Concurrent External Backup Paused External Backup Cold Backup Backup Methods Concurrent External Backup A concurrent external backup, or dirty backup, is the most common strategy used by large-scale production facilities that have large databases, have limited time to complete a backup, and require uninterrupted processing 24 hours a day. The utility you use to perform the backup depends on your site preference and the operating system. You may choose a native operating system utility, such as the UNIX tar utility, or a third-party utility such as Veritas or ARCserve. Advantages Production is not paused (except possibly very briefly during the incremental backup). Disadvantages Multiple files need to be restored (Cache.dat database files and incremental backup files), which causes the restore process to take longer. Procedure Outline: Perform a concurrent external backup using the following steps as a guide: 1. Clear the list of data blocks modified since the last backup. 2. Copy the Cache.dat database files. 3. Perform a Caché incremental backup, which copies any blocks that changed while the Cache.dat files were being copied; this may cause a very brief suspension of user processes in some configurations. See the Concurrent External Backup script section for an example. Caché High Availability Guide 17

28 Backup and Restore Paused External Backup Using a paused external backup is the second most common strategy used by large-scale production facilities that have large databases and limited time to complete a backup, but can tolerate a brief suspension of writes to databases. Organizations often use this strategy in conjunction with advanced disk technologies, such as disk mirroring. The approach is to freeze the Caché Write daemon long enough to separate a mirror copy of data and then quickly resume writes to the databases. You back up the mirror and then later rejoin it to production. Advantages An incremental pass is not necessary in the restore process. Disadvantages Unless mirroring or a similar technology is used, you must freeze writes to the database for a considerable amount of time, which may cause a shortage of free global buffers. Procedure Outline: Perform a paused external backup using the following steps as a guide: 1. Freeze writes to the database using the ExternalFreeze() method of the Backup.General class. 2. Separate the disk mirror from production (if using advanced disk technologies), or make a copy of the Cache.dat database files. 3. Resume Caché writes using the ExternalThaw() method of the Backup.General class. 4. If you split a mirror from production, back up the mirror copy of the database and rejoin the mirror to production. See the Paused External Backup script section for an example Cold Backup You generally use the cold backup strategy when your operation tolerates downtime. Often, smaller installations that do not have strict 24/7 access requirements use this strategy. Sometimes this is done only when performing a complete system backup as part of a maintenance effort such as repairing faulty hardware. In this situation, stop Caché during the backup period and restart it when the backup completes. Advantages Very simple procedure (stop Caché and copy the cache.dat files). Disadvantages You must stop Caché; consequently, of all the backup options, this method involves the longest downtime. Procedure Outline: 1. Stop Caché using the ccontrol command or through the Caché Cube. 2. Perform the backup. 3. Restart Caché using the ccontrol command or through the Caché Cube. 18 Caché High Availability Guide

29 Backup Methods Online Backup Caché implements a proprietary backup mechanism designed to cause very minimal or, in most cases, no downtime to users of the production system. The online backup captures only blocks that are in use by the database. The output goes to a sequential file. The backup file is then copied to the backup media along with any other external files such as the.cpf file, the CSP files, and external files used by the application. The Caché backup uses a multipass scan to backup database blocks. It is expected that each pass has a reduced list of modified blocks and that generally three passes are sufficient to complete a backup. During the entire final pass and for a brief moment during each prior pass, the system pauses writes to the database. If the backup list contains only new-format databases (8-KB block size), only physical writes to the database are paused while user processes are allowed to continue performing updates in memory. If the backup list contains any old-format (2-KB block size) databases, or if it is a clustered Caché environment, then all user activity is paused for these multiple brief periods. The concurrent Caché online backup strategy is used when the backup must have the least impact on Caché processes. This is a strategy used across all sizes of production facilities. In the case where 8-KB databases are used in a nonclustered environment, it is possible to back up the database without pausing user processes. The backup procedure incorporates multiple passes to copy the data, where each consecutive pass copies any data blocks that changed during the previous pass. During the last pass, writes to the disk are paused, while writes to the buffers are still allowed, thus users are not impacted (provided there are sufficient global buffers). In a clustered environment (or when some 2-KB databases are backed up), user processes are paused briefly during the final pass of the backup. There are three different types of concurrent online backups: full, cumulative, and incremental, which can be combined to manage a trade-off between the size of the backup output, and the time needed to recover from the backup: Full Backup Writes an image of all in-use blocks to the backup media. Advantages Provides the basis of your database restoration; a requirement for cumulative and incremental backups. Disadvantages Time-consuming operation. Cumulative Backup Writes all blocks that have been modified since the last full backup. Must be used in conjunction with a previous full backup. Advantages Quicker than a full backup; quicker to restore than multiple incremental backups. Disadvantages More time-consuming than incremental backups. Caché High Availability Guide 19

30 Backup and Restore Incremental Backup Writes all blocks that have been modified since the last backup of any type. Must be used in conjunction with a previous full backup and (optionally) subsequent cumulative or incremental backups. Advantages Quickest backup; creates smallest backup files. Disadvantages May end up having to restore multiple incremental backups, slowing down the restore process. Caché online backup writes all database blocks to a single file (or set of tapes) in an interleaved fashion. When an extremely large amount of data is backed up using online backup, restores can become somewhat cumbersome. This should be considered when planning your backup strategy. The restore validation process discussed above helps resolve limitations in this area by providing an online, restored copy of the databases. When using incremental or cumulative backup, the same backup validation method explained earlier in this document should of course be used. After each incremental or cumulative backup is performed, it can be immediately restored to the alternate server. As an example, a strategy of weekly full backups and daily incremental backups can work well because each daily backup only contains blocks modified that day. Each day restore that incremental to the alternate server, and check integrity. As discussed previously, overwriting the warm copy of the last known good backup when restoring the backup currently being validated should be avoided. The same concept applies when restoring an incremental to the existing restored database. After the backup is established as being the last known good backup and before applying the next day s incremental or cumulative backup to it, a copy should be saved so that the last known good backup is always online and ready for use in case the subsequent incremental restore fails. If a restored backup fails an integrity check, it must be discarded and cannot be used as a target of a subsequent incremental restore. When restoring a system from a Caché backup, first restore the most recent full backup, followed by the most recent cumulative backup, and then all incremental backups taken since the cumulative backup. 2.4 Configuring Caché Backup Settings You can configure the Caché database backup settings from the [Home] > [Configuration] > [Database Backup Settings] and the [Home] > [Task Manager] pages of the System Management Portal. From the System Management portal you can perform the following configuration tasks: Define Database Backup List Configure Backup Tasks Schedule Backup Tasks 20 Caché High Availability Guide

31 Configuring Caché Backup Settings Define Database Backup List Caché maintains a database list that specifies the databases to be backed up. You can display this list by opening the [Home] > [Configuration] > [Database Backup Settings] > [Backup Database List] page of the System Management Portal. Use the arrow buttons to move the databases you do not want to back up to the Available list and the databases you do want to back up to the Selected list. Click Save. When you add a new database to your system, Caché automatically adds it to the database list. If you do not need to include the new database in your backup plan, be sure to remove it from the Backup Database List. This database list is ignored by the FullAllDatabases backup task, which performs a backup of all databases excluding the CACHETEMP, CACHELIB, DOCBOOK, and SAMPLES databases. If you update the Caché-supplied CACHELIB and DOCBOOK databases, you can add them to the database list and run a FullDBList backup as a base for subsequent backup tasks. You can also maintain the backup database list using the Backup.General.AddDatabaseToList() and Backup.General.RemoveDatabaseFromList() methods. See the Backup.General class description in the Caché Class Reference for details on using these methods Configure Backup Tasks Caché provides four different types of backup tasks; each is listed as an item on the Database Backup Settings menu. The four backup tasks are: Configure Full Backup of All Databases Configure Full Backup of the Database List Configure Incremental Backup of the Database List Configure Cumulative Backup of the Database List These are predefined backup tasks that an operator can run on-demand from the [Home] > [Backup] page of the portal. You can also schedule combinations of these backup tasks using the Task Manager. See the Schedule Backup Tasks section later in this chapter for details. The process for configuring each of these tasks is the same. The Name, Description, and Type fields are read-only and reflect the menu choice as described in the following table. Caché High Availability Guide 21

32 Backup and Restore Name FullAllDatabases FullDBList IncrementalDBList CumuIncrDBList Description Backup Task Descriptions Full backup of all commonly updated databases, whether or not they are in the Backup Database List. Full backup of the Caché databases listed in the Backup Database List. Incremental backup of changes made to the data since the last backup, whether full or cumulative. Backup is performed on the databases currently listed in the Backup Database List. Cumulative and Incremental backup of all changes made to the data since the last full backup. Backup is performed on the databases currently listed in the Backup Database List. Type Full Full Incremental Cumulative You can send backup output to a directory on disk or to magnetic tape. Select one of the two options: 1. To back up to a directory on disk, specify the file pathname in the Device field. Click Browse to select a directory. 2. To back up to magnetic tape, select the Save to Tape check box, and specify a Tape Number from the list of available tape device numbers. See the Identifying Devices section of the Caché I/O Device Guide for detailed information regarding tape numbers. The Define Database Backup List section describes how to maintain the Backup Database List Backup File Names By default, backup files are stored in CacheSys\Mgr\Backup. The backup log files are stored in the same directory. Backup files have the suffix.cbk. Backup log files have the suffix.log. Backup files and backup log files use the same naming conventions: The name of the backup task, followed by an underscore character (_) The date of the backup, in yyyymmdd format, followed by an underscore character (_) An incremental number, nnn, for that task, for that day The.log or.cbk suffix 22 Caché High Availability Guide

33 Where nnn is a sequence number incremented for that backup task on that date. Caché creates a log file for every backup attempt; successful, failed, or aborted. Caché creates a backup file only upon successful backup, but its increment number matches the corresponding log file increment number. For example: You perform three FullDBList backup operations on June 4, 2006, the first successful, the second aborted, the third successful. This generates three.log files, numbered 001, 002, and 003, but only two.cbk files, numbered 001 and 003. The backup files: FullDBList_ _001.cbk FullDBList_ _003.cbk The matching log files: FullDBList_ _001.log FullDBList_ _002.log FullDBList_ _003.log Managing Caché Online Backups Schedule Backup Tasks You should ideally set up a schedule for running backups. Backups are best run at a time when there are the least amount of active users on the system. In addition to the four backup tasks supplied with Caché, you can create additional definitions of these four backup tasks. For example, you could create two full backup tasks, one to save the backup to a disk file, the other to save the backup to a tape. Or, to alternate backups between two disk drives, you could create a backup task for each drive. Use the Caché Task Manager to schedule these backup tasks: 1. Navigate to the [Home] > [Task Manager] page of the System Management Portal. 2. Click Schedule New Task. 3. Specify the Name, Description, Backup Type, and output location. You can delete any task you add by clicking Delete on its row on the Task Schedule page. 2.5 Managing Caché Online Backups You can run Caché database backup tasks and view backup history from the [Home] > [Backup] page of the System Management Portal. If you schedule additional backup tasks using the Task Manager, you can manage those from the [Home] > [Task Manager] page of the System Management Portal. From the System Management portal you can perform the following backup tasks: Caché High Availability Guide 23

34 Backup and Restore Run Backup Tasks View Backup Status View Backup History Handle Backup Errors Back Up Selected Globals and Routines When you add a new database to your system, you must perform a full backup. You cannot perform an incremental backup, or restore a database, until a full backup exists. After installing Caché, InterSystems recommends that you perform a FullAllDatabases backup to establish a complete backup for subsequent use by the other backup tasks Run Backup Tasks There are four types of backup tasks you can run from the System Management Portal, each having its own menu item: Run Full Backup of All Databases Run Full Backup of the Backup Database List Run Incremental Backup of the Backup Database List Run Cumulative Backup of the Backup Database List You must have performed a full backup on a database before performing an incremental or cumulative backup on that database. Read the Run Backup Task box to verify that the settings are correct. If the backup options are correct, click OK to start the backup. While running a backup from the [Home] > [Backup] > [Run Backup] page, you can view the status of the running backup by clicking the text next to Backup started. See Monitor Backup Status for details. Performing Multivolume Backups A backup, particularly a full backup, may require multiple tape volumes, or multiple disk files. Currently, there is no way to perform a multivolume backup using the System Management Portal. If you require a multivolume backup use the ^BACKUP utility. If a disk full condition occurs, Caché prompts you for the name of another disk file on another disk. In the event of an error during backup, you cannot restart the backup on a second or subsequent volume. You must restart the backup from the beginning. 24 Caché High Availability Guide

35 Managing Caché Online Backups View Backup Status Click View on the running backup process to monitor the progress of the backup operation. The same information is recorded in the log file for that backup operation, which you can later view from the View Backup History page. When Caché begins a backup, it updates the Time and Status columns of the listing. The Time column records the date and time that the backup was initiated, not when it completed. The Status column is updated to Running. Upon completion of a backup, Caché again updates the Status column to indicate the final status of the backup. Completed indicates the backup successfully completed. Failed indicates the backup could not be performed or was aborted by the operator. One cause of backup failure is trying to perform a backup on a dismounted database View Backup History Every backup operation creates a separate backup log file. The logs follow the naming convention described in Backup File Names. From the portal you can view a list of system backup logs from completed backup tasks: 1. Navigate to the [Home] > [Backup] page of the System Management Portal. 2. Click View Backup History in the right-hand column to display the [Home] > [Backup] > [View Backup History] page. 3. To view the contents of a particular file, click View in the right-hand column of the appropriate row. You can view the contents of a backup log file and search for a string within that file Handle Backup Errors In the event of an error during backup, the backup utility allows you to retry the device on which the error occurred. Alternatively, you can abort the backup. On stand-alone systems (those not clustered), if you abort a backup regardless of the type, the next backup must be a full backup. This full backup on a stand-alone system does not block access to the database. If a backup encounters any I/O errors, the backup aborts and logs a system error in the cconsole.log file, viewable by the Backup utility, the SYSLOG character-based utility, or any text file viewer. The log file allows you to quickly see where the problem occurred so that you can fix it. Caché High Availability Guide 25

36 Backup and Restore Back Up Selected Globals and Routines You may want to back up only selected globals or routines in a database. The System Management Portal offers options to perform these tasks. The following are a few cases of where these options are helpful: Restoring selected globals If you back up your globals using the Export option from the [Home] > [Globals] page of the System Management Portal, you can use the Import option to restore only the globals you require. Restoring a database after extensive repairs When your Caché database suffers degradation, it does not use the space as efficiently as it could; some unused blocks are not marked as being available, and pointers become overly indirect. If you backed up your globals using the Export option before the problem occurred, you can recreate the database, and then load the globals using the Import option. Restoring selected routines You can use the Export and Import options from the [Home] > [Routines] page of the System Management Portal to back up and restore individual routines. You can also selectively back up and restore source code (.MAC,.INC, and.int files), or both source and object code (.OBJ files) using the Export and ImportDir methods of the %SYSTEM.OBJ class. Routines and globals are backed up into standard format files. These files are referred to as RSA (routine save) and GSA (global save) files. 2.6 Restoring from a Backup If any problem arises that renders your data inaccessible or unusable, you can recreate that data by restoring the affected database(s) from backup files and applying the changes recorded in the journal files. When you are restoring an incremental or cumulative backup, the target database must be in exactly the same state as when you restored the previous full backup. You must prevent all updates to the database that you restored from the full backup until you restore all subsequent incremental and cumulative backups. Failing to heed this warning can result in the incremental/cumulative restore producing a degraded database. If the previous full backup was an external backup, and the external backup is restored before Caché is started (started to apply the incremental) then there is the possibility that updates during startup could modify the restored full, thus invalidating the incremental that needs to be restored. Special care must be taken to avoid this. If the external is to be restored in place, then Caché must be started and 26 Caché High Availability Guide

37 those databases dismounted before the external backup is restored. The databases can then be mounted with switch 10 set for the purpose of restoring the incremental. Alternatively, you can restore the external to alternate directory paths, restore the subsequent incremental backups, and then move them into place. To perform a restore, use the following strategy: 1. Identify which Caché databases require restoration. 2. Restore the last full backup of those Caché databases. 3. If you have done cumulative incremental backups since the full backup, restore the last one. 4. Restore all incremental backups done since the last full backup in the order in which the backups were performed, or restore the last cumulative incremental backup, whichever was more recent. 5. Apply the changes in the journal file for the directories restored or selected directories and globals you specify. 6. Perform a full backup of the restored system. Restoring from a Backup CAUTION: If you backed up with a UNIX or OpenVMS backup utility, use the same utility to restore Using the Backup History to Recreate the Database The Backup utility maintains a backup history. The Restore utility prompts you for the backup(s) to restore according to their order in the backup history. Note: On Caché platforms that support access to the same database from multiple computers, you should always back up a given directory from the same computer, so that its complete backup history is available if you need to restore the directory. When you select one of the three restore options on the BACKUP main menu, the utility asks you to enter the name of the device holding the first backup to be restored. The default the first time you enter a restore option is the device the last full backup was sent to, if there was one. Caché helps you restore backups in logical order. After restoring the last full backup, the utility uses the backups in the Backup History to suggest the next logical backup for you to restore. It cycles through all of the backups in this way. Having already prompted you with the last full backup, it prompts you to restore subsequent backups in the following order: 1. It prompts you for the most recent cumulative incremental backup after the last full backup, if one exists. Caché High Availability Guide 27

38 Backup and Restore 2. After restoring the most recent cumulative incremental backup, if there was one, it prompts you to restore all incremental backups since the last cumulative incremental backup (or if none exists, since the last full backup). It does so in order from the first to the most recent. You can override the suggested backups in the restore process. Remember, however, that an incremental or cumulative incremental backup does not represent a complete copy of your disk. You can restore an incremental backup only after restoring a full backup Suspending Database Access During a Restore In most cases, the database you are restoring is not fully independent of the other databases on the system. For this reason, it is recommended that all user activity be suspended during restore. Even if you are the only user on your system, you still want to restrict login access if any users can log in remotely to your system. You can, however, restore a database with users active on other databases. All databases being restored are dismounted during the restore. Therefore, if you did not suspend database access, users who try to access the databases being restored receive <PROTECT> errors Restoring Database Properties If the characteristics of a directory have changed by the time you do a restore, the restore utility handles these situations. It creates Caché databases as necessary and modifies their characteristics as appropriate, to return them to the state they were in at the time the backup was completed Error Handling for Restore If an error occurs while you are restoring, you are given these options: Retry the device Skip that block or set of blocks and continue with the restore Abort the restore of that directory but otherwise continue with the restore Abort the restore 2.7 Caché Backup Utilities Caché provides utilities to perform backup and restore tasks. The ^BACKUP routine provides menu choices to run common backup procedures, which you can also run independently and in some cases 28 Caché High Availability Guide

39 non-interactively using scripts. The utility names are case-sensitive. Run all these utilities from the %SYS namespace. Perform Backup and Restore Tasks Using ^BACKUP Backup Databases Using ^DBACK Restore Databases Using ^DBREST Estimate Backup Size Using ^DBSIZE Caché Backup Utilities Perform Backup and Restore Tasks Using ^BACKUP The Caché ^BACKUP utility allows you to perform Caché backup and restore tasks from a central menu as shown in the following example: %SYS>Do ^BACKUP 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? Enter the appropriate menu number option to start the corresponding routine. Press Enter without entering an option number to exit the utility. Subsequent sections in this document describe the utilities started by choosing each option: 1. Backup Databases Using ^DBACK 2. Restore All Databases Using ^DBREST 3. Restore Selected Databases Using ^DBREST 4. Maintain Database Backup List Maintain Database Backup List Note: When editing the database list use the database name, not the directory name. This is consistent with the way the backup configuration works in the System Management Portal. The Caché ^BACKUP utility allows you to backup Caché databases or to restore an already created backup. If a list of databases has not been created then all databases will be included in the backup. If a list is created that list will apply to all aspects of the backup system including calls to LISTDIRS^DBACK and CLRINC^DBACK for scripted backups. Caché High Availability Guide 29

40 Backup and Restore %SYS>Do ^BACKUP 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? 4 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 3 The following 4 databases are included in backups CACHEAUDIT C:\Program Files\CacheSys\Mgr\cacheaudit\ CACHESYS C:\Program Files\CacheSys\Mgr\ DOCBOOK C:\Program Files\CacheSys\Mgr\Docbook\ USER C:\Program Files\CacheSys\Mgr\User\ 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 4 The following is a list of all databases in the configuration. Databases which are part of the backup are marked with (*) (*) CACHEAUDIT c:\program files\cachesys\mgr\cacheaudit\ CACHELIB (Read Only) c:\program files\cachesys\mgr\cachelib\ (*) CACHESYS c:\program files\cachesys\mgr\ (*) DOCBOOK c:\program files\cachesys\mgr\docbook\ SAMPLES c:\program files\cachesys\mgr\samples\ (*) USER c:\program files\cachesys\mgr\user\ 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 5 =====Last Full Backup Information===== Date: 19 Jan 2007 Description: Full backup of all databases that are in the backup database list. Device: c:\program files\cachesys\mgr\backup\fulldblist_ _001.cbk 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information 30 Caché High Availability Guide

41 Caché Backup Utilities Option? 1 Enter database to add? SAMPLES Enter database to add? You've selected SAMPLES to be added to the backups Are you sure you want to do this (yes/no)? y Completed. 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 3 The following 5 databases are included in backups CACHEAUDIT C:\Program Files\CacheSys\Mgr\cacheaudit\ CACHESYS C:\Program Files\CacheSys\Mgr\ DOCBOOK C:\Program Files\CacheSys\Mgr\Docbook\ SAMPLES C:\Program Files\CacheSys\Mgr\Samples\ USER C:\Program Files\CacheSys\Mgr\User\ 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 2 Enter database to remove (^ when done)? SAMPLES Enter database to remove (^ when done)? You've removed SAMPLES from the backups Are you sure you want to do this? y Completed. 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? 3 The following 4 databases are included in backups CACHEAUDIT C:\Program Files\CacheSys\Mgr\cacheaudit\ CACHESYS C:\Program Files\CacheSys\Mgr\ DOCBOOK C:\Program Files\CacheSys\Mgr\Docbook\ USER C:\Program Files\CacheSys\Mgr\User\ 1) Add a database to the backup 2) Remove a database from the backup 3) Show current list of databases included in backups 4) Show list of available databases 5) Display last full backup information Option? Back Up Databases Using ^DBACK The Caché ^DBACK utility allows you to backup Caché databases. If a list of databases has not been created then all databases will be included in the backup. If a list is created that list will apply to all aspects of the backup system including calls to LISTDIRS^DBACK and CLRINC^DBACK for scripted backups. Caché High Availability Guide 31

42 Backup and Restore %SYS>Do ^BACKUP 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? 1 Cache Backup Utility What kind of backup: 1. Full backup of all in-use blocks 2. Incremental since last backup 3. Cumulative incremental since last full backup 4. Exit the backup program 1 => External Entry Points for ^DBACK BACKUP^DBACK The following procedure invokes a backup using the passed arguments to satisfy the questions that are asked during the interactive execution of ^DBACK. BACKUP^DBACK(argfile,type,desc,outdev,kiljrn,logfile,mode,clrjrn,swjrn, nwjrnfil,quietimeout,taskname) Starting with this version of Caché the meanings of the following arguments have changed. Because of changes in the journaling mechanism, you can no longer delete the current journal file, clear the current journal file, or specify a new journal file name. Argument argfile type desc Description No longer used; always use a NULL value Type of backup. Values: I Incremental C Cumulative F Full E External full backup Description stored in the backup label and in the history global. Free form text string that can be NULL. The following arguments are ignored (and, therefore, not required) for backups of type E. 32 Caché High Availability Guide

43 Caché Backup Utilities Argument outdev kiljrn logfile mode clrjrn swjrn nwjrnfil Description Output device for the backup; can be a file name or tape device. Y switch journal file after backup N ignored File name Send a copy of all messages that would be sent to the terminal to this file. Null value no log file NOISY Default, print all text on terminal QUIET Only display text related to abnormal conditions NOINPUT No terminal is attached to this process. No output will be sent to the terminal and if a read must be executed, the backup is aborted. It is advisable to have a log file. NOTE: This switch only affects what is displayed at the terminal, the logfile is always NOISY. Y switch the journal file N ignored Y switch the journal file after backup N Do not switch the journal file. This is overridden if clrjrn or kiljrn is Y Ignored The following are two optional arguments: Argument quietimeout taskname Description Number of seconds to wait for system to quiesce before aborting backup. A zero (0) or negative value indicates to wait indefinitely (default = 60). Task name passed by Backup.General.StartTask(); used internally by Caché. Using these external entry points requires read/write access to the SYS global for all modes. A type E backup deletes the.ind,.ine, and.inf files in the directories in the SYS global. The routine also records this backup with the description in the history global as being the LASTFULL backup. Caché High Availability Guide 33

44 Backup and Restore If SWITCH 10 or 13 is set when this routine is called they remain set throughout the backup. If switch 13 is set then it will be converted into a switch 10 since backup needs to do Sets and Kills. It will be restored upon exit. Variables defined by this entry point are NEWed but those defined within the body of the ^DBACK procedure are not. Return values: 0 failed; check log file if specified. 1 success 1,warning message string success, but with warnings. The warnings are separated by ~ in the string. LISTDIRS^DBACK LISTDIRS^DBACK(file,mode) ; ; This procedure takes a file name and a mode. CLRINC^DBACK CLRINC^DBACK(mode) clears the incremental backup bits Restore Databases Using ^DBREST The Caché ^DBREST database restore utility performs the following actions: Requires you to choose whether to restore all or selected directories. Asks if you want to stop all other Caché processes from running during the restore. In most cases, answer Yes. Asks for the name of the file that holds the backup you wish to restore. If your backup is on more than one volume, that information is in the volume header. After restoring the first volume, the utility prompts you for the next. Displays the header of the volume. The header contains the following information determined during the backup: the date of the backup on this volume, the date of the previous backup, and the date of the last full backup. Asks you to verify that this is the backup you wish to restore. Lists the directories on the device to restore. Allows you to specify another input device that contains the next backup to restore. The ^DBREST menu options 1 and 2 are the equivalent of choosing options 2 and 3 from the ^BACKUP menu. 34 Caché High Availability Guide

45 Caché Backup Utilities %SYS>Do ^DBREST Cache DBREST Utility Restore database directories from a backup archive Restore: 1. All directories 2. Selected and/or renamed directories 3. Exit the restore program 1 => The following sections describe the process of choosing each option: 1. Restore All Databases Using ^DBREST 2. Restore Selected or Renamed Databases Using ^DBREST You can also perform these functions non-interactively in a script. See the External Entry Points of ^DBREST section for details Restore All Databases Using ^DBREST Choosing 1 from the ^DBREST menu is equivalent to choosing 2 from the ^BACKUP menu: %SYS>Do ^BACKUP 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? 2 Proceed with restoring ALL directories Yes => n Restore: 1. All directories 2. Selected and/or renamed directories 3. Exit the restore program 1 => 3 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? The following procedure restores all directories: 1. USER>DO ^%CD Namespace: %SYS You're in namespace %SYS Default directory is c:\cachesys\mgr\ %SYS>DO ^BACKUP 2. Select Restore ALL from the Backup utility options. This option restores all directories that are on the backup medium. Caché High Availability Guide 35

46 Backup and Restore %SYS>Do ^BACKUP 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? 3. Confirm that you want to restore all directories: Proceed with restoring ALL directories Yes=> If you press Enter, the restore process proceeds for all databases. If you answer No, you can choose which type of restore to perform. Restore: 1. All directories 2. Selected and/or renamed directories 3. Exit the restore program 1 => 4. Indicate whether you want to suspend Caché processes while restoring takes place. InterSystems recommends suspending processes. Do you want to set switch 10 so that other processes will be prevented from running during the restore? Yes => 5. Specify the first file from which to restore. You can press Enter to accept the default file, which is the last full backup. Specify input file for volume 1 of backup 1 (Type STOP to exit) Device: c:\cachesys\mgr\backup\fullalldatabases_ _001.cbk => 6. Check that the description of the backup is correct and verify that this is the file you want to restore. This backup volume was created by: Cache for Windows (Intel) 5.1 The volume label contains: Volume number 1 Volume backup MAR :52AM Full Previous backup MAR :00AM Incremental Last FULL backup MAR :00AM Description Full backup of ALL databases, whether or not they are in the backup database list. Buffer Count 0 Is this the backup you want to start restoring? Yes => 7. The utility tells you which directories it will restore, and the restore proceeds. The following directories will be restored: c:\cachesys\mgr\ c:\cachesys\mgr\cacheaudit\ c:\cachesys\mgr\samples\ c:\cachesys\mgr\test\ c:\cachesys\mgr\user\ 36 Caché High Availability Guide

47 Caché Backup Utilities ***Restoring c:\cachesys\mgr\ at 10:46: blocks restored in seconds for this pass, total restored. ***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:01 53 blocks restored in 0.0 seconds for this pass, 53 total restored. ***Restoring c:\cachesys\mgr\samples\ at 10:50: blocks restored in 0.6 seconds for this pass, 914 total restored. ***Restoring c:\cachesys\mgr\test\ at 10:50:02 53 blocks restored in 0.0 seconds for this pass, 53 total restored. ***Restoring c:\cachesys\mgr\user\ at 10:50: blocks restored in 0.1 seconds for this pass, 124 total restored. ***Restoring c:\cachesys\mgr\ at 10:50:02 5 blocks restored in 0.0 seconds for this pass, total restored. ***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 54 total restored. ***Restoring c:\cachesys\mgr\samples\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 915 total restored. ***Restoring c:\cachesys\mgr\test\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 54 total restored. ***Restoring c:\cachesys\mgr\user\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 125 total restored. ***Restoring c:\cachesys\mgr\ at 10:50:02 3 blocks restored in 0.0 seconds for this pass, total restored. ***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 55 total restored. ***Restoring c:\cachesys\mgr\samples\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 916 total restored. ***Restoring c:\cachesys\mgr\test\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 55 total restored. ***Restoring c:\cachesys\mgr\user\ at 10:50:02 1 blocks restored in 0.0 seconds for this pass, 126 total restored. 8. Specify the input file for the next incremental backup to restore, or enter STOP if there are no more input files to restore. Specify input file for volume 1 of backup following MAR :52AM (Type STOP to exit) Device: stop 9. Indicate whether you want to restore other backups. When you answer Yes, the procedure repeats from step 3. When you respond No, Caché mounts the databases you have restored. Caché High Availability Guide 37

48 Backup and Restore Do you have any more backups to restore? Yes => No Mounting c:\cachesys\mgr\ c:\cachesys\mgr\... (Mounted) Mounting c:\cachesys\mgr\cacheaudit\ c:\cachesys\mgr\cacheaudit\... (Mounted) Mounting c:\cachesys\mgr\samples\ c:\cachesys\mgr\samples\... (Mounted) Mounting c:\cachesys\mgr\test\ c:\cachesys\mgr\test\... (Mounted) Mounting c:\cachesys\mgr\user\ c:\cachesys\mgr\user\... (Mounted) 10. Specify which journal entries you want to apply to the restored databases and the name of the journal file you are restoring. Normally, you select Option 1 and apply only those changes that affect the directories you have just restored. Restoring a directory restores the globals in it only up to the date of the backup. If you have been journaling, you can apply journal entries to restore any changes that have been made in the globals since the backup was made. What journal entries do you wish to apply? 1. All entries for the directories that you restored 2. All entries for all directories 3. Selected directories and globals 4. No entries Apply: 1 => 11. Restore from the journal files begins. We know something about where journaling was at the time of the backup: 0: offset in c:\cachesys\mgr\journal\ Use current journal filter (ZJRNFILT)? No Use journal marker filter (MARKER^ZJRNFILT)? No Updates will not be replicated The earliest journal entry since the backup was made is at offset in c:\cachesys\mgr\journal\ Do you want to start from that location? Yes => Yes Final file to process (name in YYYYMMDD.NNN format): < > [?] => Prompt for name of the next file to process? No => No Provide or confirm the following configuration settings: Journal File Prefix: => Files to dejournal will be looked for in: c:\cachesys\mgr\journal\ c:\journal\altdir\ in addition to any directories you are going to specify below, UNLESS you enter a minus sign ('-' without quotes) at the prompt below, in which case ONLY directories given subsequently will be searched Directory to search: <return when done> Here is a list of directories in the order they will be searched for files: 38 Caché High Availability Guide

49 Caché Backup Utilities c:\cachesys\mgr\journal\ c:\journal\altdir\ The journal restore includes the current journal file. You cannot do that unless you stop journaling or switch journaling to another file. Do you want to switch journaling? Yes => Yes Journaling switched to c:\cachesys\mgr\journal\ You may disable journaling the updates for faster restore; on the other hand, you may not want to do so if a database to restore is being shadowed. Do you want to disable journaling the updates? Yes => yes Updates will NOT be journaled c:\cachesys\mgr\journal\ % 65.03% 68.44% 72.21% 75.86% 79.26% 82.73% 86.08% 89.56% 92.99% 96.07% 98.87%100.00% ***Journal file finished at 11:03:31 c:\cachesys\mgr\journal\ % 17.10% 17.90% 18.90% 20.05% 21.33% 22.58% 23.81% 25.15% 26.32% 27.65% 28.85% 30.08% 31.37% 32.59% 33.98% 35.16% 36.25% 37.32% 38.41% 39.55% 40.72% 41.81% 42.83% 43.85% 44.89% 46.00% 47.15% 48.24% 49.28% 50.32% 51.41% 52.54% 53.71% 54.76% 55.80% 56.85% 57.97% 59.10% 60.16% 61.17% 62.19% 63.24% 64.32% 65.18% 66.02% 66.87% 67.71% 68.52% 69.34% 70.14% 70.96% 71.76% 72.60% 73.58% 74.51% 75.43% 76.35% 77.26% 78.17% 79.07% 79.69% 80.31% 80.93% 81.56% 82.20% 82.83% 83.47% 84.27% 87.00% 88.57% 91.65% 93.03% 96.09% 97.44% 99.04%100.00% ***Journal file finished at 11:03:32 Journal reads completed. Applying changes to databases % 28.57% 42.86% 57.14% 71.43% 85.71% % [journal operation completed] Replication Enabled 1) Backup 2) Restore ALL 3) Restore Selected or Renamed Directories 4) Edit/Display List of Directories for Backups Option? Restore Selected or Renamed Databases Using ^DBREST The Restore Selected or Renamed Directories option lets you select which directories to restore from the backup medium. It also allows you to restore a database to a different directory name. The following example shows how to restore selected or renamed directories. 1. %SYS>DO ^BACKUP 2. Select Restore Selected or Renamed Directories from the Backup Menu. 3. Indicate whether you want to suspend Caché processes while restoring takes place. InterSystems recommends suspending processes. Do you want to set switch 10 so that other Cache processes will be prevented from running during the restore? Yes => Caché High Availability Guide 39

50 Backup and Restore 4. Specify the first file from which to restore. You can press <Enter> to accept the default file, which is the last full backup. Specify input file for volume 1 of backup 1 (Type STOP to exit) Device: c:\cachesys\mgr\backup\incrementaldblist_ _001.cbk => 5. Check that the description of the backup is correct and verify that this is the file you want to restore. This backup volume was created by: Cache for Windows (Intel) 5.1 The volume label contains: Volume number 1 Volume backup MAR :03AM Full Previous backup MAR :52AM Full Last FULL backup MAR :52AM Description Incremental backup of all databases that are in the backup database list. Buffer Count 0 Is this the backup you want to start restoring? Yes => 6. As the utility prompts you with directory names, specify which databases you want to restore, and in which directories you want to restore them: For each database included in the backup file, you can: -- press RETURN to restore it to its original directory; -- type X, then press RETURN to skip it and not restore it at all. -- type a different directory name. It will be restored to the directory you specify. (If you specify a directory that already contains a database, the data it contains will be lost). c:\cachesys\mgr\ => c:\cachesys\mgr\cacheaudit\ => c:\cachesys\mgr\test\ => c:\cachesys\mgr\user\ => Do you want to change this list of directories? No => 7. After responding to each directory prompt, you see the prompt: Do you want to change this list of directories? No=>. Answer Yes if you want to edit your choices, or press Enter to confirm them. 8. Continue the Restore from Step 8 in the procedure for restoring all directories, as specified earlier in this chapter External Entry Points of ^DBREST The Caché restore utility, ^DBREST, provides a non-interactive execution option using external entry points. You can write a script to implement unattended restores by calling one of these two entry points. Both entry points are functions, which return the status of the call. All arguments are input only. 40 Caché High Availability Guide

51 Caché Backup Utilities EXTALL^DBREST restores all directories present on the backup device. The syntax to use the EXTALL entry point of the ^DBREST utility is as follows: EXTALL^DBREST(quietmode,allowupd,inpdev,dirlist,jrnopt,jrnfile,jdirglo) EXTSELCT^DBREST restores selected files from the backup device or restores to a target directory that is different from the source directory. The syntax to use the EXTSELCT entry point of the ^DBREST utility is as follows: EXTSELCT^DBREST(quietmode,allowupd,inpdev,dirlist,jrnopt,jrnfile,jdirglo) The following table describes the input arguments used for both functions. Argument quietmode allowupd inpdev dirlist jrnopt jrnfile jdirglo Description A value indicates the operation is in quiet (non-interactive) mode; must be a non-null value for this external call, typically 1. Indicator whether or not to allow updates during the restore process: 1 allow updates during the restore process 0 do not allow updates Input device that contains the backup. If this device is a tape device, the utility prompts you to specify the device for the next volume. File name containing a list of directories to restore. One record for each directory to be restored has to be present. Ignored for the EXTALL entry point. Options to restore the journal: 1 All directories for which you just restored the backup 2 All directories in the journal 3 Selected directories and globals specified by the jdirglo argument 4 No directories Journal file. If null, the utility uses the current file, which is stored in the ^%SYS("JOURNAL, CURRENT") global. Only used when jrnopt is 3. This is the name of the file containing selection criteria for directories and globals for the journal restore. Requirements of the file indicated by jdirglo: Contains one record for each directorythis file should contain one record for each directory that you want the journal restore Caché High Availability Guide 41

52 Backup and Restore Separate each field with a comma (,). Format of each record: <DirName>,<RestAll>,<Globals>; where: Argument DirName RestAll Globals Description Name of the directory that you want to restore the journal on. Whether or not to restore journal entries for all globals in the directory; not case-sensitive and required. Y restore journal on all globals in this directory N specify globals for journal restore in globals list A list of globals separated by commas for journal restore. Only used if RestAll is N. If the list of globals is large, you can enter the remaining global names on the next line but you must specify the directory name again followed by N and then the list of globals. Examples of different records: DUA0:[TEST1],Y DUA1:[TEST2],N,GLO1,GLO2,GLO3 DUA1:[TEST2],N,GLO4,GLO5,GLO6 DUA1:[TEST3],n The last record could also be completely omitted if you do not want to restore the journal for that directory. Requirements of the file indicated by dirlist: Contains one record for each directory to restore Separate each field with a comma (,). Format of each record: <SourceDir>,<TargetDir>,<CreateDir>; where: Argument SourceDir TargetDir CreateDir Description Name of the directory to restore Directory name to which to restore; omit if restoring to same as source directory. Whether or not to create the target directory if it does not exist. Y create the target directory N do not create target directory 42 Caché High Availability Guide

53 Caché Backup Utilities Examples of different records: DUA0:[TEST1] DUA1:[TEST2],DUA1:[TEST2] DUA2:[TEST3],DUA2:[TEST4],Y DUA2:[TEST5],DUA2:[TEST6],N The following return codes are possible from calling these functions: Return Code Description No errors; successful completion Cannot open input device (device to restore from) Volume label does not match label in backup history Backup/Restore already in progress Invalid device for reading selection criteria (for selective restore of directories) Invalid journal restore option Invalid journal file or file for selection criteria for journal restore cannot be opened Estimate Backup Size Using ^DBSIZE Immediately before performing any backup, estimate its size using the ^DBSIZE utility, which estimates disk space needed for the backup. The utility ^DBSIZE provides an estimate of the size of the output created by a Caché backup. It is only an estimate, since there is no way of knowing how many blocks will be modified once the backup has been started. You can obtain a more accurate estimate by preventing global updates while running ^DBSIZE, and then doing your backup before allowing global updates to resume. You can estimate the size of backups in two ways: Run ^DBSIZE interactively Call ^DBSIZE from a routine Note: A database must be in the list of selected databases to be backed up before you can evaluate it with ^DBSIZE Run DBSIZE Interactively The following procedure describes the steps necessary to run ^DBSIZE interactively: Caché High Availability Guide 43

54 Backup and Restore 1. Do ^DBSIZE 2. Caché displays the DBSIZE main menu: Incremental Backup Size Estimator What kind of backup: 1. Full backup of all in-use blocks 2. Incremental since last backup 3. Cumulative incremental since last full backup 4. Exit the backup program 1=> 3. Select the type of backup for which you want an estimate: full, incremental, or cumulative incremental. 4. At the Suspend Updates? Yes=> prompt, either: Press Enter to suspend updates so that you get a more accurate estimate; or Enter No to continue updates 5. Examine the results that are displayed. First, DBSIZE shows you how many Caché blocks you need to do the type of backup you selected, for: Each directory in the backup list All directories in the backup list. Suspend Updates? Yes=> n Directory In-Use Blocks c:\cachesys\mgr\ 983 c:\cachesys\mgr\cachelib\ 5320 c:\cachesys\mgr\docbook\ 6137 c:\cachesys\mgr\samples\ 687 c:\cachesys\mgr\user\ Total Number of Database Blocks: For a disk file: Total size including overhead: byte blocks = bytes For Magnetic Media: Total Number of 16KB Blocks including overhead of backup volume and pass labels: 1655 Next, DBSIZE provides information about backup to disk file (for Windows 95/98, Windows NT and UNIX) or RMS file (for OpenVMS). If the directories to be backed up include any long strings, you see separate lines for standard and long block sizes. 44 Caché High Availability Guide

55 Caché Backup Utilities For a disk file: Total size including overhead: byte blocks = bytes For an RMS file: Total Number of 512 Blocks including overhead of backup volume and pass labels: Pre Allocation quantity is: Finally, DBSIZE provides information about the amount of space used if the backup is made to magnetic tape. For Magnetic Media: Total Number of 16KB Blocks including overhead of backup volume and pass labels: Use the DBSIZE Function You can also call ^DBSIZE from a routine. To do so, you use the following function: $$INT^DBSIZE(backup_type) Important: The values of backup_type differ from the option values running ^DBSIZE interactively. Values of backup_type backup_type Description Incremental backup Full backup Cumulative incremental backup For example to run a full backup: %SYS>w $$INT^DBSIZE(2) 13178^5 The returned value is two numbers separated by a caret (^). In this example, the returned value is the total estimated size of the backup, in blocks; the returned value 5 indicates the number of databases to be backed up. Caché High Availability Guide 45

56 Backup and Restore %SYS>w $$INT^DBSIZE(1) 996^5 %SYS>w $$INT^DBSIZE(3) 996^5 %SYS> %SYS>w $$INT^DBSIZE(1) 95^3^950272^16^ ^1980^1980 %SYS>w $$INT^DBSIZE(2) 2620^3^ ^377^ ^43560^43560 %SYS>w $$INT^DBSIZE(3) 95^3^950272^16^ ^1980^1980 %SYS> 2.8 Sample Backup Procedures Caché Backup on UNIX Caché Backup on OpenVMS Incremental Backup on UNIX Schedule Backups with Task Manager External UNIX Backup Script Caché makes it easy to integrate with such utilities. The following is an example of a UNIX procedure: 1. Clear the list of database blocks modified since the last backup. This synchronization point later allows you to identify all database blocks modified during the backup. Call the application program interface (API), CLRINC^DBACK("QUIET"), in the backup script; this completes instantly. 2. Using your preferred backup utility, copy the CACHE.DAT files which may be in use. 3. Perform an incremental backup of the blocks modified by users during the backup. This should be an output sequential file. Since it is likely to be a small set of blocks, this step should complete very quickly. Call the API, BACKUP^DBACK(), in the backup script. Important: The journal file should be switched at this time. 4. Copy the incremental file to the backup save-set, using your preferred UNIX command. The following is an abbreviated example of the cbackup script: 46 Caché High Availability Guide

57 Sample Backup Procedures../bin/cuxs -s. -U "" -B << cleanup s x=\$\$clrinc^dback("noisy") i x s x=\$\$backup^dback("","e","unix Backup via call_os_backup","","","") o "cback_temp":"was" u "cback_temp" w x c "cback_temp" h cleanup CLRINC and BACKUP("","E") are performed in one Caché session UNIX Backup and Restore You should perform a UNIX level backup of your system periodically. You can perform a UNIX level backup either from the UNIX prompt or using the UNIX/Caché backup facility cbackup discussed below Using UNIX Backup Utilities Use a UNIX backup utility in place of Caché Backup in the following situations: To restore sequential files or other software that is not part of Caché. To move Caché data between systems when you cannot transfer databases directly The following table lists UNIX utilities you may find useful for these types of backups. UNIX Backup Utilities and Commands Utility/Command cp and mv (copy and move) tar (tape archiver) cpio dump bru Function Copies or moves the given file or files to a different file, directory, or file system. Example: move the manager's directory (and the binary files in /usr/bin) to another directory, using the following command: #mv /usr/cache/* /usr/cache.old/* Standard UNIX command to copy files and directories, extract files from tape, and list the files on a tape. Produces more portable output than cpio or dump. In conjunction with the find command, performs backups similar to the tar command. Performs complete system backups. A third-party backup and restore utility. Before using it on your system, verify that it works properly and meets your requirements. Caché High Availability Guide 47

58 Backup and Restore cbackup Utility A UNIX utility called cbackup allows you to back up your Caché databases using a system-level backup while automatically updating your Caché internal incremental backup. Using cbackup allows you to use the backup tools available on your operating system to back up your Caché databases and synchronize the bitmap information with the Caché internal incremental backup facility. This updates your Caché backup history so that you can use cbackup for full backups and the Caché Backup utility for incremental backups. On Caché for UNIX, you must initiate all restores after backup interactively through the BACKUP utility. Note: The cbackup utility is called from the shell environment. The backup utility halts Caché while it performs the backup and then restarts it. To enable journaling you must first set up the call_os_backup file to choose your UNIX backup utility. The call_os_backup file is automatically installed in your system manager directory. The example below uses tar as the backup utility. : # InterSystems Corporation # # File: call_os_backup # # This is a template for the user-specific backup procedure. # It should backup Cache database files from a directory list contained in # the cback_dir_list file, which is generated by the cbackup script that # calls this one. # # Do not forget to include database extension files if any! # # Variable 'backstatus' should be set to 1 for success # 0 for failure # # For example: # # sed -e "s/\/$//" cback_dir_list > cback_tar_list # if tar cvff /dev/rmt0 cback_tar_list # then backstatus=1 # else backstatus=0 # fi To use tar as your UNIX backup utility remove the comment marks from the last five lines above. If you want to use a different utility, use the form presented above to set up the backup utility. 48 Caché High Availability Guide

59 # echo "This message is from the call_os_backup script, which should contain a" echo "user-defined backup procedure to backup Cache database files according" echo "to a list of directories stored in cback_dir_list." echo "" echo "!!! Do not forget to include database extension files if any!!!" echo "" echo " Variable 'backstatus' should be set to 1 for success" echo " 0 for failure" # backstatus=1 # exit $backstatus Performing Both Caché and System-level Backups Using cbackup 1. Be sure Caché is running and that you are in the manager's directory. 2. Enter the Backup Menu to ensure that you have selected all the directories you want to back up. 3. Halt out of Caché to return to the UNIX shell. Sample Backup Procedures 4. Type cbackup at the UNIX shell prompt. You will be prompted for confirmation of the directories to be backed up. The cbackup script automatically shuts down Caché and activates the call_os_backup script which performs the backup OpenVMS Backup To perform OpenVMS BACKUP you must shut down Caché. The OpenVMS backup copies entire CACHE.DAT and CACHE.EXT files. CACHE.DAT and CACHE.EXT files created with either the Caché Database utility or the characterbased MSU utility are RMS files. Thus, they can be backed up and copied using OpenVMS utilities, such as BACKUP. You can defragment your Caché databases by using the OpenVMS BACKUP utility monthly to backup and restore your CACHE.DAT and CACHE.EXT files. You can use the OpenVMS BACKUP utility to perform your weekly full backup as part of your strategy to ensure the physical integrity of your database. OpenVMS BACKUP provides greater redundancy and error checking than Caché backup. As a result, it is more proficient at recovering from tape errors. The disadvantage of OpenVMS BACKUP is that, normally, Caché must be shut down to run the OpenVMS BACKUP. However, you can overcome this disadvantage by doing a OpenVMS BACKUP while Caché is running and following it with an incremental backup. See Using CBACKUP.COM Running OpenVMS BACKUP for a Caché Database If you are using the OpenVMS BACKUP utility by itself, rather than in conjunction with Caché incremental backup as in the CBACKUP.COM example file, use this procedure: Caché High Availability Guide 49

60 Backup and Restore 1. Have users log off the system and set OpenVMS interactive logins to zero. 2. Stop Caché using the ccontrol stop command procedure. 3. Use OpenVMS BACKUP to back up the system. 4. If you are doing the backup to defragment your files, use the Restore option of the utility. 5. Start Caché using the ccontrol start command procedure and resume operation. 6. Enable OpenVMS logins. 7. Record this full backup in the Management Information option of the Caché BACKUP Utility Using CBACKUP.COM InterSystems supplies a command procedure, CBACKUP.COM, which provides a model of how to use entry points in Caché backup routines to perform a backup while Caché is running. This command procedure gives you examples of various combinations of OpenVMS and Caché backups. CBACKUP.COM is loaded into the CACHESYS directory during Caché installation from the file CBACKUP_PROTO.COM on the distribution tape. CBACKUP.COM checks that the process which is executing it meets one of the following three criteria: Has the system manager's UIC Is authorized to hold SYSPRV Is authorized to hold CMKRNL This privilege is required because CBACKUP.COM uses the /IGNORE=INTERLOCK qualifier in the OpenVMS BACKUP command. If the process does not meet one of the criteria, an error message is printed and CBACKUP.COM terminates. CBACKUP.COM carries out these actions: 1. Performs the OpenVMS backup. 2. Records the date, time, and a brief description of the OpenVMS full backup in the Caché Backup History. This information is used later when you request a restore. 3. Runs a Caché incremental backup. InterSystems recommends that you examine this procedure in detail, modify it as necessary, and use it if you wish to use OpenVMS BACKUP or any entry points to Caché Backup. 50 Caché High Availability Guide

61 3 Journaling Global journaling preserves changes in the database since the last backup. While a backup is the cornerstone of physical recovery, it is not the complete answer. Restoring the database from a backup does not recover changes made since that backup. Typically, this is a number of hours before the point at which physical integrity was lost. What happens to all the database changes that occurred since then? The answer lies with journaling. This chapter discusses the following topics: Journaling Overview Configuring Journaling Journaling Operation Tasks Journaling Utilities Journal I/O Errors Special Considerations for Journaling includes information regarding UNIX file systems and performance. 3.1 Journaling Overview Each instance of Caché keeps a journal. The journal is a set of files that keeps a time-sequenced log of changes that have been made to the databases since the last backup. The process is redundant and logical and does not use the Caché Write daemon. Caché transaction processing works with journaling to maintain the logical integrity of data. Caché High Availability Guide 51

62 Journaling When Caché starts, it reapplies all journal entries since the last Write daemon pass. Since user processes update the journal concurrently, rather than through the Write daemon, this approach provides added assurance that updates prior to a crash are preserved. The configuration and management of Caché journaling provides a safe and consistent approach to supporting highly available systems. The journaling state is a property of the database, not individual globals. A database can have only one of two global journaling states: Yes or No. The journal contains global update operations (primarily Set and Kill operations) for globals in transactions regardless of the setting of the databases in which the affected globals reside, as well as all update operations for globals in databases whose Global Journal State is Yes. This greatly improves the reliability of the system; it avoids inconsistencies (after crash recovery) due to updates to globals that may or may not be journaled, and that may or may not be involved in transactions. Journaling global operations in databases mounted on a cluster depends on the database setting. The local Caché instance does not journal transaction operations to globals on remote nodes. In a network configuration, journaling is the responsibility of the node on which the global actually resides, not the one that requests the Set or Kill. Thus, if node B performs a Set at the request of node A, the journal entry appears in the journal on node B, not node A. Backups and journaling are daily operations that allow you to recreate your database. If any problem arises that renders your database inaccessible or unusable, you can restore the backups and apply the changes in the journal to recreate your database. This method of recovering from a loss of physical integrity is known as roll forward recovery. The journal is also used for rolling back incomplete transactions. The default Global Journal State for a new database is Yes. New Caché instances have the journaling property set to Yes for the CACHEAUDIT, CACHESYS, and USER databases. The CACHELIB, CACHETEMP, DOCBOOK, and SAMPLES databases have the property set to No. Operations to globals in CACHETEMP are never journaled. Map temporary globals to the Caché temporary database, CACHETEMP. The following topics provide greater detail of how journaling works: Differences Between Journaling and Write Image Journaling Protecting Database Integrity Automatic Journaling of Transactions Rolling Back Incomplete Transactions Using Temporary Globals and CACHETEMP Journal Management Classes and Globals 52 Caché High Availability Guide

63 3.1.1 Differences Between Journaling and Write Image Journaling In this chapter, the journal refers to the journal file; journaling refers to the writing of global update operations to the journal file. Do not confuse the Caché journal described in this chapter with write image journaling, which is described in the Write Image Journaling and Recovery chapter of this guide. Journaling provides a complete record of all database changes, as long as you have journaling enabled for the database. In the event of database loss or degradation, you restore the contents of the journal file to the database. Write image journaling provides a copy of any database modifications that are not actually written to the database when a system crash occurs. In such a case, Caché automatically writes the contents of the write image journal to the database when it restarts Protecting Database Integrity The Caché recovery process is designed to provide maximal protection: Journaling Overview It uses the roll forward approach. If a system crash occurs, the recovery mechanism completes the updates that were in progress. By contrast, other systems employ a roll back approach, undoing updates to recover. While both approaches protect internal integrity, the roll forward approach used by Caché does so with reduced data loss. It protects the sequence of updates; if an update is present in the database following recovery, all preceding updates are also present. Other systems which do not correctly preserve update sequence may yield a database that is internally consistent but logically invalid. It protects the incremental backup file structures, as well as the database. You can run a valid incremental backup following recovery from a crash Automatic Journaling of Transactions In a Caché application, you can define a unit of work, called a transaction. Caché transaction processing uses the journal to store transactions. Caché journals any global update that is part of a transaction regardless of the global journal state setting for the database in which the affected global resides. You use commands to: Indicate the beginning of a transaction. Commit the transaction, if the transaction completes normally. Roll back the transaction, if an error is encountered during the transaction. Caché supports many SQL transaction processing commands. See the Transaction Processing chapter of Using Caché ObjectScript for details on these commands. Caché High Availability Guide 53

64 Journaling Rolling Back Incomplete Transactions If a transaction does not complete, Caché rolls back the transaction using the journal entries, returning the globals involved to their pre-transaction values. As part of updating the database, Caché rolls back incomplete transactions by applying the changes in the journal, that is, by performing a journal restore. This happens in the following situations: During recovery, which occurs as part of Caché startup after a system crash. When you halt your process while transactions are in progress. When you use the Terminate option to terminate a process from the [Home] > [Process Details] page of the System Management Portal. If you terminate a process initiated by the Job command, the system automatically rolls back any incomplete transactions in it. If you terminate a user process, the system sends a message to the user asking whether it should commit or roll back incomplete transactions. You can write roll back code into your applications. The application itself may detect a problem and request a rollback. Often this is done from an error-handling routine following an application-level error. See the Managing Transactions Within Applications section of the Transaction Processing chapter of Using Caché ObjectScript for more information Using Temporary Globals and CACHETEMP Nothing mapped to the CACHETEMP database is ever journaled. Since the globals in a namespace may be mapped to different databases, some may be journaled and some may not be. It is the journal property for the database to which the global is mapped that determines if Caché journals the global operation. The difference between CACHETEMP and a database with the journal property set to No, is that nothing in CACHETEMP, not even transactional updates, are journaled. If you need to exclude new z/z* globals from journaling, map the globals to a database with the journal property set to No. To always exclude z/z* globals from journaling, you must map them in every namespace to the CACHETEMP database. Caché does not journal temporary globals. Some of the system globals designated by Caché as temporary and contained in CACHETEMP are: ^%cspsession ^CacheTemp* ^mtemp 54 Caché High Availability Guide

65 Configuring Journaling Journal Management Classes and Globals See the class documentation for %SYS.Journal.System in the Caché Class Reference for information on available journaling methods and queries. It is part of the %SYS.Journal package. Also, Caché uses the ^%SYS( JOURNAL ) global node to store information about the journal file. For example: ^%SYS("JOURNAL","ALTDIR") stores the name of the alternate journal directory. ^%SYS("JOURNAL","CURDIR") stores the name of the current journal directory. ^%SYS("JOURNAL","CURRENT") stores journal status and the journal file name. You can view this information from the [Home] > [Globals] page of the System Management Portal. 3.2 Configuring Journaling Caché starts with journaling enabled for the following databases: CACHESYS, CACHELIB, and USER. You can enable or disable journaling on each database from the [Home] > [Configuration] > [Local Databases] page of the System Management Portal. Click Edit on the row corresponding to the database and click Yes or No in the Global Journal State box. The default setting of the journal state for new databases is Yes. When you first mount a database from an earlier release of Caché, the value is set to Yes, regardless of the previous setting for new globals and regardless of the previous settings of individual globals within that database. You can change the global journal setting for a database on a running system. If you do this, Caché warns you of the potential consequences and audits the change if auditing is enabled. The journal file name is in current date format (yyyymmdd.nnn). The suffix nnn starts at 001 and increases incrementally. When the journal file fills, Caché automatically switches to a new one. The new file has the same directory name, but a different numeric suffix. If you use a journal file prefix, Caché continues to use that prefix. For example, if the journal file fills, Caché starts a new one called If the date changes while the journal file is filling, the new journal file is named The following sections describe configuration in greater detail: Configure Journal Settings Journaling Best Practices Caché High Availability Guide 55

66 Journaling Configure Journal Settings To configure Caché journaling, navigate to the [Home] > [Configuration] > [Journal Settings] page of the System Management Portal. You can edit the following settings: Journal directory Enter the name of a directory in which to store the journal files. The name may be up to 63 characters long. InterSystems recommends that the journal directory be located in a different partition from your databases. The default is the Journal subdirectory in the Caché installation manager s directory. Alternate journal directory Enter the name of an alternate directory journaling switches to if the current directory disk is full or becomes unavailable. The same characteristics apply as for the journal directory. Defaults to the Journal directory. Start new journal file every Enter the number of megabytes for the maximum size of the journal file after which the journal file switches. The default size is 1024 MB. Journal File Prefix (optional) Enter an alphanumeric string to distinguish the journal file name. A change to this setting requires a restart to take effect. Important: InterSystems recommends isolating journal files from the databases by updating both the current and alternate journal file locations to separate disk partitions before any activity takes place on the system. You can also update the previous four settings using the ^JRNOPTS routine or by selecting option 6, Edit Journal Properties, from the ^JOURNAL routine menu. See the Update Journal Settings using ^JRNOPTS section for details. When to purge journal files You can set either or both of the following two options. If you enter nonzero values for both settings, purging occurs when a journal file meets the sooner of the two conditions. - After this many days Enter the number of days after which to purge (valid values: 0-100). - After this many successive successful backups Enter the number of consecutive successful backups after which to purge (valid values: 0-10). This includes any type of backup, whether a Caché backup or an external backup that calls the $$BACKUP^DBACK("","E") function after successful completion. Note: No journal file containing currently open transactions is purged, even if it meets the criteria of this setting. 56 Caché High Availability Guide

67 Journaling Operation Tasks Freeze on error Select Yes or No. This setting controls the behavior when an error occurs in writing to the journal. The default is No. See the Journal I/O Errors section for a detailed explanation of this setting. You are not required to restart Caché after changing most of these settings (except where indicated), but any change causes a new journal file to begin. There is an additional advanced configuration setting affecting journaling which you can maintain from the [Home] > [Configuration] > [Advanced Settings] page of the System Management Portal. Choose Transactions in the Category list: SynchronousCommit True or false. Every TCOMMIT command requests a flush of the journal data involved in that transaction to disk. When this setting is true, TCOMMIT does not complete until the journal data write operation completes. When it is false, TCOMMIT does not wait for the write operation to complete. The default is false Journaling Best Practices The following are some important points to consider when configuring journaling: Journal all globals to ensure close to zero data loss in the event of a crash; Caché updates the journals much more frequently than the physical database. Always know exactly what you are journaling and always journal what you cannot lose. Understand all of your globals; make a distinction between what is truly temporary (and therefore should be mapped to CACHETEMP if possible), and what just goes away after a while (which should be journaled, as it would be needed as part of a restore). Place journal files on a separate disk from the database (CACHE.DAT) files. InterSystems recommends isolating journal files from the database to lessen the risk of their being corrupted if there is a crash and the database is corrupted. Journal files never contain database degradation; they can, therefore, function as a useful form of secondary backup. 3.3 Journaling Operation Tasks Once journaling is configured there are several tasks you can perform: Start journaling Stop journaling Switch journal files Caché High Availability Guide 57

68 Journaling View journal files Purge journal files Restore journal files Start Journaling If journaling is stopped, you can start it using the ^JRNSTART routine or by selecting option 1, Begin Journaling, from the ^JOURNAL routine menu. See the Start Journaling Using ^JRNSTART section for details. Note: You cannot start journaling from the System Management Portal Stop Journaling When you stop journaling, transaction processing ceases. If a transaction is in progress when you stop journaling, the complete transaction may not be entered in the journal. To avoid this problem, it is best to make sure all users are off the system before stopping journaling. If you stop journaling and Caché crashes, the startup recovery process does not roll back incomplete transactions started before journaling stopped since the transaction may have been committed but not journaled. In contrast, transactions are not affected in any adverse way by switching journal files. Rollback correctly handles transactions spanning multiple journal files created by journal switching; so, if possible, it is better to switch journal files than to stop journaling. You can stop journaling using the ^JRNSTOP routine or by selecting option 2, Stop Journaling, from the ^JOURNAL routine menu. See the Stop Journaling Using ^JRNSTOP section for details. Note: You cannot stop journaling from the System Management Portal Switch Journal Files Caché automatically switches the journal file in the following situations: After a successful backup of a Caché database When the current journal file grows to the maximum file size allowed (configurable on the Journal Settings page) When the journal directory becomes unavailable and you specified an alternate directory After updating settings in the [Home] > [Configuration] > [Journal Settings] page of the System Management Portal 58 Caché High Availability Guide

69 Switching the journal file is preferable to stopping and starting journaling. The advantage of switching journal files over stopping and restarting journaling, is that you do not miss journaling any global activity that occurs after journaling is stopped but before it is restarted. To manually switch journal files: 1. Navigate to the [Home] > [Journals] page of the System Management Portal. 2. Click Switch Journal above the list of database journal files. 3. Confirm the journal switch by clicking OK. You can also switch journal files using the ^JRNSWTCH routine or by selecting option 3, Switch Journal File from the ^JOURNAL routine menu. See the Switch Journal Files Using ^JRNSWTCH section for details View Journal Files You can view the journal files from the [Home] > [Journals] page of the System Management Portal. 1. Click Journals under the Operations column of the [Home] page. Use the Filter box to shorten the list if necessary. 2. To view the journal file information, click View in the row of the appropriate journal file. Use the Match box with the Search button to help find a particular entry. (Text in the Search box is casesensitive.) 3. To view the journal file entry, click the Offset of the appropriate node in the list to view a dialog box containing journal record details. You can also use the ^JRNDUMP utility to display the entire journal and the SELECT^JRNDUMP entry point to display selected entries. See the Display Journal Records Using ^JRNDUMP section for details Purge Journal Files Journaling Operation Tasks You can schedule a task to run regularly that purges obsolete journal files. A new Caché instance contains a pre-scheduled Purge Journal task that is scheduled to run after the daily Switch Journal task that runs at midnight. The purge process deletes journal files based on the When to purge journal files setting on the [Home] > [Configuration] > [Journal Settings] page. Note: No journal file containing currently open transactions is purged, even if it meets the criteria of the purge setting. Caché High Availability Guide 59

70 Journaling Restore Journal Files After a system crash or disk hardware failure, recreate your database by restoring your backup copies. If you have been journaling and your journal file is still accessible, you can further restore your databases by applying changes since the last backup, which have been tracked in your journal. To restore the journal files: 1. First confirm that all users exit Caché. 2. Stop journaling if it is enabled. 3. Restore the latest backup of your database. See the Backup and Restore chapter of this guide for more information. 4. Run the journal restore utility. See the Restore Globals From Journal Files Using ^JRNRESTO section for details. Note: You cannot run the journal restore process from the System Management Portal. 3.4 Journaling Utilities Caché provides several utilities to perform journaling tasks. The ^JOURNAL utility provides menu choices to run some common journaling utilities, which you can also run independently. There are also several other journaling utilities, which you run from the %SYS namespace. The following sections describe the journaling utilities in detail: Perform Journaling Tasks Using ^JOURNAL Start Journaling Using ^JRNSTART Stop Journaling Using ^JRNSTOP Switch Journal Files Using ^JRNSWTCH Restore Globals From Journal Files Using ^JRNRESTO Filter Journal Records Using ^ZJRNFILT Display Journal Records Using ^JRNDUMP Update Journal Settings Using ^JRNOPTS Recover from Startup Errors Using ^STURECOV Convert Journal Files Using ^JCONVERT and ^%JREAD 60 Caché High Availability Guide

71 Journaling Utilities Set Journal Markers Using ^JRNMARK Manipulate Journal Files Using ^JRNUTIL Manage Journaling at the Process Level Using %NOJRN In the following sections the sample procedures show C:\MyCache as the Caché installation directory Perform Journaling Tasks Using ^JOURNAL This example shows the menu available by invoking the ^JOURNAL routine: %SYS>Do ^JOURNAL 1) Begin Journaling (^JRNSTART) 2) Stop Journaling (^JRNSTOP) 3) Switch Journal File (^JRNSWTCH) 4) Restore Globals From Journal (^JRNRESTO) 5) Display Journal File (^JRNDUMP) 6) Edit Journal Properties (^JRNOPTS) 7) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL()) 8) Display Journal status (Status^JOURNAL) Option? Enter the appropriate menu number option to start that particular routine. Press Enter without entering an option number to exit the utility. Subsequent sections in this document describe the utilities started by choosing options 1 6. For information on option 7, Activate or Deactivate Journal Encryption, see the Configuring Caché Encryption Settings section of the Database Encryption chapter of the Caché Security Administration Guide, which describes details on journal file encryption Display Journal Status Using Status^JOURNAL Choosing option 8, Display Journal status, displays a concise overview of journal status information including the following: Current journal directory and its remaining space Alternate journal directory (if different) and its remaining space Current journal file, its maximum size, and space used Journaling state, which can be one of the following: - Enabled - Disabled (stopped) - Disabled due to I/O error (suspended) - Frozen due to I/O error Caché High Availability Guide 61

72 Journaling - Journal switch in progress (paused) Though suspended and frozen due to I/O error are the same journal state, the system takes different action; when frozen, it discards journal data. If applicable, the process IDs of any process running ^JRNSTART, ^JRNSTOP, or ^JRNSWTCH For example: Option? 8 Current journal directory: C:\MyCache\Mgr\Journal\ Current journal directory free space (KB): Alternate journal directory: C:\MyCache\Mgr\ Alternate journal directory free space (KB): Current journal file: C:\MyCache\mgr\journal\ Current journal file maximum size: Current journal file space used: Journaling is enabled Start Journaling Using ^JRNSTART To start journaling, run ^JRNSTART or enter 1 at the Option prompt of the ^JOURNAL menu, as shown in the following examples. Example of running ^JRNSTART directly: %SYS>Do ^JRNSTART Example of starting journaling from the ^JOURNAL menu: 1) Begin Journaling (^JRNSTART) 2) Stop Journaling (^JRNSTOP) 3) Switch Journal File (^JRNSWTCH) 4) Restore Globals From Journal (^JRNRESTO) 5) Display Journal File (^JRNDUMP) 6) Edit Journal Properties (^JRNOPTS) 7) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL()) 8) Display Journal status (Status^JOURNAL) Option? 1 If journaling is running when you select this option, you see a message similar to the following: 62 Caché High Availability Guide

73 Journaling Utilities Option? 1 Already journaling to C:\MyCache\mgr\journal\ Stop Journaling Using ^JRNSTOP To stop journaling, run ^JRNSTOP or enter 2 at the Option prompt of the ^JOURNAL menu, as shown in the following examples. Example of running ^JRNSTOP directly: %SYS>Do ^JRNSTOP Stop journaling now? Yes => Yes Example of stopping journaling from the ^JOURNAL menu: %SYS>Do ^JOURNAL 1) Begin Journaling (^JRNSTART) 2) Stop Journaling (^JRNSTOP) 3) Switch Journal File (^JRNSWTCH) 4) Restore Globals From Journal (^JRNRESTO) 5) Display Journal File (^JRNDUMP) 6) Edit Journal Properties (^JRNOPTS) 7) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL()) 8) Display Journal status (Status^JOURNAL) Option? 2 Stop journaling now? Yes => Yes If journaling is not running when you select this option, you see a message similar to the following: Option? 2 Not journaling now Switch Journal Files Using ^JRNSWTCH To switch the journal file, run ^JRNSWTCH or enter 3 at the Option prompt of the ^JOURNAL menu, as shown in the following example: %SYS>Do ^JOURNAL 1) Begin Journaling (^JRNSTART) 2) Stop Journaling (^JRNSTOP) 3) Switch Journal File (^JRNSWTCH) 4) Restore Globals From Journal (^JRNRESTO) 5) Display Journal File (^JRNDUMP) 6) Edit Journal Properties (^JRNOPTS) 7) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL()) 8) Display Journal status (Status^JOURNAL) Option? 3 Switching from: C:\MyCache\mgr\journal\ To: C:\MyCache\mgr\journal\ Caché High Availability Guide 63

74 Journaling The utility displays the name of the old and new journal files Restore Globals From Journal Files Using ^JRNRESTO Journal restore respects the current settings of the database. Caché stores nothing in the journal about the journal state of the database when it writes the journal record. The state of the database at the time of restore determines what action is taken. This means that changes to databases whose journal state is Yes are durable, but changes to other databases may not be. Caché ensures physical consistency, but not necessarily application consistency, if transactions involve databases whose journal state is No. The Caché ^JRNRESTO routine only restores databases whose journal state is Yes at the time of the journal restore. It checks the database journal state the first time it encounters each database and records the journal state. The restore process skips journal records for databases whose journal state is No. If no databases are marked as being journaled, the routine asks if you wish to terminate the restore. You can change the database journal state to Yes on specific databases and restart ^JRNRESTO. To restore the journal files: 1. Run the routine from the system manager s namespace: %SYS>Do ^JRNRESTO This utility uses the contents of journal files to bring globals up to date from a backup. Replication is not enabled. Restore the Journal? Yes => 2. Press <Enter> to select the default, Yes, to confirm that you want to restore the journal. 3. If you have existing journal filters, specify whether you want to use them: Use current journal filter (ZJRNFILT)? Use journal marker filter (MARKER^ZJRNFILT)? See the Filter Journal Records Using ^ZJRNFILT section for details. 4. Specify whether you want to use existing journal filters and to restore all journaled globals. Use current journal filter (ZJRNFILT)? no Use journal marker filter (MARKER^ZJRNFILT)? no Process all journaled globals in all directories? Yes Enter Yes at the prompt if you want to apply all global changes to the database. Enter No if you want to restore only selected globals. Then at the Global^ prompts, enter the specific globals you want to restore. 5. Specify whether or not to clear the journal file. If you do not use transaction processing, enter Yes. 64 Caché High Availability Guide

75 Journaling Utilities If you use transaction processing, you can clear the journal file only if there are no active Caché processes that may be in the middle of a transaction. Restoring from Multiple Journal Files If Caché switched to multiple journal files since the restored backup, you must restore the journal files in order from the oldest to the most recent. For example, if you have three journal files to restore, , , and , you must restore them in the following order: Rolling Back Incomplete Transactions Restoring the journal also rolls back incomplete transactions. Ensure that users have completed all transactions so that the restore does not attempt to roll back active processes. To ensure that transactions are all complete before you restore your backup and clear the journal file, InterSystems strongly recommends the following: If you need to roll back transactions for your own process, the process must halt or use the TROLLBACK command. If you need to roll back transactions system-wide, shut down Caché and restart it to ensure that no users are on the system Filter Journal Records Using ^ZJRNFILT InterSystems provides a journal filter mechanism to manipulate the journal file. The journal filter program is a user-written routine called ^ZJRNFILT whose format is shown below. This is called by the Caché journal restore program, ^JRNRESTO, and ensures that only selected records are restored. Create the ^ZJRNFILT routine using the following format: ZJRNFILT(pid,dir,glo,type,restmode,addr,time) Argument pid dir glo type Type input input input input Description Process ID of process in journal record (passed in hex) Directory in journal record Global in journal record Command type in journal record (S for Set, K for Kill) Caché High Availability Guide 65

76 Journaling Argument restmode addr time Type output output output Description 0 - do not restore record 1 - restore record Address of the journal record Time stamp of the record. This is the time the journal buffer is created, not when the Set or Kill operation occurs, so it represents the earliest this particular operation could have happened. ^ZJRNFILT Considerations Consider the following when using ^ZJRNFILT: If the startup routine (^STU) calls ^JRNRESTO, it does not call the filter routine under any circumstances. Journal restore only calls the journal filter (^ZJRNFILT) if it exists. If it does exist, the restore procedure prompts you to confirm the use of the filter in the restore process. If you answer yes to use the journal filter, for every record in the journal file to restore, the routine calls the journal filter ^ZJRNFILT with the indicated input arguments to determine whether to restore the current record. You can use any logic in your ^ZJRNFILT routine to determine whether or not to restore the record. Return confirmation through the output restmode argument. If you are using the directory name, dir, in the ^ZJRNFILT routine logic, pass it in canonical form. You can put the directory name in canonical form using the following system function: Set dir=$zutil(12,dir) See the $ZUTIL(12) entry in the Caché ObjectScript Reference for details. The entire global reference is passed to ^ZJRNFILT for use in program logic. When the journal restore process completes, it prompts you to confirm whether to rename the ^ZJRNFILT routine or delete it. If you choose to rename the filter, the utility renames it ^XJRNFILT and deletes the original ^ZJRNFILT. The restore process aborts with an appropriate error message if any errors occur in the ^ZJRNFILT routine. ^ZJRNFILT Examples Two globals, ^ABC and ^XYZ, are journaled. While journaling is turned on, the following code is executed, and the journal file records the Set and Kill operations for these globals 66 Caché High Availability Guide

77 Journaling Utilities For I=1:1:500 Set ÂBC(I)="" For I=1:1:500 Set ^XYZ(I)="" For I=1:1:100 Kill ÂBC(I) 1. To restore all records for ÂBC only, the ^ZJRNFILT routine looks like this: ZJRNFILT(pid,dir,glo,type,restmode,addr,time) Set restmode=1 If glo["xyz" Set restmode=0 Quit ; /*Filter*/ /*Return 1 for restore*/ /*except when it is ^XYZ*/ 2. To restore all records except the kill on ÂBC, the ^ZJRNFILT routine looks like this: ZJRNFILT(pid,dir,glo,type,restmode,addr,time) Set restmode=1 If glo["âbc",type="k" Set restmode=0 Quit ; /*Filter*/ /*Return 1 for restore*/ /*except if a kill on ÂBC*/ Display Journal Records Using ^JRNDUMP To display the records in the journal file, enter 5 at the Option prompt of the ^JOURNAL menu or run ^JRNDUMP as shown in the following example: 1. %SYS>DO ^JRNDUMP Journal Directory & prefix C:\MyCache\Mgr\Journal\ [JRNSTART] C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ C:\MyCache\mgr\journal\ > C:\MyCache\mgr\journal\ 2. The routine displays a list of journal files. A greater-than sign (>) appears to the left of the chosen file followed by a prompt: Pg(D)n,Pg(U)p,(N)ext,(P)rev,(G)oto,(E)xamine,(Q)uit => Use these options to navigate to the journal file you wish to locate: Enter D or U to page through the list of journal files. Enter N or P to move the > to the desired journal file. Enter G to enter an alternate file name of which to display the contents. Enter E to display the contents of the chosen journal file. Enter Q or <Enter> to quit the routine. Caché High Availability Guide 67

78 Journaling 3. After entering G or E, the utility displays the journal file name and begins listing the contents of the file by offset address. For example: Journal: C:\MyCache\mgr\journal\ Address Proc ID Op Directory Global & Value =============================================================================== S C:\MyCache\mgr\ SYS("shdwcli","doctest","remend") = S C:\MyCache\mgr\ SYS("shdwcli","doctest","end") = S C:\MyCache\mgr\ SYS("shdwcli","doctest","jrnend") = At the bottom of the current listing page is information about the journal file and another prompt: Last record: ; Max size: (N)ext,(P)rev,(G)oto,(F)ind,(E)xamine,(Q)uit => Use these options to navigate to the journal record you wish to display: Enter N or P to display the next or previous page of addresses. Enter G to bring the list to a particular address. Enter F to search for a particular string within the journal file. Enter E to enter the address and display the contents of a chosen journal record. Enter Q or <Enter> to return to the list of journal files. 5. After entering E or G, enter an address at the prompt. The E option displays the contents of the journal record at or near the address you entered; the G option displays the page of journal records starting at that location. For either option, the utility locates the record that is the closest to the offset address you specify; it does not need to be a valid address of a journal record. Also, you may enter 0 (zero) or press <Enter> to go to the beginning of the journal file, or enter -1 to go to the end of the journal file. 6. You may browse through a display of the journal records using N or P to display the next or previous journal record contents, respectively. When you are finished displaying records, enter Q at the prompt to return to the list of journal records. There are different types of journal records: The journal header is 8192 bytes long. It appears once at the start of every journal file. The ^JRNDUMP utility does not display the journal header record. Journal data records. Journal markers 68 Caché High Availability Guide

79 The following is a sample journal file data record as displayed by ^JRNDUMP. The example shows how a Set command is recorded. The new value is recorded, but not the old value, because the Set occurred outside a transaction: Journal: C:\MyCache\mgr\journal\ Address: Type: Set In transaction: No Process ID: 4836 Remote system ID: 0 Time stamp: 60284,53240 Collation sequence: 5 Prev address: Next address: 0 Global: ^["^^C:\MyCache\mgr\"]ABC New Value: 2 Journaling Utilities (N)ext,(P)rev,(Q)uit => In a transaction, the old value is also recorded, to allow transaction rollback, as seen in this second example: Journal: C:\MyCache\mgr\journal\ Address: Type: Set In transaction: Yes Process ID: Remote system ID: 0 Time stamp: 60584, /15/ :36:19 Collation sequence: 5 Prev address: Next address: Global: ^["^^C:\MyCache\mgr\"]ABC New Value: 5 Old Value: 2 (N)ext,(P)rev,(Q)uit => The following table describes each field in the journal data record. Journal Data Record Fields Displayed by ^JRNDUMP Field Address Type In transaction Process ID Description Location of this record in number of bytes from beginning of file. This is the only field where you enter a value to select a record. The type of command recorded in this journal record entry. See the Journal File Command Type Codes table for possible types. Whether or not the update occurred in a transaction. Process ID number for the process issuing the command. Caché High Availability Guide 69

80 Journaling Field Remote system ID Time stamp Collation sequence Prev address Next address Cluster sequence # Global New Value Old Value Description Remote system ID number (0 if a local process). Creation time of the journal buffer, in $HOROLOG and human-readable format. This is not the time the Set or Kill operation occurs, so it represents the earliest this particular operation could have happened. Collation sequence of the global being updated. Location of previous record (0 indicates this is the first record). Location of next record (0 indicates this is the last record). Sequencing for globals in cluster-mounted databases. During cluster failover, journal entries from different nodes are updated in order of this cluster time sequencing. Extended reference of global being updated. For a Set operation, the value assigned to the global. For a Set or Kill operation in a transaction, the value that was in the global before the operation. The following table shows both the number and the letter code for various transaction types. Journal File Command Type Codes Type BeginTrans CommitTrans Set KillNode KillDesc ZKill NSet NKill NZKill JrnMark BitSet Number Letter BT CT S K k k S K k M b 70 Caché High Availability Guide

81 Journaling Utilities Type NetReq JOURNAL-END Number 15-1 Letter NN The following is an example of a journal marker record created by an incremental backup: Journal: C:\MyCache\mgr\journal\ Address: Type: JrnMark Marker ID: -1 Marker text: NOV ;03:14PM;Incremental Marker seq number: 1 Prev marker address: 0 Time stamp: 60584, /15/ :36:19 Prev address: Next address: (N)ext,(P)rev,(Q)uit => Select Journal Records to Dump The function SELECT^JRNDUMP lets you display any or all of the records in the journal file. Caché dumps selected records from the journal file, starting from the beginning of the file, based on the arguments passed to the function. The syntax to use the SELECT entry point of the ^JRNDUMP utility is as follows: SELECT^JRNDUMP(%jfile,%pid,%dir,%glo,%gloall,%command,%remsysid) Argument %jfile %pid %dir %glo %gloall %command Description Journal file name. Default is the current journal file. Process ID in the journal record. Default is any process. Directory in the journal record. Default is any directory. Global reference in the journal record. Default is any global. Global indicator whether to list entries related to all global nodes containing the name represented by glo: 0 Exact match of global reference with the name specified in glo, 1 Partial match; all records with a global reference that contains the name specified in glo. Default is 0. Type of command. Default is any command. Use either the letter or the numeric codes described in the Journal File Command Type Codes table in the previous section. Caché High Availability Guide 71

82 Journaling Argument %remsysid Description Remote system ID of journal record. Default is any system. 1 1 If %pid is specified, then %remsysid defaults to local system (0); otherwise, it defaults to any system, the same as if it is specified as 0. That is, you cannot select journal entries only from the local system. You may pass the null string for any of the other arguments, in which case the routine uses the defaults. SELECT^JRNDUMP Examples The following examples show different ways to select specific journal records. You can use this entry point to send the output of the ^JRNDUMP routine to a device other than the terminal. For example, this sends the output to a file called JRNDUMP.OUT: %SYS>Do SELECT^JRNDUMP("JOURNAL:POGH "," ","","","","","") Device: SYS$LOGIN:JRNDUMP.OUT Parameters: "RW"=> To select all records in the journal file that contain the global reference ÂBC: DO SELECT^JRNDUMP(" ","","","ÂBC",1,"") To select only records that have an exact match to the global reference ÂBC: DO SELECT^JRNDUMP(" ","","","ÂBC",0,"") Records that are not an exact match, such as ÂBC(1) or ÂBC(100), are not selected. To select only records that exist for the process with pid number, 1203: DO SELECT^JRNDUMP(" ","1203","","","","") Note: On OpenVMS, you must specify the pid in uppercase. To select only records for Kill operations of global ÂBC: DO SELECT^JRNDUMP(" ","","","ÂBC","","K") Update Journal Settings Using ^JRNOPTS As an alternative to using the Journal Settings page of the System Management Portal, you can update the basic journal configuration settings using the ^JRNOPTS routine or by entering 6 at the Option prompt of the ^JOURNAL menu. To change the setting, type the new value at the prompt and press Enter. For example: 72 Caché High Availability Guide

83 Journaling Utilities SYS>Do ^JRNOPTS Current journal directory: <C:\MyCache\Mgr\Journal\> => Alternate journal directory: <D:\cachesys\altjournal\> => Journal File Prefix: [?] =>? Enter an alphanumeric string ('_' allowed) or. to reset prefix to null Journal File Prefix: [?] => Max journal file size in MB (range: [1, 4087]): <512> => 1024 Max journal file size in MB (range: [1, 4087]): <1024> => Entering a question mark (?) displays help for the journal file prefix. A change to the journal file prefix requires a restart: *** Journal options updated. Journal file prefix change requires a Cache restart. If you change any of the other settings, the journal file switches. If you do not change any settings, you see the following message: *** Nothing changed Recover from Startup Errors Using ^STURECOV During the Caché startup procedure if the journal or transaction restore process encounters errors, such as <FILEFULL> or <DATABASE>, the procedure logs the errors in the console log (cconsole.log) and starts the system in single-user mode. Caché provides a utility, ^STURECOV, to help you recover from the errors and start Caché in multiuser mode. The routine has several options which you can use to retry the failed operation and bring the system up, or ignore the errors and bring the system up. The journal restore phase tries to do as much work as possible before it aborts. If a database triggers more than three errors, it aborts the recovery of that database and leaves the database dismounted. During transaction rollback, the first error in a database causes the rollback process to skip that database in the future. The process does not fully replay transactions that reference that database; it stores them for rollback during the recovery process. When Caché encounters a problem during the dejournaling phase of startup it generates a series of console log messages similar to the following: 08/10-11:19:47:024 ( 2240) System Initialized. 08/10-11:19:47:054 ( 2256) Write daemon started. 08/10-11:19:48:316 ( 1836) Performing Journal Recovery 08/10-11:19:49:417 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB Caché High Availability Guide 73

84 Journaling C:\MyCache\mgr\journal\ addr= ^["^^C:\MyCache\mgr\jo1666\"]test(4,3,28) 08/10-11:19:49:427 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB C:\MyCache\mgr\journal\ addr= ^["^^C:\MyCache\mgr\test\"]test(4,3,27) 08/10-11:19:49:437 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB C:\MyCache\mgr\journal\ addr= ^["^^C:\MyCache\mgr\test\"]test(4,3,26) 08/10-11:19:49:447 ( 1836) Error in JRNRESTB: <DATABASE>restore+42^JRNRESTB C:\MyCache\mgr\journal\ addr= ^["^^C:\MyCache\mgr\test\"]test(4,2,70) 08/10-11:19:50:459 ( 1836) Too many errors restoring to C:\MyCache\mgr\test\. Dismounting and skipping subsequent records 08/10-11:19:50:539 ( 1836) 4 errors during journal restore, see console.log file for details. Startup aborted, entering single user mode. If the errors are from transaction rollback, then the output looks similar to this: 08/11-08:55:08:732 ( 428) System Initialized. 08/11-08:55:08:752 ( 1512) Write daemon started. 08/11-08:55:10:444 ( 2224) Performing Journal Recovery 08/11-08:55:11:165 ( 2224) Performing Transaction Rollback 08/11-08:55:11:736 ( 2224) Max Journal Size: /11-08:55:11:746 ( 2224) START: C:\MyCache\mgr\journal\ /11-08:55:12:487 ( 2224) Journaling selected globals to C:\MyCache\mgr\journal\ started. 08/11-08:55:12:487 ( 2224) Rolling back transactions... 08/11-08:55:12:798 ( 2224) Error in %ROLLBACK: <DATABASE>set+2^%ROLLBACK C:\MyCache\mgr\journal\ addr= ^["^^C:\MyCache\mgr\test\"]test(4,1,80) 08/11-08:55:12:798 ( 2224) Rollback of transaction for process id #2148 aborted at offset in C:\MyCache\mgr\journal\ /11-08:55:13:809 ( 2224) C:\MyCache\mgr\test\ dismounted - Subsequent records will not be restored 08/11-08:55:13:809 ( 2224) Rollback of transaction for process id #924 aborted at offset in C:\MyCache\mgr\journal\ /11-08:55:14:089 ( 2224) STOP: C:\MyCache\mgr\journal\ /11-08:55:14:180 ( 2224) 1 errors during journal rollback, see console.log file for details. Startup aborted, entering single user mode. Both output listings end with the same instructions: Enter Cache' with C:\MyCache\bin\cache -sc:\mycache\mgr -B and D ^STURECOV for help recovering from the errors. When Caché cannot start properly, it starts in single-user mode. While in this mode, execute the special commands indicated in these instructions to enter Caché. For example, for a Windows installation, enter the following: C:\MyCache\bin\>cache -sc:\mycache\mgr -B UNIX-based and OpenVMS systems have a slightly different syntax. 74 Caché High Availability Guide

85 This runs the Caché executable from the Caché installation bin directory (install-dir\bin) indicating the pathname (by using the -s argument) of the system manager s directory (install-dir\mgr) and inhibits all logins except one emergency login (by using the -B argument). You are now in the manager s namespace and can run the startup recovery routine, ^STURECOV: Do ^STURECOV The ^STURECOV journal recovery menu appears as follows: Journaling Utilities Journal recovery options ) Display the list of errors from startup 2) Run the journal restore again 3) Bring up the system in multi-user mode (includes journal restore) 4) Dismount a database 5) Mount a database 6) Database Repair Utility 7) Check Database Integrity 8) Reset system so journal is not restored at startup 9) Display instructions on how to shut down the system 10) Display Journaling Menu (^JOURNAL) H) Display Help E) Exit this utility Enter choice (1-10) or [Q]uit/[H]elp? Only UNIX-based or OpenVMS systems contain option 9 on the menu. Before starting the system in multiuser mode, correct the errors that prevented the journal restore or transaction rollback from completing. You have several options regarding what to do: Option 1 The journal restore and transaction rollback procedure tries to save the list of errors in the ^%SYS() global. This is not always possible depending on what is wrong with the system. If this information is available, this option displays the errors. Option 2 This option performs the same journal restore and transaction rollback which was performed when the system was started. The amount of data is small so it should not be necessary to try and restart from where the error occurred. Option 3 When you are satisfied that the system is ready for use, use this option to complete the startup procedure and bring the system up as if startup had completed normally. Option 4 This option lets you dismount a database. Generally, use this option if you want to let users back on a system but you want to prevent them from accessing a database which still has problems (^DISMOUNT utility). Option 5 This option lets you mount a database (^MOUNT utility). Option 6 This option lets you edit the database structure (^REPAIR utility). Option 7 This option lets you validate the database structure (^INTEGRIT utility). Caché High Availability Guide 75

86 Journaling Option 8 This updates the system so that it does not attempt journal restore or transaction rollback at startup. This applies only to the next time the startup process is run. Use this in situations where you cannot get journal recovery to complete and you need to allow users back on the system. Consider dismounting the databases which have not been recovered. This operation is not reversible. You can perform journal restore manually using the ^JRNRESTO utility. Option 9 It is not possible to shut down the system from this utility, but this option displays instructions on how to shut the system down from the UNIX or OpenVMS command line. Option 10 This option brings up the journaling menu which allows you to browse and restore journal files. There are options which start and stop journaling but these are not generally of interest when resolving problems with journaling at startup. Take whatever corrective action is necessary to resolve the problem. This may involve using the ^DATABASE routine to extend the maximum size of the database, or it may require freeing space on the file system or using the ^INTEGRIT and ^REPAIR utilities to find and correct database degradation. As you do this work, you can use Option 2 of the ^STURECOV utility to retry the journal replay/transaction rollback as many times as necessary. You can display any errors you encounter, including those from when the system started, using Option 1. When you correct all the problems, and run Option 2 without any errors, use Option 3 to bring the system up in multiuser mode. If you find that you cannot resolve the problem, but you still want to bring the system up, use Option 8 to clear the information in the Caché image journal (.wij file) that triggers journal restore and transaction rollback at startup. The option also logs the current information in the console log. Once this completes, use Option 3 to start the system. Use this facility with care, as it is not reversible. If Caché was unable to store the errors during startup in the ^%SYS() global for ^STURECOV to display, you may get an initial message before the menu that looks like this: There is no record of any errors during the prior startup This could be because there was a problem writing the data Do you want to continue? No => yes Enter error type (? for list) [^] =>? Supported error types are: JRN - Journal and transaction rollback Enter error type (? for list) [^] => JRN Journaling errors are one type of error that this utility tries to handle and that is the scope of this chapter. Other error types are discussed in the appropriate sections of the documentation. CAUTION: Only use the ^STURECOV utility when the system is in single-user mode following an error during startup. Using it while the system is in any other state (for example, up running normally) can cause serious damage to your data as it restores journal information if you ask it to and this information may not be the most current data. The ^STURECOV utility warns you, but it lets you force it to run. 76 Caché High Availability Guide

87 Convert Journal Files Using ^JCONVERT and ^%JREAD The ^JCONVERT routine is a utility that reads journal files and converts them to a common-format file so that the ^%JREAD utility can read and apply journal transactions to databases on a different system. The ^JCONVERT utility exists on older InterSystems database products as well as all versions of Caché. Use these utilities to move journal data between different system versions that do not have compatible journal files. For example, if you are converting to a new version of Caché and need to minimize downtime, perform the following steps: 1. Enable journaling on the old system. 2. Run a backup on the old system; this switches to a new journal file on the old system. 3. Continue journaling on the old system. Journaling Utilities 4. Restore the backup of the old system on the new system and perform any necessary conversions. 5. Stop the old system and run ^JCONVERT on the journal files created on the old system since the backup. 6. Apply the transactions from the old system to the new system using the file created from ^JCONVERT as input to ^%JREAD on the new system. The ^JCONVERT utility uses the same process as the journal restore utility to select and filter the journal files for processing. You can include a range of journal files as input and create one output file. See the Restore Globals From Journal Files Using ^JRNRESTO section for details on selecting and filtering journal files. Before converting the journal files, you must choose the record and character translation for the converted file. The default is Variable/UTF8, which is compatible with the ^%JREAD utility on all platforms. You can move the files among platforms with binary FTP. However, these files are not easy to view or edit. Answer No at the record format prompt for an alternative local format file that you can edit but cannot as easily move to another platform. CAUTION: If you choose the alternate editable format, you cannot export it as binary data; when you import it, binary data may be misinterpreted as the end of a record. The alternative format is different for each platform: Unicode (all platforms): $CHAR(8232) [LSEP] UTF8 UNIX (Linux and OS X): $CHAR(10) [LF] RAW Windows and OpenVMS: $CHAR(13,10) [CRLF] RAW Unicode format requires a Unicode editor which may or may not exist on all Unicode platforms. This format works with Text Edit on the OS X operating system. If you require arbitrary record format and Caché High Availability Guide 77

88 Journaling character translations, this may be possible, but as this should be a very uncommon condition, contact the InterSystems Worldwide Response Center (WRC) for guidance. Globals in the journal file are stored with a specific directory reference appended to the global reference. You can choose either to include the directory reference in the converted file, or exclude it. If you include it, you can always filter it out or change it later during the ^%JREAD procedure. The directory reference determines where ^%JREAD sets the global on the target system. If you do not include the directory reference, ^%JREAD makes all sets in the current directory. If you do include the directory reference, the utility makes sets in the same directory as on the source system unless translated by a ^%ZJREAD program you supply. If the target system is on a different operating system or the databases reside in different directories on the target system, you must supply a ^%ZJREAD routine to translate the directory reference. The ^%JREAD routine reads a common journal file format and applies the journal transactions to the databases on the target system. During the import of records, if a ^%ZJREAD routine exists, the utility calls it for each journal transaction allowing you to manipulate the journal records. You can reference the following variables in your ^%ZJREAD routine: type - Transaction type gref - Global reference value - Global value %ZJREAD - 1:Apply transaction, 0:Do not apply transaction If you decide not to apply a transaction, set the variable %ZJREAD to 0 (zero) to skip the record. You can also modify the other variables. For example, you can change the directory specification by modifying gref. The following is an example ^%ZJREAD routine. It looks for transactions that contain updates to %SYS( JOURNAL, and prevents them from being applied. You can copy this and modify it to suit your needs: %ZJREAD; /*The following variables are defined; you can modify them before the transaction gets applied type - Transaction type gref - Global reference value - Global value %ZJREAD - 1:Apply transaction, 0:Do not apply transaction */ If gref["sys(""journal""" Set %ZJREAD=0 Quit Sample Run of ^JCONVERT The following is a sample run of the ^JCONVERT utility: 78 Caché High Availability Guide

89 Journaling Utilities %SYS>Do ^JCONVERT Journal Conversion Utility [ Cache Format --> Common Format ] You must choose the record and character translation for the converted file. The default of Variable/UTF8 is compatible with current ^%JREAD on all platforms and can be moved among platforms with binary FTP. However, these files are not easy to view or edit. Answer No for an alternative local format that can be edited, but not be as easy to move to another platform. Use Variable/UTF8 record format? <Yes> Globals in the journal file are stored with a specific directory reference appended to the global reference. You can choose either to include the directory reference in the converted file, or exclude it. Note that if you include it, you can always filter it out or change it later during the %JREAD procedure. The directory reference determines where ^%JREAD sets the global on the target system. If the directory reference is not included, all sets are made to the current directory. If the directory reference is included, sets will be made to the same directory as on the source system unless translated by a ^%ZJREAD program you supply. If the target system is on a different operating system or the databases reside in different directories on the target system, the ^%ZJREAD program must be used to translate the directory reference. Include the directory reference? <Yes> Enter common journal file name: common.jrn Common journal file: common.jrn Record separator: Variable Character translation: UTF8 Directory reference: Yes Use current journal filter (ZJRNFILT)? no Use journal marker filter (MARKER^ZJRNFILT)? no Process all journaled globals in all directories? enter Yes or No, please Process all journaled globals in all directories? yes Specify range of files to process (names in YYYYMMDD.NNN format) from: < > [?] => through: < > [?] => Prompt for name of the next file to process? No => No Caché High Availability Guide 79

90 Journaling Provide or confirm the following configuration settings: Journal File Prefix: => Files to dejournal will be looked for in: C:\MyCache\mgr\journal\ C:\MyCache\mgr\ in addition to any directories you are going to specify below, UNLESS you enter a minus sign ('-' without quotes) at the prompt below, in which case ONLY directories given subsequently will be searched Directory to search: <return when done> Here is a list of directories in the order they will be searched for files: C:\MyCache\mgr\journal\ C:\MyCache\mgr\ You may tailor the response to errors by choosing between the alternative actions described below. Otherwise you will be asked to select an action at the time an error actually occurs. Either Continue despite database-related problems (e.g., a target database is not journaled, cannot be mounted, etc.), skipping affected updates or Abort if an update would have to be skipped due to a database-related problem (e.g., a target database is not journaled, cannot be mounted, etc.) Either Abort if an update would have to be skipped due to a journal-related problem (e.g., journal corruption, some cases of missing journal files, etc.) or Continue despite journal-related problems (e.g., journal corruption, some missing journal files, etc.), skipping affected updates Either Apply sorted updates to databases before aborting or Discard sorted, not-yet-applied updates before aborting (faster) Would you like to specify error actions now? No => yes 1. Continue despite database-related problems (e.g., a target database is not journaled, cannot be mounted, etc.), skipping affected updates 2. Abort if an update would have to be skipped due to a database-related problem (e.g., a target database is not journaled, cannot be mounted, etc.) Select option [1 or 2]: 1 1. Abort if an update would have to be skipped due to a journal-related problem (e.g., journal corruption, some cases of missing journal files, etc.) 2. Continue despite journal-related problems (e.g., journal corruption, some missing journal files, etc.), skipping affected updates Select option [1 or 2]: 2 1. Apply sorted updates to databases before aborting 2. Discard sorted, not-yet-applied updates before aborting (faster) Select option [1 or 2]: 2 Based on your selection, this restore will 80 Caché High Availability Guide

91 ** Continue despite database-related problems (e.g., a target database is not journaled, cannot be mounted, etc.), skipping affected updates ** Continue despite journal-related problems (e.g., journal corruption, some missing journal files, etc.), skipping affected updates ** Discard sorted, not-yet-applied updates before aborting (faster) Journaling Utilities C:\MyCache\mgr\journal\ % 14.93% 15.95% 17.14% 18.25% 19.27% 20.49% 21.63% 22.65% 23.84% 24.99% 25.97% 27.10% 28.25% 29.31% 30.50% 31.72% 32.84% 33.84% 34.84% 35.84% 36.85% 37.91% 38.99% 40.10% 41.08% 42.03% 42.97% 43.93% 44.94% 45.95% 47.05% 48.11% 49.07% 50.04% 51.02% 52.03% 53.07% 54.14% 55.25% 56.21% 57.17% 58.15% 59.14% 60.18% 61.24% 62.33% 63.28% 64.20% 65.15% 66.10% 67.11% 68.13% 69.05% 69.94% 70.83% 71.61% 72.41% 73.09% 73.85% 74.59% 75.32% 76.06% 76.75% 77.73% 78.70% 79.65% 80.59% 81.53% 82.46% 83.40% 84.33% 85.27% 86.05% 86.59% 87.13% 87.67% 88.23% 88.78% 89.34% 89.89% 90.61% 93.28% 94.38% 97.12% 98.21% 99.93%100.00% ***Journal file finished at 11:31:36 C:\MyCache\mgr\journal\ % 14.96% 15.98% 17.18% 18.29% 19.31% 20.53% 21.67% 22.69% 23.88% 25.03% 26.01% 27.15% 28.30% 29.36% 30.55% 31.78% 32.90% 33.90% 34.90% 35.91% 36.92% 37.99% 39.06% 40.17% 41.16% 42.11% 43.05% 44.01% 45.03% 46.04% 47.14% 48.20% 49.17% 50.14% 51.11% 52.13% 53.17% 54.25% 55.36% 56.33% 57.29% 58.27% 59.26% 60.30% 61.36% 62.46% 63.40% 64.33% 65.28% 66.23% 67.24% 68.26% 69.19% 70.08% 70.97% 71.76% 72.56% 73.25% 74.01% 74.75% 75.47% 76.22% 76.91% 77.89% 78.87% 79.83% 80.77% 81.70% 82.64% 83.58% 84.52% 85.46% 86.24% 86.78% 87.32% 87.87% 88.42% 88.98% 89.53% 90.09% 90.81% 93.49% 94.59% 97.33% 98.42%100.00% ***Journal file finished at 11:31:37 C:\MyCache\mgr\journal\ % 14.92% 15.93% 17.12% 18.24% 19.25% 20.47% 21.61% 22.62% 23.82% 24.96% 25.94% 27.07% 28.22% 29.28% 30.46% 31.69% 32.80% 33.80% 34.80% 35.80% 36.81% 37.87% 38.94% 40.05% 41.04% 41.98% 42.92% 43.88% 44.89% 45.90% 47.00% 48.06% 49.02% 49.98% 50.96% 51.97% 53.01% 54.08% 55.19% 56.15% 57.11% 58.08% 59.07% 60.12% 61.17% 62.26% 63.20% 64.13% 65.07% 66.02% 67.03% 68.05% 68.97% 69.86% 70.75% 71.53% 72.33% 73.01% 73.77% 74.51% 75.23% 75.98% 76.67% 77.64% 78.61% 79.56% 80.50% 81.43% 82.37% 83.30% 84.24% 85.17% 85.95% 86.49% 87.03% 87.57% 88.13% 88.68% 89.23% 89.79% 90.51% 93.18% 94.27% 97.01% 98.10% 99.81%100.00% ***Journal file finished at 11:31:38 [journal operation completed] Converted journal records Set Journal Markers Using ^JRNMARK To set a journal marker in a journal file, use the following routine: SET rc=$$add^jrnmark(id,text) Argument id Description Marker ID (for example, -1 for backup) Caché High Availability Guide 81

92 Journaling Argument text rc Description Marker text of any string up to 256 characters (for example, timestamp for backup) Journal location of the marker (journal offset and journal file name, delimited by a comma) or, if the operation failed, a negative error code followed by a comma and a message describing the error. Note that a journal offset must be a positive number Manipulate Journal Files Using ^JRNUTIL InterSystems provides several functions in the ^JRNUTIL routine. You can use these functions for writing site-specific routines to manipulate journal records and files. The following table lists the functions available in the routine. Functions Available in ^JRNUTIL Journaling Task Close a journal file Delete a journal file Delete a journal record Read a record from a journal file into a local array Switch to a different journal file directory Open a journal file Use an opened journal file Function Syntax $$CLOSEJRN^JRNUTIL(jrnfile) $$DELFILE^JRNUTIL(jrnfile) $$DELREC^JRNUTIL(addr) $$GETREC^JRNUTIL(addr,jrnode) $$JRNSWCH^JRNUTIL(newdir) $$OPENJRN^JRNUTIL(jrnfile) $$USEJRN^JRNUTIL(jrnfile) The following table describes the arguments used in the utility. Argument addr jrnfile newdir jrnode Description Address of the journal record. Name of journal file. New journal file directory. Local variable passed by reference to return journal record information. 82 Caché High Availability Guide

93 Manage Journaling at the Process Level Using %NOJRN If journaling is enabled system-wide, you can stop journaling for Set and Kill operations on globals within a particular process by issuing a call to the ^%NOJRN utility from within an application or from programmer mode as follows: %SYS>DO DISABLE^%NOJRN Journaling remains disabled until one of the following events occurs: The process halts. The process issues the following call to reactivate journaling: %SYS>DO ENABLE^%NOJRN Journal I/O Errors 3.5 Journal I/O Errors When Caché encounters a journal file I/O error, the Journal daemon retries the failed operation periodically until it succeeds. Typically there is a one-second interval between attempts. What happens next depends on the setting that indicates whether or not to freeze Caché. This Freeze on error journal setting is on the [Home] > [Configuration] > [Journal Settings] page of the System Management Portal. The default setting is No. Freezing the system on a journal I/O error protects data integrity at the expense of system availability. Keeping the system available by disabling journaling threatens recoverability. InterSystems recommends you review your business needs and determine the best approach for your environment. The following sections describe the impact of each choice: Freeze System on Journal I/O Error Setting is No Freeze System on Journal I/O Error Setting is Yes Freeze System on Journal I/O Error Setting is No If you configure Caché not to freeze on a journal file I/O error, Caché disables journaling if it is not able to recover in a timely manner from the error. This prevents the system from hanging. Specifically, after it has attempted the failed I/O operation without success and one of the following conditions is met: Journal buffers are all filled up It has attempted for a predetermined amount of minutes (internal value) Caché High Availability Guide 83

94 Journaling The number of available global buffers falls below a predetermined minimum (internal value) When journaling is disabled, database updates are no longer journaled. As a result, the journal is no longer a reliable source from which to recover databases if the system crashes. The following conditions exist when journaling is disabled: Transaction rollback fails and generates <ROLLFAIL> errors. Shadowing becomes undependable once updates to the source databases are no longer journaled because is relies on journaling of the source databases Crash recovery of uncommitted data is nonexistent. Full recovery no longer exists. You are only able to recover to the last backup. ECP lock and transaction recoverability guarantees are compromised. If the system crashes, Caché startup recovery does not attempt to roll back incomplete transactions started before it disabled journaling because the transactions may have been committed, but not journaled. What to do if journaling is disabled? To summarize, if journaling is disabled, perform the following steps: 1. Resolve the problem As soon as possible, resolve the problem that disabled journaling. 2. Switch the journal file The Journal daemon retries the failed I/O operation periodically in an attempt to preserve the journal data accumulated prior to the disabling. If necessary, you can switch the journal file to a new directory to resolve the error; however, Caché does not re-enable journaling automatically even if it succeeds with the failed I/O operation and switches journaling to a new file. It also does not re-enable journaling if you switch the journal file manually. 3. Back up the databases on the main server (the backup automatically re-enables journaling if you have not done so). InterSystems strongly recommends backing up your databases as soon as possible after the error to avoid potential data loss. In fact, performing a Caché online backup when journaling is disabled due to an I/O error restarts journaling automatically, provided that the error condition that resulted in the disabling of journaling has been resolved and you have sufficient privileges to do so. You can also enable journaling by running ^JRNSTART. When a successful backup operation restarts journaling, Caché discards any pending journal I/O, since any database updates covered by the pending journal I/O are included in the backup. Important: Starting journaling requires higher privileges than running a backup. 4. Restore shadow databases If using shadowing, restore the backup to the shadow(s) to synchronize the databases, and restart the shadow from the new journal file started since the backup. 84 Caché High Availability Guide

95 Special Considerations for Journaling Freeze System on Journal I/O Error Setting is Yes If you configure Caché to freeze on a journal file I/O error, the Journal daemon freezes journaling immediately. This prevents the loss of journal data at the expense of system availability. The Journal daemon unfreezes journaling after it succeeds with the failed I/O operation. As soon as the error occurs all global activities that are normally journaled are blocked, which causes other jobs to hang. The typical outcome when the Freeze on error setting is Yes, is that Caché hangs until you resolve the journaling problem. The system appears to be down to operational end-users. While Caché is hanging, you can take corrective measures, such as freeing up space on a disk that is full or switching the journal to a new working disk or to one that has free space. The advantage to this option is that once Caché resolves the problem and resumes, it does not lose any journal information. The disadvantage is that the system is less available while the problem is being solved. Caché posts warning and error messages to the cconsole.log file periodically while the Journal daemon is retrying the failed I/O operation. 3.6 Special Considerations for Journaling Review the following special considerations when using Caché journaling: Performance UNIX File System Recommendations Performance While journaling is crucial to ensuring the integrity of your database, it can consume disk space and slow performance, depending on the number of global updates being journaled. Journaling affects performance because updates result in double processing, as the change is recorded in both the database and the journal file. Caché uses a flat file journaling scheme to minimize the adverse effect on performance UNIX File System Recommendations InterSystems has specific journal file recommendations on UNIX-based platforms to achieve optimal journal performance and to ensure journal data integrity if there is a system crash. The following table outlines the recommended file systems and mount options for journaling on each operating system. Caché High Availability Guide 85

96 Journaling For the particular versions supported for each operating system, see the Supported Server Platforms table in the Supported Platforms document. Operating System AIX 5L AIX 5L 5.3 (ML-03 and later) HP-UX Solaris Red Hat Enterprise Linux 2 SUSE Linux Enterprise Recommended File Systems and Options JFS2 file system mounted with concurrent I/O enabled VxFS file system mounted with concurrent I/O enabled NetApp NFS3 file system mounted with concurrent I/O enabled VxFS file system mounted with direct I/O enabled UFS 1 or NetApp NFS3 file system mounted with direct I/O enabled VxFS file system mounted with direct I/O enabled Caché enables direct I/O by default. OS command mount -o cio mount -o cio mount -o cio mount -o mincache=direct mount -o forcedirectio mount -o mincache=direct N/A 1 Journal I/O performance may not be optimal using UFS. 2 Red Hat Linux requires kernel version and above to address spurious I/O errors on journal files using direct I/O. You can first find this update in RHSA-2006:0575. Caché issues a message with severity level 2 (SEVERE) to the cconsole.log file if the device for the current journal file is not of a recommended file system type or is mounted without the necessary option to ensure journal durability. It issues the message only once, when it creates a journal file or if it has switched to a new file system since Caché startup. However, journaling does continue, despite the warning. Subsequent journal files in the same file system do not regenerate the warning. For example, on HP-UX if you do not use the recommended file system or mount option, you may get one of the following messages: 86 Caché High Availability Guide

97 11/01-11:56:30:570 (3772) 2 The file system for current journal file is NOT one of the recommended ones (vxfs) for the OS and might not support the mount option(s) to ensure journal data integrity in the event of a system crash. or Special Considerations for Journaling 11/01-12:09:39:500 (5337) 2 The device for the new journal file was not mounted with a recommended option (mincache=direct) to ensure durability of journal writes. In the event of a system crash, loss of journal data is possible. Caché High Availability Guide 87

98

99 4 Shadow Journaling Shadow journaling, or database shadowing, enables secondary computers to maintain a shadow copy of selected databases as they are updated on a primary machine. By continually transferring journal information from the primary machine to the secondary machines, shadowing enables recovery to a system which is typically within only a few transactions of the source database. You can use shadowing for many purposes, each with its own set of important considerations depending on your system environment. Some of the most common objectives satisfied by shadowing include the following: Disaster recovery, the most common use; it is simple and inexpensive. Read-only report server where ad hoc reporting tasks can operate on current data without affecting production. Low-budget replication where the databases are replicated on the shadow instance using journaling. Failover in some specific circumstances. This chapter discusses the following topics: Shadowing Overview Configuring Shadowing Managing and Monitoring Shadowing Using the Shadow Destination for Disaster Recovery Caché High Availability Guide 89

100 Shadow Journaling 4.1 Shadowing Overview A primary Caché instance may have one or more shadows. Shadow journaling monitors database activity on a primary system, the source, and causes the same activity to occur on a secondary system, the destination. It does this through a shadow client service running on the destination that continually requests journal file details from a shadow service running on the source. The shadow service responds by sending the details of the actual Set, Kill, and $Bit journal record entries to the destination shadow over a TCP connection. The source and destination servers can be of different hardware, operating system, or CPU chipset. Shadowing Overview All shadowing uses a fast transmission method which allows more efficient performance by sending the compacted journal file block by block. The shadow applies all transactions to the local databases. The transmission mode requires the data to be written to the journal file, which may introduce a delay of a few seconds. The shadow establishes a TCP connection to the server and receives the journal file. As the journal file downloads, another shadow process applies the journal entries to the local destination copy of the database. Upon connecting to the data source server, the destination shadow sends the server the name of the journal file and the starting point. The shadow checks for new records periodically. If it does not have 90 Caché High Availability Guide

101 Configuring Shadowing the latest records, the shadow downloads them and updates the databases. During these processes, Caché continually stores checkpoints in a shadow global to facilitate rollback and restart capabilities. Caché purges the destination shadow copies of source journal files automatically. You can configure how long to keep files that are eligible for purging, that is, ones that have been dejournaled and do not contain any open transactions. 4.2 Configuring Shadowing This section explains how to configure and set up shadowing in Caché. It describes the following procedures: Configuring the Source Database Server Configuring the Destination Shadow Journaling on the Destination Shadow Configuring the Source Database Server To enable shadowing on a source database server, first ensure that the source system can make a TCP connection to the destination system. If you plan to secure this connection using SSL, a %SuperServer SSL/TLS configuration must exist on the source. See Configuring the Caché Superserver to Use SSL/TLS in the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide for details. Important: A shadow service cannot run on a system with a single-server license. Use the System Management Portal from the Caché instance running on the source system to enable the shadow service, restrict connections, and to enable global journaling for the databases you are shadowing. You also need to synchronize the source and destination databases before shadowing begins. These procedures are described in the following topics: Enable the Shadowing Service Enable Journaling Synchronize Databases For information on methods and queries available for interfacing with the data source of a shadow without using the System Management Portal, see the SYS.Shadowing.DataSource class documentation in the Caché Class Reference. Caché High Availability Guide 91

102 Shadow Journaling Also see the Important Journaling Considerations section for issues that may pertain to your environment Enable the Shadowing Service To use shadowing, you must enable the shadowing service using the Security Management portion of the System Management Portal. You may also restrict shadowing access by entering the IP addresses of allowed connections: 1. Navigate to the [Home] > [Security Management] > [Services] page of the System Management Portal. 2. Click %Service_Shadow in the list of service names to edit the shadow service properties. 3. Select the Service enabled check box. Before clicking Save, you may want to first restrict what IP addresses can connect to this database source. If so, perform the next step, and then click Save. 4. In the Allowed Incoming Connections box, any previously entered server addresses are displayed in the IP Address list. Click Add to add an IP Address. Repeat this step until you have entered all permissible addresses. You may delete any of these addresses individually by clicking Delete in the appropriate row, or click Delete All to remove all addresses, therefore allowing connections from any address Enable Journaling Verify that you are journaling each database that you wish to shadow. 1. Navigate to the [Home] > [Configuration] > [Local Databases] page of the System Management Portal and view the Journal column for each database you wish to shadow. 2. To change the journal state from No to Yes, click Edit in the row of the appropriate database to edit the database properties. 3. In the Global Journal State list, click Yes and then click Save. By default, the CACHELIB, DOCBOOK, and SAMPLES databases are not journaled and, as a result, you cannot shadow them CACHETEMP is never journaled Synchronize Databases Before you start shadowing, synchronize the databases on the shadow destination with the source databases. Use an external backup on the source data server and restore the databases on the destination shadow. See the Backup and Restore chapter of the Caché High Availability Guide for more information. 92 Caché High Availability Guide

103 Configuring Shadowing Important Journaling Considerations Review the following sections for conditions that may affect your system: Managing Source Journal File Purging Responding to Disabled Source Journaling Using the Shadow for Queries Shadowing Class Compiles Managing Source Journal File Purging Caché does not do any special handling of journal file purging on the source for shadowing; therefore, it is your responsibility to configure the journal settings on the source to ensure the journal files that the shadow requires remain available until they are transmitted to the destination shadow. This is usually only a concern if the shadow falls seriously behind the source; for example, if you suspend the shadow or stop it for a prolonged period of time. Responding to Disabled Source Journaling If journaling is disabled on the source, you must determine the best course of action to maintain a valid shadow. Most likely, you will have to resynchronize the shadow with the source after you resolve the condition that caused Caché to disable journaling. See the Journal I/O Errors section of the Journaling chapter of this guide for more details. Using the Shadow for Queries Versions of Caché prior to release turned off journaling for any changes to the cached query definition global (^mcq); however, this version does journal updates to this global. As a result, if you plan to execute SQL queries on the destination shadow you must segregate updates of the ^mcq global on the source from the activity on the destination. This prevents logical corruption of the ^mcq global on the shadow. You can do this easily with global and routine mapping. For each namespace on the source from which you run dynamic queries, do the following: 1. Create a new database on the source to isolate the source cached queries. 2. Add global mapping for the ^mcq global in the source namespace to the new database. 3. Add routine mapping for all versions of the cached query routines (the default prefix is CacheSql) in the source namespace to the new database. 4. You can then exclude the new database from any mapping to the shadow, therefore preventing any updates to the^mcq global or the CacheSql routines on the source from being transferred to the shadow. Caché High Availability Guide 93

104 Shadow Journaling Shadowing Class Compiles Caché journals the database that contains the globals used for compiling classes. Journaling these globals during a class compile is unnecessary and could take up a significant amount of journal file space. Therefore, by default, Caché turns journaling off during class compiles. If you use shadowing and rely on the compile on the source to update the application code on the shadow, change the default qualifier (compile with /journal=1) so that you do journal the class compile and transfer those updates to the shadow database. Otherwise, you cannot use the shadow for disaster recovery unless you recompile all your classes Configuring the Destination Shadow To configure shadowing on a destination shadow server, first ensure that the destination system can make a TCP connection to the source system. If you plan to use SSL, an SSL/TLS client configuration must exist on the destination. See A Note on Caché Client Applications Using SSL/TLS in the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide for details. Use the System Management Portal from the Caché instance running on the destination system to configure the destination shadow properties and start shadowing. These procedures are described in the following sections: Define the Shadow Map the Databases Start Shadowing For information on methods and queries available for interfacing with the shadow destination without using the System Management Portal, see the SYS.Shadowing.Shadow class documentation in the Caché Class Reference Define the Shadow Navigate to the [Home] > [Configuration] page of the System Management Portal and click Shadow Server Settings under the Connectivity column to display the [Home] > [Configuration] > [Shadow Server Settings] page. Perform the following steps to define the shadow properties: 1. Click Add New Shadow to define a shadow on this destination server. If you have previously defined a shadow and wish to update its information, click Edit in the row of the source settings you wish to update. 2. Enter an identifying name for this shadow in the Name of the shadow box. This value is also referred to as the shadow ID. The system uses this name to distinguish between shadow instances that may be running on the same system. Do not use the tilde (~) character in the shadow name; it is used in internal shadow processing. 94 Caché High Availability Guide

105 Configuring Shadowing 3. Enter the TCP/IP address or host name (DNS) of the source database server you are shadowing in the DNS name or IP address of the source box. 4. Enter the superserver port number of the source Caché instance you are shadowing in the Port number of the source box. Important: If you change the IP address or the port number on a suspended shadow, it is your responsibility to ensure the shadow can resume properly. 5. Click Advanced to enter the following optional fields: Journal file directory Enter the full name, including the path, of the journal file directory on the destination shadow system. Click Browse for help in finding the proper directory. Consider the following when updating this entry: - If you are shadowing more than one source instance on this destination, ensure you use a unique journal file directory for each instance. - If you change the journal file directory on a cluster shadow, the change only takes affect for journal files from new cluster nodes until you stop and restart the shadow. Filter routine Enter the name (omit the leading ^) of an optional filter routine the shadow uses to filter journal records before dejournaling them on the shadow. The routine should be in the %SYS namespace. See the Create a Filter Routine section for details. Days of old copied journals to keep Enter the number of days to keep the shadow copies of the source journal files. By default, the Caché destination purges its copy of a journal file as soon as it finishes dejournaling as long as it does not contain open transactions. You can keep the shadow copies of the journal files on the destination longer by entering a value in this field. For example, if you enter 3, the shadow copy of a source journal file is eligible for purging if the source journal file is at least three days old, is completely dejournaled, and does not contain open transactions. The completion date of the source journal file determines its age. Maximum error messages to keep Enter the number of shadowing errors from 0 to 200 which Caché should retain. The default is 10. SSL Configuration Choose from the list of existing client configurations; leave this entry blank if you do not wish to use SSL for the shadow connection. The shadow connection to the source fails if either of the following two conditions exist: - The source does not support SSL, but you choose an SSL configuration. - The source requires SSL, but you do not choose an SSL configuration. 6. Click Save to enable the database mapping portion of the page. See the Map the Databases section for details. Caché High Availability Guide 95

106 Shadow Journaling Create a Filter Routine If you indicate a filter routine, the shadow dejournaling process runs it from the %SYS namespace. Your filter routine should take the following format: MyShadowFilter(pid,dir,glo,type,addr,time) The following table describes how the shadowing process passes the input values to the filter routine. Argument pid dir glo type addr time Description Process ID of the record (If the record has a nontrivial remote system ID, the pid contains two fields delimited by a comma (,): the first field is the process ID and the second is the remote system ID. Source (not shadow) database directory Global reference in the form of global(subscripts), without the leading ^ Type of the record; valid values are: S (SET), s (BITSET), K (KILL), k (ZKILL) Offset of the record in the journal file Timestamp of the record In the filter routine logic, return 0 for the dejournaling process to skip the record; otherwise the shadow dejournals the record. CAUTION: Perform the New command on any local variable in the filter routine to avoid accidentally overwriting the variables used in the shadow routine. The following sample filter routine skips all journal records for the global ^X during the dejournaling process and logs each record that is dejournaled: MyShadowFilter(pid,dir,glo,type,addr,time) If $Extract($QSubscript(glo,0))="X" Quit 0 ;shadow filter routine ;skip X* globals Do MSG^%UTIL(pid_","_dir_","_glo_","_type_","_addr_","_time,1,0) ;log Quit 1 Note: If you use dot syntax when referring to a global in your filter routine, you must use the leading ^. You can specify to use a filter routine in one of the following two ways: 96 Caché High Availability Guide

107 From the [Home] > [Configuration] > [Shadow Server Settings] page when you choose to Add a New Server or Edit an existing shadow, enter the name in the Filter routine box in the Advanced settings. Alternatively, you can set the global node ^SYS("shdwcli",<shdw_id>,"filter") to the name of the filter routine (without the leading ^), where <shdw_id> is the name of the shadow. For example: Set ^SYS("shdwcli","MyShadow","filter")="MyShadowFilter" Configuring Shadowing In this example, MyShadow is the name of the shadow and MyShadowFilter is the name of the filter routine Map the Databases After you successfully save the configuration settings, you can add or delete database mappings from the source to the shadow: 1. Next to Database mapping for this shadow click Add to associate the database on the source system with the directory on the destination system using the Add Shadow Mapping dialog box. 2. In the Source database directory box, enter the physical pathname of the source database file the CACHE.DAT file. Enter the pathname of its corresponding destination shadow database file in the Shadow database directory box, and then click Save. 3. Verify any pre-filled mappings and click Delete next to any invalid or unwanted mappings. Shadowing requires at least one database mapping to start. 4. Click Close to return to the [Home] > [Configuration] > [Shadow Server Settings] page. See the Shadowing the Source Manager s Directory section for special instructions if you want to shadow the CACHESYS database. If the source database server is part of a cluster, the configuration settings for the destination shadow differ slightly. For information on shadowing a clustered system, see the Cluster Shadowing section of the Cluster Journaling chapter of this guide. Shadowing the Source Manager s Directory In this release you can use the CACHESYS database as a source database of shadowing, provided that the target (shadow database) is not the CACHESYS database on the shadow. Currently the only way to add a database mapping containing the source manager s directory (CACHESYS) to a shadow configuration is by using the SYS.Shadowing.Shadow class API. For example: Caché High Availability Guide 97

108 Shadow Journaling Set ShadowOref=##class(SYS.Shadowing.Shadow).%OpenId("MyShadow") Do ShadowOref.SetDatabaseToShadow("C:\MyCache\Mgr","D:\MyCacheShdw\Shdwsys") Set rc=shadoworef.%save() Where C:\MyCache\Mgr is the source manager s directory for the CACHESYS database and D:\MyCacheShdw\Shdwsys is the directory for a database that is not the CACHESYS database on the destination. See the SYS.Shadowing.Shadow entry in the Caché Class Reference for details Journaling on the Destination Shadow InterSystems recommends you journal all databases that are the destination of shadowing. This results in the destination shadow maintaining a journal of applied updates, which provides an additional level of redundancy. Important: Be careful not to place these journals in the same directory as the journal files coming over from the source database server. To mitigate the increased demand on the storage capacity of the shadow, Caché shadow purges the destination shadow copy of a source journal file once it is dejournaled as long as it does not contain any transactions open on the shadow and it meets the configured purge criteria. See the description of the Days of old copied journals to keep field in the Define the Shadow section. If you decide not to journal the destination shadow databases, you must also disable journaling on its CACHESYS database. Caché stores the journal address and journal file name of the journal record last processed by shadowing in the ^SYS global in the CACHESYS database. This serves as a checkpoint from which shadowing resumes if shadowing fails. CAUTION: On the shadow destination, if you journal the CACHESYS database, but not the destination shadow databases, there is the possibility that if the shadow crashes and is restarted, the checkpoint in CACHESYS could be recovered to a point in time which is later in the journal stream than the last record committed to the shadow databases. As a result, the resumed shadow may effectively skip processing some journal entries. 4.3 Managing and Monitoring Shadowing Caché provides an interface to shadow processing through the System Management Portal. You can also configure and manage shadow processing using the ^SHADOW utility or the SYS.Shadowing API classes, SYS.Shadowing.DataSource and SYS.Shadowing.Shadow. This document describes the procedures using the System Management Portal and gives some examples of using the shadowing APIs. 98 Caché High Availability Guide

109 A shadow can be in one of three states; depending on the state, you can perform different actions on the shadow. The following sections describe each state and action including the interrelationships among them. Shadow States A shadow can be in one of these states at any given time: Stopped When a shadow is stopped, you can modify its properties. This is the initial state of a newly created shadow. Processing When a shadow is running, it applies database updates and you cannot modify its properties. Suspended When a shadow is suspended, it does not apply database updates but retains checkpoints. You can modify its properties, though some changes may not take effect immediately. See the Shadow Checkpoints section for details. Shadow Actions Managing and Monitoring Shadowing There are four types of allowable actions you can perform on a shadow, depending on its current state and your user privileges: Start / Restart Starts a stopped shadow from the starting point specified using Select Source Event or, in the case of a restart, from the appropriate checkpoint. Stop (with or without rollback) Stops a processing or suspended shadow. When you stop shadow processing, Caché offers you the choice whether or not to roll back any open transactions. Suspend Suspends a processing shadow. Contrary to stopping a shadow, when you suspend a shadow, Caché maintains its open transactions, journal files, and checkpoints. Resume Resumes a suspended shadow from where it left off. When a fatal error occurs, a shadow aborts, entering the suspended state. The following diagram shows the permissible actions on a shadow in each state. It indicates the shadow states with circles and shows the actions you can perform on these states with arrows. Caché High Availability Guide 99

110 Shadow Journaling Relationships of Shadow States and Permissible Actions Shadow Processing Considerations Keep the following conditions in mind when deciding when and how to change the state of a shadow: A stopped shadow does not start or restart automatically with a Caché restart; you must start or restart it explicitly as described in this chapter. Conversely, on Caché startup, a shadow that was not in stopped state in the previous Caché session resumes automatically. You cannot start a suspended shadow; you must either resume processing or stop the shadow and then start it. Avoid choosing to restart a shadow after you stop it with rollback. The shadow databases may be in an undeterminable state until the shadow reaches the journal location of the last stop. There are two places in the System Management Portal that you can perform tasks on a defined shadow: The Shadow Administration Tasks require system manager privileges. From the Configuration menu under System Administration tasks (the [Home] > [Configuration] page), click Shadow Server Settings in the Connectivity column. The Shadow Operation Tasks require operator privileges. From the Operations menu on the [Home] page, choose Shadow Servers and then click This System as Shadow Server. See the individual method descriptions in the SYS.Shadowing.Shadow entry of the Caché Class Reference for details on performing these shadowing tasks programmatically. 100 Caché High Availability Guide

111 Managing and Monitoring Shadowing Shadow Checkpoints Caché creates checkpoints periodically throughout the shadowing process. A checkpoint for the shadow is a location in the shadow copy of a source journal with the following implications: 1. All records at and prior to it are presumed to have been applied to the shadow databases. 2. It is safe, as far as database integrity is concerned, for the shadow to resume from the checkpoint after being suspended. You can retrieve checkpoint information using the CheckPointInfo method of the SYS.Shadowing.Shadow class Shadow Administration Tasks The [Home] > [Configuration] > [Shadow Server Settings] page lists each defined shadow with the name, status, source name and port, start point, filter, and choices for performing actions on the shadow configuration. Click the following options to perform the indicated task: Edit Allows updates to the fields you entered when you added a new shadow. See Define the Shadow for descriptions of these settings. You cannot save edits if the shadow is processing. Start Starts shadow processing from a start point you select; option available if the shadow is stopped. See Start Shadowing for details. Restart Starts shadow processing from the appropriate checkpoint depending on whether or not you rolled back any open transactions when you stopped the shadow. See Restart Shadowing for details. Stop Stops shadow processing; option available if the shadow is processing or suspended. Select the Roll back open transactions check box if you want to roll back any open transactions. See Stop Shadowing for details. Delete Deletes the entire shadow definition; you must stop the shadow before deleting the shadow definition Start Shadowing Once you add a shadow definition it appears in the list of shadows on the [Home] > [Configuration] > [Shadow Server Settings] page. You can start a shadow from this page: 1. Before starting the shadowing process, verify you have synchronized the databases you are shadowing on the source and destination and mapped the source databases to the corresponding destination databases. 2. Click Start in the row for the shadow name you want to start. Caché High Availability Guide 101

112 Shadow Journaling 3. After verifying the location information for the source instance, click Select Source Event to choose where to begin shadowing. A page displays the available source events from the source journal file directory. You must select a source event before you can start shadowing. See the Select a Source Event section for details. From the System Management Portal, if you attempt to start a shadow that had been processing, you may see the following warning: *** WARNING *** There is a checkpoint from previous shadowing session. You might want to RESTART the shadow from that checkpoint instead. If you do START, the checkpoint and any remaining shadow copies of journal files from previous shadowing session will be deleted. Are you sure you want to start shadowing? As this warning states, if you start a previously processing shadow, Caché clears all checkpoints and stores the start point you select as the first checkpoint. This process also purges any remaining shadow copies of the source journal files and fetches new copies form the source regardless of a possible overlap between the files. Ensure your new start point coincides with the state of the shadow databases. If you run multiple shadows on an instance of Caché, see the Generic Memory Heap Considerations section for details on adjustments you may have to make. Select a Source Event While starting a destination shadow, you must select a source event from the journal files on the data source server where shadowing of the journaled databases should begin. Click Select Source Event to display a list of journal events on the source database that are valid starting points for shadowing. From this list, click the time to specify at which source event shadowing starts. Choose the starting point after which you synchronized the databases on the source and destination. For example, Caché automatically switches the journal file after a successful backup. Before starting the shadowing process, synchronize the databases by restoring the successful backup file from the source on the destination shadow databases. On the shadow, click Select Source Event from the configuration page to see events listed similar to those in the following display. 102 Caché High Availability Guide

113 Managing and Monitoring Shadowing For this example, to start shadowing at the point when the source backup ended successfully (the point of database synchronization), click the Time ( :30:54), of the Event displaying end of backup (). Generic Memory Heap Considerations The journal reader and database update processes on the shadow destination communicate via shared memory allocated from the generic memory heap (also known as gmheap or GenericHeapSize). Several processes in Caché use this memory and how Caché allocates it involves very complex interactions; therefore, Caché silently increases gmheap allocation during startup when necessary. In most cases, you should not have to manually adjust the allocation. If you start multiple shadows at or near the same time while Caché is running, you may receive a gmheap allocation error. You can improve the allocation by starting the shadows as a group. If you start (or resume) multiple shadows one by one consecutively, the first shadow to start uses about half of the free gmheap memory; the second, half of what remains; and so on. In contrast, if you start multiple shadows as a group, every shadow in the group uses 1/(N+1) of the free gmheap memory, where N is the number of the shadows in the group. Thus, starting multiple shadows as a group not only avoids the possible error allocating memory from gmheap, but also allocates memory evenly among the shadows. See the StartGroup method in the SYS.Shadowing.Shadow entry of the Caché Class Reference for more information. You can adjust the gmheap size by changing the setting of GenericHeapSize from the [Home] > [Configuration] > [Advanced Settings] page of the System Management Portal Restart Shadowing If you stopped shadowing, in addition to the choice of starting the shadow, you can restart the shadow. Processing starts from the last checkpoint taken before you stopped the shadow if you chose not to roll back open transactions. If you chose to roll back when you stopped the shadow, shadow processing begins at the checkpoint prior to the earliest open transaction when the shadow stopped. A shadow restart reuses the journal files retained from the last time you stopped the shadow. Caché High Availability Guide 103

114 Shadow Journaling Stop Shadowing When you stop shadowing you can choose to roll back or not to roll back any open transactions by selecting or clearing the Roll back open transactions check box. Stop without Rollback If you choose not to roll back, it is similar to suspending a shadow, but requires more privileges. You may choose this option if you want to maintain the current checkpoint, but you do not want Caché to automatically resume the shadow at restart, which does happen to a suspended shadow. For example, if you have to make configuration changes that require a Caché restart and additional changes after Caché is up, but before the shadow should start, use this option. Stop with Rollback This option is mainly for disaster recovery. The rollback sets the shadow databases to a logically consistent state, though out of sync with the source. Avoid restarting a shadow that you stopped and rolled back. You have the option open to you so that you can recover if it was a mistake to choose the rollback option; thus avoiding the need to resynchronize the shadow with the source. CAUTION: Restarting a shadow that you stopped with rollback may leave the shadow in an indeterministic state until the shadow has progressed beyond the point of the prerollback state Shadow Operations Tasks You can monitor the shadowing operation status from both the source and destination servers of the shadow. From the System Management Portal, navigate to the [Home] > [Shadow Servers] page. On this page you choose the appropriate option depending on whether you are monitoring the shadowing process from the shadow side or the data-source side. The following sections detail the contents of each side: Managing the Destination Shadow Monitoring the Data Source Managing the Destination Shadow You can monitor and manage the shadow process from the destination shadow. Click This System as Shadow Server to display a list of source servers for this shadow machine and associated actions you can perform on each item. The [Home] > [Shadow Servers] > [Shadows] page lists each defined shadow with the information described in the following table. Field Name Description Name of the shadow. 104 Caché High Availability Guide

115 Managing and Monitoring Shadowing Field Status Checkpoint Errors Open Transactions Latency Description One of three shadowing states described previously in this section: stopped, processing, suspended. You may see trying to connect as status when you initiate processing. Offset location in the shadow copy of the journal where it is safe to resume processing. Number of errors reported on the shadow destination. Indicates whether or not there are open transactions on the shadow and, if so, how many. Estimated time for the shadow to process the journal records that it copied from the source but has not yet applied to the shadow databases. Click the following options to perform the indicated task: Details displays selected details of this shadowing configuration. Resume resumes shadow processing; option available if you have previously suspended shadow processing. Suspend suspends shadow processing; option available if the shadow is processing. Errors displays a list of errors occurring on the destination shadow. Caché retains the details of the number of errors you indicate in the configuration of the shadow. Click Delete to clear these errors and return to the list of shadows Monitoring the Data Source You can also monitor the shadow process on the source system from the Data Source column of the [Home] > [Shadow Servers] page: Click This System as Data Source to display a list of shadows defined for this data source. The [Home] > [Shadow Servers] > [Data Source] page lists each defined shadow with the details described in the following table. Field Port Shadow IP Journal Description Superserver port number of the Caché source instance (also shows process ID). IP address of the shadow destination machine. Full directory path and file name of journal file currently being copied. Caché High Availability Guide 105

116 Shadow Journaling Field PID Latency Shadowing Rate Description Process ID number of the shadow journal copying process. Estimated time for the shadow to catch up copying the source journal file. This is not the latency of shadow dejournaling, which is available on the destination side. Rate in KBs per second that the shadow copies the source journal files. Click Error Log to display the [Home] > [Shadow Servers] > [Data Source Errors] page which lists errors reported on this data source. You can also obtain this information programmatically. See the SYS.Shadowing.DataSource entry of the Caché Class Reference for details. 4.4 Using the Shadow Destination for Disaster Recovery The makers of operating systems and computer hardware provide several appealing failover strategies for high availability. Caché is compatible with these strategies (see the System Failover Strategies chapter in this guide for more information). In addition, Caché provides capabilities to support Disaster Recovery strategies. Caché shadowing provides low-cost logical data replication over heterogeneous network configurations. Shadowing is a good low-cost solution for off site disaster recovery. The following list outlines some of the major reasons why: Data loss is typically small a few seconds to a few minutes. You can locate the shadow far away from the primary location. Time to recover is minutes. Shadowing is a good mechanism to recover from disk failure, database degradation due to hardware or software failure, and destruction of the primary physical plant. Shadowing, however, cannot recover from malicious deletion of globals. A Caché shadow server can apply journals from several dissimilar platforms on a small-scale server over any TCP network. Since shadowing only conveys logical updates to the destination, it eliminates the risk of proliferating any structural problem. You should consider the following limitations before deciding if Caché shadowing best suits your Disaster Recovery strategy. The shadow server applies production journals asynchronously so as not to affect performance on the production server. This results in possible latency in data applied to the shadow server, 106 Caché High Availability Guide

117 although it is generally seconds behind at most. Consequently, if you want to use the shadow server databases, they might be slightly out of date. This latency could increase if the shadow server connection with the production server is lost for any sustained period. Caché provides mechanisms to monitor the state and progress of the shadow server to help you determine the risk of using the shadow server databases during disaster recovery. Open transactions may remain. You can choose whether or not to roll back any incomplete transactions when you stop a shadow, which may depend on the state of the source journal files at the time of the disaster. Enabling the shadow server to replace the production server is not automatic. The following procedure highlights how you might recover to the shadow server. If your database system functions as an application server, install identical applications on your shadow system to speed recovery. To use your shadow system as a master database: 1. Follow the procedure for stopping shadowing on the shadow server. 2. Stop Caché and change the IP address and fully qualified domain name (FQDN) of the shadow system so that it exactly matches the original database system. 3. Restart Caché. Using the Shadow Destination for Disaster Recovery Caché High Availability Guide 107

118

119 5 System Failover Strategies Caché fits into all common high-availability configurations supplied by operating system providers including Microsoft, IBM, HP, and EMC. Caché provides easy-to-use, often automatic, mechanisms that integrate easily with the operating system to provide high availability. There are four general approaches to system failover. In order of increasing availability they are: No Failover Cold Failover Warm Failover Hot Failover Each strategy has varying recovery time, expense, and user impact, as outlined in the following table. Approach Recovery Time Expense User Impact No Failover Unpredictable No cost to low cost High Cold Failover Minutes Moderate Moderate Warm Failover Seconds Moderate to high Low Hot Failover Immediate Moderate to high None There are variations on these strategies; for example, many large enterprise clients have implemented hot failover and also use cold failover for disaster recovery. It is important to differentiate between failover and disaster recovery. Failover is a methodology to resume system availability in an acceptable period of time, while disaster recovery is a methodology to resume system availability when all failover strategies have failed. Caché High Availability Guide 109

120 System Failover Strategies If you require further information to help you develop a failover and backup strategy tailored for your environment, or to review your current practices, please contact the InterSystems Worldwide Response Center (WRC). 5.1 No Failover With no failover in place your Caché database integrity is still protected from production system failure. Structural database integrity is maintained by Caché write image journal (WIJ) technology; you cannot disable this. Logical integrity is maintained through global journaling and transaction processing. While global journaling can be disabled and transaction processing is optional, InterSystems highly recommends using them. If a production system failure occurs, such as a hardware failure, the database and application are generally unaffected. Disk degradation, of course, is an exception. Disk redundancy and good backup procedures are vital to mitigate problems arising from disk failure. With no failover strategy in place, system failures can result in significant downtime, depending on the cause and your ability to isolate and resolve it. If a CPU has failed, you replace it and restart, while application users wait for the system to become available. For many applications that are not businesscritical this risk may be acceptable. Customers that adopt this approach share the following common traits: Clear and detailed operational recovery procedures Well-trained, responsive staff Ability to replace hardware quickly Disk redundancy (RAID and/or disk mirroring) Enabled global journaling 24x7 maintenance contracts with all vendors Expectations from application users who tolerate moderate downtime Management acceptance of risk of an extended outage Some clients cannot afford to purchase adequate redundancy to achieve higher availability. With these clients in mind, InterSystems strives to make Caché 100% reliable. 110 Caché High Availability Guide

121 Cold Failover 5.2 Cold Failover A common and often inexpensive approach to recovery after failure is to maintain a standby system to assume the production workload in the event of a production system failure. A typical configuration has two identical computers with shared access to a disk subsystem. After a failure, the standby system takes over the applications formerly running on the failed system. Microsoft Windows Clusters, HP MC/ServiceGuard, Tru64 UNIX TruClusters, OpenVMS Clusters, and IBM HACMP provide a common approach for implementing cold failover. In these technologies, the standby system senses a heartbeat from the production system on a frequent and regular basis. If the heartbeat consistently stops for a period of time, the standby system automatically assumes the IP address and the disk formerly associated with the failed system. The standby can then run any applications (Caché, for example) that were on the failed system. In this scenario, when the standby system takes over the application, it executes a pre-configured start script to bring the databases online. Users can then reconnect to the databases that are now running on the standby server. Again, WIJ, global journaling, and transaction processing are used to maintain structural and data integrity. Customers generally configure the failover server to mirror the main server with an identical CPU and memory capacity to sustain production workloads for an extended period of time. The following diagram depicts a common configuration: Cold Failover Configuration Caché High Availability Guide 111

122 System Failover Strategies State of PROD FUNCTIONAL OUT OF SERVICE IP address of PROD N/A IP address of STDBY Note: Shadow journaling, where the production journal file is continuously applied to a standby database, includes inherent latency and is therefore not recommended as an approach to high availability. Any use of a shadow system for availability or disaster recovery needs should take these latency issues into consideration. 5.3 Warm Failover The warm failover approach exploits a standby system that is immediately available to accept user connections after a production system failure. This type of failover requires the concurrent access to disk files provided, for example, by OpenVMS clusters and Tru64 UNIX TruClusters. In this type of failover two or more servers, each running an instance of Caché and each with access to all disks, concurrently provide access to all data. If one machine fails, users can immediately reconnect to the cluster of servers. A simple example is a group of OpenVMS servers with cluster-mounted disks. Each server has an instance of Caché running. If one server fails, the users can reconnect to another server and begin working again. 112 Caché High Availability Guide

123 Hot Failover Warm Failover Configuration State A B C Normal 300 users 300 users 300 users B fails 300 users 0 users 300 users B users log on again 450 users 0 users 450 users The 600 users on A and C are unaware of B's failure, but the 300 users that were on the failed server are affected. 5.4 Hot Failover The hot failover approach can be complicated and expensive, but comes closest to ensuring 100% uptime. It requires the same degree of failover as for a cold or warm failover, but also requires that the state of a running user process be preserved to allow the process to resume on a failover server. One approach, for example, uses a three-tier configuration of clients and servers. Caché High Availability Guide 113

124 System Failover Strategies Hot Failover Configuration Thousands of users on terminal browsers connect through TCP sockets to a bank of application servers. Each application server has a backup server ready to automatically start in case of a server failure. In turn, the application servers are each connected to a bank of data servers, each with its own backup server. If a data server fails, any application server waiting for a response automatically resubmits its request to a different data server while the backup server is started. Similarly, any user terminal that sends a request to an application server that fails automatically reissues its request to an alternate application server. 114 Caché High Availability Guide

125 6 Caché Cluster Management This chapter contains information about cluster management in Caché. It discusses the following topics: Overview of Caché Clusters Configuring a Caché Cluster Managing Cluster Databases Caché Startup Write Image Journaling and Clusters Cluster Backup System Design Issues for Clusters Cluster Application Development Strategies Caché ObjectScript Language Features DCP and UDP Networking Caché clusters can be configured on both the OpenVMS and Tru64 UNIX platforms. This chapter contains information about cluster management in general for both platforms and some specifics for OpenVMS. For more detailed information on other cluster-related topics, please see: Cluster Journaling Caché Clusters on Tru64 UNIX Caché and Windows Clusters ECP Failover Caché High Availability Guide 115

126 Caché Cluster Management 6.1 Overview of Caché Clusters Caché systems may be configured as a cluster. Cluster configurations provide special benefits to their users: Users can invisibly share disk storage and printers or maintain private access to these resources. Users can share queues. Cluster software can be configured to search for the least used resource, maximizing usage of resources while simultaneously increasing throughput. In a cluster environment, each computer executes its own copy of software. A Caché cluster is identified by its pre-image journal (PIJ) directory. Nodes that specify the same PIJ directory are all part of a cluster. A cluster session begins when the first cluster node starts and ends when the last cluster node shuts down. You can customize the networking capabilities of Caché to allow for cluster failover: if one computer in the cluster goes down, the remaining members continue to function without database degradation. Cluster members can share databases; you can connect the computers in a cluster in the following ways: Special purpose hardware, such as Memory Channels and Gigabit Ethernet, for high speed communication SCSI bus based clusters Ethernet cables, for lower cost A combination of the above The functionality provided is the same, regardless of which connection mechanisms are used. The following are system specifications for a cluster configuration: Maximum number of cluster nodes in a cluster: 14 Maximum number of cluster-mounted databases: approximately Cluster Master The first node running Caché that joins the cluster by attempting to mount a database in cluster mode becomes the cluster master. The cluster master performs the following functions: Acts as a lock server to all cluster-mounted databases Coordinates write image journaling cluster-wide 116 Caché High Availability Guide

127 Overview of Caché Clusters Manages cluster failover If the cluster master fails or shuts down, the next node that joined the cluster becomes the cluster master and assumes these functions. A node joins a cluster when it starts its ENQ daemon system process (ENQDMN). Caché activates this process the first time a node attempts to cluster-mount a database. At the same time, it also creates the Recovery daemon (RECOVERY) on a node to manage cluster failover. Caché only creates the ENQDMN and RECOVERY system processes on systems that join a cluster. The ENQ daemon uses the cluster-wide PIJ file, which you specify from the [Home] > [Configuration] > [Advanced Settings] page of the System Management Portal. Each node in the cluster must specify the same location for this file. Caché uses the PIJ file to support cluster failover, recovery, and write image journaling in a cluster. See Configuring a Caché Cluster for details Cluster Master as Lock Server The cluster master acts as a lock server by managing access to the cluster-mounted database (CACHE.DAT) files. Applications that run in a cluster must have mechanisms to coordinate access from multiple cluster nodes to cluster-mounted databases. Caché accomplishes this at two levels: Block-level Locks Caché manages block-level access to shared databases on disk for Caché applications running in a cluster environment. It prevents one node from reading or modifying a block from a disk which is simultaneously being changed in the memory of another node. Multiple nodes can read the same block, but only one can update it at a time. Caché manages these simultaneous access requests at the block level with the Distributed Lock Manager (DLM), using the ENQ daemon (ENQDMN). Caché ObjectScript Level Locks While each cluster member can directly access clustered databases, no member can independently process Caché ObjectScript Lock commands for clustered databases. The cluster master acts as a lock server by coordinating all Caché ObjectScript Lock requests to maintain the logical integrity of the cluster-mounted database. Caché servers communicate these requests to the cluster master via network connections. Thus, ECP must be running on each computer that is participating in the cluster. Even if an application issues a Lock command to a global using the extended bracket syntax of [dir_name,dirset_name], or via a namespace mapped to a cluster mounted database, the cluster master processes the command. If you need to coordinate multiple global updates, you must use the Lock command when updating globals in cluster-mounted databases. Caché journaling technology uses lock information to coordinate updates to these databases so that journal restores work correctly in the event of cluster failover or recovery after a cluster crash. Caché High Availability Guide 117

128 Caché Cluster Management 6.2 Configuring a Caché Cluster If you are running in a Caché cluster, you must set up a network configuration. It is easiest to set up your network configuration prior to creating your system configuration. Set up a Caché server configuration that links all the computers in your Caché cluster using ECP. See the Configuring Distributed Systems chapter of the Caché Distributed Data Management Guide for details. By default, a Caché instance acts as a single system until you set specific cluster-related settings. On each cluster node, perform the following procedure: 1. Navigate to the [Home] > [Configuration] > [Advanced Settings] page of the System Management Portal. 2. Select Clusters from the Category list. 3. Enter a value for CommAddr, the IP Address to advertise in the PIJ to the other cluster members. If the node you are configuring is using multiple IP addresses, see the Multiple Network Device Configuration section for additional information. 4. Set JoinCluster to true. 5. Enter the PIJDirectory location; this must be the same on each cluster node and is required when JoinCluster is set to true. The directory must exist. 6. You must click Apply Changes and restart Caché for this information to take effect. For detailed information on each of these settings see the corresponding entries in the Caché Advanced Configuration Settings Reference: CommAddr JoinCluster PIJDirectory Multiple Network Device Configuration If your network configuration contains multiple network devices you must make sure that each cluster node is identified in every other cluster node. For communication within the cluster, enter the host name or private IP address in the CommAddr field. Caché converts the IP address and stores the machine name in the PIJ file. If you use the node name, it must resolve to the same network segment on all cluster nodes. For communication with clients that are not part of the cluster, enter the public IP address of the node in the ClientNodeName field using the following procedure: 118 Caché High Availability Guide

129 1. Navigate to the [Home] > [Configuration] > [Advanced Settings] page of the System Management Portal. 2. Select Miscellaneous from the Category list. Managing Cluster Databases 3. Enter a value for ClientNodeName, the node name for this Caché server. The name you enter here is what is recorded in the PIJ. 4. You must click Apply Changes and restart Caché for this information to take effect. For detailed information on this setting see the corresponding entry in the Caché Advanced Configuration Settings Reference: ClientNodeName 6.3 Managing Cluster Databases The following sections provide information on database management within a cluster-networked system: Creating Caché Database Files Mounting Databases Most examples are for Caché shared-all clusters on the OpenVMS platform Creating Caché Database Files In a cluster, all CACHE.DAT database files must be created on disks that are cluster-accessible at the system level. Enter the name of the OpenVMS directory where you wish to create a new CACHE.DAT file. If you are using a directory on the current disk, enter only the directory name. If you are using a directory on another disk, or a disk connected to another cluster node, enter the device name and the directory name as follows: device_name:[dirname] On an OpenVMS cluster, the device portion contains a controller name, regardless of whether the CACHE.DAT directories are cluster mounted. For disks that are physically connected to one node in the cluster, the name is the same as the (system communications services) SCS node name of the computer serving the disk and the parts are separated by a dollar sign. For example: DKA100, if physically served by node TEST, is known as TEST$DKA100:. Caché expands DKA100: to TEST$DKA100:. Caché High Availability Guide 119

130 Caché Cluster Management If the disk is served by an independent controller array, it has a number and is both preceded and separated by dollar signs. For example: DKA100: on cluster controller 1 is $1$DKA100: Mounting Databases When you create a database you select whether or not to mount it a startup; the default is to mount at startup. Databases must be mounted explicitly via the mount command or at system startup via the database mount list, which is part of the network dataset configuration. In a Caché cluster, all Caché databases must be on a disk that is mounted as cluster-accessible to all members of the Caché cluster, even if they are privately mounted at the Caché level. Include the following in your database mount list: All databases to be cluster-mounted by any cluster node. Any database namespace that contains implicitly mapped globals. Caché mounts newly created databases privately. You can remount private databases as cluster databases using the mount utility. From the [Home] > [Databases] page of the System Management Portal click Dismount and then Mount. Mark cluster databases to be mounted at startup using the following procedure: 1. Navigate to the [Home] > [Configuration] > [Local Databases] page of the System Management Portal. 2. Click Edit in the appropriate database row. 3. Select the Mount Required at Startup check box Deleting a Cluster-Mounted Database You cannot delete a cluster-mounted database (CACHE.DAT) file. If you attempt to delete it, you see the following message: ## ERROR while Deleting. Cannot delete a cluster-mounted database You must dismount or privately mount the database before you can delete it. 6.4 Caché Startup Once the network configuration is determined, the Caché startup procedure does the following: 120 Caché High Availability Guide

131 Write Image Journaling and Clusters 1. Performs network initialization operations, including activation of the network daemons. 2. Mounts databases configured with the Mount Required at Startup check box selected. Caché displays information about each database it mounts. For example: Directory Mode VMS1$DKA0:[SYSM.V6D1-9206A] PvtVMS$DKA0:[DIR2] Clu If mount error conditions occur, they are reported to the terminal and the cconsole.log. If the ENQ daemon fails to start, see the cconsole.log. The first node to activate its ENQ daemon by cluster-mounting a Caché database becomes the cluster master for each cluster member. Normally, you include all cluster-mounted databases in the Database Mount List and they are mounted at startup. Startup pauses with a message if you attempt to join a cluster during cluster failover. 6.5 Write Image Journaling and Clusters Caché write image journaling allows remaining cluster members to continue to function without database degradation or data loss if one cluster member goes down. In the cluster environment, the Write daemon on the first node to cluster-mount a database becomes the master Write daemon for the cluster; it creates the cluster-wide journal file, named CACHE.PIJ. In addition, each node, including the master, has its own image journal file called CACHE.PIJxxx. In a cluster environment, writes throughout the entire cluster freeze until the cause of the freeze is fixed. For more information, see the Write Image Journaling and Recovery chapter of the Caché High Availability Guide. 6.6 Cluster Backup For privately mounted databases in a cluster, backups and journaling are the daily operations that allow you to recreate your database. In the event of a system failure that renders your database inaccessible, you can restore the backups and apply the changes in the journal to recreate it. CAUTION: Always run a backup for the cluster mounted databases from the same machine in a cluster so the backup history is complete. Caché stores this information in a global in the manager s database. Caché High Availability Guide 121

132 Caché Cluster Management If you are doing a full backup on a database that is mounted in cluster mode from multiple computers, always perform the back up from the same computer. This maintains an accurate backup history for the database. The BACKUP utility permits you to back up and restore databases that are shared by multiple CPUs in a cluster environment. Note: For cluster-mounted databases, InterSystems recommends another backup strategy, such as volume shadowing. Concurrent backup also works with clusters. All databases must be mounted before you can back them up. The backup utility mounts any databases needed for the backup. It first tries to mount them privately; if that action fails, it mounts them for clustered access. If a private mount fails and the system is not part of the cluster, or if the cluster mount fails, then you cannot back up the database. You receive an error message, and can choose whether to continue or stop. When backing up cluster-mounted databases, BACKUP must wait for all activity in the cluster to cease before it continues. For this reason, clustered systems may be suspended slightly longer during the various passes than when you back up a single node. The DBSIZE utility gives you the option of suspending the system while it makes its calculation. It also lets you suspend the cluster if any of the databases in the backup list is cluster-mounted when the calculation takes place. The incremental backup software uses a Lock to prevent multiple backups from occurring at the same time. This method does not work across a cluster. You must ensure that only one backup at a time runs throughout an entire cluster whose members share the same database. The DBSIZE utility uses the same internal structures as the BACKUP utility. DBSIZE tests the lock used by BACKUP. However, the same restriction applies: do not run DBSIZE on one cluster member while another cluster member is running a backup. Otherwise, the backup will not be intact, and database degradation may result when you restore from that backup. 6.7 System Design Issues for Clusters Please be aware of the following design issues when configuring your Caché cluster system. You may need to adjust your system parameters for your clustered system. Please reference the appropriate platform appendix of the Caché Installation Guide for recommended calculations for system parameters. 122 Caché High Availability Guide

133 6.7.1 Determining Database File Availability Cluster Application Development Strategies In order to properly mount the database files to function most efficiently in the cluster, determine which CACHE.DAT files need to be available to all users in the cluster. Mount these in cluster mode from within Caché. All WIJ, PIJ, and journal files must be on cluster-mounted disks. Also determine which CACHE.DAT files only needed by users on only one cluster node. Mount these privately or specify that they are automatic the system mounts them on reference. 6.8 Cluster Application Development Strategies The key to performance in a cluster environment is to minimize disk contention among nodes for blocks in cluster-mounted directories Block Level Contention If a directory is cluster-mounted, all computers can access data in it with a simple reference. More than one computer can access a given block in the database at a time to read its data. However, if a computer wants to update a block, all other computers must first relinquish the block. If another computer wants to access that block prior to the completion of the Write daemon cycle, the computer that did the update must first write the changed block to disk (in such a way as the action can be reversed if that computer goes down). The other computers can again read that block until one of them wants to modify it. If there is a great deal of modification done to a database from all cluster members, a significant amount of time-consuming I/O processing occurs to make sure each member sees the most recent copy of a block. You can use various strategies to minimize the amount of disk I/O when a particular database is modified by multiple computers. Mount Directories Privately If a database is not used frequently by other nodes, mount the database privately from the node which uses it most frequently. When other nodes need to access it, they can use a remote network reference. Use Local Storage of Counters Contention for disk blocks is most common in the case of updating counters. To minimize this problem, code your applications so that groups of counters (for example, 10) are allocated per remote request to a local private directory. Thereafter, whenever a local process needs a new counter index number, it first checks the private directory to see if one of the ten is available. If not, it then goes to allocate Caché High Availability Guide 123

134 Caché Cluster Management a new set of ten counters from the clustered directory. You can use $INCREMENT to update counters and retrieve the value in a single operation. Note: This is also a good strategy for nonclustered networked systems. In addition to reducing contention when accessing counters, this technique also enhances access of records that use those counters. Since a system obtains contiguous counters, block splitting combined with the Caché collating sequence work causes records created by different nodes to be located in different areas of the database. Therefore, processes on different nodes perform their Set and Kill operations into different blocks and are no longer in contention, thus reducing disk I/O. 6.9 Caché ObjectScript Language Features The following sections provide information about Caché ObjectScript language features with implications for cluster mounted database systems Remote Caché ObjectScript Locks The following information details remote locks under Caché ObjectScript, with respect to a cluster environment Remote Lock Handling Information about remote locks is stored in two places: In the Lock Table on the system requesting the lock (the client) In the Lock Table on the system to which the lock request is directed (the server) The server for items in a cluster-mounted database is always the Lock Server (Cluster Master). When a process on the client system needs to acquire a remote lock, it first checks to see if an entry is already present in the client lock table, indicating that another process on that same computer already has a lock on the remote item. If a lock already exists for the desired global, the process queues up for that lock, just as it would for a local lock. No network transmissions are required. If the needed remote lock is not present in the client s lock table, the client process creates an entry in the local lock table and sends a network request for it. If the reference resolves to a careted lock in a cluster-mounted database, the lock request is automatically sent to the Cluster Master. 124 Caché High Availability Guide

135 DCP and UDP Networking Once the client process receives the lock acknowledgment from the remote computer, an entry identifying the process making the lock will be present both in its own (client) lock table and an entry identifying the remote computer (but not the process) which made the lock will exist in the server s lock table. If any of the network requests fail, the client process must remove all the locks from the local lock table. It must also send network unlock requests for any network locks it actually acquired when locking multiple items at one time. When a process has completed its update it issues an UNLOCK command. If it is an incremental unlock, it is handled in the local lock table. If it is the last incremental unlock, or if it is not an incremental unlock, then an unlock request is sent to the server. Note: If another process on the local machine has queued for the lock, rather than releasing the lock on the server, Caché may grant it to the waiting process. This is called lock conversion Remote Lock Commands by Extended Reference All extended references used in remote lock commands should use the same directory specification. This includes consistency between uppercase and lowercase. For example, VMS2$SYS is not equivalent to vms2$sys. If you use logicals, all processes and applications must use the same logical name, not just resolve to the same physical directory name. In addition, logicals must be defined the same way on all cluster members as well as by all processes running on each member. System managers and applications developers need to work together to maintain consistency. This limitation is consistent with the ANSI standard regarding Caché ObjectScript locks and remote reference syntax. Note: In a cluster, references to remote globals on cluster-mounted databases can be made as a simple reference. However, certain techniques you may want to use to minimize disk contention require the use of extended reference DCP and UDP Networking If you are configuring a cluster with the legacy DCP technology, InterSystems recommends that you always specify the local address in the network definition as the IP address which the other cluster members use to talk to the local machine. You can determine the default system IP address using the following commands: Caché High Availability Guide 125

136 Caché Cluster Management SET hostname=$system.server.hostname() ;get our host name SET ipaddr=$p($system.server.ipaddresses(hostname),",") ;lookup 1st IP address If UDP networking is being used for the cluster network traffic, a node cannot use the IP address when it joins the cluster. If the network definition for a system specifies as the local IP address, Caché attempts to determine the real IP address for this system when it tries to join a cluster. If this system has only one IP address, this is not an issue. If this system has multiple IP addresses, Caché picks the first one returned by gethostbyname() to use in the cluster. If this is the incorrect IP address, a cluster crash is declared. You can avoid this problem by always specifying the local address in the network definition. 126 Caché High Availability Guide

137 7 Cluster Journaling This chapter contains information about journaling on ECP-based shared-disk clustered systems in Caché. It discusses the following topics: Journaling on Clusters Cluster Failover Cluster Shadowing Tools and Utilities For related information, see the following chapters in this guide: Journaling Shadow Journaling Cluster Management Backup and Restore 7.1 Journaling on Clusters Journaling is necessary for cluster failover to bring the databases up to date and to use transaction processing. Each node in a cluster maintains its own journal files, which must be accessible to all other nodes in the cluster to ensure recoverability. A cluster session ID (CSI) is the time when the session begins, that is, the cluster start time, which is stored in the header of every journal file on a clustered system. Caché High Availability Guide 127

138 Cluster Journaling In addition to the information journaled on a nonclustered system, the following specifics apply to clustered systems: Updates to clustered databases are always journaled, (usually on the master node only) except for scratch globals. On a cluster database, even globals whose database journaling attribute is No are journaled regardless of whether they are updated outside or within a transaction. Database updates via the $Increment function are journaled on the master node as well as on the local node if it is not the master. Other updates are journaled locally if so configured. The journal files on clustered systems are organized using the following: Cluster Journal Log Cluster Journal Sequence Numbers The default location of the journal files is in the manager s directory of the Caché instance. InterSystems recommends isolating journal files from the database files (CACHE.DAT files) by changing the journal file location to a separate disk before any activity takes place on the system. CAUTION: Do not stop journaling in a cluster environment, although it is possible to do so with the ^JOURNAL routine. If you do, the recovery procedure is vulnerable until the next backup Cluster Journal Log Journal files used by members of a cluster are logged in a file, CACHEJRN.LOG, located in the cluster pre-image journal, or PIJ directory. It contains a list of journal files maintained by nodes while they are part of the cluster. Journal files maintained by a node when it is not part of the cluster may not appear in the cluster journal log. Here is an example of part of a log: 0,_$1$DRA1:[TEST.50.MGR.JOURNAL] ,_$1$DKA0:[TEST.50.MGR.JOURNAL] ,_$1$DRA1:[TEST.50.MGR.JOURNAL] ,_$1$DRA1:[TEST.50.MGR.JOURNAL] ,_$1$DRA1:[TEST.5A.MGR.JOURNAL] ,_$1$DRA1:[TEST.5A.MGR.JOURNAL] The first value in each comma-delimited row is the cluster system number (CSN) of the node to which the journal file, the second field, belongs. The log is useful for locating journal files of all the members of a cluster, especially the members that have left the cluster. The CSN of a node may change when it restarts. 128 Caché High Availability Guide

139 When a node joins the cluster its current journal file is added to the journal log. Processes that start journaling or switch journal files also add entries. The log is used in a cluster journal restore, by shadowing, and by the journal dump utility Cluster Journal Sequence Numbers Cluster Failover InterSystems recommends locking globals in cluster-mounted databases if you require a record of the sequence of updates. This is the tool Caché uses to record the time-sequencing of updates in the journal files of the cluster nodes. If cluster failover occurs, the journals of all the nodes can be applied in the proper order. Updates may not be dejournaled in the same order as they originally occurred, but they are valid with respect to the synchronization guaranteed by the Lock command and $Increment function. To restore clustered databases properly from the journal files, the updates must be applied in the order they occurred. The cluster journal sequence number, which is part of every journal entry of a database update, and the cluster session ID, which is part of the journal header, provide a way to sequence transactions from the journal files of all the cluster members. For each cluster session, the sequence number starts at 1 and can go as high as (that is, 2**64-1). The master node of a cluster maintains a master copy of the sequence number, which is incremented during database mounting, and with each use of Lock and $Increment. The master value of the sequence propagates to one or all cluster nodes, depending on the type of operation. The sequence number is used by cluster journal restore and shadowing, which is a special form of journal restore. Both utilities operate on the assumption that the sequence number increases monotonically during a cluster session. At the end of a backup, the cluster journal sequence number on all cluster nodes is incremented to be higher than the previous master cluster journal sequence number. Then a journal marker bearing the new cluster journal sequence number is placed in the current local journal file. In a sense, the journal marker serves as a barrier of cluster journal sequence numbers, separating the journaled database updates that are covered in the backup from those that are not. Following the restore of the backup, cluster journal restore can start from the cluster journal sequence number of the journal marker and move forward. You can also set your own journal markers using the ^JRNMARK utility. See Setting Journal Markers on a Clustered System for details. 7.2 Cluster Failover The Caché cluster failover process protects the integrity of data on other cluster nodes when one cluster member fails. It allows the remaining cluster members to continue to function. The following conditions must be met for cluster failover to work successfully: Caché High Availability Guide 129

140 Cluster Journaling All directories containing CACHE.DAT files must be accessible to all surviving nodes. Journaling must be enabled at all times while Caché is running. Networking must be properly configured. If a cluster member fails, the cluster master executes cluster failover. If the master is the failing node, the cluster member that least recently joined the cluster becomes the new master and executes failover. Cluster failover consists of two phases. In the first phase, the cluster master does the following: Checks the cluster PIJ and the write image journal files (CACHE.WIJ) on each node to determine what recovery is needed from these files. Executes recovery from the WIJ files to all databases that had been mounted in cluster mode. If an error occurs during this phase, the cluster crashes and further Cluster Recovery must take place. In the second phase, the cluster master does the following: Mounts databases in private mode, as required, to restore the journals of all cluster members. Attempts to mount the databases in cluster mode if it cannot mount them in private mode. Restores any Caché journal entries after the current index kept in the CACHE.WIJ file for each cluster member s journal. For details, see Cluster Restore. Rolls back incomplete transactions in the failed node s journal file. Reforms the lock table if it is the new cluster master; otherwise, it discards the locks of the failing node. During failover, the journals from all cluster members are applied to the database and any incomplete transactions are rolled back. If cluster failover completes successfully, there is no database degradation or data loss from the surviving nodes. There is only minimal data loss (typically less than the last second) not visible to other cluster members from the failing node. If failover is unsuccessful, the cluster crashes and you must shut down all cluster nodes before restarting the cluster. See the Failover Error Conditions section for more details Cluster Recovery Recovery occurs when Caché stops on any cluster member. The procedure changes depending on how Caché stops. During successful failover, the recovery procedure is fairly straightforward and automatic. If, however, a clustered system crashes, the recovery is more complex. Following a cluster crash, a clustered Caché node cannot be restarted until all of the nodes in the cluster have been stopped. If a cluster member attempts Caché startup or tries to join the cluster by 130 Caché High Availability Guide

141 cluster-mounting a disk before all other cluster members have been stopped, the following message is displayed: ENQ daemon failed to start because cluster is crashed. Once all members are stopped, start each node. The first node that starts runs the Caché recovery procedure if it detects there was an abnormal system shutdown. This node becomes the new cluster master. Cluster members that are not the cluster master are frequently referred to as slave nodes. The Recovery daemon (RCVRYDMN) performs recovery on the surviving or new master node based on whether the crashed node was the master or a slave. To enable recovery, the node s databases and WIJ files must be accessible cluster-wide. The master is responsible for managing the recovery clusterwide based on the WIJ file on each node. When a slave crashes, the Recovery daemon on the master node does the following: 1. It uses the journal information provided by the WIJ files to apply journal files on all cluster nodes (including the one that crashed) from a common starting point, as with all cluster journal restores. 2. It rolls back all incomplete transactions on the crashed system. Again, the rollbacks are journaled, this time on the host system of the Recovery daemon. For this reason, if you restore journal files, it is safer to do a cluster journal restore than a stand-alone journal restore, as the rollback of an incomplete transaction in one node's journal may be journaled on another node. When the (former) master crashes, the Recovery daemon does the following: 1. As in step 1 above, it applies the journal files on all the cluster nodes. 2. In between the two steps above, it adjusts the cluster journal sequence number on its host system, which is the new master, so that it is higher than that of the last journaled entry on the crashed system, which was the old master. This guarantees the monotonic increasing property of cluster journal sequence in cluster-wide journal files. 3. As above, all incomplete transactions are rolled back on the crashed system. Cluster Failover If the last remaining node of a cluster crashes, restarting the first node of the cluster involves cluster journal recovery, which includes rolling back any transactions that were open (uncommitted) at the time of crash Cluster Restore Typically, journal files are applied after backups are restored to bring databases up to date or up to the point of a crash. If nodes have not left or joined a cluster since the last backup, you can restore the journal files starting from the marker corresponding to the backup. If one or more nodes have joined the cluster since the last backup, the restore is more complex. A node joins a cluster either when it restarts with a proper configuration or cluster mounts a database after startup (as long as you properly set up other parameters, such as the PIJ directory, at startup). To Caché High Availability Guide 131

142 Cluster Journaling make journal restore easier in the latter case, switch the journal file on the node as soon as it joins the cluster. For each node that has joined the cluster since the last backup of the cluster: 1. Restore the latest backup of the node. 2. If the backup occurred before the node joined the cluster, restore the private journal files from where the backup ends, up to the point when it joined the cluster. (You can make this easier by switching the journal file when the node joins the cluster.) 3. Restore the latest cluster backup. 4. Restore the cluster journal files starting from where the backup ends. See Cluster Journal Restore for detailed information about running the utility. This procedure works well for restoring databases that were privately mounted on nodes before they joined the cluster and then are cluster-mounted after that node joined the cluster. It is based on the following assumptions: A cluster backup covers all cluster-mounted databases and a system-only backup covers private databases that, by definition, are not accessible to other systems of the cluster. The nodes did not leave and rejoin the cluster since the last cluster backup. In more complicated scenarios, these assumptions may not be true. The first assumption becomes false if, say, rather than centralizing backups of all cluster-mounted databases on one node, you configure each node to back up selected cluster-mounted databases along with its private databases. In this case, you may have to take a decentralized approach by restoring one database at a time. For each database, the restore procedure is essentially the same: 1. Restore the latest backup that covers the database. 2. Restore the private journal files up to the point when the node joined the cluster, if it postdates the backup. 3. Restore the cluster journal files from that point forward. CAUTION: Even if the database has always been privately mounted on the same node, it is safer to restore the cluster journal files than to apply only the journal files of that node. If the node crashed or was shut down when it was part of the cluster, open transactions on the node would have been rolled back by and journaled on a surviving node of the cluster. Restoring the cluster journal files ensures that you do not miss such rollbacks in journal files of other nodes. InterSystems does not recommend or support the scenario where a node joins and leaves the cluster multiple times. 132 Caché High Availability Guide

143 Cluster Shadowing Failover Error Conditions When a cluster member fails, the other cluster members notice a short pause while failover occurs. In rare situations, some processes on surviving cluster nodes may receive <CLUSTERFAILED> errors. You can trap these errors with the $ZTRAP error-trapping mechanism. Cluster failover does not work if one of the following is true: One or more cluster members go down during the failover process. There is disk drive failure and the first failover phase encounters an error. One of the surviving cluster members does not have a Recovery daemon. If failover is unsuccessful, the cluster crashes and the following message appears at the operator's console: ****** Caché : CLUSTER CRASH - ALL Caché SYSTEMS ARE SUSPENDED ****** The other cluster members freeze when Caché processes reach a Set or Kill command. Examine the failover log, which is contained in the console log (normally cconsole.log in the manager s directory) of the cluster master, to see the error messages generated during the failover process. If a cluster member that failed attempts startup, or a node tries to join the cluster by cluster-mounting a database while the cluster is in failover, the following message is displayed: The cluster appears to be attempting to recover from the failure of one or more members at this time. Waiting 45 seconds for failover to complete... A period (.) appears every five seconds until the active recovery phase completes. The mount or startup then proceeds. If the cluster crashes during this time, the following message is displayed: ENQ daemon failed to start because cluster is crashed. See the Cluster Recovery section for an explanation of what happens when a cluster crashes. 7.3 Cluster Shadowing The use of journaling in a clustered system also makes it possible to shadow a Caché cluster. In a cluster, each node manages its own journal files, which contain data involving private or clustermounted databases. The shadow mirrors the changes to the databases (assuming all changes are journaled) on a Caché system that is connected to the cluster via TCP. The following diagram gives an overview of the cluster shadowing process: Caché High Availability Guide 133

144 Cluster Journaling Cluster Shadowing Overview The destination shadow connects to the specified Caché superserver on the cluster, requesting a list of journal files at or after the specified start location (the combination of cluster start time and cluster journal sequence number), one starting file for each cluster member. For each node (with a unique CSN) returned from the source cluster, the shadow starts a copier process that copies journal files, starting with the file returned, from the server to the shadow. Each copier acts as a semi-independent shadow itself, similar to a nonclustered block-mode shadow. Once all copiers are up and running, the cluster shadow starts a dejournaling process that applies journal entries from the copied journal files to the databases on the shadow side respecting the cluster journal sequence numbers of each journal record. The cluster shadow maintains a list of current live members (including port numbers and IP addresses) of the cluster which it receives from the source cluster. The following sections describe what information is necessary and the procedures involved in setting up a cluster shadow as well as the limitations to the completeness and timeliness of the shadow databases: Configuring a Cluster Shadow 134 Caché High Availability Guide

145 Cluster Shadowing Cluster Shadowing Limitations Note: The shadow does not have to be a clustered system. The word cluster in cluster shadowing refers to the source database server, not the shadow Configuring a Cluster Shadow You must provide several types of information to properly configure a cluster shadow. An overview of the required data items is divided into the following categories: Establishing a Connection Although a cluster is identified by the PIJ directory that all its member nodes share, the uniqueness of the identifier does not go beyond the physical cluster that hosts the Caché cluster. The shadow needs a way to make a TCP connection to the source cluster; therefore, on the shadow you must specify the IP address or host name of one member of the Caché cluster and the port number of the superserver running on that member. Also provide the shadow with a unique identity to distinguish it from other shadows, if any, in the same Caché instance. Identifying the Starting Location Configure the shadow to identify the journal starting location a cluster start time (CSI) and, optionally, a cluster journal sequence number for dejournaling. If you do not specify a cluster journal sequence number, dejournaling starts at the beginning of the cluster session. Copying Journal Files Similar to noncluster shadowing, specify a directory to put the journal files copied over from the cluster. However, a single directory is not adequate; journal files from different members of the cluster must be kept separate. The directory you specify serves as the parent of the directories for the shadow copies of the journal files. In fact, the shadow creates directories on the fly to keep up with the dynamic nature of the cluster components. At run time, for each journal directory on the server cluster, the shadow sets up a distinct subdirectory under the user-specified parent directory and copies journal files from a journal directory on the server to its corresponding directory on the shadow this is called redirection of journal files. The subdirectories are named by sequential numbers, starting with 1. You cannot override a redirection by specifying a different directory on the shadow for a journal directory on the server. Redirecting Dejournaled Transactions As with nonclustered shadowing, specify database mapping, or redirections of dejournaled Set and Kill transactions. There are two ways to provide the information to set up a cluster destination shadow: Using the System Management Portal Using Caché Routines Caché High Availability Guide 135

146 Cluster Journaling Using the System Management Portal You can configure a shadow server using the System Management Portal. Perform the following steps: 1. From the [Home] > [Configuration] > [Shadow Server Settings] page of the System Management Portal, follow the procedure described in the Configuring the Destination Shadow section of the Shadow Journaling chapter of this guide. Use the following specifics particular to cluster shadowing: a. Database Server Enter the IP address or host name (DNS) of one member of the source cluster to which the shadow will connect. b. Database Server Port # Enter the port number of the source specified in the previous step. 2. After entering the location information for the source instance, click Select Source Event to choose where to begin shadowing. A page displays the available cluster events from the cluster journal file directory. 3. Click Advanced to enter the following optional field: Journal file directory Enter the full name, including the path, of the journal file directory on the destination shadow system, which serves as the parent directory of shadow journal file subdirectories, created automatically by the shadow for each journal directory on the source cluster. Click Browse for help in finding the proper directory. 4. After you successfully save the configuration settings, add database mapping from the cluster to the shadow. 5. Next to Database mapping for this shadow click Add to associate the database on the source system with the directory on the destination system using the Add Shadow Mapping dialog box. 6. In the Source database directory box, enter the physical pathname of the source database file the CACHE.DAT file. Enter the pathname of its corresponding destination shadow database file in the Shadow database directory box, and then click Save. 7. Verify any pre-filled mappings and click Delete next to any invalid or unwanted mappings. Shadowing requires at least one database mapping to start. 8. Start shadowing Using Caché Routines You can also use the shadowing routines provided by Caché to configure the cluster shadow. Each of the examples in this section uses the technique of setting a return code as the result of executing the routine. After which you check the return code for an error or 1 for success. To initially configure the cluster shadow: Set rc=$$configclushdw^shdwx(shadow_id,server,jrndir,begloc) 136 Caché High Availability Guide

147 Cluster Shadowing Where... shadow_id server jrndir begloc is... A string that uniquely identifies the shadow Consists of the superserver port number and IP address or host name of one cluster node, delimited by a comma Parent directory of shadow journal file subdirectories, where journal files fetched from the source cluster are stored on the destination shadow, one subdirectory for each journal directory on the source cluster Beginning location consisting of a cluster session ID (cluster startup time, in YYYYMMDD HH:MM:SS format) and a cluster sequence number, delimited by a comma You can run the routine again to change the values of server, jrndir, or begloc. If you specify a new value for jrndir, only subsequent journal files fetched from the source are stored in the new location; journal files already in the old location remain there. You can also direct the shadow to store journal files fetched from the journal directory, remdir, on the source in a local repository, locdir, instead of the default subdirectory of jrndir. Again, this change affects only journal files to be fetched, not the journal files that have been or are being fetched. Set rc=$$configjrndir^shdwx(shadow_id,locdir,remdir) The last mandatory piece of information about a shadow, Set and Kill transaction redirection, can be given as follows: Set rc=$$configdbmap^shdwx(shadow_id,locdir,remdir) This specifies that Set and Kill transactions from the source directory, remdir, be redirected to the shadow directory, locdir. Unlike journal files, there is no default redirection for a source database if it is not explicitly redirected, Set and Kill transactions from that database are ignored by the dejournaling process of the shadow. Finally, to start and stop shadowing: Set rc=$$start1^shdwcli("test") Set rc=$$stop1^shdwcli("test") See Shadow Information Global and Utilities for more information Cluster Shadowing Limitations There are a few limitations in the cluster shadowing process. Database updates that are journaled outside a cluster are not shadowed. Here are two examples: Caché High Availability Guide 137

148 Cluster Journaling After a cluster shuts down, if a former member of the cluster starts up as a stand-alone system and issues updates to some (formerly clustered) databases, the updates do not appear on the shadow. After a formerly stand-alone system joins a cluster, the new updates made to its private databases appear on the shadow (if they are defined in the database mapping), but none of the updates made before the system joined the cluster appear. For this reason, joining a cluster on the fly (by clustermounting a database) should be planned carefully in coordination with any shadow of the cluster. In cluster shadowing, there is latency that affects the dejournaler. Journal files on the destination shadow side are not necessarily as up to date as what has been journaled on the source cluster. The shadow applies production journals asynchronously so as not to affect performance on the production server. This results in possible latency in data applied to the shadow. Only one Caché cluster can be the target of a cluster shadow at any time, although there can be multiple shadows on one machine. There is no guarantee regarding the interactions between the multiple shadows, thus it is the user s responsibility to ensure that they are mutually exclusive. Note: Exclude Caché databases from RTVScan when using Symantec Antivirus software to avoid the condition of the cluster shadow hanging on Windows XP. See the Release Notes for Symantec AntiVirus Corporate Edition for detailed information. 7.4 Tools and Utilities The following tools and utilities are helpful in cluster journaling processes: Cluster Journal Restore ^JRNRESTO Journal Dump Utility ^JRNDUMP Startup Recovery Routine ^STURECOV Setting Journal Markers on a Clustered System ^JRNMARK Cluster Journal Information Global ^%SYS( JRNINFO ) Shadow Information Global and Utilities ^SYS( shdwcli ) 7.5 Cluster Journal Restore The cluster journal restore procedure allows you to start or end a restore using the journal markers placed in the journal files by a Caché backup. You can run a cluster journal restore either as part of a backup restore or as a stand-alone procedure. 138 Caché High Availability Guide

149 Caché includes an entry point to the journal restore interface for performing specific cluster journal restore operations. From the %SYS namespace, run the following: Do CLUMENU^JRNRESTO This invokes a menu that includes the following options: 1. Perform a cluster journal restore. 2. Generate a common journal file from specific journal files. 3. Perform a cluster journal restore after a backup restore. 4. Perform a cluster journal restore based on Caché backups. Cluster Journal Restore Perform a Cluster Journal Restore The first option of the cluster journal restore menu allows you to run the general journal restore on a clustered or nonclustered system. It is the equivalent of running ^JRNRESTO and answering Yes to the Cluster journal restore? prompt. %SYS>Do ^JRNRESTO This utility uses the contents of journal files to bring globals up to date from a backup. Replication is not enabled. Restore the Journal? Yes => Yes Cluster journal restore? Yes You are asked to describe the databases to be restored from the journal and the starting and ending points of the restore. The starting and ending points can be based on a backup, on a set of journal markers, at a cluster start, or any arbitrary point in the journal. The interface prompts for directory information, including any redirection specifics, and whether all databases and globals are to be processed. For example: Directory: _$1$DKB300:[TEST.CLU.5X] Redirect to Directory: _$1$DKB300:[TEST.CLU.5X] => _$1$DKB300:[TEST.CLU.5X] --> _$1$DKB300:[TEST.CLU.5X] Restore all globals in _$1$DKB300:[TEST.CLU.5X]? Yes => Yes Directory: For each directory you enter you are asked if you want to redirect. Enter the name of the directory to which to restore the dejournaled globals. If this is the same directory, enter the period (.) character or press Enter. Also specify for each directory whether you want to restore all journaled globals. Enter Yes or press Enter to apply all global changes to the database and continue with the next directory. Otherwise, enter No to restore only selected globals. Caché High Availability Guide 139

150 Cluster Journaling At the Global^ prompt, enter the name of the specific globals you want to restore from the journal. You may select patterns of globals by using the asterisk (*) to match any number of characters and the question mark (?) to match any single character. Enter?L to list the currently selected list of globals. When you have entered all your selected globals, press Enter at the Global^ prompt and enter the next directory. When you have entered all directories, press Enter at the Directory prompt. Your restore specifications are displayed as shown in this example: Restoring globals from the following clustered datasets: 1. _$1$DKB300:[TEST.CLU.5X] All Globals Specifications for Journal Restore Correct? Yes => Yes Updates will not be replicated Verify the information you entered before continuing with the cluster journal restore. Answer Yes or press Enter if the settings are correct; answer No to repeat the process of entering directories and globals. Once you verify the directory and global specifications, the Main Settings menu of the cluster journal restore setup process is displayed with the current default settings, as shown in the following example: Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DKB400:[TEST.5X]CACHEJRN.LOG with NO redirections of journal files 2. To START restore at the beginning of cluster session < :35:37> 3. To STOP restore at sequence #319 of cluster session < :35:37> ,_$1$DRA2:[TEST.5Y.JOURNAL] To SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions Select an item to modify ('Q' to quit or ENTER to accept and continue): From this menu you may choose to modify any of the default values of the five settings by entering its menu item number: 1. Change the source of the restore. 2. Change the starting point of the restore. 3. Change the ending point of the restore. 4. Toggle the switching journal file setting. 5. Toggle the disable journaling setting. After each modification, the Main Settings menu is displayed again, and you are asked to verify the information you entered before the restore begins. The following is an example of how the menu may look after several changes: 140 Caché High Availability Guide

151 Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DKB400:[TEST.5Y.MGR]CACHEJRN.TXT with redirections of journal files _$1$DKB400:[TEST.5X.JOURNAL] -> _$1$DRA2:[TEST.5X.JOURNAL] _$1$DKB400:[TEST.5Y.JOURNAL] -> _$1$DRA2:[TEST.5Y.JOURNAL] _$1$DKB400:[TEST.5Z.JOURNAL] -> _$1$DRA2:[TEST.5Z.JOURNAL] 2. To START restore at the journal marker located at ,_$1$DKB400:[TEST.5X.JOURNAL] > _$1$DRA2:[TEST.5X.JOURNAL] To STOP restore at the journal marker located at ,_$1$DKB400:[TEST.5X.JOURNAL] > _$1$DRA2:[TEST.5X.JOURNAL] NOT to SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions Select an item to modify ('Q' to quit or ENTER to accept and continue): Start journal restore? Cluster Journal Restore Press Enter to accept the settings and continue. If you are using the journal log of the current cluster, you are informed that the restore will stop at the currently marked journal location and asked if you want to start the restore. Select an item to modify ('Q' to quit or ENTER to accept and continue): To stop restore at currently marked journal location offset of _$1$DRA1:[TEST.50.MGR.JOURNAL] Start journal restore? Enter Yes to begin the cluster journal restore. Once the restore finishes your system is ready for activity. Enter No to go back to the main menu where you can continue to make changes to the cluster journal restore setup or enter Q to abort the cluster journal restore. After aborting the cluster journal restore, you can run a private journal restore or abort the restore process entirely. Select an item to modify ('Q' to quit or ENTER to accept and continue): Q Run private journal restore instead? No [Journal restore aborted] Replication Enabled Change the Source of the Restore The first item on the Main Settings menu contains the information required to find the journal files for all the cluster members. The information has two elements: Cluster journal log a list of journal files and their original full paths. Redirection of journal files necessary only if the system where you are running the restore is not part of the cluster associated with the journal log. Caché High Availability Guide 141

152 Cluster Journaling By default, the restore uses the cluster journal log associated with the current clustered system. If you are running the restore on a nonclustered system, you are prompted for a cluster journal log before the main menu is displayed. Choose this option to restore the journal files on a different Caché cluster from the one that owns the journal files. You can either: Identify the cluster journal log used by the original cluster. Create a cluster journal log that specifies where to locate the journal files. Note: The option to redirect the journal files is available only if the specified cluster journal log is not that of the current cluster. Identify the Cluster Journal Log The Journal File Information menu displays the cluster journal log file to be used in the cluster journal restore. If the journal files on the original cluster are not accessible to the current cluster, copy them to a location accessible to the current cluster and specify how to locate them by entering redirect information. Enter I to identify the journal log used by the original cluster. Select an item to modify ('Q' to quit or ENTER to accept and continue): 1 Cluster Journal Restore - Setup - Journal File Info [I]dentify an existing cluster journal log to use for the restore Current: _$1$DRA1:[TEST.50]CACHEJRN.LOG - OR - [C]reate a cluster journal log by specifying where journal files are Selection ( if no change): I *** WARNING *** If you specify a cluster journal log different from current one, you may need to reenter info on journal redirection, restore range, etc. Enter the name of the cluster journal log ( if no change) => cachejrn.txt Cluster Journal Restore - Setup - Journal File Info [I]dentify an existing cluster journal log to use for the restore Current: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT [R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT - OR - [C]reate a cluster journal log by specifying where journal files are You must redirect journal files if the journal files being restored are not in their original locations, as specified in the cluster journal log. To redirect the journal files listed in the cluster journal log, provide the original and current locations when prompted. You may give a full or partial directory name as an original location. All original locations with leading characters that match the partial name are replaced with the new location. An example of redirecting files follows: 142 Caché High Availability Guide

153 Selection ( if no change): R Journal directories in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT _$1$DRA1:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] Enter the original and current locations of journal files (? for help) Journal files originally from: _$1$DRA1: are currently located in: _$1$DRA2: _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] Journal files originally from: Cluster Journal Restore - Setup - Journal File Info [I]dentify an existing cluster journal log to use for the restore Current: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT [R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT - OR - [C]reate a cluster journal log by specifying where journal files are Selection ( if no change): Cluster Journal Restore This example shows the choice of an alternative cluster journal log, CACHEJRN.TXT, which contains a list of journal files originally located on _$1$DRA1:. These files are redirected to be retrieved from their new location, _$1$DRA2:, during the restore. When you have finished entering the redirection information, press Enter to return to the Main Settings menu. Journal redirection assumes a one-to-one or many-to-one relationship between source and target directory locations. That is, journal files from one or multiple original directories may be located in one new location, but not in multiple new locations. To restore from journal files that are in multiple new locations, create a cluster journal log that specifies where to locate the journal files. Create a Cluster Journal Log If the journal files on the original cluster are not accessible to the current cluster, create a cluster journal log that specifies the locations of the journal files. The files in the specified locations must all be part of the cluster. Copy them to a location accessible to the current cluster and specify how to locate them by entering redirect information. Selection ( if no change): C *** WARNING *** If you specify a cluster journal log different from current one, you may need to reenter info on journal redirection, restore range, etc. Enter the name of the cluster journal log to create (ENTER if none) => cachejrn.txt How many cluster members were involved? (Q to quit) => 3 For each cluster member, enter the location(s) and name prefix (if any) of the journal files to restore -- Cluster member #0 Journal File Name Prefix: Directory: _$1$DRA1:[TEST.50.MGR.JOURNAL] Directory: Cluster member #1 Journal File Name Prefix: Directory: _$1$DRA1:[TEST.5A.MGR.JOURNAL] Directory: Caché High Availability Guide 143

154 Cluster Journaling Cluster member #2 Journal File Name Prefix: Directory: _$1$DRA1:[TEST.5B.MGR.JOURNAL] Directory: This example shows the creation of a cluster journal log, CACHEJRN.TXT, for a cluster with three members whose journal files were originally located on _$1$DRA1:. The next menu contains the additional option to redirect the journal files in the cluster journal log you created: Cluster Journal Restore - Setup - Journal File Info [I]dentify an existing cluster journal log to use for the restore Current: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT [R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT - OR - [C]reate a cluster journal log by specifying where journal files are Selection ( if no change): Enter R to redirect the files as described in Identify the Cluster Journal Log. When finished entering redirect information, press Enter to return to the Main Settings menu Change the Starting Point of the Restore The second and third items on the Main Settings menu specify the range of restore where in the journal files to begin restoring and where to stop. The starting point information contains the starting journal file and sequence number for each cluster member. The default for where to begin is determined in the following order: If a cluster journal restore was performed after any backup restore, restore the journal from the end of last journal restore. If a backup restore was performed on the current system, restore the journal from the end of the last restored backup. If the current system is associated with the cluster journal log being used, restore the journal from the beginning of the current cluster session Otherwise, restore the journal from the beginning of the cluster journal log. Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT with redirections of journal files _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] 2. To START restore at the end of last restored backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK ,_$1$DRA1:[TEST.50.MGR.JOURNAL] > _$1$DRA1:[TEST.50.MGR.JOURNAL] To STOP restore at the end of the cluster journal log 4. To SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions 144 Caché High Availability Guide

155 Select an item to modify ('Q' to quit or ENTER to accept and continue): 2 Cluster Journal Restore - Setup - Where to Start Restore 1. At the beginning of a cluster session 2. At a specific journal marker 3. Following the restore of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK (*) i.e., at the journal marker located at ,_$1$DRA1:[TEST.50.MGR.JOURNAL] Selection ( if no change): 1 To start journal restore at the beginning of cluster session :47: :19: :26: :29: :51: :58: :29: :33: :35:48 => 5 Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT with redirections of journal files _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] 2. To START restore at the beginning of cluster session < :51:31> 3. To STOP restore at the end of the cluster journal log 4. To SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions Select an item to modify ('Q' to quit or ENTER to accept and continue): 2 Cluster Journal Restore - Setup - Where to Start Restore 1. At the beginning of a cluster session (*): < :51:31> 2. At a specific journal marker 3. Following the restore of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK Selection ( if no change): 2 To start restore at a journal marker location (in original form) journal file: _$1$DRA1:[TEST.50.MGR.JOURNAL] offset: Cluster Journal Restore You have chosen to start journal restore at ,_$1$DRA1:[TEST.50.MGR.JOURNAL] the journal location by the end of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK The submenu varies slightly based on the current settings. For example, if no backup restore was performed, the submenu for specifying the beginning of the restore does not list option 3 to restore from the end of last backup. In a submenu, the option that is currently chosen is marked with an asterisk (*). Caché High Availability Guide 145

156 Cluster Journaling Change the Ending Point of the Restore By default, the restore ends at either the current journal location, if the current system is associated with the selected cluster journal log, or the end of the journal log. The submenu for option 3 is similar to that for option 2: Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT with redirections of journal files _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] 2. To START restore at the end of last restored backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK ,_$1$DRA1:[TEST.50.MGR.JOURNAL] > _$1$DRA1:[TEST.50.MGR.JOURNAL] To STOP restore at the end of the cluster journal log 4. To SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions Select an item to modify ('Q' to quit or ENTER to accept and continue): 3 Cluster Journal Restore - Setup - Where to Stop Restore 1. At the end of a cluster session 2. At the end of _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT 3. At a specific journal marker This is the menu you would see if the journal log is the one for the current cluster: Select an item to modify ('Q' to quit or ENTER to accept and continue): 3 Cluster Journal Restore - Setup - Where to Stop Restore 1. At the end of a cluster session 2. At current journal location (*) 3. At a specific journal marker The submenu varies slightly based on the current settings. For example, depending whether or not the journal log is the one for current cluster, option 2 in the menu for specifying the end of the restore would be either the current journal location or the end of the journal log. In a submenu, the option that is currently chosen is marked with an asterisk (*) Toggle the Switching Journal File Setting The fourth menu item specifies whether to switch the journal file before the restore. If you select this item number, the value is toggled between the values To SWITCH and NOT to SWITCH the journal file; the menu is displayed again with the new setting: 146 Caché High Availability Guide

157 Select an item to modify ('Q' to quit or ENTER to accept and continue): 4 Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT with redirections of journal files _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] 2. To START restore at the end of last restored backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK ,_$1$DRA1:[TEST.50.MGR.JOURNAL] > _$1$DRA1:[TEST.50.MGR.JOURNAL] To STOP restore at the end of the cluster journal log 4. NOT to SWITCH journal file before journal restore 5. To DISABLE journaling the dejournaled transactions Cluster Journal Restore The default is to switch the journal file before the restore. This provides a clean start so that updates that occur after the restore are in new journal files Toggle the Disable Journaling Setting The fifth menu item specifies whether to disable journaling of the dejournaled transactions during the restore. If you select this item, the value is toggled between the values DISABLE and NOT to DISABLE journaling the dejournaled transactions; the menu is redisplayed with the new setting. Select an item to modify ('Q' to quit or ENTER to accept and continue): 5 Cluster Journal Restore - Setup - Main Settings 1. To LOCATE journal files using cluster journal log _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT with redirections of journal files _$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL] _$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL] _$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL] 2. To START restore at the end of last restored backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK ,_$1$DRA1:[TEST.50.MGR.JOURNAL] > _$1$DRA1:[TEST.50.MGR.JOURNAL] To STOP restore at the end of the cluster journal log 4. NOT to SWITCH journal file before journal restore 5. NOT to DISABLE journaling the dejournaled transactions For better performance, the default setting is to disable journaling the dejournaled transactions. However, if you are running a cluster shadow, you may want to choose not to disable journaling. Note: If you choose not to disable journaling, the dejournaled transactions are journaled only if they otherwise meet the normal criteria for being journaled Generate a Common Journal File The user interface for this option is similar to the first with additional questions about the contents and format of the output file. However, instead of restoring the journal files, this option produces a common- Caché High Availability Guide 147

158 Cluster Journaling format journal file that can be read by the ^%JREAD utility on a Caché system that does not support cluster journal restores or on another platform such as DSM. ^JCONVERT provides the same functionality if you answer Yes to the Cluster Journal Convert? question. The second option produces a single common-format output file from the cluster journal files. It calls the ^JCONVERT utility, which takes a journal file from a single system and writes it out in a common format to be read by the %JREAD routine. This is useful for restoring journal files across versions of Caché where the journal files are not compatible (for example, as part of an almost rolling upgrade) or as part of failing back to an earlier release. You can also use this option to write the journal file in a format that can be loaded into another platform such as DSM. Cluster Journal Restore Menu ) Cluster journal restore 2) Generate common journal file from specific journal files 3) Cluster journal restore after backup restore 4) Cluster journal restore corresponding to Caché backups H) Display Help E) Exit this utility Enter choice (1-4) or [E]xit/[H]elp? Perform a Cluster Journal Restore after a Backup Restore Option three restores the journal files after a Caché backup has been restored. This is similar to the restore performed by the incremental backup restore routine, ^DBREST, after a cluster backup restore when there is no way to run it independently of restoring a backup. (To restart the journal restore, for example.) One difference between this option and restoring using ^DBREST is that this option does not start with the list of databases contained in the backup; you must enter the database list. The routine offers to include all currently cluster-mounted databases in the restore, but if it is being run after restoring a backup, the databases restored by the backup are then privately mounted unless you change the mount state. (The restore mounts them privately and leaves them privately mounted when it is finished.) It starts with the markers recorded in the journal files by the backup and ends with the end of the journal data Perform a Cluster Journal Restore Based on Caché Backups The fourth menu option restores the journal files using journal markers that were added by a Caché backup to specify the starting point and, optionally, the end point. It is similar to option three except that it uses backups which have been performed to designate where to start rather than backups which have been restored. Functionally they are the same; both options use a marker which has been placed 148 Caché High Availability Guide

159 Journal Dump Utility into the journal file by a Caché backup as the starting point. The difference is in the list of choices of where to start. 7.6 Journal Dump Utility On a Caché clustered system, the ^JRNDUMP routine displays the cluster session ID (cluster startup time) of a journal file instead of the word JRNSTART. The ^JRNDUMP routine displays a list of records in a journal file, showing the cluster session ID along with the journal file sizes. The utility lists journal files maintained by the local system as well as journal files maintained by other systems of the Caché cluster, in the order of cluster startup time (cluster session ID) and the first and last cluster journal sequence numbers of the journal files. Journal files created by ^JRNSTART are marked with an asterisk (*). Journal files that are no longer available (purged, for example) are marked with D (for deleted). Journal file names are displayed with indentions that correspond to their CSN, that is: no indention for journal files from system 0, one space for system 1, two spaces for system 2, etc. Sample output from the cluster version of ^JRNDUMP follows. By default, The level-1 display on a clustered system is quite different from the nonclustered one: FirstSeq LastSeq Journal Files Session :02: D /bench/test/cache/50a/mgr/journal/ /bench/test/cache/50b/mgr/journal/ Session :55: /bench/test/cache/50b/mgr/journal/ (N)ext,(P)rev,(G)oto,(E)xamine,(Q)uit => Besides a list of journal files from every cluster node, (even the dead ones), there are cluster session IDs and the first and the last cluster journal sequence numbers of each journal file. A cluster session ID (the date-time string following Session) is the time the first node of the cluster starts. A cluster session ends when the last node of the cluster shuts down. Files from different nodes are shown with different indention: no indentation for the node with CSN 0, one space for the node with CSN 1, and so on. The CSN of a node uniquely identifies the node within the cluster at a given time. The files labeled D have most likely been deleted from their host systems. The previous version of ^JRNDUMP for clusters is available as OLD^JRNDUMP, if you prefer that output. Caché High Availability Guide 149

160 Cluster Journaling 7.7 Startup Recovery Routine The following is the help display of the startup recovery routine, ^STURECOV: %SYS>Do ^STURECOV Enter error type (? for list) [^] =>? Supported error types are: JRN - Journal restore and transaction rollback CLUJRN - Cluster journal restore and transaction rollback Enter error type (? for list) [^] => CLUJRN Cluster journal recovery options ) Display the list of errors from startup 2) Run the journal restore again 4) Dismount a database 5) Mount a database 6) Database Repair Utility 7) Check Database Integrity H) Display Help E) Exit this utility Enter choice (1-8) or [E]xit/[H]elp? H Before running ^STURECOV you should have corrected the errors that prevented the journal restore or transaction rollback from completing. Here you have several options regarding what to do next. Option 1: The journal restore and transaction rollback procedure tries to save the list of errors in ^%SYS(). This is not always possible depending on what is wrong with the system. If this information is available, this option displays the errors. Option 2: This option performs the same journal restore and transaction rollback which was performed when the system was started. The amount of data is small so it should not be necessary to try and restart from where the error occurred. Option 3 is not enabled for cluster recovery Option 4: This lets you dismount a database. Generally this would be used if you want to let users back on a system but you want to prevent them from accessing a database which still has problems (^DISMOUNT utility). Option 5: This lets you mount a database (^MOUNT utility). Option 6: This lets you edit the database structure (^REPAIR utility). Option 7: This lets you validate the database structure (^INTEGRIT utility). Option 8 is not enabled for cluster recovery. Shut the system down using the bypass option with ccontrol stop and then start it with ccontrol start. During startup answer YES when asked if you want to continue after it displays the message related to errors during recovery. Press <enter> continue Cluster journal recovery options ) Display the list of errors from startup 2) Run the journal restore again 150 Caché High Availability Guide

161 4) Dismount a database 5) Mount a database 6) Database Repair Utility 7) Check Database Integrity H) Display Help E) Exit this utility Enter choice (1-8) or [E]xit/[H]elp? Setting Journal Markers on a Clustered System 7.8 Setting Journal Markers on a Clustered System To set a journal marker effective cluster-wide, use the following routine $$CLUSET^JRNMARK(id,text,swset) Where... id text swset is... Marker ID (for example, -1 for backup) Marker text (for example, timestamp for backup) 1 if the switch that inhibits database reads and writes (switch 10) has been set cluster-wide (and locally) by the caller. The caller is responsible for clearing it afterwards. 0 if the switch has not been set. The routine takes care of setting and clearing the switch properly Note that switch 10 must be set locally and cluster-wide to ensure the integrity of the journal marker. If successful, the routine returns the location of the marker the offset of the marker in the journal file and the journal file name delimited by a comma. Otherwise, it returns an error code (<=0) and error message, also delimited by a comma. 7.9 Cluster Journal Information Global The global node ^%SYS("JRNINFO") is where cluster journal information is maintained. It is indexed by current cluster session ID and is recreated every time the cluster restarts. This allows you to modify or delete the cluster journal log (presumably after deleting the journal files) between two cluster sessions, as the update algorithm assumes that you do not alter the cluster journal log during a cluster session. Caché High Availability Guide 151

162 Cluster Journaling The ^%SYS("JRNINFO") global has three subcomponents: The jrninfo table is indexed by journal file names, with the value of the top node being the number of entries in the cluster journal log and the value of each subnode being a comma-delimited list of the attributes of that journal file: CSN, line number of the journal file in the cluster journal log, CSI, first and last sequence numbers. The jrninfor (r for reverse) table is a list of journal files, with CSN as the primary key and the line number of the journal file in the cluster journal log as the secondary key. The seqinfo table contains the following subscripts: CSI, first and last sequence numbers, CSN, and line number of the journal file in the cluster journal log. Here is a sample of ^%SYS("JRNINFO") contents: ^%SYS("JRNINFO", ,"jrninfo")=16 ^%SYS("JRNINFO", ,"jrninfo", "_$1$DKA0:[TEST.50.MGR.JOURNAL] ")=1,2, ,160,160 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,1, ,3,3 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,3, ,292,292 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,4, ,3,417 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,7, ,3,422 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,8, ,3,4 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,9, ,3,7 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,10, ,3,10 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,11, ,3,17 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,12, ,3,17 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,13, ,3,27 "_$1$DRA1:[TEST.50.MGR.JOURNAL] ")=0,15, ,3,133 "_$1$DRA1:[TEST.5A.MGR.JOURNAL] ")=1,5, ,3,3 "_$1$DRA1:[TEST.5A.MGR.JOURNAL] ")=1,6, ,131,131 "_$1$DRA1:[TEST.5A.MGR.JOURNAL] ")=1,14, ,39,39 "_$1$DRA1:[TEST.5A.MGR.JOURNAL] ")=1,16, ,3,3 ^%SYS("JRNINFO", ,"jrninfor",0, 1)=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.50.MGR.JOURNAL] ^%SYS("JRNINFO", ,"jrninfor",1, 2)=_$1$DKA0:[TEST.50.MGR.JOURNAL] )=_$1$DRA1:[TEST.5A.MGR.JOURNAL] )=_$1$DRA1:[TEST.5A.MGR.JOURNAL] )=_$1$DRA1:[TEST.5A.MGR.JOURNAL] )=_$1$DRA1:[TEST.5A.MGR.JOURNAL] ^%SYS("JRNINFO", ,"seqinfo", ,3,3,0,1)= ^%SYS("JRNINFO", ,"seqinfo", ,160,160,1,2)= ^%SYS("JRNINFO", ,"seqinfo", ,292,292,0,3)= ^%SYS("JRNINFO", ,"seqinfo", ,3,3,1,5)= ^%SYS("JRNINFO", ,"seqinfo", ,3,417,0,4)= ^%SYS("JRNINFO", ,"seqinfo", ,3,422,0,7)= ^%SYS("JRNINFO", ,"seqinfo", ,131,131,1,6)= ^%SYS("JRNINFO", ,"seqinfo", ,3,4,0,8)= 152 Caché High Availability Guide

163 ^%SYS("JRNINFO", ,"seqinfo", ,3,7,0,9)= ^%SYS("JRNINFO", ,"seqinfo", ,3,10,0,10)= ^%SYS("JRNINFO", ,"seqinfo", ,3,17,0,11)= 12)= ^%SYS("JRNINFO", ,"seqinfo", ,3,27,0,13)= ^%SYS("JRNINFO", ,"seqinfo", ,39,39,1,14)= ^%SYS("JRNINFO", ,"seqinfo", ,3,3,1,16)= ^%SYS("JRNINFO", ,"seqinfo", ,3,133,0,15)= Shadow Information Global and Utilities 7.10 Shadow Information Global and Utilities The global node ^SYS("shdwcli ) is where shadow client information is maintained. Most of the values are available through the utilities ShowState^SHDWX, ShowError^SHDWX, and ShowWhere^SHDWX. Running ShowState^SHDWX displays most of the data contained in the global: %SYS>d ShowState^SHDWX("clutest",1) Shadow ID PrimaryServerIP Port R S Err clutest rodan \_ clutest~ \_ clutest~ \_ clutest~2 rodan Redirection of Global Sets and Kills: ^^_$1$DKB300:[TEST.CLU.5X] -> ^^_$1$DKA0:[TEST.CLU.5X] Redirection of Master Journal Files: Base directory for auto-redirection: _$1$DKA0:[TEST.5X.SHADOW] _$1$DRA2:[TEST.5X.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.1] _$1$DRA2:[TEST.5Y.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.2] _$1$DRA2:[TEST.5Z.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.3] Primary Server Cluster ID: _$1$DKB400:[TEST.5X]CACHE.PIJ Primary Server Candidates (for failover): rodan When to purge a shadow journal file: after it's dejournaled The output displayed from the ^SYS("shdwcli ) global has the following components: Shadow ID the ID of a copier shadow is partially inherited from the parent shadow. The clu subnode of a copier contains the ID of the parent, and the sys subnode of the parent contains a list of the IDs of the copiers. PrimaryServerIP and Port for a copier, these specify the system from which it gets journal files; for the dejournaler, the system from which it gets journal information (JRNINFO server). The values are stored in the ip and port0 subnodes. R has the value 1 if the shadow is running; from the stat subnode. Caché High Availability Guide 153

164 Cluster Journaling S has the value 1 if the shadow is requested to stop (due to latency, it is possible that both R and S have the value 1 if the shadow has yet to check for the stop request); from the stop subnode. Err number of errors encountered. See details through ShowError^SHDWX, which displays the information from the err subnode. Redirection of Global Sets and Kills referred to as database mapping in the System Management Portal; from the dbmap subnode of the cluster shadow. Redirection of Master Journal Files discussed in the Using Caché Routines section; stored in the jrndir subnode of the cluster shadow. The value of the jrndir subnode is the number of journal directories that have been automatically redirected (in the preceding example output, the next new journal directory is redirected to a subdirectory [.4]). (jrndir,0) is the base shadow directory and everything else indicates a redirection of journal directories with the server journal directory being the key and the shadow journal directory being the value. Primary Server Cluster ID used to prevent the shadow from following a node to a different cluster; from the DBServerClusterID subnode. Primary Server Candidates (for failover) the list of current live members of the cluster. If one member dies, a shadow (either the dejournaler or a copier) that gets information from the member tries other members on the list until it succeeds. A new member is added to the list as soon as the shadow knows its presence; from the servers subnode. When to purge a shadow journal file works in the same way as purging of local journal files. The age threshold is set by the lifespan subnode of the cluster shadow. Unlike purging of local journal files, however, if the value of lifespan is 0, the shadow journal files are purged as soon as they have been dejournaled completely. The purged journal files are listed in the jrndel subnode of the copiers. The chkpnt subnode stores a list of checkpoints. A checkpoint is a snapshot of the work queue of the dejournaler the current progress of dejournaling. The value of the chkpnt subnode indicates which checkpoint to use when the dejournaler resumes. This is the checkpoint displayed by ShowWhere^SHDWX. Updating the value of the chkpnt subnode after having completely updated the corresponding checkpoint, avoids having a partial checkpoint in the case of system failover in the middle of an update (in that case, dejournaler would use previous checkpoint). The copiers keep the names of the copied (or being copied) journal files in the jrnfil subnode. This makes it possible to change the redirection of journal files by allowing the dejournaler to find the shadow journal files in the old directory while the copiers copy new journal files to the new location. Once a shadow journal file is purged, it is moved from the jrnfil list to the jrndel list. Here is a sample of the ^SYS("shdwcli") contents for the nodes for the cluster shadow, clutest, and two of its copier shadows: ^SYS("shdwcli","clutest")=0 ^SYS("shdwcli","clutest","DBServerClusterID")=_$1$DKB400:[TEST.5X]CACHE.PIJ "at")=0 "chkpnt")= Caché High Availability Guide

165 ^SYS("shdwcli","clutest","chkpnt",1)=1, ,-128 ^SYS("shdwcli","clutest","chkpnt",1, ,0,1)=0,,,, ^SYS("shdwcli","clutest","chkpnt",2)=2, ,-128 ^SYS("shdwcli","clutest","chkpnt",2, ,-128,2)= -128,_$1$DKA0:[TEST.5X.SHADOW.1] ,0,,0 ^SYS("shdwcli","clutest","chkpnt",3)=6, ,5 ^SYS("shdwcli","clutest","chkpnt",3, ,11,6)= 5,_$1$DKA0:[TEST.5X.SHADOW.1] , ,,0 ^SYS("shdwcli","clutest","chkpnt",4)=35, ,85 ^SYS("shdwcli","clutest","chkpnt",4, ,95,35)= 85,_$1$DKA0:[TEST.5X.SHADOW.1] , ,,0 ^SYS("shdwcli","clutest","chkpnt",5)=594, ,807 ^SYS("shdwcli","clutest","chkpnt",5, ,808,594)= 808,_$1$DKA0:[TEST.5X.SHADOW.1] ,262480,1,0... ^SYS("shdwcli","clutest","chkpnt",212)=24559, ,5 ^SYS("shdwcli","clutest","chkpnt",212, ,37,24559)= 5,_$1$DKA0:[TEST.5X.SHADOW.1] , ,,0 ^SYS("shdwcli","clutest","cmd")= ^SYS("shdwcli","clutest","dbmap","^^_$1$DKB300:[TEST.CLU.5X]")= ^^_$1$DKA0:[TEST.CLU.5X] ^SYS("shdwcli","clutest","end")=0 "err")=1 ^SYS("shdwcli","clutest","err",1)= :16: Query+8^SHDWX;-12; reading ans from TCP 42009timed out,remote server is not responding ^SYS("shdwcli","clutest","err",1,"begin")= :09:09 "count")=5 ^SYS("shdwcli","clutest","errmax")=10 "intv")=10 "ip")=rodan "jrndir")=3 ^SYS("shdwcli","clutest","jrndir",0)=_$1$DKA0:[TEST.5X.SHADOW] "_$1$DRA2:[TEST.5X.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.1] "_$1$DRA2:[TEST.5Y.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.2] "_$1$DRA2:[TEST.5Z.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.3] ^SYS("shdwcli","clutest","jrntran")=0 "lifespan")=0 "locdir")= "locshd")= "pid")= "port")= "port0")=42009 "remjrn")= ^SYS("shdwcli","clutest","servers","42009, ")= "42009,rodan")= "42019, ")= "42029, ")= ^SYS("shdwcli","clutest","stat")=0 "stop")=1 ^SYS("shdwcli","clutest","sys",0)= 1)= 2)= ^SYS("shdwcli","clutest","tcp")= TCP "tpskip")=1 "type")=21 ^SYS("shdwcli","clutest~0")=0 ^SYS("shdwcli","clutest~0","at")=0 "clu")=clutest "cmd")= "end")= "err")=0 "intv")=10 "ip")= ^SYS("shdwcli","clutest~0","jrndel", "_$1$DKA0:[TEST.5X.SHADOW.1] ")= Shadow Information Global and Utilities Caché High Availability Guide 155

166 "_$1$DKA0:[TEST.5X.SHADOW.1] ")= "_$1$DKA0:[TEST.5X.SHADOW.2] ")= ^SYS("shdwcli","clutest~0","jrnfil")=36 ^SYS("shdwcli","clutest~0","jrnfil",35)= _$1$DRA2:[TEST.5X.JOURNAL] ^SYS("shdwcli","clutest~0","jrnfil",35,"shdw")= _$1$DKA0:[TEST.5X.SHADOW.1] ^SYS("shdwcli","clutest~0","jrnfil",36)= _$1$DRA2:[TEST.5X.JOURNAL] ^SYS("shdwcli","clutest~0","jrnfil",36,"shdw")= _$1$DKA0:[TEST.5X.SHADOW.1] ^SYS("shdwcli","clutest~0","jrntran")=0 "locdir")= "locshd")= _$1$DKA0:[TEST.5X.SHADOW.1] "pause")=0 "pid")= "port")=42009 "port0")=42009 "remend")= "remjrn")= "stat")=0 "stop")=1 "tcp")= TCP "tpskip")=1 "type")=12 ^SYS("shdwcli","clutest~1")=0 ^SYS("shdwcli","clutest~1","at")=0 "clu")=clutest "cmd")= "end")= "err")=0 "intv")=10 "ip")= ^SYS("shdwcli","clutest~1","jrndel", "_$1$DKA0:[TEST.5X.SHADOW.1] ")= "_$1$DKA0:[TEST.5X.SHADOW.2] ")= "_$1$DKA0:[TEST.5X.SHADOW.2] ")= ^SYS("shdwcli","clutest~1","jrnfil")=18 ^SYS("shdwcli","clutest~1","jrnfil",17)= _$1$DRA2:[TEST.5X.JOURNAL] ^SYS("shdwcli","clutest~1","jrnfil",17,"shdw")= _$1$DKA0:[TEST.5X.SHADOW.1] ^SYS("shdwcli","clutest~1","jrnfil",18)= _$1$DRA2:[TEST.5Y.JOURNAL] ^SYS("shdwcli","clutest~1","jrnfil",18,"shdw")= _$1$DKA0:[TEST.5X.SHADOW.2] ^SYS("shdwcli","clutest~1","jrntran")=0 "locdir")= "locshd")= _$1$DKA0:[TEST.5X.SHADOW.2] "pid")= "port")=42009 "port0")=42009 "remend")= "remjrn")= "stat")=0 "stop")=1 "tcp")= TCP "tpskip")=1 "type")= Caché High Availability Guide Cluster Journaling

167 8 Caché Clusters on Tru64 UNIX Alpha Tru64 UNIX version 5 introduced cluster-mounted file systems and a UNIX distributed lock manager. Thus, two or more servers with their own memory, CPU, and I/O channels can simultaneously access data stored on a common set of file systems. There are three methods available for running Caché on one of these clusters. The first method is to simply run your Caché instances on single servers as stand-alone installations. While this is the simplest case, it fails to utilize the performance and high availability advantages of the cluster. The second method is to utilize Tru64 UNIX Cluster Available Application (CAA) functionality. This is described in the System Failover Strategies chapter as a cold failover. The third method, the topic for this chapter, is to utilize two or more of the servers in a Tru64 UNIX cluster as a unit in a warm failover or a hot failover configuration. With this version of Caché, you may run Caché across a Tru64 UNIX cluster where processes running on any member of the cluster have controlled simultaneous access to the same Caché database files. With this configuration you gain performance and high availability benefits. This functionality is the same as for Caché on OpenVMS clusters. The following topics are discussed: Tru64 UNIX Caché Cluster Overview TruCluster File System Architecture Planning a Tru64 Caché Cluster Installation Tuning a Tru64 Caché Cluster Member Caché High Availability Guide 157

168 Caché Clusters on Tru64 UNIX 8.1 Tru64 UNIX Caché Cluster Overview TruCluster technology allows multiple machines running Tru64 Version 5.0 or later to work together in a scalable, high availability, clustered configuration. All cluster members must be connected to each other via a cluster interconnect. The cluster interconnect carries all communication between members of the cluster. Any time a cluster member needs access to a resource that is served by another cluster member, the data flows over the cluster interconnect. Physically, the interconnect can be Ethernet, Gigabit Ethernet, or Compaq's proprietary Memory Channel interconnect. The cluster interconnect runs an IP stack, so every cluster member has an IP address for its cluster interconnect in addition to its regular address(es) on the outside LAN. The IP address of the cluster interconnect is generally chosen from a non-routed network, such as the network, as it is only used by members of the cluster and is not visible to the outside world. General information about a cluster's current configuration can be obtained with the command clu_get_info. Example of Tru64 Cluster Configuration 8.2 TruCluster File System Architecture One of the defining features of the TruCluster file system architecture is the concept of a shared root. All cluster members share the same root, /usr, and /var file systems. These correspond to the cluster_root, cluster_usr, and cluster_var partitions, respectively, and are stored on the cluster root disk or disks. 158 Caché High Availability Guide

169 Since the root of the directory structure is shared, all cluster members see an identical view of the directory structure and all members of the cluster can access anything mounted locally on any cluster member. Of course, if a disk is local to one member and not on the shared SCSI bus, that member becomes a single point of failure for that disk. The most obvious implication is that all cluster members share the same operating system files. Most administrative tasks, such as user management, take effect cluster-wide. While simplifying system administration, this makes the cluster root drive a very critical resource, and you should consider making it accessible from more than one shared bus to avoid it becoming a point of failure. This shared-everything approach raises some questions about how system-specific configuration files and the boot process are handled. The files are mapped from their standard location in the file system to the member-specific root partition via context-dependent symbolic links (CDSLs). CDSLs allow each member to have its own version of a particular file. A CDSL is a special type of link that is created with the mkcdsl command. An example view of a CDSL: clunode1.kinich.com> ls -l sysconfigtab lrwxrwxrwx 1 root system 57 Sep sysconfigtab@ ->../cluster/members/{memb}/boot_partition/etc/sysconfigtab* The unique part of this link is the {memb} section. This section corresponds to the member number of the node. Thus a process running on member1 accesses its own copy of /etc/sysconfigtab and a process running on member2 accesses its own separate copy of /etc/sysconfigtab. CDSLs are available on all versions of Tru64 5.1B regardless of whether the Tru64 cluster software is installed. Each cluster member is assigned a permanent member id. A system that is not a cluster member is assigned the number 0. A view of the beginning portion of the CDSL: clunode1: /etc> ls -l../cluster/members total 24 lrwxr-xr-x 1 root system 6 Sep 6 10:45 member@ -> {memb}/ drwxr-xr-x 9 root system 8192 Sep 6 10:45 member0/ drwxr-xr-x 9 root system 8192 Mar member1/ drwxr-xr-x 9 root system 8192 Sep 6 10:45 member2/ When a CDSL is referenced, Tru64 replaces the {memb} part with membern where n is the member number of the local machine. Thus for first member, sysconfigtab refers to the file: /etc/cluster/members/member1/boot_partition/etc/sysconfigtab The clu_get_info command displays the list of current cluster members and their member ids Caché and CDSLs TruCluster File System Architecture The Caché registry, which stores all information about Caché instances on that machine, is stored in /usr/local/etc/cachesys. When Caché is installed in a TruCluster environment, it makes this directory Caché High Availability Guide 159

170 Caché Clusters on Tru64 UNIX a CDSL to a member-specific area. Therefore, when Caché is installed on one cluster member, the ccontrol list command on the other cluster members do not display that installation. This provides maximum flexibility where the current version of Caché can be given the same configuration name on each cluster member. It also prevents one cluster member from accidentally starting or stopping an instance of Caché that is running on another cluster member. If desired, the other installations from other cluster members can be added to the local registry file using the create function of the ccontrol command. The syntax is: clunode1: /> ccontrol create $cfgname directory=$tgtdir versionid=$ver where $cfgname is the configuration or instance name, $tgtdir is the directory where that instance is installed, and $ver is the version string. It is possible to simply delete the CDSL (using the rmcdsl command) and use a common registry file for the Caché cluster. However, the upgrade procedure for Caché may convert this back to a CDSL, placing the existing registry file in the member-specific area of the upgraded system. CAUTION: Do not use /usr/local/etc/cachesys as a mount point for a Caché instance on Tru64. The Caché installation and upgrade process uses the mkcdsl command on this directory; the command fails if the directory is anything but the registry directory. CDSLs can be fragile it is easy to delete them or break them. Broken CDSLs can cause trouble with remastering AdvFS domains. The clu_check_config command tests the validity of registered CDSLs and displays any problems. Generally, removing and recreating a broken CDSL may be all that is necessary to resolve problems with lost files Remastering AdvFS Domains In a Tru64 cluster there is a single view of the file system, but this is layered on top of AdvFS file systems. Even though each cluster member may have direct access to the disk device that make up the domain, one cluster member is elected to be the server for that domain and all other cluster members are clients. Caché opens database files using direct I/O on Tru64 cluster members. If there is a direct path from a machine to the disk drive, the database, journal, and WIJ (write image journal) I/O all use that path. Tru64 also supports direct I/O when there is not a direct connection to the disk. In this case, Tru64 redirects the I/O requests over the cluster interconnect to the server for that drive. This is obviously not the optimal configuration from a performance perspective. Even though direct I/O allows for shared I/O to a file from multiple cluster members, Tru64 restricts file expansion (and all metadata operations) to the server for the fileset. This is generally not a consideration for databases that expand rarely (compared to the I/O rate), but it is a concern for journal files, which are constantly expanding. Caché requires a separate AdvFS domain for each set of journal files that could potentially be part of a Caché instance on separate cluster members. AdvFS domains can be remastered on a running system, 160 Caché High Availability Guide

171 although this tends to fail if the domain is under heavy load at the time. The status of an AdvFS domain can be determined with the cfsmgr command. The command with no arguments displays summary information for each system. The command cfsmgr <mount point> displays information for a single filesystem. For example: clunode1: /# cfsmgr Domain or filesystem name = cluster_root#root Mounted On = / Server Name = clunode2 Server Status : OK Domain or filesystem name = root1_domain#root Mounted On = /cluster/members/member1/boot_partition Server Name = clunode1 Server Status : OK Domain or filesystem name = root2_domain#root Mounted On = /cluster/members/member2/boot_partition Server Name = clunode2 Server Status : OK Domain or filesystem name = cluster_var#var Mounted On = /var Server Name = clunode2 Server Status : OK Domain or filesystem name = test_domain#test Mounted On = /test Server Name = clunode2 Server Status : OK Domain or filesystem name = cluster_usr#usr Mounted On = /usr Server Name = clunode1 Server Status : OK clunode1: /# TruCluster File System Architecture To change the machine which is serving a domain use the -r -a SERVER= option. You can specify either the filesystem or the domain to be remastered; however, keep in mind that either way the entire domain is remastered this affects all filesets/file systems in that domain. In the following example the server for /test (and the domain it is part of) is transferred from clunode2 to clunode1: Caché High Availability Guide 161

172 Caché Clusters on Tru64 UNIX clunode1: /# cfsmgr /test Domain or filesystem name = /test Server Name = clunode2 Server Status : OK clunode1: /# cfsmgr r -a SERVER=clunode1 /test clunode1: /# cfsmgr /test Domain or filesystem name = /test Server Name = clunode1 Server Status : OK clunode1: /# When a cluster member shuts down (or fails) any domains/filesets it was serving get taken over by one of the surviving members, which has a direct connection to the disk drives involved. If no member has such a connection, the domain/fileset becomes unavailable. Caché cluster failover requires that the surviving cluster members have access to all of the databases from that failed system (not just the cluster-mounted ones), the WIJ file, and the journal files. This is why Caché, and its components, should not be installed on a disk that is only connected to a single node. When a failed cluster member starts back up it does not automatically take over the filesets it used to be serving. This must be done manually. After planning your configuration you should add the necessary cfsmgr commands to the startup scripts for each cluster member so that when it boots, it becomes the server for the domains that contain the Caché journal files for that node. 8.3 Planning a Tru64 Caché Cluster Installation Please keep the following points in mind when planning your Tru64 UNIX cluster: Caché clusters requires Tru64 UNIX Version 5.1B with the latest 5.1B aggregate patch kit installed. HP recommends using Memory Channel hardware for the cluster interconnect, although Gigabit Ethernet is also supported. Slower Ethernet is not recommended for cluster interconnect hardware. Each cluster member should have a direct path to the disk subsystem. A disk subsystem may be directly connected to only some or one of the cluster members, but this is neither a highly available nor a high-performance configuration. UNIX file systems (UFS) are read-only across Tru64 clusters. Caché must be installed on Advanced File Systems (AdvFS) in a Tru64 cluster. You must create at least as many AdvFS domains as Tru64 cluster members being used on Caché. To store Caché journal files, each Tru64 cluster member running Caché needs a file system that it can serve locally. 162 Caché High Availability Guide

173 Tuning a Tru64 Caché Cluster Member Note: It is not acceptable to have one domain and multiple filesets for Caché journal files. A fileset contains all files and control scripts that make up a product. Filesets are the smallest software object manageable by SysMan Software Manager commands. Filesets must be grouped into one or more products and can be included in one or more different bundles. The best practice is to build the Tru64 UNIX cluster before installing Caché. Do not use context-dependent symbolic links (CDSLs) for the Caché installation directory. Install Caché into a separate directory on each cluster member. Using the same names for these directories makes administration easier. Separate AdvFS domains or filesets are required for the Caché journal files of each member, and the default location of journal files is a subdirectory of the Caché installation directory. Therefore, if you want to use the default location for journal files, install Caché for each cluster member in separate AdvFS domains. Caché installations that always run on the same cluster member may be installed in the same domain. 8.4 Tuning a Tru64 Caché Cluster Member The tuning parameters for a Tru64 Caché cluster system are the same as those for an OpenVMS cluster system. InterSystems recommends setting values for two attributes of the dlm subsystem, rhash_size and dlm_locks_cfg. These correspond to the RESHASHTBL and LOCKIDTBL parameters under OpenVMS; their calculations are similar. They are configured as other Tru64 kernel parameters and are put in the /etc/sysconfigtab file under the dlm: stanza. For example: dlm: rhash_size = dlm_locks_cfg = The system configuration file contains separate stanzas for kernel subsystems. Each stanza contains a list of attributes and the values those attributes hold. Only those subsystems/attributes to be changed from the default need to be listed. This file, sysconfigtab, is not shared between cluster members, therefore, each cluster member must be restarted for the changes to take effect. The settings should be the same for each cluster member. Caché High Availability Guide 163

174

175 9 Caché and Windows Clusters Microsoft Windows operating systems do not support shared-all clusters: They do not offer a shared resource cluster model. They do not allow simultaneous access to shared drives: you cannot lock, read, or write to a cluster. If a drive fails, the operating system does not swap in a backup drive. Windows NT and Windows 2000 Advanced Server allow only two nodes to be clustered, and only provide failover of your drives from one node to another. Some versions of Windows Server 2003 and 2008, though, allow up to 16 nodes. Two of the Caché-supported Microsoft platforms, Windows 2000 Advanced Server and Windows Server 2003, however, do allow you to cluster computers that share the same storage. You must have a RAID or SCSI disk drive system to do so. See the Microsoft Knowledge Base Article (278007), Available Features in Windows Server 2003 Clusters, for more information on changes for Windows Server You can run two basic cluster setups under Windows operating systems in Caché: Single Failover Cluster Multiple Failover Cluster Note: You must have a multiserver Caché license for Windows clustering. For a multiple failover cluster, you must also have a separate license for each Caché instance, or an Enterprise license. The subsections in Example Procedures show the procedure details used in both setups. These cluster setups are described in the sections that follow. For suggestions on other ways to run a large enterprise system, contact the InterSystems Worldwide Response Center (WRC). Caché High Availability Guide 165

176 Caché and Windows Clusters 9.1 Single Failover Cluster The following shows a single failover cluster: Single Failover Cluster CLUNODE-1 and CLUNODE-2 are clustered together, and Caché is running on one node. During normal operations, the following conditions are true: Disk S is online on CLUNODE-1, and CLUNODE-2 has no database disks online. The instance CacheA runs on CLUNODE-1; CLUNODE-2 is idle. In this setup, if CLUNODE-1 fails, your system looks like this: 166 Caché High Availability Guide

177 Single Failover Cluster Failover Cluster with Node Failure Disk S is online on CLUNODE-2; CLUNODE-1 has no database disks online. The instance CacheA runs on CLUNODE-2; CLUNODE-1 is down. See the Setting Up a Failover Cluster section for a detailed example Setting Up a Failover Cluster This section gives an example of the steps necessary to set up a failover cluster on Windows Server The following sections describe the steps on each node: Tasks on CLUNODE-1 Tasks on CLUNODE-2 Tasks on Both Nodes Tasks on CLUNODE-1 Perform the following steps on the first cluster node, CLUNODE-1: 1. Open the Cluster Administrator from the Windows Administrative Tools submenu. Verify that all drives that contain Caché files and databases are shared drives and that they are all online on CLUNODE Create a Cluster Group called CacheA Group. 3. Create an IP Address Resource for CacheA Group called CacheA_IP. Caché High Availability Guide 167

178 Caché and Windows Clusters You can also create a Network Name resource type if you want applications to connect to Caché by a DNS name as well as an IP address. 4. Create a Physical Disk Resource for the shared disk containing CacheA called Disk S:. 5. Install Caché on CLUNODE-1, naming the instance CacheA and installing it in a new folder, CacheA, on Disk S. The Caché install procedure on a cluster node automatically creates the resource needed to manage the Caché service. 6. Define the instance CacheA and map the database files for the instance on drives that are online on the same cluster as the Caché instance during normal operations. Do the same for all journal files. 7. If you are using the CSP Gateway (to access the System Management Portal, for example) from within this cluster, see the CSP Gateway Considerations section. 8. Move CacheA Group from CLUNODE-1 to CLUNODE-2 by right-clicking CacheA Group under the Groups branch and then clicking Move Group. You do not need to stop Caché; this is the way you fail over. Dependencies After you create the resources on CLUNODE-1, the following dependencies exist: 1. IP address, CacheA_IP, has no dependencies. 2. Physical disk resource, Disk S, depends on CacheA_IP. CSP Gateway Considerations Caché protects server passwords in the CSP Gateway configuration file (CSP.ini) using Windows DPAPI encryption. The encryption functions work with either the machine store or user store. The Web server hosting the CSP Gateway operates within a protected environment where there is no available user profile on which to base the encryption; therefore, it must use the machine store. Consequently, it is not possible to decrypt a CSP Gateway password that was encrypted on another computer. This creates a situation for clustered environments where the CSP.ini file is on a shared drive and shared among multiple participating computers. Only the computer that actually performs the password encryption can decrypt it. It is not possible to move a CSP.ini file containing encrypted passwords to another computer; the password must be reentered and re-encrypted on the new machine. Here are some possible approaches to this issue: Use a machine outside of the cluster as the Web server. Each time you fail over, reset the same password in the CSP Gateway. Configure each computer participating in the cluster so that it has its own copy of the CSP Gateway configuration file (CSP.ini) on a disk that does not belong to the cluster. Caché maintains the file in the directory hosting the CSP Gateway DLLs. Save and encrypt the password on each individual computer before introducing the node to the cluster. 168 Caché High Availability Guide

179 For example, where Disk C from each machine does not belong to the cluster and Caché is installed on Disk S, you may have the following: CLUNODE-1: C:\Inetpub\CSPGateway\CSP.ini with password XXX encrypted by CLUNODE-1 CLUNODE-2: C:\Inetpub\CSPGateway\CSP.ini with password XXX encrypted by CLUNODE-2 Disable password encryption by manually adding the following directive to the CSP.ini file before starting the CSP Gateway and adding the passwords: [SYSTEM] DPAPI=Disabled Single Failover Cluster See the CSP Gateway Advanced Configuration Guide for more information Tasks on CLUNODE-2 Perform the following steps on the second cluster node, CLUNODE-2: 1. Install Caché on CLUNODE-2, naming the instance CacheA and installing it in the CacheA folder on Disk S. You are installing Caché on top of itself only to get the Caché entry into the registry. 2. Create a Caché Cluster Resource for CacheA Group called CacheA_controller. 3. Bring CacheA_controller online on CLUNODE-2 using the Cluster Administrator Tasks on Both Nodes Verify the failover setup: 1. Move the CacheA Group cluster group from CLUNODE-2 back to CLUNODE When you again have CacheA Group running on CLUNODE-1, run ipconfig on both CLUNODE-1 and CLUNODE-2 to check that each is properly advertising the alias IP addresses. You now have a failover cluster running Caché. Important: Do not start and stop Caché from the Caché Cube. Instead, using the Cluster Administrator, take the CacheA_controller offline to shut down Caché, and bring the CacheA_controller online to start Caché. Caché High Availability Guide 169

180 Caché and Windows Clusters 9.2 Example Procedures The following are instructions for performing common procedures in the cluster building process. They apply to the single failover example, but you can adapt them to the additional steps in the multiple failover setup by replacing the CacheA names with the appropriate CacheB names Create a Cluster Group To create a cluster group, do the following: 1. From the Cluster Administrator, right-click Groups, point to New, and click Group. 2. In the New Group dialog box, enter the group name (in this example, CacheA Group) in the Name box and click Next. 3. List and arrange the preferred owners, for example, CLUNODE-1 and CLUNODE-2, as shown in the following graphic for CacheA Group, and click Finish. 170 Caché High Availability Guide

181 Example Procedures Create an IP Address Resource To create an IP address resource, do the following: 1. From the Cluster Administrator, right-click the group name (CacheA Group), point to New, and click Resource. 2. Enter CacheA_IP as the name, select IP Address as the Resource type, and click Next: 3. Assign the alias IP address used to connect to the instance (CacheA) by your users. Put this resource in the corresponding cluster group (CacheA Group). This is not the cluster or node IP, but a new and unique IP specific for the instance (CacheA). For this example, the value of CacheA_IP is Caché High Availability Guide 171

182 Caché and Windows Clusters Once finished, the CacheA_IP resource has the following properties: IP Address Advanced Properties IP Address Parameter Properties CacheA_IP has no dependencies. 172 Caché High Availability Guide

183 Example Procedures Create a Physical Disk Resource To create a physical disk resource for the shared disk containing CacheA, do the following: 1. From the Cluster Administrator, right-click the group name (CacheA Group), point to New, and click Resource. 2. Enter Disk S: as the name, select Physical Disk as the Resource type, and click Next: 3. Verify, and update as necessary, the following settings for the Dependencies properties: Click Modify to enter a dependency. Enter CacheA_IP in the Name, and IP Address in the Resource Type as shown in the following figure: Physical Disk Dependency Properties Install Caché Follow the procedure described in the Installing Caché on Microsoft Windows chapter of the Caché Installation Guide to install Caché on the Windows cluster node. Each time you install an instance on a new node that is part of a Windows cluster, you must change the default automatic startup setting. Navigate to the [Home] > [Configuration] > [Memory and Startup] Caché High Availability Guide 173

184 Caché and Windows Clusters page of the System Management Portal and clear the Start Caché on System Boot check box to prevent automatic startup; this allows the cluster manager to start Caché. Following the installation you can remove the shortcuts from the Windows Startup folder (C:\Documents and Settings\All Users\Start Menu\Programs\Startup) that start the Caché Cube on Windows login. The shortcut has the name you give the instance when you install (CACHE, for example). The recommended best practice is to manage the cluster remotely from the cube on a workstation connecting to the cluster IP address. If you choose to use the cube locally from the desktop of one of the cluster nodes, be aware that certain configuration changes require a Caché restart and if you restart Caché outside the context of the cluster administrator, the cluster will declare the group failed and attempt failover Create a Caché Cluster Resource On Windows Server 2003 and later, Caché automatically adds a new resource type, ISCCres2003, to the Cluster Administrator when you install on an active Windows cluster node. To add a Caché cluster resource of this type, perform the following steps: 1. From the Cluster Administrator, right click the group name, CacheA Group, point to New, and click Resource. 2. Select ISCCres2003 as the Resource type and click Next. 3. Enter the resource name, CacheA_controller in this example, and the Caché instance name you entered at installation, CacheA in this example. 4. Update as necessary, the following settings for the Dependencies properties: Click Modify to enter a dependency. Enter Disk S: in the Name, and Physical Disk in the Resource Type and click OK. 5. Verify, and update as necessary, the following settings for the controller Advanced properties: Clear the Affect the group check box. Leave the default, 3, in the Threshold box. Click Use value from resource type for both poll intervals. 6. Verify that the instance name you entered in step 3 appears in the Instance box on the Parameters properties tab. Once finished, the CacheA_controller cluster resource has the following properties: 174 Caché High Availability Guide

185 Example Procedures Cluster Resource General Properties Cluster Resource Dependencies Properties Caché High Availability Guide 175

186 Caché and Windows Clusters Cluster Resource Advanced Properties Cluster Resource Parameters Properties 176 Caché High Availability Guide

187 Multiple Failover Cluster 9.3 Multiple Failover Cluster You may also set up a cluster with multiple failover nodes using the same procedures described in the previous sections for the single failover cluster. The following shows a failover cluster on multiple nodes: Multiple Failover Cluster CLUNODE-1 and CLUNODE-2 are clustered together, and Caché is running on both nodes. During normal operations, the following conditions are true: Disk S is online on CLUNODE-1, and Disk T is online on CLUNODE-2. The CacheA instance runs on CLUNODE-1; the CacheB instance runs on CLUNODE-2. Instances CacheA and CacheB cannot directly access each other s cache.dat files; they can directly access only their own mounted cache.dat files. With this type of setup, if CLUNODE-2 fails, your system looks like this: Caché High Availability Guide 177

188 Caché and Windows Clusters Multiple Failover Cluster with Node Failure Both CacheA and CacheB run on CLUNODE-1. Once you repair or replace CLUNODE-2, you can move your CacheB instance back to CLUNODE-2. If CLUNODE-1 were to fail, both CacheA and CacheB would run on CLUNODE-2. See the Setting Up a Multiple Failover Cluster section for a detailed example Setting Up a Multiple Failover Cluster This section gives a simple example of the steps necessary to set up a cluster with more than one failover node. The steps are performed on the following: Tasks on CLUNODE-1 Tasks on CLUNODE-2 Tasks on Both Nodes Tasks on CLUNODE-1 Perform the following steps on the first cluster node, CLUNODE-1: 1. Open the Cluster Administrator from the Windows Administrative Tools submenu. Verify that all drives that contain Caché files and databases are shared drives and that they are all online on CLUNODE Create a Cluster Group called CacheA Group. 3. Create an IP Address Resource for CacheA Group called CacheA_IP. 178 Caché High Availability Guide