Caché High Availability Guide

Transcription

1 Caché High Availability Guide Version August 2012 InterSystems Corporation 1 Memorial Drive Cambridge MA

2 Caché High Availability Guide Caché Version August 2012 Copyright 2012 InterSystems Corporation All rights reserved. This book was assembled and formatted in Adobe Page Description Format (PDF) using tools and information from the following sources: Sun Microsystems, RenderX, Inc., Adobe Systems, and the World Wide Web Consortium at primary document development tools were special-purpose XML-processing applications built by InterSystems using Caché and Java.,, Caché WEBLINK, Distributed Cache Protocol, M/SQL, M/NET, and M/PACT are registered trademarks of InterSystems Corporation.,,,, InterSystems Jalapeño Technology, Enterprise Cache Protocol, ECP, and InterSystems Zen are trademarks of InterSystems Corporation. All other brand or product names used herein are trademarks or registered trademarks of their respective companies or organizations. This document contains trade secret and confidential information which is the property of InterSystems Corporation, One Memorial Drive, Cambridge, MA 02142, or its affiliates, and is furnished for the sole purpose of the operation and maintenance of the products of InterSystems Corporation. No part of this publication is to be used for any other purpose, and this publication is not to be reproduced, copied, disclosed, transmitted, stored in a retrieval system or translated into any human or computer language, in any form, by any means, in whole or in part, without the express prior written consent of InterSystems Corporation. The copying, use and disposition of this document and the software programs described herein is prohibited except to the limited extent set forth in the standard software license agreement(s) of InterSystems Corporation covering such programs and related documentation. InterSystems Corporation makes no representations and warranties concerning such software programs other than those set forth in such standard software license agreement(s). In addition, the liability of InterSystems Corporation for any losses or damages relating to or arising out of the use of such software programs is limited in the manner set forth in such standard software license agreement(s). THE FOREGOING IS A GENERAL SUMMARY OF THE RESTRICTIONS AND LIMITATIONS IMPOSED BY INTERSYSTEMS CORPORATION ON THE USE OF, AND LIABILITY ARISING FROM, ITS COMPUTER SOFTWARE. FOR COMPLETE INFORMATION REFERENCE SHOULD BE MADE TO THE STANDARD SOFTWARE LICENSE AGREEMENT(S) OF INTERSYSTEMS CORPORATION, COPIES OF WHICH WILL BE MADE AVAILABLE UPON REQUEST. InterSystems Corporation disclaims responsibility for errors which may appear in this document, and it reserves the right, in its sole discretion and without notice, to make substitutions and modifications in the products and practices described in this document. For Support questions about any InterSystems products, contact: InterSystems Worldwide Customer Support Tel: Fax: support@intersystems.com

3 Table of Contents About This Book System Failover Strategies No Failover Strategy Failover Cluster Concurrent Cluster ECP Cluster ECP Failover ECP Recovery ECP and Caché Clusters Application Server Fails Data Server Fails Network Is Interrupted Cluster as an ECP Database Server ECP Clusters Caché and Windows Clusters Setting up Failover Clusters Single Failover Cluster Multiple Failover Cluster CSP Gateway Considerations Example Procedures Create a Cluster Service Create a Client Access Point Create a Physical Disk Resource Install Caché Create a Caché Cluster Resource Mirroring Configuring Mirroring Starting/Stopping ISCAgent Creating a Mirror Editing Mirror Configurations Adding Async Members to a Mirror Adding Databases to a Mirror Removing Mirror Configurations Disconnecting/Connecting Mirror Members Configuring an ECP Application Server to Connect to a Mirror Configuring a Mirror Virtual IP (VIP) Customizing the ISCAgent Port Customizing the ISCAgent Interface Customizing the ISCAgent User/Group on UNIX /Linux Systems Mirror Tunable Parameters ^ZMIRROR User-defined Routine Forms Caché Mirroring Concepts ISCAgent Async Mirror Member Caché High Availability Guide iii

4 4.2.3 Mirror Synchronization Communication Channels The Failover Process Sample Configurations Mirroring Special Considerations General Mirroring Considerations Database Considerations Hardware Considerations Network Considerations Network Interface Considerations Ensemble Considerations Disaster Recovery Switch Production to an Async Mirror Member Restore the Databases and Reestablish the Mirror Appendix A:Using Red Hat Enterprise Linux Clusters with Caché A.1 Hardware Configuration A.2 Red Hat Linux and Cluster Suite Software A.2.1 Red Hat Linux A.2.2 Red Hat Cluster Suite A.3 Installing and Using the <cache /> Resource A.3.1 Installing the cache.sh Script A.3.2 Patching the cluster.rng Script A.3.3 Using the <cache /> Resource A.4 Installing Caché in the Cluster A.4.1 Installing a Single Instance of Caché A.4.2 Installing Multiple Instances of Caché A.5 Application Considerations A.6 Testing and Maintenance A.6.1 Failure Testing A.6.2 Software and Firmware Updates A.6.3 Monitor Logs Appendix B:Using IBM PowerHA SystemMirror with Caché B.1 Hardware Configuration B.2 IBM PowerHA SystemMirror Configuration B.3 Install Caché in the Cluster B.3.1 Installing a Single Instance of Caché in the Cluster B.3.2 Installing Multiple Instances of Caché in the Cluster B.3.3 Application Controllers and Monitors B.3.4 Application Considerations B.4 Test and Maintenance Appendix C:Using HP Serviceguard with Caché C.1 Hardware Configuration C.2 HP-UX and HP Serviceguard Configuration C.3 Install Caché in the Cluster C.3.1 Installing a Single Instance of Caché in the Cluster C.3.2 Installing Multiple Instances of Caché in the Cluster C.3.3 Special Considerations C.4 Test and Maintenance iv Caché High Availability Guide

5 List of Figures Figure 1 1: Failover Cluster Configuration... 5 Figure 1 2: Concurrent Cluster Configuration... 6 Figure 1 3: ECP Cluster Configuration... 7 Figure 3 1: Single Failover Cluster Figure 3 2: Failover Cluster with Node Failure Figure 3 3: Multiple Failover Cluster Figure 3 4: Multiple Failover Cluster with Node Failure Figure 3 5: Physical Disk Dependency Properties Figure 3 6: Cluster Resource General Properties Figure 3 7: Cluster Resource Dependencies Properties Figure 3 8: Cluster Resource Policies Properties Figure 3 9: Cluster Resource Advanced Policies Properties Figure 3 10: Cluster Resource Parameters Properties Figure 4 1: Mirror Figure 4 2: Async Member Connected to Multiple Mirrors Figure 4 3: Multiple Async Members Connected to Single Mirror Figure 4 4: Mirror Communication Channels Figure 4 5: Status of Systems Figure 4 6: Status of Systems Figure 4 7: Sample Configuration: Direct Connect Figure 4 8: Sample Configuration: Networked Through External Ethernet Switch Caché High Availability Guide v

6 List of Tables Table 4 1: Mirror System Tunable Options Table 4 2: Mirror Configuration Details Form (Part 1) Mirror and First Failover Member (Configuration Items) Table 4 3: Mirror Configuration Details Form (Part 2) Second Failover Member (Configuration Items) Table 4 4: Mirroring Processes on Primary Failover Member Table 4 5: Mirroring Processes on Backup Failover/Async Member vi Caché High Availability Guide

7 About This Book As organizations rely more and more on computer applications, it is vital to safeguard the contents of databases. This guide explains the many mechanisms that Caché provides to maintain a highly available and reliable system. It describes strategies for recovering quickly from system failures while maintaining the integrity of your data. There are mechanisms available to maintain high availability including shadow journaling and various recommended failover strategies involving Caché ECP (Enterprise Cache Protocol) and clustering. The networking capabilities of Caché can be customized to allow cluster failover. The following topics are addressed: System Failover Strategies ECP Failover Caché and Windows Clusters Mirroring This guide also contains the following platform-specific appendixes: Using Red Hat Enterprise Linux Clusters with Caché Using IBM PowerHA SystemMirror with Caché Using HP ServiceGuard with Caché For detailed information, see the Table of Contents. For general information, see Using InterSystems Documentation. Caché High Availability Guide 1

8

9 1 System Failover Strategies Caché fits into all common high-availability configurations supplied by operating system providers including Microsoft, IBM, HP, and EMC. Caché provides easy-to-use, often automatic, mechanisms that integrate easily with the operating system to provide high availability. There are four general approaches to system failover. In order of increasing availability they are: No Failover Strategy Failover Cluster Concurrent Cluster ECP Cluster Each strategy has varying recovery time, expense, and user impact, as outlined in the following table. Approach Recovery Time Expense User Impact No Failover Strategy Unpredictable No cost to low cost High Failover Cluster Minutes Moderate Moderate Concurrent Cluster Seconds Moderate to high Low ECP Cluster Immediate Moderate to high None There are variations on these strategies; for example, many large enterprise clients have implemented ECP cluster and also use failover cluster for disaster recovery. It is important to differentiate between failover and disaster recovery. Failover is a methodology to resume system availability in an acceptable period of time, while disaster recovery is a methodology to resume system availability when all failover strategies have failed. If you require further information to help you develop a failover and backup strategy tailored for your environment, or to review your current practices, please contact the InterSystems Worldwide Response Center (WRC). 1.1 No Failover Strategy With no failover strategy in place your Caché database integrity is still protected from production system failure. Structural database integrity is maintained by Caché write image journal (WIJ) technology; you cannot disable this. Logical integrity Caché High Availability Guide 3

10 System Failover Strategies is maintained through global journaling and transaction processing. While global journaling can be disabled and transaction processing is optional, InterSystems highly recommends using them. If a production system failure occurs, such as a hardware failure, the database and application are generally unaffected. Disk degradation, of course, is an exception. Disk redundancy and good backup procedures are vital to mitigate problems arising from disk failure. With no failover strategy in place, system failures can result in significant downtime, depending on the cause and your ability to isolate and resolve it. If a CPU has failed, you replace it and restart, while application users wait for the system to become available. For many applications that are not business-critical this risk may be acceptable. Customers that adopt this approach share the following common traits: Clear and detailed operational recovery procedures Well-trained, responsive staff Ability to replace hardware quickly Disk redundancy (RAID and/or disk mirroring) Enabled global journaling 24x7 maintenance contracts with all vendors Expectations from application users who tolerate moderate downtime Management acceptance of risk of an extended outage Some clients cannot afford to purchase adequate redundancy to achieve higher availability. With these clients in mind, InterSystems strives to make Caché 100% reliable. 1.2 Failover Cluster A common and often inexpensive approach to recovery after failure is to maintain a standby system to assume the production workload in the event of a production system failure. A typical configuration has two identical computers with shared access to a disk subsystem. After a failure, the standby system takes over the applications formerly running on the failed system. Microsoft Windows Clusters, HP MC/ServiceGuard, OpenVMS Clusters, and IBM HACMP provide a common approach for implementing failover cluster. In these technologies, the standby system senses a heartbeat from the production system on a frequent and regular basis. If the heartbeat consistently stops for a period of time, the standby system automatically assumes the IP address and the disk formerly associated with the failed system. The standby can then run any applications (Caché, for example) that were on the failed system. In this scenario, when the standby system takes over the application, it executes a pre-configured start script to bring the databases online. Users can then reconnect to the databases that are now running on the standby server. Again, WIJ, global journaling, and transaction processing are used to maintain structural and data integrity. Customers generally configure the failover server to mirror the main server with an identical CPU and memory capacity to sustain production workloads for an extended period of time. The following diagram depicts a common configuration: 4 Caché High Availability Guide

11 Concurrent Cluster Figure 1 1: Failover Cluster Configuration State of PROD FUNCTIONAL OUT OF SERVICE IP address of PROD N/A IP address of STDBY Shadow journaling, where the production journal file is continuously applied to a standby database, includes inherent latency and is therefore not recommended as an approach to high availability. Any use of a shadow system for availability or disaster recovery needs should take these latency issues into consideration. 1.3 Concurrent Cluster The concurrent cluster approach exploits a standby system that is immediately available to accept user connections after a production system failure. This type of failover requires the concurrent access to disk files provided, for example, by OpenVMS clusters. In this type of failover two or more servers, each running an instance of Caché and each with access to all disks, concurrently provide access to all data. If one machine fails, users can immediately reconnect to the cluster of servers. A simple example is a group of OpenVMS servers with cluster-mounted disks. Each server has an instance of Caché running. If one server fails, the users can reconnect to another server and begin working again. Caché High Availability Guide 5

12 System Failover Strategies Figure 1 2: Concurrent Cluster Configuration State A B C Normal 300 users 300 users 300 users B fails 300 users 0 users 300 users B users log on again 450 users 0 users 450 users The 600 users on A and C are unaware of B's failure, but the 300 users that were on the failed server are affected. 1.4 ECP Cluster The ECP cluster approach can be complicated and expensive, but comes closest to ensuring 100% uptime. It requires the same degree of failover as for a failover cluster or concurrent cluster, but also requires that the state of a running user process be preserved to allow the process to resume on a failover server. One approach, for example, uses a three-tier configuration of clients and servers. 6 Caché High Availability Guide

13 ECP Cluster Figure 1 3: ECP Cluster Configuration Users connect directly or via Web servers to a bank of ECP application servers. In the case of application server failure, either new user traffic is routed to surviving application servers or failover cluster techniques automatically start a standby server which then is able to accept the traffic. In turn, the ECP application servers are connected to ECP data servers clustered in a failover cluster or concurrent cluster. If a data server fails, any application server waiting for a response will have its request answered by a surviving member of the cluster based on the guarantees of a failover cluster or concurrent cluster. Caché High Availability Guide 7

14

15 2 ECP Failover One of the most powerful and unique features of Caché is the ability to efficiently distribute data and application logic among a number of server systems. The underlying technology behind this feature is the Enterprise Cache Protocol (ECP), a distributed data caching architecture that manages the distribution of data and locks among a heterogeneous network of server systems. ECP is an important part of an application failover strategy for high-availability systems. This chapter describes how the architecture works to maintain high availability: ECP Recovery ECP and Caché Clusters ECP Clusters For more detailed information about ECP, see the Caché Distributed Data Management Guide. 2.1 ECP Recovery The simplest case of ECP recovery is a temporary network interruption that is long enough to be noticed, but short enough that the underlying TCP connection stays active during the outage. During the outage, the application server (or client) notices that the connection is nonresponsive and blocks new network requests for that connection. Once the connection resumes, processes that were blocked are able to send their pending requests. If the underlying TCP connection is reset, the data server waits for a reconnection for a timeout (configurable, default is set to one minute). If the client does not succeed in reconnecting during that interval, all the work done by the previous connection is rolled back and the connection request is converted into a request for a brand new connection. A more complex case is where the network outage is severe enough to reset the underlying TCP connection, but both client and data server stay up throughout the outage, and the client reconnects within the data server s reconnection window. On reconnection, the main action that must be performed is to flush (or, eventually, re-validate) the client s cache of downloaded blocks and the client s cache of downloaded routines. In addition, the client keeps a queue of locks to remove and transactions to roll back once the connection is reestablished. By keeping this queue, there is never a problem with allowing a process to halt right away whenever it wants to, whether or not the servers it has pending transactions and locks on are currently available. Connection recovery is careful to complete any pending Set and Kill operations that had been queued for the data server before the network outage was detected, before it completes the delayed release of locks. Finally, there is the case where the data server shut down, either gracefully or as a result of a crash. In this case, recovery involves several more steps on the data server, some of which involve the data server journal file in very important ways. The result of the several different steps is that: Caché High Availability Guide 9

16 ECP Failover The data server s view of the current active transactions from each application server has been restored from the data server s journal file. The data server s view of the current active Lock operations from each application server has been restored, by having the application server upload those locks to the data server. The application server and the data server both agree on exactly which requests from the application server can be ignored (because it is certain they completed before the crash) and which ones should be replayed. Hence, the last recovery step is to simply let the pending network requests complete, but only those network requests that are safe to replay. Finally, the application server delivers to the data server any pending unlock or rollback indications that it saved from jobs that halted while the data server was restarting. All guarantees are maintained, even in the face of sudden and unanticipated data server crashes, as long as the integrity of the databases (including the WIJ file and the journal files) are maintained. During the recovery of an ECP-configured system, Caché guarantees a number of recoverable semantics which are described in detail in the ECP Recovery Guarantees section of the ECP Recovery Guarantees and Limitations appendix of the Caché Distributed Data Management Guide. There are limitations to these guarantees which are described in detail in the ECP Recovery Limitations section of the ECP Recovery Guarantees and Limitations appendix of the Caché Distributed Data Management Guide. 2.2 ECP and Caché Clusters Adding cluster nodes which utilize ECP is simple and straightforward. On each cluster node, configure the system as an ECP data server, by enabling the ECP service from the [System] > [Security Management] > [Services] page. Click %Service_ECP and select the Service enabled check box. This is the only configuration setting required to use this node as an ECP data server. If you add a new member to the cluster, Caché does not need to change network configuration on every running member. Only the lock and increment requests are delivered in the internal cluster ECP connection. There is no data block to be sent from the master to the other cluster members as would be sent in ECP without clusters. When cluster failover happens, the cluster member asks the new master to do ECP recovery. A cluster member creates ECP connections to the other cluster members so it can access the privately-mounted databases in the other members. If you configure a node to be an ECP application server, it is not used as the cluster connection; it works as a regular ECP connection. Though this connection can also be used for accessing clustered mounted databases, the cluster failover does not recover the connection. If this is the first member to join the cluster, it is the master. Each node that joins the cluster does the following: Retrieves connection information (IP address and port) for each cluster member from the PIJ file. Validates the connection to each existing member to ensure cluster failover success. Allocates a system # (index to netnode array) and sets up a null system name for the system #. Initializes the netnode structure with ECP connection using the IP address and port from the master entry in the PIJ file and puts the connections in Not Connected state. Caché declares any ECP connection from a failing cluster member Disabled and releases all resources for that connection, including the locks in the lock table owned by the failed system. The following sections outline what happens under the described conditions on a clustered system using the ECP architecture: Application Server Fails 10 Caché High Availability Guide

17 ECP and Caché Clusters Data Server Fails Network Is Interrupted Cluster as an ECP Database Server Application Server Fails If the data server becomes aware that the application server has halted, crashed, disconnected, or otherwise declared the connection dead, the data server declares the connection dead. If a data server declares a connection is dead, it rolls back any open transactions for that application server and releases any locks held for that application server. It is then available for a new connection from that application server. During application server recovery the application server retransmits any requests it had previously sent for which it had not yet received a response, and, in the case of a data server crash, it also transmits any locks it owned on the data server. Following this phase the users on the application server resume operations without any noticeable effect other than the pause no data is lost or rolled back, and no application server user processes get errors. However, a few processes that were waiting for a $Increment or $Bit function that sets or clears a bit and returns the former value, may receive errors and have their open transactions rolled back Data Server Fails If a data server crashes while application server connections are open, the data server does the following at restart, prior to allowing general system usage: Attempts to reestablish a connection with the application servers that were active. Allows them to reestablish locks they had on the data server. Reprocesses earlier requests for which the answers were never received by the application servers. Declares the connection dead and rolls back open transactions for any application server not heard from during the startup phase. Following a data server crash, an application server waits for 20 minutes (a configurable time limit) before declaring the connection dead. If during that time the data server restarts, the application server goes through a recovery phase and resumes operation Network Is Interrupted If either the application or the data server detects a network outage, the data server waits up to one minute (a configurable time limit) for the network to start working before declaring the connection dead. When the application server is not receiving responses to requests, and cannot determine if there is a network outage or a problem with the data server, it waits for up to 20 minutes (configurable) which gives the system manager time to discover that the data server has crashed and restart it. Once either the application or the data server has declared a connection dead, there is no longer any ability to recover from a failure. In that case the data server rolls back any open transactions for that application server and releases its application locks. The application server is expected to issue <NETWORK> errors to any application processes that are still waiting for data. New attempts by the application server to use the network result in an attempt to create a new connection, so processing can resume when the problem is ultimately resolved. If the connection cannot be made, the application server process that made the request receives an error. Caché High Availability Guide 11

18 ECP Failover Cluster as an ECP Database Server In the Caché-supported genuine cluster environments, namely OpenVMS, using your Caché cluster as an ECP database server has the following limitations: You must run ECP to particular members of a cluster, not to the cluster as a whole. Do not use the IP address of the cluster. If a cluster master fails, you must reconnect to the new master IP address. In this type of cluster, ECP is used as a lock transport mechanism, not to transfer data Master Fails Caché does the following on a cluster member after its cluster master fails: Sets the state of any ECP connection from the failed master to Disabled and releases all resources for that connection including all the locks in the lock table owned by the failed system. Sets the state of the cluster ECP connection to Trouble. Locates the IP address and port of the new master through the PIJ file. Creates a connection to the new master and performs the recovery procedure New Cluster Master Caché does the following on a new cluster master after the previous cluster master fails: Sets the state of any ECP connections from the failed master to Disabled and releases all resources for that connection, including all locks in the lock table owned by the failed system. Stops all ECP daemons for the cluster ECP connection. Converts the granted remote locks to the old master into local locks. Removes all the pending cancel lock entries to the old master in the lock table, Caché does not remove the pending unlock entries because the process that issues the unlock request removes it eventually. Waits for all cluster members to upload their locks. Reissues the pending lock entries in the lock table by waking up the requesting processes. Sets the state of the cluster ECP connection to Disabled. 2.3 ECP Clusters It is also possible to use ECP on third-party clustering platforms. Caché ECP Clusters is a high availability feature that enables failover from one ECP data server to another, using operating system level clustering to detect a failed server. The following appendix describes the specifics on the indicated platform: Using Red Hat Enterprise Linux Clusters with Caché 12 Caché High Availability Guide

19 3 Caché and Windows Clusters Microsoft Windows operating systems do not support shared-all clusters: They do not offer a shared resource cluster model. They do not allow simultaneous access to shared drives: you cannot lock, read, or write to a cluster. If a drive fails, the operating system does not swap in a backup drive. However, Microsoft Windows Server 2003 and Windows Server 2008 platforms allow you to cluster computers that share the same storage. You must have a RAID or SCSI disk drive system to do so. This chapter contains the following subsections: Setting up Failover Clusters Example Procedures 3.1 Setting up Failover Clusters This section provides an overview of the steps required to set up a cluster. For suggestions on other ways to run a large enterprise system, contact the InterSystems Worldwide Response Center (WRC). To set up a failover cluster on a Windows Server platform, you must perform the steps as follows: 1. Configure the Microsoft cluster group on either cluster node and verify that it works. Depending on Windows Server platform you are using, refer to the following Microsoft Support documentation for more information: Available Features in Windows Server 2003 Clusters Introducing Windows Server 2008 Failover Clustering 2. Install Caché on the shared disks on both cluster nodes, as follows: Important: If you are upgrading Caché, the installer stops Caché if it is running. In the Windows cluster, stopping Caché initiates a cluster group failover, which causes the physical disk resource to become unavailable and the upgrade to fail. Therefore, before installing a Caché upgrade in the Windows cluster, using Failover Cluster Management, take the Caché cluster resource offline (to stop Caché), then bring the physical disk resource online (to install the upgrade). Then continue with this procedure. a. Install Caché on the first cluster node on which the Microsoft cluster group is running. Caché High Availability Guide 13

20 Caché and Windows Clusters If you define two or more Caché instances in a single cluster resource group, they will failover simultaneously with the group. b. Configure Caché. Important: You must ensure that the default automatic startup setting is disabled; for information, see Memory and Startup Settings in the Configuring Caché chapter of the Caché System Administration Guide. c. Stop Caché. Important: Do not start and stop Caché from the Caché Launcher. Instead, using Failover Cluster Management, take the Caché cluster resource offline to stop Caché, and bring the Caché cluster resource online to start Caché. d. Move the cluster group to the second cluster node. e. Install Caché on the second cluster node. Caché instances (members of the same Caché failover cluster representing the same cluster group resource) must be installed in the same directories and use the same install options. Caché instances from the same group, but representing different cluster group resources, must use the shared disks and IP addresses of the same cluster group. 3. On either node, configure Caché clustering via the Failover Cluster Management option (in the Microsoft Windows Administrative Tools menu). For descriptions of supported cluster configurations are described in the following subsections: Single Failover Cluster Multiple Failover Clusters Single Failover Cluster The following illustration shows a single failover cluster: Figure 3 1: Single Failover Cluster 14 Caché High Availability Guide

21 Setting up Failover Clusters CLUNODE-1 and CLUNODE-2 are clustered together, and Caché is running on one node. During normal operations, the following conditions are true: Disk S is online on CLUNODE-1, and CLUNODE-2 has no database disks online. The instance CacheA runs on CLUNODE-1; CLUNODE-2 is idle. In this setup, if CLUNODE-1 fails, your system looks like this: Figure 3 2: Failover Cluster with Node Failure Disk S is online on CLUNODE-2; CLUNODE-1 has no database disks online. The instance CacheA runs on CLUNODE-2; CLUNODE-1 is down. See the Example Procedures section of this chapter for a detailed example Multiple Failover Cluster You may also set up a cluster with multiple failover nodes using the same procedures described in the previous sections for the single failover cluster. The following shows a failover cluster on multiple nodes: Figure 3 3: Multiple Failover Cluster Caché High Availability Guide 15

22 Caché and Windows Clusters CLUNODE-1 and CLUNODE-2 are clustered together, and Caché is running on both nodes. During normal operations, the following conditions are true: Disk S is online on CLUNODE-1, and Disk T is online on CLUNODE-2. The CacheA instance runs on CLUNODE-1; the CacheB instance runs on CLUNODE-2. Instances CacheA and CacheB cannot directly access each other s cache.dat files; they can directly access only their own mounted cache.dat files. With this type of setup, if CLUNODE-2 fails, your system looks like this: Figure 3 4: Multiple Failover Cluster with Node Failure Both CacheA and CacheB run on CLUNODE-1. Once you repair or replace CLUNODE-2, you can move your CacheB instance back to CLUNODE-2. If CLUNODE-1 were to fail, both CacheA and CacheB would run on CLUNODE-2. See the Example Procedures section of this chapter for a detailed example CSP Gateway Considerations For high availability solutions running over CSP, InterSystems recommends that you use a hardware load balancer for load balancing and failover. InterSystems requires that you enable sticky session support in the load balancer (see your load balancer documentation for directions on how to enable sticky session support); this guarantees that once a session has been established between a given instance of the gateway and a given application server all subsequent requests from that user run on the same pair. This configuration assures that the session ID and server-side session context are always in sync; otherwise, it is possible that a session is created on one server but the next request from that user runs on a different system where the session is not present, which results in runtime errors (especially with hyperevents, which require the session key to decrypt the request). It is possible to configure a system to work without sticky sessions but this requires that the CSP session global be mapped across all systems in the enterprise and can result in significant lock contention so it is not recommended. Caché protects server passwords in the CSP Gateway configuration file (CSP.ini) using Windows DPAPI encryption. The encryption functions work with either the machine store or user store. The web server hosting the CSP Gateway operates within a protected environment where there is no available user profile on which to base the encryption; therefore, it must use the machine store. Consequently, it is not possible to decrypt a CSP Gateway password that was encrypted on another computer. 16 Caché High Availability Guide

23 Example Procedures This creates a situation for clustered environments where the CSP.ini file is on a shared drive and shared among multiple participating computers. Only the computer that actually performs the password encryption can decrypt it. It is not possible to move a CSP.ini file containing encrypted passwords to another computer; the password must be reentered and re-encrypted on the new machine. Here are some possible approaches to this issue: Use a machine outside of the cluster as the web server. Each time you fail over, reset the same password in the CSP Gateway. Configure each computer participating in the cluster so that it has its own copy of the CSP Gateway configuration file (CSP.ini) on a disk that does not belong to the cluster. Caché maintains the file in the directory hosting the CSP Gateway DLLs. Save and encrypt the password on each individual computer before introducing the node to the cluster. For example, where Disk C from each machine does not belong to the cluster and Caché is installed on Disk S, you may have the following: CLUNODE-1: C:\INSTANCEDIR\CSP\bin\CSP.ini with password XXX encrypted by CLUNODE-1 CLUNODE-2: C:\INSTANCEDIR\CSP\bin\CSP.ini with password XXX encrypted by CLUNODE-2 Disable password encryption by manually adding the following directive to the CSP.ini file before starting the CSP Gateway and adding the passwords: [SYSTEM] DPAPI=Disabled See the CSP Gateway Configuration Guide for more information. 3.2 Example Procedures This section describes common procedures in the cluster building process. They apply to the single failover example, but you can adapt them to the additional steps in the multiple failover setup by replacing the CacheA names with the appropriate CacheB names Create a Cluster Service To create a cluster service, do the following: 1. From Failover Cluster Management, right-click Services and Applications, then click Create Empty Service or Application. 2. In the New service or application Properties dialog box, on the General tab, enter the group name (in this example, CacheA Group) in the Name box. Caché High Availability Guide 17

24 Caché and Windows Clusters 3. Verify the preferred owners, for example, CLUNODE-1 and CLUNODE-2, as shown in the preceding graphic for CacheA Group, and click OK Create a Client Access Point To create a client access point, do the following: 1. From Failover Cluster Management, right-click the group name (CacheA Group), click Add a resource, then add the IP address. 2. Right-click Resource Name, select Properties. then enter CacheA_IP as the name. If the initial (default) State is Offline, select Bring the resource online. 3. Assign the alias IP address used to connect to the instance (CacheA) by your users. This is not the cluster or node IP, but a new and unique IP specific for the instance (CacheA). For this example, the value of CacheA_IP is Once finished, the CacheA_IP resource has the following properties: 18 Caché High Availability Guide

25 Example Procedures CacheA_IP has no dependencies Create a Physical Disk Resource To create a physical disk resource for the shared disk containing CacheA, do the following: 1. From Failover Cluster Management, right-click the group name (CacheA Group), and click Add Storage. 2. From the list of available disks, select the Windows cluster node on which Caché will be installed, then click OK: 3. On the Dependencies tab, verify, and update as necessary, the IP Address specified for the Resource as shown in the following figure, then click OK: Caché High Availability Guide 19

26 Caché and Windows Clusters Figure 3 5: Physical Disk Dependency Properties Install Caché For information about installing Caché on the Windows cluster node, see Setting up Failover Clusters in this chapter and the Installing Caché on Microsoft Windows chapter of the Caché Installation Guide. Each time you install an instance on a new node that is part of a Windows cluster, you must change the default automatic startup setting. Navigate to the [System] > [Configuration] > [Memory and Startup] page of the Management Portal and clear the Start Caché on System Boot check box to prevent automatic startup; this allows the cluster manager to start Caché. For more information, see Memory and Startup Settings in the Configuring Caché chapter of the Caché System Administration Guide. Following the installation you can remove the shortcuts from the Windows Startup folder (C:\Documents and Settings\All Users\Start Menu\Programs\Startup) that start the Caché Launcher on Windows login. The shortcut has the name you give the instance when you install (CACHE, for example). The recommended best practice is to manage the cluster remotely from the launcher on a workstation connecting to the cluster IP address. If you choose to use the launcher locally from the desktop of one of the cluster nodes, be aware that certain configuration changes require a Caché restart and if you restart Caché outside the context of the cluster administrator, the cluster will declare the group failed and attempt failover Create a Caché Cluster Resource On Windows Server 2003 and later, Caché automatically adds a new resource type, ISCCres2003, to Failover Cluster Management when you install on an active Windows cluster node. To add a Caché cluster resource of this type, perform the following steps: 1. From Failover Cluster Management, right click the group name, CacheA Group, point to Add resource of type ISC- Cres2003, and the select Properties. 2. On the General tab, enter the resource name, (CacheA_controller in this example), which is the name that is displayed by Failover Cluster Management. 20 Caché High Availability Guide

27 Example Procedures 3. On the Dependencies tab, update, as necessary, the following settings: Click Insert to enter a dependency. Enter Disk S: in the Name and Physical Disk in the Resource Type, then click OK. 4. On the Policies tab, update, as necessary, the following settings: Clear the If restart is unsuccessful, fail over all resources in this service or application check box. Adjust Pending timeout to allow for any extra time required for user shutdown procedures. You should consider increasing the default value by the amount of the ShutdownTimeout configured for the Caché instance. 5. On the Advanced Policies tab, verify and update, as necessary, the following settings for the controller: From the Possible Owners list box, select cluster members on which Caché should be permitted to run. In both Basic resource health check interval and Thorough resource health check interval sections, select Use standard time period for the resource type. 6. On the Parameters tab, verify and update, as necessary, the following settings: In the Instance text box, enter the name of the Caché instance controlled by the cluster resource (specified in step 3). Enter a description of the Caché instance in the optional Comments field. When you are finished, the CacheA_controller cluster resource has the following properties: Figure 3 6: Cluster Resource General Properties Caché High Availability Guide 21

28 Caché and Windows Clusters Figure 3 7: Cluster Resource Dependencies Properties Figure 3 8: Cluster Resource Policies Properties 22 Caché High Availability Guide

29 Example Procedures Figure 3 9: Cluster Resource Advanced Policies Properties Figure 3 10: Cluster Resource Parameters Properties Caché High Availability Guide 23

30

31 4 Mirroring Traditional availability and replication solutions often require substantial capital investments in infrastructure, deployment, configuration, software licensing, and planning. Caché Database Mirroring (Mirroring) is designed to provide an economical solution for rapid, reliable, robust, automatic failover between two Caché systems, making mirroring the ideal automatic failover high-availability solution for the enterprise. In addition to providing an availability solution for unplanned downtime, mirroring offers the flexibility to incorporate planned downtimes (for example, Caché configuration changes, hardware or operating system upgrades, etc.) on a particular Caché system without impacting the overall Service Level Agreements (SLA s) for the organization. Combining InterSystems Enterprise Cache Protocol (ECP) application servers with mirroring provides an additional level of availability; application servers treat a failover as an ECP data server restart and allow processing to seamlessly continue on the new system once the failover is complete, thus greatly minimizing workflow and user disruption. Configuring the two failover mirror members in separate data centers offers additional redundancy and protection from catastrophic events. Traditional availability solutions that rely on shared resources (such as shared disk) are often susceptible to a single point of failure with respect to that shared resource. Mirroring reduces that risk by maintaining independent components on the primary and backup mirror systems. Further, by utilizing logical data replication, mirroring reduces the potential risks associated with physical replication, such as out-of-order updates and carry-forward corruption, which are possible with other replication technologies such as SAN-based replication. Finally, mirroring allows for special async members, which can be configured to receive updates from multiple mirrors across the enterprise. This allows a single system to act as a comprehensive Enterprise Data Warehouse, allowing enterprisewide data mining and Business Intelligence using InterSystems DeepSee. The async member can also be deployed in a Disaster Recovery model in which a single mirror can update up to six geographically-dispersed async members; this model provides a robust framework for distributed data replication, thus ensuring business continuity benefits to the organization; for more information, see Disaster Recovery in this chapter. This chapter is divided into the following topics: Configuring Mirroring Caché Mirroring Concepts Mirroring Special Considerations Disaster Recovery Caché High Availability Guide 25

32 Mirroring 4.1 Configuring Mirroring One of the primary goals of mirroring is to provide a robust, economic replication solution. In keeping with this goal, mirroring has been designed to be adaptable to various system configurations and architectures. However, to ensure an optimal experience (that is, to maximize the likelihood of automated failover in the event of a failure of the primary, and minimize the amount of time taken for the failover to complete and users to be brought back on the new primary), you should adhere to the following general configuration guidelines: Caché Versions When creating a mirror or adding members to a mirror, all Caché instances involved must be of the same version. However, instances that belong to an existing mirror can be upgraded separately and at different times. ICMP Do not disable Internet Control Message Protocol (ICMP) on any system that is configured as a mirror member because mirroring relies on ICMP to detect whether or not members are reachable. Network InterSystems recommends that you use a high-bandwidth, low-latency, reliable network between the two failover members. If possible, it is desirable to create a private subnet for the two failover members such that the dataand control-channel traffic can be routed exclusively on this private network. A slow network could impact the performance of both the primary and the backup failover members, and could directly impact the ability of the backup failover member to take over as primary in the event of a failover. Disk Subsystem In order for the backup failover member to keep up with the primary system, the disk subsystems on both failover members should be comparable; for example, if configuring a storage array on the first failover member, it is recommended that you configure a similar storage array on the second failover member. In addition, if network-attached storage (NAS) is used on one or both systems, it is highly recommended that separate network links be configured for the disk I/O and the network load from the mirror data to minimize the chances of overwhelming the network. Journal Throughput As journaling is the core of mirror synchronization, it is essential to monitor and optimize the performance of journaling on the failover members. For more information, see Journaling Best Practices in the Journaling chapter of the Caché Data Integrity Guide. Virtualization While it is possible to configure one or all of the mirror components in a virtualized environment, it is highly recommended that you configure these systems with appropriate redundancy; for example, the two failover members should not reside on the same physical host. User-defined Startup Routines If you are migrating to mirroring and have ^%ZSTART or ^ZSTU routines, you should instead use the ^ZMIRROR routine (see ^ZMIRROR User-defined Routine in this chapter). In addition, if you want to run a task on a specific failover member, use the task manager (see Using the Task Manager in the Managing Caché chapter of the Caché System Administration guide) to specify the mirror member on which the task should run. Important: After you set up a mirror, data in mirrored CACHE.DAT files is kept in sync automatically; rather than mirroring the CACHE.DAT file itself (as when copying a CACHE.DAT file from one system to another), only the data contained in the CACHE.DAT file on the primary mirror member is mirrored on other members. Other entities including (but not limited to) users, roles, namespaces, non-mirrored databases, mappings (especially global mappings and package mappings) are not kept in sync by the mirror on both failover members. Therefore, when you configure/update (add/modify/delete) these entities on one failover member, you must manually configure/update them consistently on the other failover member to keep them in sync. In addition, you must also perform the same manual operations on any async members on which you want these entities to be available. For more information, see the following topics: Starting/Stopping ISCAgent 26 Caché High Availability Guide

33 Configuring Mirroring Creating a Mirror Editing Mirror Configurations Adding Async Members to a Mirror Adding Databases to a Mirror Removing Mirror Configurations Disconnecting/Connecting Mirror Members Configuring an ECP Application Server to Connect to a Mirror Configuring a Mirror Virtual IP (VIP) Customizing the ISCAgent Port Mirror Tunable Parameters ^ZMIRROR User-defined Routine Forms Starting/Stopping ISCAgent The ISCAgent process must be installed and started on every system hosting a failover or async member before you can create or use a mirror. The component is installed when you install or upgrade Caché (see Mirror Upgrade Tasks in the Upgrading Caché chapter of the Caché Installation Guide). Depending on the platform on which you are running, you can start or stop the ISCAgent process as follows: On Windows, start the ISCAgent process as follows: 1. In the Microsoft Windows Control Panel, select Services from the Administrative Tools drop-down list, and doubleclick ISCAgent to display the ISCAgent Properties window. 2. On the Extended tab, click Start to start, or Stop to stop ISCAgent. 3. On the Extended tab, select Automatic from the Startup type drop-down list. On non-windows platforms, run the ISCAgent start/stop script, which is installed in the following locations, depending on the operating system: AIX : /etc/rc.d/init.d/iscagent OpenVMS: the scripts, RunAgent and StopAgent, are located in the instance [.BIN] subdirectory. To start the ISCAgent process, run the following command as root from the [.BIN] HP-UX: /sbin/init.d/iscagent Linux: /etc/init.d/iscagent Mac OS X: /Library/StartupItems/ISCAgent/ISCAgent Solaris: /etc/init.d/iscagent In addition, you must configure ISCAgent to start automatically when the system starts. For non-windows platforms, consult the operating system documentation. For example, to start ISCAgent on the IBM AIX platform, run the following command as root: /etc/rc.d/init.d/iscagent start; to stop it, run the following command: /etc/rc.d/init.d/iscagent stop. Caché High Availability Guide 27

34 Mirroring On UNIX /Linux platforms, mirroring uses the general-purpose logging facility, syslog, to log messages: informational messages are logged as priority LOG_INFO; error messages are logged as priority LOG_ERR (for information about configuring the syslog facility, see the documentation for your platform). In addition, mirroring requires that the bash shell is loaded, and the env executable script is installed in the /usr/bin directory Creating a Mirror Creating a mirror involves configuring two failover members and, optionally, one or more async members. After the mirror is created, you can add databases to be mirrored. In addition, you can configure members to use SSL/TLS and to encrypt data; for detailed information, see Creating and Editing SSL/TLS Configurations for a Mirror in the Using SSL/TLS with Caché chapter, and the Managed Key Encryption chapter, respectively, of the Caché Security Administration Guide. Important: Before you can create a mirror, you must ensure that the ISCAgent process has been started as described in the Starting/Stopping ISCAgent section in this chapter. When you are creating a new mirror, configure the mirror members in the following order: 1. Create Mirror and Configure First Failover Member 2. Configure Second Failover Member 3. Add Second Failover Member to Mirror 4. Add Async Members to Mirror 5. Adding Databases to a Mirror To simplify the configuration task, you can use the form provided at the end of this chapter to record essential system information; see Mirror Configuration Details Form Create Mirror and Configure First Failover Member The following procedure describes how to create a mirror and configure the first failover member. 1. Navigate to the [System] > [Configuration] > [Create Mirror] page of the Management Portal on the first failover member (for example, SystemA), and click Create a Mirror; if the link is not active, click Enable Mirror Service and select the Service Enabled check box. 2. On the [System] > [Configuration] > [Create Mirror] page, enter the following information in the Mirror Information section: a. Mirror Name Enter a name for the mirror. Valid names must be 1 15 uppercase alphanumeric characters. b. Use SSL/TLS Specify whether or not you want to use SSL/TLS security by selecting Yes or No from the dropdown list. If you select Yes, click Set up SSL/TLS and follow the instructions in the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide. c. Use Virtual IP Specify whether or not you want to use a Virtual IP address by selecting Yes or No from the dropdown list. If you select Yes, you are prompted for the following information: IP Address Enter an IP address in the text box (for more information, see the Configuring a Mirror Virtual IP (VIP) section of this chapter). Mask (CIDR format) Enter a Classless Inter-Domain Routing (CIDR) mask in the text box. 28 Caché High Availability Guide

35 Configuring Mirroring Network Interface Select a network interface from the drop-down list (for more information, see the Network Interface Considerations section of this chapter). 3. All other fields on this page are pre-populated with information specific to this node: Click Advanced Settings to display and edit additional pre-populated information about the mirror and the first failover member. In the Mirror Settings section (for more information, see the Mirror Tunable Parameters section in this chapter): Quality of Service Timeout (msec) The maximum time, in milliseconds, that processes on this failover member wait for data to be acknowledged by the other member; the default is 2000 msec. Acknowledgment Mode Select the acknowledgement mode that the primary failover member uses during a Mirror Synchronization process from the drop-down list; the default is Received. Agent Contact Required for Failover Select whether or not the active backup failover member should take over as the primary failover member if it not able to communicate with the primary system; the default is Yes. If you select No, you must provide the ^ZMIRROR user-defined routine (see ^ZMIRROR Userdefined Routine in this chapter). In the Mirror Failover Member Information section: Mirror Member Name The name of the failover member you are configuring on this node (for example, SystemA); it defaults to a unique system name. Superserver Address The IP address or host name that external systems can use to communicate with this failover member. Mirror Agent Address The IP address or host name of the ISCAgent on this failover member. Mirror Agent Port The port number of the ISCAgent on this failover member. For information, see the ISCAgent section in this chapter. In the This Failover Member section: Mirror Private Address The IP address or host name that the failover members use to communicate with each other for mirroring information and data. 4. Click Save Configure Second Failover Member This procedure describes how to configure the second failover member in the mirror. 1. Navigate to the [System] > [Configuration] > [Join Mirror as Failover] page of the Management Portal on the second failover member, and click Join as Failover; if the link is not active, click Enable Mirror Service and select the Service Enabled check box. 2. On the [System] > [Configuration] > [Join Mirror as Failover] page, enter the following information in the Mirror Information section: Mirror Name The name of the mirror you want to join; this should be the same mirror name you specified when you configured the first failover member (see the Create Mirror and Configure First Failover Member section of this chapter). 3. Enter the following information in the Other Mirror Failover Member s Info section: Caché High Availability Guide 29

36 Mirroring a. Agent Address on Other System Enter the superserver IP address or host name you specified when you created the mirror on the first failover member. b. Mirror Agent Port Enter the port of the ISCAgent you specified when you created the mirror on the first failover member. c. Caché Instance Name Enter the Caché instance name of the first failover member. 4. Click Connect to retrieve and display information about the mirror and the first failover member. 5. Enter or edit, if necessary, the following information in the Mirror Failover Member Information section: a. In the This System column: Mirror Member Name The name for the failover member you are configuring on the current node (for example, SystemB). Superserver Address The IP address or host name that other systems can use to communicate with this failover member. Mirror Agent Port The port number of the ISCAgent on this failover member. For information, see the ISCAgent section in this chapter. Network Interface for Virtual IP If the mirror is configured to use virtual IP, select a network interface from the drop-down list (for more information, see the Network Interface Considerationssection of this chapter). SSL/TLS Config The link lets you add or edit the SSL/TLS security; for information, see the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide. Important: If you configured the mirror to use SSL/TLS, this failover member must also be configured to use SSL/TLS to be able to join the mirror. Mirror Private Address The IP address or host name that the failover members use to communicate with each other for mirroring information and data. b. The Mirror Information for <mirror name> section displays information about the mirror that this member is joining: Use SSL/TLS Whether or not the mirror is using SSL security. Quality of Service Timeout (msec) The maximum time, in milliseconds, that processes on this failover member wait for data to be acknowledged by the other member; the default is 2000 msec. Acknowledgment Mode The acknowledgement mode that the primary failover member uses during a Mirror Synchronization process from the drop-down list. Agent Contact Required for Failover Whether or not the active backup failover member should take over as the primary failover member if it not able to communicate with the primary system; the default is Yes. Mirror Virtual IP The virtual IP address if the mirror uses a virtual IP address. 6. Click Save. To complete the process of joining the mirror, follow the instructions in the Add Second Failover Member to Mirror section of this chapter. 30 Caché High Availability Guide

37 Configuring Mirroring Add Second Failover Member to Mirror After you have created the mirror and configured the first failover member, and configured the second failover member, on the system on which you created the mirror and configured the first failover member (SystemA), you must add the second failover member to the mirror, as follows: 1. Navigate to the [System] > [Configuration] > [Edit Mirror] page of the Management Portal, and click Edit Mirror. 2. On the [System] > [Configuration] > [Edit Mirror] page of the Management Portal, click the Add Failover Member link in the Mirror Failover Member Information section. 3. Enter the following in the Mirror Information section in the Add Failover Member to Mirror window: a. Mirror Private Address The IP address or host name that the other failover members (for example, SystemB) use to communicate with each other for mirroring information and data. b. Mirror Agent Port Enter the port for the ISCAgent on the second failover member (SystemB). c. Caché Instance Name Enter the Caché instance name of the other failover member. 4. Click Connect to display information in the Failover Member Information section about both failover members in this mirror: Mirror Member Name The name of each failover member. Superserver Address The IP address or host name that external systems can use to communicate with this failover member. Mirror Agent Port The port for the ISCAgent for each failover member. SSL/TLS Config The link lets you add or edit the SSL/TLS security; for information, see the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide. Important: If you configured the mirror to use SSL/TLS, this failover member must also be configured to use SSL/TLS to be able to join the mirror. Mirror Private Address The IP address or host name that the failover members use to communicate with each other. 5. Click Save to save the displayed information and open the Edit Mirror page Editing Mirror Configurations If the Add Failover Member link is displayed in the Mirror Failover Member Information section, go to Add Second Failover Member to First Failover Member in this section. The following procedure describes how to use this page to edit information about the mirror, as follows: 1. Navigate to the [System] > [Configuration] > [Edit Mirror] page of the Management Portal, and click Edit Mirror. 2. On the [System] > [Configuration] > [Edit Mirror] page of the Management Portal, the following information about the mirror is displayed (but cannot be edited) for both members: Mirror Member Name The name of each failover member. Superserver Address The IP address or host name that external systems use to communicate with this failover member. Caché High Availability Guide 31

38 Mirroring Mirror Private Address The IP address or host name that the failover members use to communicate with each other. 3. The Mirror Information section displays the following information: Mirror Name The name of the mirror; you cannot edit this field. Use SSL/TLS If you initially configured the mirror to use SSL/TLS security (that is, the drop-down box displays Yes), you can click Set up SSL/TLS and follow the instructions in the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide to modify the certificate information.. Use Virtual IP You can change whether or not you want to use a Virtual IP address by selecting Yes or No from the drop-down list. If you select Yes, you can add or modify the following information: IP Address Enter an IP address in the text box (for more information, see the Configuring a Mirror Virtual IP (VIP) section of this chapter). Mask (CIDR format) Enter a Classless Inter-Domain Routing (CIDR) mask in the text box. Network Interface Select a network interface from the drop-down list (for more information, see the Network Interface Considerations section of this chapter). 4. Click Advanced Settings to display the Mirror Settings and This Failover Member subsections, and edit pre-populated information about the mirror: In the Mirror Settings subsection (for more information, see the Mirror Tunable Parameters section in this chapter): Quality of Service Timeout (msec) The maximum time, in milliseconds, that processes on this failover member wait for data to be acknowledged by the other member; the default is 2000 msec. Acknowledgment Mode Select the acknowledgement mode that the primary failover member uses during a Mirror Synchronization process from the drop-down list; the default is Received. Agent Contact Required for Failover Select whether or not the active backup failover member should take over as the primary failover member if it not able to communicate with the primary system; the default is Yes. If you select No, you must provide the ^ZMIRROR user-defined routine (see ^ZMIRROR Userdefined Routine in this chapter). In the This Failover Member subsection, the following values, which were specified when the mirror was created: Mirror Agent Address The IP address or host name of the ISCAgent on this failover member. Mirror Agent Port The port for the ISCAgent for this failover member. Mirror Private Address The IP address/host name used by external systems, such as ECP application servers, to communicate with the mirror. 5. Click Save Adding Async Members to a Mirror The following procedure describes how to configure async members: 1. Navigate to the [System] > [Configuration] > [Join Mirror as Async] page of the Management Portal and click Join as Async; if the link is not active, click Enable Mirror Service and select the Service Enabled check box. 32 Caché High Availability Guide

39 Configuring Mirroring 2. On the [System] > [Configuration] > [Join Mirror as Async] page, enter the following information in the Mirror Information section: Mirror Name Enter the same mirror name you specified when you created the mirror (see Creating a Mirror in this chapter). 3. Enter the following information in the Other Mirror Failover Member s Info section: a. Agent Address on Other System Enter the superserver IP address or host name of either failover member. b. Mirror Agent Port Enter the port of the ISCAgent you specified when you created the mirror on the specified failover member. c. Caché Instance Name Enter the Caché instance name of the other failover member whose IP address/host name you specified in Agent Address on Other System. 4. Click Connect to retrieve and display information about the async member. 5. Enter the following information in the Async Member System Info section: a. Async Member Name Enter a name for the async member you are configuring (for example, ASYNC1). b. Async Member System Type Select one of the following types from the drop-down list: Disaster Recovery This option is for systems configured as disaster recovery systems, where all mirrored databases are read-only. Important: A disaster recovery async member must have direct TCP/IP connectivity to the failover members and be on the same VLAN for proper virtual IP address reassignment. When the async member is not in the same data center as the failover members, a VLAN subnet can be extended across the two data centers to continue supporting the same virtual IP address. This requires proper Layer 2 connectivity between the two sites, the use of appropriate protocols such as IEEE 802.1Q for efficient operations, and support for Inter-Switch Links (ISL) for high availability. Another possible solution involves the use of hardware-based site selectors (for example, Cisco Global Site Selectors or f5 BIG-IP Global Traffic Manager), which can direct incoming traffic to a single fully qualified domain name to separate IP addresses within multiple data centers. Consult your organization s network administrators for information about available technical options. Read-Only Reporting This option is for systems where new mirrored databases are set to read-only by default. Read-Write Reporting This option is for systems where new mirrored databases are set to read-write by default. It is for systems configured as reporting/business intelligence systems, where it is possible to modify the data during analysis. c. Use SSL/TLS Specify whether or not you want to use SSL/TLS security by selecting Yes or No from the dropdown list; if you select Yes, click SSL/TLS Config and follow the instructions in the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide. d. SSL/TLS Config The link lets you add or edit the SSL/TLS security; for information, see the Creating and Editing SSL/TLS Configurations for a Mirror section of the Using SSL/TLS with Caché chapter of the Caché Security Administration Guide. Important: If you configured the mirror to use SSL/TLS, async members must also be configured to use SSL/TLS to be able to join the mirror. Caché High Availability Guide 33

40 Mirroring e. X.509 Distinguished Name The X.509 Distinguished Name (DN) is displayed only if the async member uses SSL/TLS security. If it is displayed, copy the DN, and follow the procedure described in Adding Authorized Async Members to Mirrors, below. 6. Click Save. 7. Restore databases to the async member, as described in the Adding Databases to a Mirror section of this chapter Adding Authorized Async Members to Mirrors This procedure is required only if the X.509 Distinguished Name (DN) was displayed during the Configuring Async Members procedure; it assumes you copied the DN, as instructed in that procedure. To add the secure async member, do the following on both the primary and backup failover members: 1. Navigate to the [System] > [Mirror Monitor] page, and click Add under the async members table. 2. On the Add a New Authorized Async Member page, enter the name of the async member in the Async Member Name text box (see Configuring Async Members in this chapter). 3. Paste the DN in the Async Member DN text box. 4. Click Save Deleting Authorized Async Members from Mirrors To remove a secure async member, do the following on both the primary and backup failover members: 1. Navigate to the [System] > [Mirror Monitor] page. 2. In the Authorized Async Members section, click Delete in the row that specifies the secure async member you want to remove. 3. In the message box that is displayed, click OK Adding Databases to a Mirror Only local databases on the currently active primary failover system can be added to a mirror. Only data in CACHE.DAT files can be mirrored. Data that is external (that is, stored on the file system) cannot be mirrored by Caché. In addition, journaling must be enabled on the database before it can be added to the mirror; if journaling is not enabled, you cannot add the database to the mirror. Mirrored databases must be present on both the primary and backup failover members as well as on async members that use them. In addition, namespaces and global/routine/package mappings associated with the mirrored databases must be the same on all mirror members,. The mirrored database on the backup failover member must be mounted, active and caught up (for more information, see Database Considerations and Activating/Catching up Mirrored Databases in this chapter) to be able to take over as the primary in the event of a failover. After configuring a database for mirroring, add it to the backup failover member, as well as to all async member(s) that need to access it. To add databases to the mirror, do the following on the currently active primary failover system: Important: To ensure that the database is mirrored, you must perform steps 1 through 4 (on the primary failover member), in the specified order, before you back up the database. 34 Caché High Availability Guide

41 Configuring Mirroring 1. Navigate to the [System] > [Configuration] > [Local Databases] page of the Management Portal, and click Edit next to the database you want to add to the mirror. 2. On the Database Properties page, click Add to Mirror. If journaling is not enabled on the database, Databases must be journaled to be mirrored is displayed in place of this link; to enable it, select Yes from the Global Journal State drop-down list. 3. In the Add database to mirror window, either accept the default name or specify a unique name to identify the mirrored database. By default, the local name of the database is displayed as the name of the database in the mirror. However, because multiple mirrored databases cannot have the same mirror database name, you must ensure that the name is unique; if the database has already been configured as a mirror database, you must change the name. 4. Click Add to mirror. 5. Back up the mirrored database on the primary failover member. Depending on the database backup strategy you use (see Backup Strategies in the Backup and Restore chapter of the Caché Data Integrity Guide), consider the following: Caché Online Backup With this strategy, you can back up mirrored databases on any mirror member. Inter- Systems recommends that you run backups on all members nightly to ensure that non-mirrored databases on the backup failover member and async members are backed up, and both mirrored and non-mirrored databases on the primary failover member are backed up. External Backup or Cold Backup InterSystems recommends that you back up mirrored databases on the primary failover member nightly. In addition, it is recommended that you back up non-mirrored databases on both the primary and backup failover members, as well as on all connected async members, nightly to ensure consistency between databases on all members. For more information, see Database Considerations in this chapter. 6. Restore the mirrored database on the backup failover member and all connected async members. Depending on the backup restore method you use (see Restoring from a Backup in the Backup and Restore chapter of the Caché Data Integrity Guide), consider the following: Caché Online Backup Restore (^DBREST) Routine The routine automatically resynchronizes mirrored databases when they are restored on the backup failover member or any async mirror member. In addition, the restore process on the backup failover member automatically negotiates with the running primary to ensure that the restored mirrored database(s) are activated and caught up. For more information seemirrored Database Considerations in the Backup and Restore chapter of the Caché Data Integrity Guide. Ensuring that the mirrored databases are synchronized requires that the journal files from the time of the backup are available and online; for information about restoring mirror journal files see Restoring Mirror Journal Files in the Journaling chapter of the Caché Data Integrity Guide. If the relevant journal files on the primary failover member have been purged, you must restore a more up-to-date backup; for information about purging mirror journal files see Purge Journal Files in the Journaling chapter of the Caché Data Integrity Guide. External Backup Restore or Cold (Offline) Backup Restore Both of these methods require that you manually activate and catch up the mirrored databases after they are restored and mounted on the backup failover member. In the event a manual journal restore is required, see Restore Globals From Journal Files Using ^JRNRESTO in the Journaling chapter of the Caché Data Integrity Guide. Caché High Availability Guide 35

42 Mirroring 7. If required, activate and/or catch up the mirrored databases on the backup failover member and async member(s) as described in Activating/Catching up Mirrored Databases in this chapter. If the backup failover member and/or async member(s) have a different endianness than the primary failover member, see Member Endianness Considerations in this chapter. When subsequent copies of a mirrored database are created on non-primary mirror members, the data to begin the catchup process may not have been sent yet. If the required data does not arrive within 60 seconds, the catch-up process begins anyway; those databases may not catch up if the data does not arrive before it is required, however, in which case a message regarding the database(s) that had the problem is logged in the cconsole.log file. During database creation there would be only one database; however this also applies to catching up in other situations where multiple databases are specified Activating/Catching up Mirrored Databases Activate and/or catch up mirrored databases on the backup failover member and async member(s) through the Mirror Monitor (see the Monitoring Caché Mirroring Performance chapter of the Caché Monitoring Guide). Mirrored databases must be activated on the backup failover member and async member(s). To activate mirrored databases, do the following on the backup failover member and async member(s): 1. Navigate to the [System] > [Mirror Monitor] page. 2. Click Activate in the row of the database you wish to activate. Similarly, mirrored databases must be caught up on the backup failover member and async member(s). To catch up mirrored databases, do the following on the backup failover member and async member(s): 1. Navigate to the [System] > [Mirror Monitor] page. 2. Click Catch up in the row of the database you wish to catch up. In addition, you can remove databases from the mirror as described in Removing Mirrored Databases from Mirrors in this chapter. To perform these actions simultaneously for one or more mirrored databases, click More Actions, then click the action you want to perform Removing Mirror Configurations You can remove mirrored databases, failover members, and async member(s) from mirrors. To do this, you must remove elements in the following order: 1. Remove Async Members from Mirrors 2. Remove Backup Failover Members from Mirrors 3. Remove Primary Failover Members from Mirrors 4. Remove Mirrored Databases from Mirrors Remove Async Member(s) from Mirrors In addition, if you want to shadow both mirrored and non-mirrored databases from an async mirror member, you must configure separate shadows for the databases. Removing an async member from mirrors requires that you run the ^MIRROR routine, as follows: 36 Caché High Availability Guide

43 Configuring Mirroring 1. On an async member that you are removing from the mirror, run the ^MIRROR routine in the %SYS namespace in the Caché Terminal. 2. Select Mirror Configuration from the main menu to display the following submenu: 1) Stop tracking a mirror 2) Remove Mirror Configuration 3) Display Mirror Configuration 4) Manage Async Journal File Retention 5) Manage Async Mirror Member Type 6) Clear FailoverDB Flag of Mirrored Databases 7) Display Mirrored Databases 3. From the submenu, select Stop tracking a mirror and follow the instructions. 4. From the submenu, select Remove Mirror Configuration and follow the instructions. 5. Restart the Caché instance. You can select Display Mirror Configuration at any time during the procedure to view the mirror configuration information in the cpf file on the member you are updating Remove Backup Failover Members from Mirrors Removing the backup failover member from mirrors requires that you run the ^MIRROR routine, as follows: 1. On the backup member that you are removing from the mirror, run the ^MIRROR routine in the %SYS namespace in the Caché Terminal. 2. Select Mirror Configuration from the main to display the following submenu: 1) Edit IP Address 2) Remove Other Failover Member 3) Remove This failover member 4) Remove Authorized ID for Async member 5) Display Mirror Configuration 6) Adjust Trouble Timeout parameter 7) Modify Network Addresses 8) Refresh other failover member's data via agent 3. If the mirror uses SSL/TLS, select Remove Authorized ID for Async member from the submenu and follow the instructions. 4. Select Remove This Failover Member from the submenu and follow the instructions. 5. Restart the Caché instance. You can select Display Mirror Configuration at any time during the procedure to view the mirror configuration information in the cpf file on the member you are updating Remove Primary Failover Members from Mirrors Removing the primary failover member from mirrors requires that you run the ^MIRROR routine in the %SYS namespace in the Caché Terminal. 1. On the primary member that you are removing from the mirror, run the ^MIRROR routine in the %SYS namespace in the Caché Terminal. 2. Select Mirror Configuration from the main menu to display the following submenu: 1) Edit IP Address 2) Remove Other Failover Member 3) Remove This Failover Member 4) Remove Authorized ID for Async Member 5) Display Mirror Configuration 6) Adjust Trouble Timeout parameter 7) Modify Network Addresses 8) Refresh other failover member's data via agent Caché High Availability Guide 37

44 Mirroring 3. If the mirror uses SSL/TLS, select Remove Authorized ID for Async member from the submenu and follow the instructions. 4. Select Remove This Failover Member from the submenu and follow the instructions. 5. Restart the Caché instance. 6. On the primary member that you are removing from the mirror, select Mirror Configuration from the main menu. 7. Select Remove This Failover Member again from the submenu and follow the instructions. 8. Restart the Caché instance. You can select Display Mirror Configuration at any time during the procedure to view the mirror configuration information in the cpf file on the member you are updating Remove Mirrored Databases from Mirrors You can convert mirrored databases from mirrored to non-mirrored local use by removing them from the mirror, which you do through the Mirror Monitor (see the Monitoring Caché Mirroring Performance chapter of the Caché Monitoring Guide). To remove databases from mirrors, do the following on either failover system: 1. Navigate to the [System] > [Mirror Monitor] page on the primary failover member. 2. Click Remove in the row of the database you wish to remove from the mirror. To perform this action on multiple mirrored databases simultaneously, click More Actions, then click the action you want to perform. Alternatively, you can remove one or all mirrored databases by selecting the Remove mirrored database option from the Mirror Management main menu list of the ^MIRROR routine Disconnecting/Connecting Mirror Members You can disconnect (and reconnect) a backup failover member and async member(s) from mirrors. See the following subsections for procedures to perform these and similar tasks: Disconnecting/Connecting Backup Failover Members from Mirrors Disconnecting/Connecting Async Members from Mirrors Disconnecting/Connecting Backup Failover Members from Mirrors You can disconnect (then reconnect) the backup failover member from the mirror as follows: 1. Navigate to the [System] > [Mirror Monitor] page on the backup failover member. 2. Click the Disconnect button. 3. Restart the Caché instance. After you restart the instance, the Disconnect button is replaced by the Connect button. To reconnect the backup member: 1. Navigate to the [System] > [Mirror Monitor] page on the backup failover member. 2. Click the Connect button. 3. Restart the Caché instance Disconnecting/Connecting Async Member(s) from Mirrors You can disconnect (reconnect) an async member from the mirror as follows: 38 Caché High Availability Guide

45 Configuring Mirroring 1. Navigate to the [System] > [Mirror Monitor] page on the aysnc member. 2. Click Disconnect in the row that specifies the async member you want to disconnect. 3. Restart the Caché instance. After you restart the instance, the Disconnect link is replaced by the Connect link. To reconnect the async member: 1. Navigate to the [System] > [Mirror Monitor] page on the aysnc member. 2. Click Connect in the row that specifies the async member you want to reconnect. 3. Restart the Caché instance. In addition, you can use the mirroring SYS.Mirror.StopMirror() andsys.mirror.startmirror() API methods or the ^MIRROR routine to perform these tasks Configuring an ECP Application Server to Connect to a Mirror Before configuring an ECP application server to connect to a mirror, you must ensure that both failover members are configured to act as ECP data servers, as follows: Important: An ECP application server that is connected to a mirror must be running Caché 10.2 or later. After configuring the ECP application server to connect to the mirror, perform failover tests (see Operatorinitiated Failover in this chapter) to ensure that it connects to the mirror regardless of which failover member is the primary member. 1. On each failover member, navigate to the [System] > [Configuration] > [ECP Settings] page of the Management Portal, and configure them as ECP data server systems; for information see Configuring an ECP Data Server in the Configuring Distributed Systems chapter of the Caché Distributed Data Management Guide. 2. On the ECP application server, navigate to the [System] > [Configuration] > [ECP Settings] > [ECP Data Server] page, and enter the following information: a. Server Name Enter the instance name of the primary failover member. b. Host DNS Name or IP Address Enter the IP address or host DNS name of the primary failover member. c. IP Port Enter the superserver port number of the primary failover member whose IP address or host DNS name you specified in the Host DNS Name or IP Address text box. 3. Select the Mirror Connection check box. A failover mirror member does not accept any ECP connections that are not marked as mirror connections. When a running system is configured as a mirror member, all existing ECP connections are reset and the clients cannot reconnect until the connections are redefined as mirror connections. Non-mirror ECP connections, however, may be made to async members. 4. Click Save. 5. Navigate to [System] > [Configuration] > [Remote Databases] and select Create New Remote Database to launch the Database Wizard. 6. In the Database Wizard, select the ECP data server from the Remote server drop-down list, then click Next. 7. Select the database you want to access over this ECP channel from the list of remote databases. Caché High Availability Guide 39

46 Mirroring You can select both non-mirrored databases (databases listed as :ds:<database_name>) and mirrored databases (databases listed as :mirror:<mirror_name>:<mirror_database_name>): Non-mirrored, journaled databases are available in read-only mode. Non-mirrored, non-journaled databases are available in read-write mode. Mirrored databases are available in read-write mode Configuring a Mirror Virtual IP (VIP) As described in the Caché Mirroring Concepts section of this chapter, you can configure a mirror virtual address that allows external applications, such as the following, to interact with the mirror using a single address: Relational Access ODBC, JDBC, Relational Gateway Web Access Caché Management Portal, Caché Server Pages, ZEN Direct Access Multidimensional Access, Direct-Connect Users, Caché Studio Object Access SOAP, XML, Java, EJB, COM, VB,.NET, C++, etc. The mirror virtual address must be a virtual IP (VIP), which requires that both failover members in the mirror are part of the same subnet. The primary failover member binds the VIP to a configured interface during startup. When a failover occurs, the VIP is reassigned to the new primary, which allows all external clients and connections to interact with one static IP regardless of which failover member is currently serving as primary. During the failover process connected clients that experience a network disconnect are able to reconnect once the other system has been elected primary and completed the failover tasks as described in The Failover Process section in this chapter). If a VIP is configured, the other system completes the failover only if it is successfully able to assign the VIP; otherwise, the automated failover process is aborted and requires manual operator intervention. To ensure that the Management Portal and Caché Studio can seamlessly access the primary failover member, regardless of which failover member is currently the primary, it is recommended that both failover members be configured to use the same superserver and web server port numbers. To configure a mirror VIP, you must first obtain the following information: An available IP address to be used as the Mirror VIP. An available network interface on each of the failover members. Important: To use a mirror VIP, both failover members must be configured in the same subnet and the VIP must belong to the same subnet as the network interface that is selected on each system; therefore, the interfaces selected on both systems should be on the same subnet. In addition, it is important to reserve this VIP so that other systems cannot use it; for example, in a Dynamic Host Configuration Protocol (DHCP) network configuration, this VIP should be reserved and removed from the DNS tables so that it is not allocated dynamically to a host joining the network. The VIP must be configured with an appropriate network mask, which you must specify in Classless Inter-Domain Routing (CIDR) notation. The format for CIDR notation is: <ip_address>/<cidr_mask>, where <ip_address> is the base IP address of system, and <CIDR_mask> is platform-dependent, as follows: On Mac OS X must be /32. On all other platforms must match the mask of the IP address assigned to the base interface. For example, based on the following system: 40 Caché High Availability Guide

47 Configuring Mirroring bash-2.05b# uname -a AIX apis C0B33E4C00 bash-2.05b# ifconfig en1 en1: flags=5e080863,c0<up,broadcast,notrailers,running,simplex,multicast, GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet netmask 0xffffff00 broadcast tcp_sendspace tcp_recvspace rfc1323 In this example, the en1 interface has a base address of with a netmask of 0xffffff00, which translates to /24. Therefore, to assign as the VIP to the en1 interface, you specify the network mask as follows (in CIDR notation): / Customizing the ISCAgent Port As described in the ISCAgent section of this chapter, the default ISCAgent port is However, you can change the port number as described in the following subsections: Customizing the ISCAgent Port Number on UNIX /Linux Systems Customizing the ISCAgent Port Number on Microsoft Windows Systems Customizing the ISCAgent Port Number on HP OpenVMS Systems Customizing the ISCAgent Port Number on UNIX /Linux Systems The ISCAgent process, by default, starts on port To customize the port on a UNIX /Linux system, do the following: 1. Create (or edit) the file named /etc/iscagent/iscagent.conf. 2. Add (or edit) the following line, replacing <port> with the desired port number: application_server.port=<port> Customizing the ISCAgent Port Number on Microsoft Windows Systems The ISCAgent process, by default, starts on port To customize the port on a Windows system, do the following: 1. Create (or edit) the file named <windir>\system32\iscagent.conf. 2. Add (or edit) the following line, replacing <port> with the desired port number: application_server_port=<port> Customizing the ISCAgent Port Number on HP OpenVMS Systems The ISCAgent process, by default, starts on port To customize the port on an HP OpenVMS system, do the following: 1. Create (or edit) the file named iscagent.conf in the instance [.BIN] subdirectory. 2. Add (or edit) the following line, replacing <port> with the desired port number: application_server.port=<port> Customizing the ISCAgent Interface As described in the Network Interface Considerations section of this chapter, ISCAgent binds to the default (or configured) port on all available interfaces. However, you can change the ISCAgent to bind to the interface serving a specific address as described in the following subsections: Caché High Availability Guide 41

48 Mirroring Customizing the ISCAgent Interface on UNIX /Linux Systems Customizing the ISCAgent Interfacer on Microsoft Windows Systems Customizing the ISCAgent Port Number on HP OpenVMS Systems Customizing the ISCAgent Interface on UNIX /Linux Systems The ISCAgent process, by default, binds to the specified port. To customize the ISCAgent to bind to the interface serving a specific address on a UNIX /Linux system, do the following: 1. Create (or edit) the file named /etc/iscagent/iscagent.conf. 2. Add (or edit) the following line, replacing <ip_address> with the address served by the desired interface: application_server.interface_address=<ip_address> To explicitly bind to all available interfaces (i.e., the default), specify * as follows: application_server.interface_address=* Customizing the ISCAgent Interface on Microsoft Windows Systems The ISCAgent process, by default, binds to the specified port. To customize the ISCAgent to bind to the interface serving a specific address on a Windows system, do the following: 1. Create (or edit) the file named <windir>\system32\iscagent.conf. 2. Add (or edit) the following line, replacing <ip_address> with the address served by the desired interface: application_server_interface_address=<ip_address> To explicitly bind to all available interfaces (i.e., the default), specify * as follows: application_server.interface_address=* Customizing the ISCAgent Interface on HP OpenVMS Systems The ISCAgent process, by default, binds to the specified port. To customize the ISCAgent to bind to the interface serving a specific address on an HP OpenVMS system, do the following: 1. Create (or edit) the file named iscagent.conf in the instance [.BIN] subdirectory. 2. Add (or edit) the following line, replacing <ip_address> with the address served by the desired interface: application_server_interface_address=<ip_address> To explicitly bind to all available interfaces (i.e., the default), specify * as follows: application_server.interface_address=* Customizing the ISCAgent User/Group on UNIX /Linux Systems When installing ISCAgent on UNIX /Linux, an OS username and group of iscagent is created to serve as the default user for the agent process by default (similar to the way cacheusr serves as the user and group name for Caché processes); the main difference is that iscagent is explicitly a nobody user/group, with no specific permissions. Access to the protected resources that iscagent needs is established when the agent is started by root before dropping privileges, or granted by the Caché instances it uses via cuxagent. To change the user/group that the agent runs as on UNIX /Linux, do the following: 1. Create (or edit) the file named /etc/iscagent/iscagent.conf. 42 Caché High Availability Guide

49 Configuring Mirroring 2. Add (or edit) the following lines, replacing <username> with a valid username and <groupname> with a valid group name: privileges.user=<username> privileges.group=<groupname>[,<groupname>[,...]] You can specify multiple, comma-separated <groupname>s in the privileges.group parameter. This is useful, for example, for sites where an instance requires multiple group permissions to execute cuxagent Mirror Tunable Parameters In addition to the typical configuration options, such as mirror name, IP address, etc., mirroring lets you modify the system tunable parameters via the Advanced Settings section of the [System] > [Configuration] > [Edit Mirror] page and/or the ^MIRROR routine. The mirror system tunable parameters are listed in the following table: Table 4 1: Mirror System Tunable Options Tunable Parameter Quality of Service Timeout Acknowledgment mode Agent Contact Required for Failover Default Value 2000 milliseconds Ack (Received) Yes The larger of: Trouble Timeout Limit * 5000 milliseconds 3 * QoS Timeout * This parameter is adjusted via the Adjust Trouble Timeout parameter option from the Mirror Configuration main menu list of the ^MIRROR routine on the running primary failover member. For more information, see the following subsections: Quality of Service (QoS) Timeout Acknowledgment (Ack) Mode Agent Contact Required for Failover Trouble Timeout Limit Quality of Service (QoS) Timeout The QoS Timeout parameter indicates the maximum time, in milliseconds, that processes on the primary failover member wait for data to be acknowledged by the backup failover member. If the backup does not respond within the QoS Timeout, it is demoted from active status to catchup mode, which indicates the backup must perform additional work to catch up with the primary failover member. The backup failover member automatically re-synchronizes with the primary failover member; no manual intervention is required. This parameter should be adjusted based on the maximum round-trip latency of the network between the two failover members. For example, if the two systems are side-by-side and connected by a private network, then the default value (2000ms) should be sufficient; however, if the two failover members are configured in geographically-dispersed locations Caché High Availability Guide 43

50 Mirroring (across town, across the state, across the country, etc.), then the QoS Timeout should be adjusted based on the average round-trip latency between the two locations. In addition, if the acknowledgment mode is set to Committed, the QoS Timeout parameter should include the average amount of time required by the dejournal process on the backup to write the data to disk. If the QoS Timeout period expires, the primary failover member enters a trouble state, where it remains until it can determine that the backup failover member knows that it is no longer active, or the Trouble Timeout Limit expires (see Trouble Timeout Limit in this section). Therefore, the maximum interruption to service on the primary failover member is the sum of the QoS Timeout and the Trouble Timeout Limit Acknowledgment (Ack) Mode While the backup failover member (and each connected async member) always acknowledges receipt of data (journal updates or keep-alive messages) from the primary, the Acknowledgment mode tunable parameter controls the behavior of the primary during a Mirror Synchronization process (see the Mirror Synchronization section in this chapter). The acceptable values are: Received (default): An acknowledgment is expected by the primary failover member; the acknowledgment should be generated by the backup failover member upon receipt of data. Committed: An acknowledgment is expected by the primary failover member; the acknowledgment should be generated by the backup failover member upon writing the update to the mirror journal file on disk. This is analogous to a synchronous commit across the mirror. If the acknowledgment mode is set to Committed, the QoS Timeout parameter should be adjusted as described in Quality of Service (QoS) Timeout in this section Agent Contact Required for Failover The Agent Contact Required for Failover tunable parameter controls the behavior of the failover process on the backup failover member in relation to the availability of the ISCAgent on the primary failover member. Possible values include: Yes (default) The backup failover member does not attempt to take over as primary if it is unable to communicate with the ISCAgent process on the failed primary system. No The backup failover member continues the takeover process even if the ISCAgent process on the primary system is unavailable, subject to the following conditions: The ^ZMIRROR user-defined routine (see ^ZMIRROR User-defined Routine in this chapter) must be run in the %SYS namespace. The $$IsOtherNodeDown^ZMIRROR() procedure must exist. If the procedure returns 0 (False) or if the procedure does not exist, the backup failover member aborts the takeover process. The backup failover member must verify that the primary failover member is down within a specified period of time (see the Backup Failover Member bullet item in the Trouble Timeout Limit section of this chapter) Trouble Timeout Limit The Trouble Timeout Limit parameter is used by both failover members, as follows: Primary Failover Member The maximum time, in milliseconds, the primary failover member waits for the backup failover member to recognize that it has entered an inactive state. During this time, the primary failover member enters a trouble state and no data is written to the journal, thus ensuring that the failover members remain synchronized 44 Caché High Availability Guide

51 Configuring Mirroring if the primary goes down. The primary remains in this state until it can determine that the backup failover member knows that it is no longer active, or the Trouble Timeout Limit expires. If the backup failover member is stopped gracefully, it notifies the primary that it is exiting and the primary does not enter the trouble state. However, if the backup crashes or loses connection with the primary, the primary failover member enters the trouble state, where it remains until the trouble timeout limit expires or the connection is re-established; the maximum interruption to service on the primary failover member is the sum of the QoS Timeout and the Trouble Timeout Limit (see Quality of Service (QoS) Timeout in this section). Backup Failover Member The maximum time, in milliseconds, the backup failover member has to determine that the primary failover member is not active. It represents the elapsed period of time between the last message from the primary failover member being received and the backup being able to determine that the primary failover member is down. The backup failover member cannot become the primary until it verifies that the primary failover member is down. The value specified for the Trouble Timeout Limit should allow enough time to for the backup failover member to determine whether or not the primary is running: For systems configured with Agent Contact Required for Failover = Yes, this is the time to contact the ISCAgent on the other primary failover member (typically two or three seconds). For systems configured with Agent Contact Required for Failover =No, the time should reflect the time it takes to obtain the result, including the time to execute the code in $$IsOtherNodeDown^ZMIRROR() (see Agent Contact Required for Failover in this section) ^ZMIRROR User-defined Routine As previously noted, setting the the Agent Contact Required for Failover tunable paramete to No causes the backup failover member to take over even if the ISCAgent process on the primary system cannot be contacted. This enables continous database operation even when the node hosting the primary failover member is down, the operating system has crashed, or some other such factor has taken the primary system entirely out of operation. When this option is in use, however, it is necessary to verify that the primary node is actually down to avoid dual primaries. Therefore, failover cannot occur when Agent Contact Required for Failover=No unless the ^ZMIRROR user-defined routine exists and can be run in the %SYS namespace the $$IsOtherNodeDown^ZMIRROR() procedure exists and returns 1 (True) if the primary system is unavailable When using Agent Contact Required for Failover=No, InterSystems recommends that the ^ZMIRROR routine be implemented on both failover members before mirroring is enabled. Otherwise, a restart may be required after ^ZMIRROR is implemented ^ZMIRROR Entry Points The ^ZMIRROR user-defined routine contains the following entry points. All provide appropriate defaults if they are omitted; theisothernodedown entry point cannot be omitted, however, if Agent Contact Required for Failover=No. $$IsOtherNodeDown^ZMIRROR() This procedure is called only when Agent Contact Required for Failover=No to externally validate that the primary failover system is down. If this procedure exists and returns 0 (False), failover is aborted. Acceptable return values: 1(True); 0 (False) Caché High Availability Guide 45

52 Mirroring InterSystems does not provide technology to determine whether or not the other node is actually down; contact the InterSystems Worldwide Response Center (WRC) for assistance regarding third-party technology that can be used to make this determination. $$IsOtherNodeDown^ZMIRROR() can be called multiple times during a single event and in different circumstances (for example, when the backup loses its connection to the primary, when an instance is starting up, as part of a retry loop when something has gone wrong, and so on.). If called when the agent on the other node can be contacted, the result from IsOtherNodeDown is ignored in favor of the answer returned from the ISCAgent; if called when Agent Contact Required for Failover=Yes and the agent on the other node cannot be contacted, the result is is ignored because failover is aborted. Therefore, IsOtherNodeDown is significant only in cases in which the node is actually down (powered off) and the agent cannot be contacted. The commented sample ^ZMIRROR routine provided later in this chapter shows one possible implementation of IsOtherNodeDown using a "ping strategy" to determine whether the primary system is down. There are a number of possible implementation strategies based on elements such as SCSI reservations, shared quorum disk-based file-locking, and so on. In cases in which the node is up but the agent cannot respond (for example, the agent was process was killed), the result returned by IsOtherNodeDown prevents failover and manual intervention is necessary for the backup member to become the primary. $$CanNodeStartToBecomePrimary^ZMIRROR() This procedure is called when an instance is about to begin the process of becoming the primary member. The instance has determined that the other instance is not currently the primary and that it is eligible to become the primary. As a general rule, this entry point is not used in a ^ZMIRROR routine. However, sites that wish to block failover members from automatically becoming the primary either at startup or when connected as the backup can include logic here to do so. If this entry point returns 0 (False) then the instance enters a retry loop where it continues to call $$CanNodeStartToBecomePrimary^ZMIRROR() every 30 seconds until it either returns 1 (True) or detects that the other node has become the primary (at which point the local node will become the backup). $$CheckBecomePrimaryOK^ZMIRROR() This procedure is called immediately before a system becomes the primary failover member, but before any work/updating is done on that system. If this procedure exists and returns 0 (False), the startup sequence is aborted and this node does not become the primary failover member. Acceptable return values: 1(True); 0 (False) $$CheckBecomePrimaryOK^ZMIRROR() is called after the instance is fully initialized as the primary failover member, all mirrored databases are read/write, ECP sessions have been recovered or rolled back, and local transactions (if any) from the former primary have been rolled back. No new work has been done because users are not allowed to log in, superserver connections are blocked, and ECP is still in a recovery state. This is where you can start any local processes or do any initialization required to prepare the application environment for users. If CheckBecomePrimaryOK returns False (0), the instance aborts the process of becoming the primary member and returns to an idle state. If CheckBecomePrimaryOK returns False, ECP sessions are reset. When a node succeeds in becoming the primary, the ECP client reconnects and ECP transactions are rolled back (rather than preserved). Client jobs receive <NETWORK> errors until a TRollback command is explicitly executed (see the ECP Rollback Only Guarantee section in the ECP Recovery Guarantees and Limitations appendix of the Caché Distributed Data Management Guide). In general CheckBecomePrimaryOK is successful; however, if there are common cases in which a node does not become the primary member, they should be handled in CanNodeStartToBecomePrimary rather than CheckBecomePrimaryOK. 46 Caché High Availability Guide

53 Configuring Mirroring NotifyBecomePrimary^ZMIRROR() This procedure is executed for informational purposes after a system has successfully assumed the role of primary failover member. Acceptable return values: N/A NotifyBecomePrimary^ZMIRROR() is called at the very end of the process of becoming the primary failover member (that is, after users have been allowed on and ECP sessions, if any, have become active). This entry point does not return a value. You can include code to generate any notifications or enable application logins if desired. NotifyBecomePrimaryFailed^ZMIRROR() This procedure is executed for informational purposes when a system fails to assume the role of primary failover member. Acceptable return values: N/A NotifyBecomePrimaryFailed^ZMIRROR() is called: When a failover member starts up and fails to become the primary or backup member. When the backup detects that the primary has failed and the backup fails to take over for the primary. This entry point is called only once per incident. If the node becomes the backup again (that is, connects to the primary), then NotifyBecomePrimaryFailed is called again if the backup fails to take over as the primary; however, once it is called, it is not called again until the node either becomes the primary or the primary failover member is detected Sample ^ZMIRROR Routine A commented sample implementation of ^ZMIRROR is provided in the following: ZMIRROR ; quit ;don't enter at the top #include %occstatus #include %symirror #ifndef FailoverMemberType #define FailoverMemberType 0 #endif /* */ /* THIS ROUTINE IS PROVIDED AS-IS, WITHOUT WARRANTIES. IT IS MEANT TO BE AN EXAMPLE/SAMPLE ZMIRROR ROUTINE. YOU SHOULD TAILOR THE ZMIRROR ROUTINE BASED ON YOUR INFRASTRUCTURE AND CONFIGURATION. FOR EXAMPLE, IF THE 2 FAILOVER MEMBERS AREN'T CONNECTED VIA A RELIABLE, REDUNDANT NETWORK, THEN YOU MUST *NOT* SET AGENTCONTACTREQUIRED TO FALSE (I.E., ALWAYS RUN WITH THE DEFAULT AGENTCONTACTREQUIRED=TRUE) - THIS IS BECAUSE YOU CANNOT DEFINITIVELY TELL WHETHER THE NETWORK BETWEEN THE 2 SYSTEMS IS DOWN, OR WHETHER THE OTHER SYSTEM ITSELF IS DOWN. IF, HOWEVER, YOU HAVE A RESILIANT, RELIABLE, REDUNDANT NETWORK BETWEEN THE 2 SYSTEMS, THEN YOU MAY EXPLORE THE POSSIBILITY OF SETTING AGENTCONTACTREQUIRED TO FALSE. IN THIS CASE, YOU MUST PROVIDE AN ADEQUATE IMPLEMENTATION IN THE ^ZMIRROR ROUTINE. NAMELY, YOU MUST IMPLEMENT AN APPROPRIATE IsOtherNodeDown^ZMIRROR() FUNCTION WHICH WILL DEFINITIVELY BE ABLE TO TELL WHETHER OR NOT THE OTHER NODE IS DOWN (IF YOU CANNOT PROGRAMATICALLY DEFINITIVELY TELL WHETHER THE OTHER NODE IS DOWN, YOU *MUST* ASSUME IT IS UP AND SUBSEQUENTLY PERFORM A MANUAL FAILOVER). ALSO, YOUR IMPLEMENTATION OF ^ZMIRROR (SHOULD YOU CHOSE TO IMPLEMENT ONE) SHOULD HAVE ADEQUATE ERROR TRAPPING. The sample implementation of IsOtherNodeDown^ZMIRROR below makes these assumptions: 1. The two nodes are connected directly to each other - that is, no switches/routers/etc between them, at least for the mirror communication channel 2. There should be at least 2 IPs configured - one for the Mirror Communication channel (private), and the other for the SuperServer (public) Caché High Availability Guide 47

54 Mirroring Here's the general algorithm for the code in the example: 1. Try to determine if this node (the one trying to become primary) is isolated (i.e., if it can or cannot access other machines). Do this by: a. If an IP address or FQDN is set in the ^ZMIRROR("CONFIG","KNOWN-IP") global, try to ping it. This could be set to a machine that's highly availble inside the network - i.e., a machine that is trusted to be up. b. If no IP is set in ^ZMIRROR("CONFIG","KNOWN-IP"), try to ping This assumes that the system can talk to the internet and PING is allowed through the firewall. These mechanisms are trying to determine whether the local system has become isolated from the network or not. If the local machine is no longer "on the network" then it cannot use a ping to determine whether the other node is up or down so it is very important to set the ^ZMIRROR("CONFIG","KNOWN-IP") node to an appropriate internal machine's FQDN (or IP). 3. The sample then accesses the Mirror configuration via the Config.Mirror* classes, and extracts all of the IP addresses which belong to the node being tested. These are the: Mirror Private address, ECP address Public (SuperServer) address It is best if there are multiple addresses which are assigned to different NICs so that the failure of a single network card does not make the node appear to be down. If less than 2 configured, we pretend like the other system is up. 4. Each IP address is then PINGed and if the ping returns, the other node is considered to be UP. 5. If all the IPs have been exhausted (and the system wasn't reachable on any of the configured IPs) a) If only 1 IP was configured, return "Other node is UP" b) If more than 1 IP was configured, return "OTHER NODE IS DOWN" The sample logs the results of various stages in the ^ZMIRRORINFO global. This can be viewed using the Global Explorer in the System Management Portal or by issuing 'zw ^ZMIRRORINFO' in the %SYS namespace. */ IsOtherNodeDown() PUBLIC { quit 0 ;Remove this after you have tailored the routine to the specific configuration. set $zt="isothernodedownerr" do logmsg("isothernodedown^zmirror() invoked") set isdown=0 #;Check if the ^ZMIRROR("CONFIG","KNOWN-IP") node contains an #;address of a node on this network #;if none specified, try google.com. This will only work if this #;system can talk to the internet #;set anothersystemip to an IP of a node in the network: set anothersystemip=$get(^zmirror("config","known-ip")," #;Check if anothersystemip is reachable - if not, we're probably #;isolated, so we should assume that the other node is up #;(to prevent split brain) and ABORT if '##class(%system.inetinfo).checkaddressexist(anothersystemip) { set msg="isothernodedown^zmirror() could not reach external IP " _anothersystemip_" - we *may* be ISOLATED. Abort takeover" do logmsg(msg) goto IsOtherNodeDownDone } set msg="isothernodedown^zmirror() was able to reach to external IP " _anothersystemip_" - we're not isolated, so continuing" do logmsg(msg) #;Next, pull out all the info from the various configuration pieces that #;are needed for us to talk to the other node. #;This is version specific (since the implementation for storage of #;mirror information in the CPF file changed in ). set majorvers=##class(%system.version).getmajor() set mirmemberconfig=##class(config.mirrormember).open() if '$IsObject(mirMemberConfig) { set msg="isothernodedown^zmirror() couldn't open Mirror Member config." _"Abort takeover" do logmsg(msg) goto IsOtherNodeDownDone } 48 Caché High Availability Guide

55 Configuring Mirroring set ourmirname=mirmemberconfig.systemname if majorvers<2012 { set rs=##class(%library.resultset).%new("config.mirrorsetmembers:list") set rc=rs.execute() } else { set rs=##class(%library.resultset).%new("config.mapmirrors:list") set rc=rs.execute($system.mirror.mirrorname()) } if $$$ISERR(rc) { set msg="isothernodedown^zmirror() couldn't open Mirror Set Member config." _"Abort takeover" do logmsg(msg) goto IsOtherNodeDownDone } s found=0 while (rs.next()) { set name=rs.data("name") quit:name="" ;out of mirror members if (name'=ourmirname) { #; Prior to 2012 only the failover members were listed in the MirrorSetMember #; list. Starting in 2012 all mirror members are listed and there is a member #; type field. In both cases there can only be two failover members and we've #; already filtered ourself out above so the next failover member we find #; is the system we're looking for. if majorvers<2012 { set MemberType=$$$FailoverMemberType ;see %symirror.inc } else { set MemberType=rs.Data("MemberType") ;type values are in %symirror.inc } if MemberType=$$$FailoverMemberType { set found=1 set agentip=rs.data("agentaddress") set ecpip=rs.data("ecpaddress") set mirrorip=rs.data("mirroraddress") } } set rs="" } if agentip'="" { set mirinfo("targetip",agentip)="" } if ecpip'="" { set mirinfo("targetip",ecpip)="" } if mirrorip'="" { set mirinfo("targetip",mirrorip)="" } quit ; exit loop, we found the 2nd failover member if 'found { #;special case - this means that we're the only mirror member configured, #;so go ahead and declare the "other node" as down set isdown=1 set msg="isothernodedown^zmirror detected that this is the only " _"configured mirror member, responding with YES" do logmsg(msg) goto IsOtherNodeDownDone } if $data(mirinfo("targetip"))=0 { set isdown=0 set msg="isothernodedown^zmirror failed to locate IP information " _"for other mirror member, responding with NO" do logmsg(msg) goto IsOtherNodeDownDone } #;At this point we have at least 1 IP for the other system, #;and at most 3 IPs to try #;Next, we try to ping the other system to see if we can even reach it. #;and we keep track of how many addresses we find if we fail set isdown=0 set targetip="" set ipcount=0 for { set targetip=$order(mirinfo("targetip",targetip)) quit:targetip="" set ipcount=ipcount+1 #;Ping the IP address - is it reachable? Caché High Availability Guide 49

56 Mirroring " if ##class(%system.inetinfo).checkaddressexist(targetip) { set msg="isothernodedown^zmirror was able to ping the other system at IP address _targetip_" so exiting NO" do logmsg(msg) goto IsOtherNodeDownDone } } #;If only 1 unique IP is configured, quit (isdown=0) if ipcount=1 { #;We only have 1 IP address configured - log error and return other node is UP set msg="isothernodedown^zmirror only detected 1 unique IP in the configuration, " _"responding with NO. Need at least 2 IPs configured..." do logmsg(msg) goto IsOtherNodeDownDone } #;If we're here, we couldn't make contact AT ALL with the other system, and #; we tried more than 1 ip address so now we proceed with takeover #; #; it's OK to proceed with takeover in this case because we're confident #;that the network between the systems is extremely robust, and that failure #;to talk to the other system is NOT due to a failure of the network between #;the two systems. #;If you aren't sure about the network between the 2 systems, you should #;take the safe course and actually let the system go into a waiting state #;till you can manually verify that the other system is indeed down. set isdown=1 set msg="isothernodedown^zmirror has exhausted all possible connectivity tests - " _"proceeding to exit with YES" do logmsg(msg) IsOtherNodeDownDone set msg="isothernodedown^zmirror detected that the other node is " _$select(isdown:"down",1:"not (definitively) down") do logmsg(msg) #; If isdown=1, it means that the other node was definitively down #; ASSUMING that the network between the 2 systems is robust quit isdown IsOtherNodeDownErr do logmsg("isothernodedown^zmirror encountered an error: "_$ze) quit 0 CanNodeStartToBecomePrimary() PUBLIC { #;Put code here to determine if this system can start the process of becoming the Primary. #;For example, you could check to see if there are some required file system mounts #;or services that need to be running for this system to become Primary. #;Return 1 if this system can start to become Primary; 0 otherwise #; Note that this function is only called on versions that include JO2544. } do logmsg("cannodestarttobecomeprimary^zmirror() invoked") quit 1 CheckBecomePrimaryOK() PUBLIC { #;Put code here to determine if this system can become Primary. #;For example, you could check to see if there are some required file system mounts #;or services that need to be running for this system to become Primary. #;Return 1 if this system can become Primary; 0 otherwise do logmsg("checkbecomeprimaryok^zmirror() invoked") quit 1 } NotifyBecomePrimary() PUBLIC { #;This procedure is called as a notification when this system becomes Primary #;It does not return any value } do logmsg("notifybecomeprimary^zmirror() invoked") quit /* */ NotifyBecomePrimaryFailed^ZMIRROR: This procedure is called as a notification when this system failed to become Primary. It does not return any value. You can add your own custom notification, etc. NotifyBecomePrimaryFailed() PUBLIC { #;This procedure is called as a notification when this system fails to become Primary #;It does not return any value 50 Caché High Availability Guide

57 Configuring Mirroring } do logmsg("notifybecomeprimaryfailed^zmirror() invoked") quit /* logmsg^zmirror: Helper sub Log the passed message into the next node in ^ZMIRRORINFO */ logmsg(msg) PRIVATE { set ^ZMIRRORINFO($i(^ZMIRRORINFO))=$zdatetime($now(),1,1,3)_" "_msg quit } Forms This section includes forms that are intended to help you configure mirrors: Mirror Configuration Details Form Mirror Configuration Details Table 4 2: Mirror Configuration Details Form (Part 1) Mirror and First Failover Member (Configuration Items) Config Description Default Configured Value Required. Mirror Name A descriptive name that identifies this mirror. Blank Mirror Member Name Required A descriptive name that identifies the first failover member <System/Instance> Use SSL Optional. Yes Use Virtual IP Optional. No IP Address Optional. Displayed only if Use Virtual IP is Yes. Blank CIDR Mask Optional. Displayed only if Use Virtual IP is Yes. Blank Network Interface Optional. Displayed only if Use Virtual IP is Yes. An interface on this machine/host that can be used for the Virtual IP in the future. Blank Acknowledgment Mode Required. Acknowledgment mode when data is sent to second failover member.` Receive Required. Agent Contact Required for Failover Controls behavior of second failover member in relation to availability of first failover member during attempted takeover. Yes Caché High Availability Guide 51

58 Mirroring Config Description Default Configured Value Mirror Private Address Required. The IP address or host name used by the failover members to communicate with each other. Same as the superserver address of this instance. Required. Mirror Agent Port The port on this machine/host that the ISCAgent runs on Superserver Address Optional. The IP address or host name that external systems machine use to communicate with this instance. Same as the superserver address of this instance. 52 Caché High Availability Guide

59 Caché Mirroring Concepts Table 4 3: Mirror Configuration Details Form (Part 2) Second Failover Member (Configuration Items) Config Description Default Configured Value Required. Mirror Name Same name as specified in Mirror Configuration Details Form (Part 1). Blank Required. Agent Address on Other System Same superserver IP address or host name that was specified for first failover member in Mirror Configuration Details Form (Part 1). Blank Required. Mirror Agent Port Same port as specified for first failover member in Mirror Configuration Details Form (Part 1) Caché Instance Name Required Instance name of the first failover member. Blank Mirror Member Name Required. A descriptive name that identifies this failover member. <System/Instance> Mirror Private Address Required. The IP address or host name used by the failover members to communicate with each other. Same as the superserver address of this instance. Required. Mirror Agent Port The port on this machine/host that the ISCAgent runs on Superserver Address Optional. The IP address or host name that external systems machine use to communicate with this instance. Same as the superserver address of this instance. 4.2 Caché Mirroring Concepts As shown in the following illustration, a mirror is a logical grouping of two physically independent Caché systems, called failover members. The mirror automatically assigns the role of primary to one of the two failover members after arbitrating between the two systems; the other system automatically becomes the backup failover member. If the primary failover member fails, the backup failover member takes over as the primary, and the failed primary becomes the backup: for more information, see The Failover Process, including its subsections, in this section. Caché High Availability Guide 53

60 Mirroring Figure 4 1: Mirror In addition to the two failover members, mirroring provides a special async member, which can be configured to receive updates from one or more mirrors; for information see the Async Mirror Member subsection in this chapter. All mirrored databases on the primary failover member are journaled, regardless of any system configuration or user code that may attempt to bypass journaling for those databases. In addition, all mirrored databases on the elected backup failover member are mounted as read-only to prevent accidental updates to the databases. While it is possible for the two failover members to run on different operating systems and to be of different endianness (byte order), both members must have the same Caché character width (that is, both systems must be either 8-bit, or 16-bit Unicode systems); it is not possible to mirror across failover members that are mixed character-width systems. All external clients (language bindings, ODBC/JDBC/SQL clients, direct-connect users, etc.) connect to the mirror through an optional mirror virtual IP (VIP) address, which is automatically bound to an interface on the member that was elected primary by the mirror; for more information, see the Configuring a Mirror Virtual IP (VIP) section of this chapter. Enterprise Cache Protocol (ECP) application servers have direct built-in knowledge of the members of the mirror, including the current primary. The application servers, therefore, do not rely on the mirror VIP, but instead connect directly to the elected primary failover member. ISCAgent Async Mirror Member Mirror Synchronization Communication Channels The Failover Process Sample Configurations ISCAgent Mirroring includes an executable program, ISCAgent, that runs on all failover members in the mirror (and the async member, if configured). It is used for inter-system communication during failover/takeover. By default the Agent Service (ISCAgent) uses port 2188; however, you can change the address and/or port number as described in the Customizing the ISCAgent Port section of this chapter. 54 Caché High Availability Guide

61 Caché Mirroring Concepts The agent runs securely on a dedicated, configurable port on each member, and responds to requests from the other failover member. The role and interaction of the mirror agent are described in detail in the Failover (Takeover) Rules subsection of this chapter Async Mirror Member As illustrated in the following illustration, mirroring also allows for a special member called an async member, which can be configured to receive updates from one or more mirrors across the enterprise, thus allowing a single node to act as a comprehensive enterprise-wide data warehouse. The async member provides additional flexibility in that it is possible to choose which mirrored databases from a mirror should be replicated; alternatively, all mirrored databases from a mirror could be replicated. Async members do not belong to a mirror and, therefore, are not candidates for failover. Figure 4 2: Async Member Connected to Multiple Mirrors An async member can be configured as an enterprise data warehouse, or one or more mirrors from the enterprise can update a single async member acting as a centralized repository. This configuration allows for rich reporting, business intelligence (BI), and data mining capabilities against data from across the enterprise. For example, InterSystems DeepSee can be easily deployed on the async member to provide embedded real-time BI such that key performance indicators from across the enterprise can be analyzed quickly and efficiently from a centralized location. Since the async member remains synchronized with the mirror(s) to which it is connected, this architecture provides a platform for distributed real-time operational reporting. Since the data on the async member is continually updated from changes occurring on the mirrors to which it is connected, there is no guarantee of synchronization of updates and synchronization of results across queries on the async member. It is up to the application running against the async member to guarantee consistent results for queries that span changing data. Finally, as shown in the following illustration, it is possible to connect up to six (6) async members to a single mirror, further enhancing the business continuity and disaster recovery plans at an organization by providing a framework for reliable replication across multiple, potentially geographically-dispersed, sites. Caché High Availability Guide 55

62 Mirroring Figure 4 3: Multiple Async Members Connected to Single Mirror Mirror Synchronization The synchronization process between the primary failover member and any connected members (for example, the backup failover member or the async member) differs based on the status of the connected member. Mirroring uses the journal write cycle on the primary failover member to synchronize data across the mirror. A journal write operation can be triggered: Once every two (2) seconds if the system is idle. By the data server when responding to specific requests (for example, $Increment) from the application servers in an ECP configuration to guarantee ECP semantics. By a TCOMMIT (in synchronous commit mode, which causes the data involved in that transaction to be flushed to disk) if you are using Caché transactions. As part of every database write cycle by the Write daemon. When the journal buffer is full. In addition, mirroring provides system tunable parameters that can be adjusted; for information, see Mirror Tunable Parameters in this chapter Communication Channels As shown in the following illustration, mirroring communication between the primary and backup failover members occurs over the following dedicated TCP channels: The data channel sends data from the primary to the backup; consequently, it is the most heavily used communication channel. The ack (acknowledgment) channel transports acknowledgments from the backup to the primary. The agent channel is used to connect to the ISCAgent on the other system. 56 Caché High Availability Guide

63 Caché Mirroring Concepts Figure 4 4: Mirror Communication Channels The primary failover member also has corresponding data and ack channels for each connected async member. While it is optional to configure SSL/TLS for these communication channels, it is highly recommended because sensitive data passes between the failover members and SSL/TLS provides authentication for the ISCAgent, which provides remote accesss to journal files and can force down the system or manipulate its virtual IP address. This SSL/TLS configuration can be specified during the mirror configuration process; for more information, see the About Mirroring and SSL/TLS section in Using SSL/TLS with Caché chapter of the Caché Security Administration Guide about the implementation of SSL/TLS across the mirror. InterSystems recommends that you use a high-bandwidth, low-latency network between the two failover members to minimize the latency of updates to the backup failover member. A slow network may require the primary failover member to be throttled in order to allow the backup to keep up, which could impact performance on the primary failover member; also, a slow (or congested) network could interfere with the backup s ability to rapidly take over in the event of a failover because the backup failover member may take a long time to perform all the necessary synchronization tasks that allows it to become active Mirroring Communication Processes There are processes that run on each system (primary and backup failover members, and each connected async member) that are responsible for mirror communication and synchronization. For more information, see the following topics: Mirroring Processes on the Primary Failover Member Mirroring Processes on the Backup Failover Member/Async Member Mirroring Processes on the Primary Failover Member Running the System Snapshot routine (^%SS) on the primary failover member reveals the processes listed in the following table. The CPU, Glob, and Pr columns have been intentionally omitted from the ^%SS output in this section. Caché High Availability Guide 57

64 Mirroring Table 4 4: Mirroring Processes on Primary Failover Member Device Namespace Routine User/Location /dev/null %SYS MIRRORMGR Mirror Master MDB2 %SYS MIRRORCOMM Mirror Primary* %SYS MIRRORCOMM Mirror Svr:Rd* The processes are defined as follows: Mirror Master: This process, which is launched at system startup, is responsible for various mirror control and management tasks. Mirror Primary: This is the outbound side of the data channel; it is a one-way channel. There is one job per connected system (backup failover or async member). Mirror Svr:Rd*: This is the inbound acknowledgment channel; it is a one-way channel. There is one job per connected system (backup failover or async member). Each connected async member results in a new set of Mirror Master, Mirror Primary, and Mirror Svr:Rd* processes on the primary failover member. Mirroring Processes on the Backup Failover/Async Member Running the System Snapshot routine (^%SS) on the backup failover/async member reveals the processes listed in the following table. Table 4 5: Mirroring Processes on Backup Failover/Async Member Device Namespace Routine User/Location /dev/null %SYS MIRRORMGR Mirror Master /dev/null %SYS MIRRORMGR Mirror Dejour /dev/null %SYS MIRRORMGR Mirror Prefet* /dev/null %SYS MIRRORMGR Mirror Prefet* MDB1 %SYS MIRRORMGR Mirror Backup /dev/null %SYS MIRRORMGR Mirror JrnRead The processes identified in this table also appear on each connected async member: Mirror Master: This process, which is launched at system startup, is responsible for various mirror control and management tasks. Mirror JrnRead (Mirror Journal Read): This process reads the journal data being generated on the backup into memory and queues up these changes to be dejournaled by the dejournal job. Mirror Dejour (Mirror Dejournal): This is the dejournal job on the backup failover member; it issues the sets and kills from the received journal data to the mirrored databases. Mirror Prefet* (Mirror Prefetch): These processes are responsible for pre-fetching the disk blocks needed by the dejournal job into memory before the dejournal job actually attempts to use them. This is done to speed up the dejournaling process. There are typically multiple Mirror Prefetch jobs configured on the system. 58 Caché High Availability Guide

65 Caché Mirroring Concepts Mirror Backup: This is the inbound side of the data channel, which is also used to send the acknowledgment to the primary; it is a two-way channel The Failover Process Mirroring provides rapid, automatic, unattended failover. There are several events that could trigger a failover, such as: The Data Channel is closed by the primary failover member, which could occur if Caché on the primary member becomes unresponsive due to an application hang or error. An operator manually issues a takeover command. A takeover command is issued on the backup via the SYS.Mirror API. There are many predefined rules that influence the behavior of a takeover. These rules are mostly configurable and are designed to provide a customized failover configuration that is appropriate to your deployment. The Failover Process A System Perspective The Failover Process An Application Perspective Failover (Takeover) Rules Failover Scenarios The Failover Process A System Perspective Whenever possible, mirroring provides rapid, automatic, unattended failover. There are several events that could trigger a failover, such as: The backup does not hear from the primary within a required interval, which could occur in the case of network problems. An application or host problem causes Caché to become unresponsive on the primary. A takeover is initiated by an operator or script. The backup system ensures it is fully up-to-date before marking itself as the new primary system. The default mirroring configuration prevents errors during takeover, such as split-brain syndrome a condition whereby both systems concurrently run as active primaries which could lead to logical database degradation and loss of integrity. In addition, it is possible for an operator to temporarily bring the primary system down without causing a failover to occur. This mode can be useful, for example, in the event the primary system needs to be brought down for a very short period of time for maintenance. After bringing the primary system back up, the default behavior of automatic failover is restored The Failover Process An Application Perspective On a successful failover, the mirror VIP (if configured) is automatically bound to a local interface on the new primary. This allows external clients to reconnect to the same mirror VIP address as before, which greatly simplifies the management of external client programs because they do not need to be aware of multiple database systems and IP addresses. If, however, a mirror VIP is not configured, external clients will need to maintain knowledge of the two failover members and appropriately connect to the currently running primary. The application and connection contexts are reset because these clients are reconnecting to new systems; any open transactions are appropriately rolled back. If there are pending or in-progress transaction rollbacks during the primary mirror member startup, Caché prompts you to use the Manage^JRNROLL procedure to manage the roll backs; for more information, see Manage Transaction Rollback Using Manage^JRNROLL in the Journaling chapter of the Caché Data Integrity Guide. Caché High Availability Guide 59

66 Mirroring In an ECP deployment, application servers view a failover as a server restart condition. By design, ECP application servers reestablish their connections to the new primary failover member and continue processing their in-progress workload; during the failover process, users connected to the application servers may experience a momentary pause before they are able to resume work. For this to occur, the failover between the two failover members must occur within the configured ECP recovery timeout. If, however, the failover takes longer than this timeout, ECP recovery is initiated (that is, open transactions are rolled back, locks are released, etc.), and new connections to the new primary system are established by the ECP application servers Failover (Takeover) Rules The main goals of the backup failover member are to determine definitively that the: Primary failover member is down (either because there has been a failure or it has been forced down). Backup failover member has all of the journal data that is present in the databases on the primary failover member. Once a failover is triggered, the backup attempts to automatically take over as the primary. The default mirror configuration attempts to balance the convenience of an automated takeover with the security of a failsafe takeover. This section discusses the logic, rules, and result of an automatic takeover. A central element of the automated failover functionality is the ability of the backup to determine whether or not it is active (that is, caught up with the primary) when the primary failed, and to subsequently become caught-up if needed. The backup does this by attempting to validate the end of the last mirror journal file across the two systems to ensure that it has received the most recent updates from the primary. If the backup is able to establish communication with the ISCAgent on the primary failover member, it not only validates the end of the last/active mirror journal file, but also does the following: If the primary is currently running, the backup aborts the automatic failover process and attempts to re-link with the primary. If the primary is currently hung, the backup requests the ISCAgent to force the primary down. This is done to minimize the possibility of split-brain syndrome, whereby two systems continue to run in the role of primary. If it is determined that the primary was more current than the backup at the time of the failure, then the backup asks the ISCAgent to send the additional data so that it can get fully caught-up Failover Scenarios This section outlines several failover scenarios, along with the expected result of automatic failover in each scenario. The configuration options that impact failover are also considered as part of this discussion. All cases in which the primary failover member is brought down with the nofailover option are ignored because no failover would occur in those cases. Additionally, if the Agent Contact Required for Failover tuning parameter is configured as No, a user-defined external function must exist so that the status of the failed primary can be externally determined (for information about the Agent Contact Required for Failover tuning parameter, see Mirror Tunable Parameters in this chapter). The following scenarios are described: Primary Failover Member Fails and Backup Failover Member is Running Primary Failover Member and ISCAgent Fails and Backup Failover Member is Running and Active Operator-initiated Failover Primary Failover Member Fails and Backup Failover Member is Running The following table shows the state of the components/systems for this scenario: 60 Caché High Availability Guide

67 Caché Mirroring Concepts Component/System Caché on Primary ISCAgent on Primary Caché on Backup ISCAgent on Backup State DOWN Running Running N/A As shown in the following illustration, the backup successfully takes over as the primary regardless of the status of the Agent Contact Required for Failover tuning parameter. The backup ensures that it is fully caught-up prior to taking over as the primary. Figure 4 5: Status of Systems Primary Failover Member and ISCAgent Fails and Backup Failover Member is Running and Active The following table shows the state of the components/systems for this scenario: Component/System Caché on Primary ISCAgent on Primary Caché on Backup ISCAgent on Backup State DOWN DOWN Running N/A As shown in the following illustration, this scenario could occur in the event of a catastrophic system (host operating system or hardware) failure for the primary, or in the event of a network interruption between the primary and the backup. The success or failure of the failover depends on the following: Agent Contact Required for Failover tuning parameter set to No, the backup is active, and the $$IsOtherNodeDown^ZMIRROR()function returns True In this situation, the backup successfully takes over as primary because it is not required to contact the ISCAgent on the failed primary, and it is able to determine that it is active at the time of the failure of the primary. A user-defined pre-primary function must have been configured and must return True for this failover to succeed. Caché High Availability Guide 61

68 Mirroring Agent Contact Required for Failover tuning parameter set to Yes, the backup status is either active or not active In this situation, the backup aborts the attempt to take over as primary because it is not able to communicate with the ISCAgent on the primary failover member. This is the default configuration of the mirror it is intended to prevent the backup failover member from taking over in the event of a network interruption between the two systems. Figure 4 6: Status of Systems Operator-initiated Failover An operator has the ability to shut down Caché on the primary failover member in the following ways: Graceful shutdown Forced shutdown Graceful shutdown with the nofailover option, which can be passed as an argument to the platform-specific shutdown commands The recommended method for triggering a planned failover is to perform a graceful shutdown on the primary failover member. If the Agent Contact Required for Failover tuning parameter set to Yes (see Agent Contact Required for Failover in this chapter), you can determine whether or not the backup failover member was caught up when the primary member failed by checking the cconsole.log file for messages similar to the following: (mirror_name) Failed to contact agent on former primary, can't take over indicates the backup failover member was active when the primary member failed, but could not take over because it could not reach the ISCAgent on the primary. (mirror_name) Non-active backup is down indicates the backup failover member was not active when the primary member failed and is resetting. Graceful Shutdown A graceful shutdown is the preferred method to trigger a failover during planned activities; to accomplish this, you shut down the running primary failover member, in which case, the backup failover member takes over as primary member when it detects that the primary failover member has stopped running. Forced Shutdown To trigger a failover by forcing a shutdown, on the running backup mirror member, select the Try to make this primary option from the Mirror Management main menu list of the ^MIRROR routine. In this case, the backup only succeeds if the 62 Caché High Availability Guide

69 Caché Mirroring Concepts ISCAgent process is running on the primary, and the backup member is able to communicate successfully with the agent process; this communication ensures the backup does not assume the role of primary while the old primary is still active and running. Alternatively, as a last resort, you could attempt to force a failover by selecting the Force this node to become the primary option from the Mirror Management main menu list of the ^MIRROR routine on the running backup failover member. In this case, the backup tries to take over as the primary, regardless of whether it is able to connect to the old primary system, but does not take over if it fails to perform some basic required tasks (for example, reading a mirror journal file or mirror log file, etc.). CAUTION: Forcing the backup failover member to become the primary may result in a loss of data; therefore, this option should be used only when you are absolutely certain that the old primary system is no longer running. Before using this option, InterSystems recommends that you contact the InterSystems Worldwide Response Center (WRC) for assistance. Graceful Shutdown with nofailover Option You can gracefully shut down the primary failover member without the backup member taking over as the primary by setting the nofailover option. In this case, a running backup failover member does not attempt to take over as primary member because the nofailover option is specified for the primary member. However, on the backup failover member, you can force the backup member to take over as the new primary failover member (while the old primary member continues to be offline) by selecting the Change No Failover State option from the Mirror Management main menu list of the ^MIRROR routine. This option clears the nofailover state, thus allowing the backup to take over as the new primary member. The state is cleared the next time the primary failover member is brought online Sample Configurations The following configurations are intended as general samples to illustrate the various configurations that are possible with mirroring: Sample 1: Direct Connect Sample 2: Networked Through External Ethernet Switch Although there are other combinations and configurations that are possible, the goal of this subsection is to provide a few popular and highly tolerant options. There is significant simplification in the samples because the purpose is to illustrate simple networking configurations that are possible. Important: You should ensure that your systems are built in a highly redundant manner. In general, the samples assume that the following types of network traffic exist: General ( public ) network The network is also used for tasks such as administration, etc; it is the public IP address for a given host. Mirror ( private ) network To provide a specific isolated (private) network between the failover members; the private network is used solely for mirror communication. ECP Network If using ECP, you can provide a special network just for ECP communication or use the public network for all ECP traffic Sample 1: Direct Connect The following figure illustrates a very simple configuration whereby the failover members are directly connected to each other with crossover cables for mirror communications. The systems are also connected to an external switch for public communication. Caché High Availability Guide 63

70 Mirroring In this illustration: The Ethernet HBA s (or NIC s) are illustrated as dual-ported NIC s. Each system is depicted to have two dual-ported NIC s. This is for reliability and redundancy purposes, and is a recommended configuration. It illustrates a common technique of failover bonding or teaming of ports across NIC s (that is, port1 from NIC1 and port1 from NIC2 are teamed together for a specific network). This, too, is for reliability and redundancy purposes. While the figure specifies internal storage (SAS), it is not a requirement; the two systems could have external storage (that is, SAN, NAS, etc.). However, it is recommended that the physical spindles used by the two systems are independent. This configuration is highly economical and simple in design. It takes advantage of a direct communication channel for mirror communication between the two failover members, entirely avoiding an external switch or hub, thereby reducing the likelihood of a network-related failure between the two systems. Figure 4 7: Sample Configuration: Direct Connect Sample 2: Networked Through External Ethernet Switch The following figure illustrates a configuration where all network communication between the two failover members flows through an external Ethernet switch. 64 Caché High Availability Guide

71 Caché Mirroring Concepts Once again, while the configuration illustrates internal SAS storage, it is not necessary to use internal storage. The systems could easily be configured to use external (SAN, NAS, etc.) storage as long as it is done in a redundant fashion and the two members don't share spindles. Also, as in the preceding figure, the HBAs are configured in a redundant fashion and the ports are teamed/bonded. This configuration is flexible in that it allows some latitude in the geographic placement of the two failover members (that is, they don t need to be co-located in the same building or on the same floor). The major drawback with this sort of configuration is the dependence on the external switch for mirror (private) communications. Specifically: The external Ethernet switch must be highly redundant and available to ensure that communication is not compromised. Using a single external switch can compromise performance if the switch gets overloaded. In the event of a switch failure (or network failure), the two members may be isolated from each other (network segmentation), which can prevent automatic failover. Figure 4 8: Sample Configuration: Networked Through External Ethernet Switch Caché High Availability Guide 65