About using Microsoft SQL failover clustering with ITMS 7.1 SP2 or later This section describes Microsoft SQL Server 2008 R2 failover clustering, a method of creating a high availability for your Symantec Management Platform (SMP). Failover clustering is fully featured in SQL Server 2008 R2 Enterprise with limited features in the Standard Edition. Failover clustering is a process in which the operating system and SQL Server 2008 R2 work together to provide availability in the event of an application failure, hardware failure, or operating-system error. Failover clustering provides hardware redundancy through a configuration in which mission critical resources are transferred from a failing computer to an equally configured database server automatically. However, failover clustering is not a load balancing solution. Failover is not intended to improve database performance, reduce the maintenance required, protect your system against external threats, prevent unrecoverable software failures, single points of failure (such as non-redundant storage and hardware), or natural disasters Symantec recommends that you have sufficient knowledge and experience with Microsoft SQL failover methodologies before you attempt them in an ITMS or other suite or solution implementation. For more information about SQL Server 2008 R2 high availability, see High availability with SQL Server 2008 R2" documentation at: http://msdn.microsoft.com/en-us/library/ff658546(v=sql.100).aspx You should also consider that Microsoft SQL Servers be properly sized with enough hardware capacity to run adequately for the expected failover workload. Further considerations can be found in Installing a SQL Server 2008 Failover Cluster documentation at: http://msdn.microsoft.com/en-us/library/ms179410.aspx? Failover clustering minimizes ITMS downtime by making the CMDB highly available. Failover clustering provides benefits in the following scenarios: Hardware failures:
2 Easily run your SQL Server instance from another node while you resolve issues on the other Node Server patching/maintenance: The SQL Server is offline while you wait for the server to restart. Being offline allows for the application of patches with only brief downtimes for ITMS Troubleshooting: When a problem arises on one of the nodes, failing to another node allows for the troubleshooting of a node component while ITMS continues to run Common SQL Server failover methods available to ITMS This section describes the most common failover cluster configurations and how they apply to ITMS: Single instance failover cluster A single instance failover cluster consists of two or more single node SQL Servers that share a common storage source. Only one SQL Server node is active at any given time, and all other nodes of the cluster sit in a waiting state. In our example, Notification Server is attached to the SQL cluster and is able to access its CMDB. Notification Server is not aware of Node 1 or Node 2 and considers the SQL cluster as its connection point. Figure 1 shows a two-node cluster which is the most common configuration for a SQL failover that is used in ITMS. This arrangement provides a dedicated computer to protect against the failure of a single node in the cluster. Symantec recommends that all nodes in this configuration have the same SQL Server hardware. This ensures consistent performance in the event of a failover. If there is a need for additional failover redundancy you can increase the number of nodes within the cluster. SQL Server 2008 R2 Standard Edition supports the failover of two nodes, whereas the Enterprise edition allows for up to 16 nodes. Microsoft SQL Server licenses are only required on the active node within the SQL cluster. Figure 2 shows a critical failover of the active SQL node. When critical failure of the active SQL Server node occurs, the second node takes over the duties of the first node. Notification Server is disconnected from the CMDB until Node 2 connects to the CMDB. Once Node 2 is active, Notification Server can connect and operations continue. During the disconnect time, services, console, task, and other management items are unavailable. The failover delay depends on factors such as connectivity, database size, and the number of SQL databases in the cluster. Ideally failover should occur in less than 30 minutes.
3 Figure 1 Two-node SQL cluster configuration Figure 2 Failover of the active SQL Server node in a two-node SQL cluster Note: When using SQL for IT Analytics, Analysis Services is cluster aware but Reporting Services is not. If the node with Reporting Services goes down, IT Analytics is down until the primary node is restored.
4 Multi-Instance Failover Cluster The most common types of multi-instance cluster are where all nodes are running one or more failover cluster instances. Figure 3 shows a single clustered SQL environment that is used to host two separate instances of the CMDB s for Notification Server 1 and Notification Server 2 with each CMDB running on a separate physical node. While this configuration is operational, both Notification Servers access their CMDB s with the full performance of their respective nodes. Figure 4 shows how when a node fails, all the failover cluster instances that the node hosts failover to another node. Node 1 has failed and CMDB 1 failover to Node 2. During the failover, Notification Server is disconnected from the CMDB until Node 2 has completed the failover operations and mounts CMDB 1. Notification Server 1 eventually reconnects and operates with reduced performance until Node 1 is brought to an operable state. The performance of Notification Server 2 is also affected as it hosts two CMDBs at this time. With proper SQL Server sizing, this method provides a high level of database responsiveness and performance, and productive use of your hardware dollars. Microsoft SQL Server licenses are required on all the active nodes in this implementation (In this case all nodes in the Microsoft SQL cluster). Figure 3 Single clustered SQL configuration hosting two instances of CMDB
5 Figure 4 Failover cluster instances that Node 1 hosts failover to Node 2
6