EMC Business Continuity for Microsoft SQL Server 2008 Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Reference Architecture
Copyright 2009 EMC Corporation. All rights reserved. Published November, 2009 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. Benchmark results are highly dependent upon workload, specific application requirements, and system design and implementation. Relative system performance will vary as a result of these and other factors. Therefore, this workload should not be used as a substitute for a specific customer application benchmark when critical capacity planning and/or product evaluation decisions are contemplated. All performance data contained in this report was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary significantly. EMC Corporation does not warrant or represent that a user can or will achieve similar performance expressed in transactions per minute. No warranty of system performance or price/performance is expressed or implied in this document. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part number: H6573 2
Table of Contents Table of Contents Reference architecture overview... 4 Key components... 7 Physical architecture... 9 Validated environment profile... 10 Hardware and software resources... 10 Optimize storage resources in the SQL Server environment with storage tiering... 12 Replication functionality local and remote data protection... 13 SRDF/CE functionality during planned/unplanned failover scenarios... 13 Benefits of EFDs for SQL Server OLTP workloads versus FC disk drives... 14 Conclusion... 15 3
Reference architecture overview Reference architecture overview Document purpose This document details the reference architecture of the EMC Business Continuity for Microsoft SQL Server 2008 solution enabled by EMC Symmetrix V-Max with EMC SRDF /CE, EMC Replication Manager, and Enterprise Flash Drives (EFDs), tested and validated by EMC Global Solutions Center (GSC). The GSC labs reflect realistic deployments in which solutions are developed, designed, tested, and documented to address customer challenges. Customers can reduce the complexity, costs, and risks of deploying new technology with EMC Proven Solutions. This reference architecture combines: Well-documented technology options Recommended technology products Solution purpose The key purpose of this solution is to validate an efficient, remote storage replication process for disaster recovery (DR) and business continuity in a high volume SQL Server online transaction processing (OLTP) environment. This is accomplished using synchronous replication (SRDF/CE in synchronous mode) for the automated site failover with Microsoft Failover Clusters across the SRDF link. In this solution, the Symmetrix V-Max array is used for storage consolidation while SQL Server serves as the relational database management system supporting a multifaceted OLTP environment. This solution takes advantage of EMC TimeFinder replication technology within the Symmetrix V-Max to protect data by creating consistent snapshots at various points throughout the day. Both asynchronous TimeFinder-snap and clone technologies do not require downtime to perform backups and protect data which is a significant advantage. Also, in comparison to host-based replication, this solution poses a negligible performance impact on the host. Additionally, this solution employs a tiered storage infrastructure, which utilizes EFDs to accelerate access to critical data and low-cost, high-capacity SATA drives to store historical information. The purpose of this solution is to: Demonstrate simplified application protection using Replication Manager to rapidly back up and recover SQL Server databases in a very active OLTP environment. Validate the benefit of EFD performance for SQL Server OLTP workloads in comparison to traditional FC drive performance. Demonstrate how to migrate SQL Server table partitions between storage tiers including EFDs, Fibre Channel (FC), and SATA drives. Show the impact of storage tiering on the SQL Server application. Demonstrate the effectiveness of SRDF/CE during planned failovers and unplanned site outages (as simulated in testing). 4
Reference architecture overview The business challenge SQL Server often forms the foundation for today s most demanding, enterprise-level, transaction-based companies with its rich feature set and ability to store data from structured, semi-structured, and unstructured documents. Online Transaction Processing (OLTP) systems running on a SQL Server platform represent one of the most common data processing systems in today's enterprises. The availability requirements of OLTP systems are very demanding. Downtime can represent failure to critical business processes, effectively halting business operations, leading to lost revenue, regulatory fines, and potentially lost customers. It is vital that OLTP systems remain online during backups so that customers can continue to access the system. SQL Server administrators need a plan in place that does not introduce major performance degradation to the environment. A business whose very existence relies on 24x7 availability can succeed or fail depending on the database recovery infrastructure in place. SQL Server database administrators want to design and deploy a SQL-based OLTP infrastructure that: Reduces the cost of storing vast amounts of data Provides redundancy and high availability throughout the entire system Reduces I/O and locking contention for better application performance Ensures 24x7 access to critical business data Achieves enterprise-level performance for transactional latency and user concurrency (the key success criteria for OLTP database systems) Provides nondisruptive storage tiering to enable cost-effective information lifecycle management (ILM) 5
Reference architecture overview The technology solution To meet both the performance and cost-efficiency demands placed on critical SQL Server databases, this proven solution combines the benefits of tiered storage with the benefits of advanced storage protection. Tiered storage This solution utilizes the three levels of storage media available on the Symmetrix V-Max platform: EFDs FC disk drives SATA drives The environment also utilizes both RAID 1 mirroring, and RAID 5 striping to ensure that the most active areas (tables) receive the most suitable tiered level of storage to meet performance requirements. Tiering storage reduces costs significantly as compared to provisioning large amounts of any one particular tier of disk for the entire environment. While mileage will vary with each specific customer environment, EFDs have shown performance improvements of up to x30 in typical OLTP workloads. By helping to eliminate performance bottlenecks, EFDs represent a crucial element in this solution. Advanced storage protection In addition to using the latest storage technology for storage of database files, this solution also utilizes advanced array-based technology for both local and remote protection. Array-based technology has the advantage of being host-agnostic; any concerns over existing host-based SQL Server protection technologies will not affect the data replication of SQL databases. Local protection provided by mirroring technology (which integrates with Microsoft s Virtual Disk Interface (VDI)) is a significant enhancement for most OLTP-based environments, as it: Reduces database backup windows to seconds at an application level Requires negligible host CPU cycles, for a very short period of time Remote protection in this solution is provided by EMC s Symmetrix Remote Data Facility (SRDF) in conjunction with EMC Cluster Enabler (CE), a failover cluster extension utility that stretches a typical active/passive cluster across geographicallydispersed sites. The combination of SRDF and CE (SRDF/CE) makes it possible to not only handle unplanned site outages with quick, automated failover, but also becomes a helpful utility to handle planned site or host-level outages. SRDF/CE ensures that DR failover is repeatable and predictable, while significantly reducing DR failover management and procedures. 6
Key components Key components Introduction This section briefly describes the key solution components. EMC Symmetrix V-Max The Symmetrix V-Max is built on the strategy of simple, intelligent, modular storage and incorporates a new Virtual Matrix interface that connects and shares resources across all nodes. This allows the storage array to seamlessly grow from an entry-level configuration into the world s largest storage system. The storage array provides the highest levels of performance and availability featuring: Up to 2 PB usable capacity Up to 128 FC FE ports Up to 64 FICON FE ports Up to 64 GbE / iscsi FE ports Up to 1 TB global memory (512 GB usable) 48 to 2,400 drives EFDs, 200/400 GB FC drives 146/300/450 GB, 15k rpm 300/400/450/600 GB),10k rpm SATA drives (1 TB, 7.2k rpm) Symmetrix V-Max provides the ultimate scale-out platform. It includes the ability to incrementally scale front-end and back-end performance by adding processing modules (nodes) and storage bays. Each processing module provides additional front-end, memory, and back-end connectivity. Enterprise Flash Drives EMC integrates Enterprise Flash Drive (EFD) technology into the Symmetrix V-Max providing unprecedented performance and energy efficiency. Because EFDs contain no moving parts, much of the storage latency delay associated with traditional magnetic disk drives no longer exists. A Symmetrix V-Max with integrated EFDs can deliver single-millisecond application response times and up to 30 times more I/O operations per second (IOPS) than traditional Fibre Channel disk drives. Additionally, because there are no mechanical components, EFDs consume significantly less energy than hard disk drives. Energy consumption can be reduced up to 98 percent for a given IOPS workload by replacing disk drives with fewer EFDs. For example, in some workload scenarios, it would take 30 or more 15k rpm Fibre Channel disk drives to deliver the same performance as a single Flash drive. 7
Key components Microsoft SQL Server 2008 SQL Server 2008 delivers on Microsoft s Data Platform vision by helping organizations to manage any data, any place, and any time. Store data from structured, semi-structured, and unstructured documents, such as images and rich media, directly within the database. SQL Server 2008 delivers a rich set of integrated services that enables organizations to do more with their data such as search, query, synchronize, report and analyze. EMC Replication Manager Replication Manager automates and simplifies management of disk-based replicas. It orchestrates critical business applications, middleware, and underlying EMC replication technologies to create and manage replicas at the application level for a variety of purposes, including operational recovery, backup, restore, development, simulation, and repurposing. Customers interested in reducing manual scripting efforts, improving recovery, and creating parallel access to information can implement Replication Manager to put the right data in the right place at the right time. Replication Manager has deep application integration with SQL Server. Replicas are created by coordinating with Microsoft Virtual Disk Interface (VDI) to ensure a complete copy of SQL Server databases (even active databases) without disturbing the production SQL Server environment. EMC SRDF/CE EMC SRDF is the most powerful suite of remote storage replication solutions available for disaster recovery and business continuity. The field-proven SRDF family is the most widely deployed set of high-end replication solutions, with tens of thousands of installations in the most demanding environments. The technology provides cross-volume and storage system consistency, tight integration with industry-leading applications, and simplified usage through automated management. More specifically, the SRDF/CE option is used in this Reference Architecture to enable automated site failover using SRDF/CE in synchronous mode with Microsoft Failover Clusters. SRDF/CE allows Windows Server 2008 Enterprise and Datacenter editions running Microsoft Failover Clusters to operate across a single pair of SRDFconnected Symmetrix arrays as geographically dispersed clusters. 8
Physical architecture Physical architecture Architecture diagram The following illustration depicts the overall physical architecture of the solution. 9
Validated environment profile Validated environment profile Profile characteristics The solution was validated with the following environment profile. Profile characteristic OLTP database OLTP database size SQL Storage Tier 0 SQL Storage Tier 1 SQL Storage Tier 2 SQL TimeFinder storage Value Supporting 75,000 users with 1% concurrency rate 1.7 TB RAID 5 (7+1), 400 GB EFDs RAID 1, 450 GB, 15k rpm FC drives RAID 5 (3+1), 1,000 GB, 7.2k rpm SATA drives RAID 5 (3+1), 400 GB, 10k rpm FC drives Site link characteristics The solution was validated using the following site link configuration. Site link characteristics Link Type Distances Tested for synchronous replication Data transmission mechanism Configuration OC-3 (155 MB/s) 1 Gigabit Ethernet (stretched VLAN) 10 km 200 km FCIP Hardware and software resources Production site hardware The production site hardware used to validate the solution is listed below. Equipment at the production site Quantity Configuration Storage array 1 EMC Symmetrix V-Max 4 V-Max Engines 9 x 400 GB EFDs 213 x 450 GB, 15k rpm FC disks 18 x 1 TB 7.2k SATA drives Fibre Channel switch 1 4 Gb/s Enterprise Class Fibre Channel switch, (requires a minimum of 48 ports) Ethernet network switch 1 Gigabit Ethernet network switch (requires a minimum of 32 ports) SQL Server active node 1 4 CPU quad-core, 64 GB RAM SQL Server local passive node 1 4 CPU quad-core, 64 GB RAM Replication Manager server 1 2 CPU quad-core, 32 GB RAM EMC SMC server 1 2 CPU quad-core, 32 GB RAM 10
Hardware and software resources Disaster Recovery site hardware The Disaster Recovery site hardware used to validate the solution is listed below. Equipment at the Disaster Recovery Site Quantity Configuration Storage array 1 EMC Symmetrix V-Max 4 V-Max Engines 221 x 450 GB, 15k rpm FC disks 18 x 1 TB 7.2k SATA drives Fibre Channel switch 1 4 Gb/s enterprise class Fibre Channel switch, (requires a minimum of 48 ports) Ethernet network switch 1 Gigabit Ethernet network switch (requires a minimum of 32 ports) SQL Server remote passive node 1 4 CPU quad core, 64 GB RAM Replication Manager server 1 2 CPU quad core, 32 GB RAM Software The software used to validate the solution is listed below. Software Version Windows Server 2008, x64 Enterprise Edition SP2 Microsoft SQL Server 2008, x64 Enterprise Edition SP1 EMC Enginuity 5874.157.129 EMC Solutions Enabler 7.0 EMC SRDF/CE 3.1 EMC Replication Manager 5.2, SP1 EMC Symmetrix Management Console (SMC) 7.0.0.5 11
Optimize storage resources in the SQL Server environment with storage tiering Optimize storage resources in the SQL Server environment with storage tiering What is storage tiering? The EMC Symmetrix V-Max storage array integrates a built-in storage tiering mechanism that enables customers to optimize storage resources. The technology allows organizations to migrate critical, frequently accessed database segments to maximum performance storage, and less frequently accessed segments to lower-cost, high-capacity storage. This solution uses multiple storage types: Frequently accessed table partitions RAID 5 EFDs Infrequently accessed table partitions RAID 10 15k FC Rarely accessed table partitions RAID 5 SATA In this solution, the SQL Server database partitions are migrated to the appropriate storage tier using Enhanced Virtual LUN Technology, a feature introduced in Symmetrix V-Max. The migration is an online, nondisruptive operation, transparent to SQL Server and the Windows operating system. Enhanced Virtual LUN Technology brings this tiered storage strategy to life by easily moving information throughout the Symmetrix V-Max storage system as its value changes over time. For complete details on how to improve efficiency by creating a storage tiering plan, see the companion document to this reference architecture: EMC Business Continuity for Microsoft SQL Server 2008 Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide. Impact of storage tiering on the SQL environment The test environment validates the impact of storage tiering on the SQL Server application by demonstrating how to: Allocate the appropriate storage resources to segments of the database (partitions) that require them. Move partitions across the storage tiers with minimal impact, without shutting down the application. Eliminate I/O bottlenecks and improve response times by migrating high-priority data to EFD tiers. Reduce costs of storing data by moving historical data to lower-cost SATA drives. Simplify database management by reducing the steps required to support growth. 12
Replication functionality local and remote data protection Replication functionality local and remote data protection Local replication Replication Manager possesses the capability to auto-discover SQL Server from an application perspective. In addition, Replication Manager can determine the production host s defined storage, enabling database administrators to quickly devise a backup strategy. The replication process used in this solution involves the following: The location of all the data files and logs are discovered and mapped within Replication Manager. The TimeFinder technology (clone or snapshot) is initiated. Replication Manager monitors the progress of the replication until the clone is fully synchronized (or the snapshot session is established). When replication completes, Replication Manager quiesces the database using the Microsoft VDI interface. The VDI interface is used again to resume database write operations. The metadata from the VDI session is saved for use during restore operations. For more detailed information on how Replication Manager interoperates with SQL Server, please reference the EMC Replication Manager and Microsoft SQL Server A Detailed Review white paper. SRDF/CE functionality during planned/unplanned failover scenarios SRDF/CE for Microsoft Failover clusters EMC SRDF/CE technology allows Windows 2008 servers in a Microsoft failover cluster to operate across a single pair of SRDF-connected Symmetrix arrays as a geographically dispersed configuration. Solution testing validated that: By employing SRDF/CE, SQL databases are returned to full functionality minutes after a failover has occurred. During a planned failover, Microsoft's Failover Cluster triggers SRDF/CE to logically shut down the environment and move the instance to the Disaster Recovery site. During an unplanned outage (as simulated during testing) SRDF/CE automates the failover from the Production site to the Disaster Recovery site, ensuring write-order fidelity across all application volumes. Failover cluster quorum mode The quorum mode used for this failover cluster is node majority. This configuration contains three nodes at two sites, therefore the quorum configuration wizard recommends node majority as the best choice. For more information on quorum modes see: http://technet.microsoft.com/en-us/library/cc770830(ws.10).aspx 13
Benefits of EFDs for SQL Server OLTP workloads versus FC disk drives Benefits of EFDs for SQL Server OLTP workloads versus FC disk drives SQL database performance can be highly dependent on the I/O performance capability of the storage subsystem. Traditional Fibre Channel disk drives have been limited by the delays introduced by head seek and rotational latency. In recent years, performance gains of disk drives have been achieved primarily through increases in rotational speed. In order to meet the demands of increasing I/O rates, smaller amounts of data are spread out across many physical drives. Since EFDs have no moving parts, they are capable of sustaining high I/O rates and providing low response times. Fewer EFDs are required to meet the I/O rate that a typical OLTP workload needs while allowing the full storage capacity of the drive. For more detailed information on the benefits of integrating SQL Server databases on EMC Symmetrix V-Max EFDs, please reference the EMC Symmetrix DMX-4 Enterprise Flash Drives with Microsoft SQL Server Databases Applied Technology white paper. Findings This solution s interoperability and performance testing confirms that: The SQL Server cluster in this configuration produced over 2,500 transactions per second (TPS) while maintaining processor utilization under 80 percent. Using EFDs as part of the storage tiering plan increases the average IOPS for the database partitions located on the EFDs by approximately 10X. Host utilization was driven to a maximum while the EFDs showed that additional performance capacity was available. The SRDF link operated across an OC3 link and was tested synchronously, replicating at distances of 10 km and 200 km. The Microsoft Failover Cluster was extended across the SRDF link with EMC SRDF/Cluster Enabler. This provided recovery of the database within minutes. Replication Manager provides an easy-to-use interface for automating application consistent TimeFinder replicas on the Symmetrix V-Max array. Full copy clones and point-in-time snapshots were used to provide replicas for database recovery with little impact to SQL Server s performance. Full copy clones were able to be mounted to utility servers for consistency checks without affecting the production environment. 14
Conclusion Conclusion Summary Sizing and configuration of a SQL Server failover cluster can be a complex activity, as many requirements and aspects must be considered during the planning phase. This SQL Server environment utilizes Symmetrix V-Max with EFDs, SRDF/Cluster Enabler, and Replication Manager to produce a highly available, multisite failover cluster. This well-performing solution is capable of sustaining server failures as well as site failures. This EMC solution can improve OLTP database and SQL Server application availability, performance, and sustainability by providing the following benefits: Symmetrix V-Max provides a highly scalable well-performing storage platform for OLTP databases with mechanisms in place that protect data against site disasters. Advanced functionality such as storage tiering with EFDs allows for maximum performance for mission-critical data and the movement of the data to less expensive resources as its value decreases. Utilizing EFDs keeps the storage footprint and power consumption to a minimum, reducing total cost of ownership (TCO). SRDF/Cluster Enabler allows for creating multisite failover clusters that can recover from site failures within minutes. Replication Manager automates application consistent TimeFinder replicas to provide data recovery points and allows data repurposing without impacting the production database. Next steps EMC can help to accelerate assessment, design, implementation, and management while lowering the implementation risks and costs of a backup/disaster recovery solution for a Microsoft SQL Server environment. To learn more about this and other solutions contact an EMC representative or visit www.emc.com/solutions/microsoft. 15