Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V
Copyright 2011 EMC Corporation. All rights reserved. Published March, 2011 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part number: h8193 2
Table of Contents Chapter 1: About this Document... 4 Overview... 4 Audience and purpose... 5 Scope... 5 Reference architecture... 6 Validation environment profile... 8 Hardware and software resources... 8 Prerequisites and supporting documentation... 9 Terminology... 10 Chapter 2: Application Design... 11 Overview... 11 Microsoft SQL Server... 12 Microsoft SQL Server best practices... 12 Chapter 3: Virtualization... 13 Overview... 13 Concepts... 14 Advantages of virtualization... 14 Chapter 4: Network Design... 15 Overview... 15 Considerations... 16 Implementation... 16 Best practices... 16 Chapter 5: Storage Design... 17 Overview... 17 Design considerations... 18 Storage design implementation... 19 Best practices... 19 Chapter 6: Virtualization Performance - Testing and Validation... 20 Overview... 20 Testing tools... 21 Methodology... 21 Testing results summary... 22 Result analysis... 22 Chapter 7: FAST Cache Performance Testing and Validation... 26 Overview... 26 Testing tools... 27 Methodology... 27 Testing results summary... 27 Result analysis... 28 Chapter 8: LUN Provisioning Options Testing and Validation... 32 Overview... 32 Testing tools... 33 Methodology... 33 Testing results summary... 33 Result analysis... 34 3
Chapter 1: About this Document Overview Introduction EMC's commitment to consistently maintain and improve quality is led by the Total Customer Experience (TCE) program, which is driven by Six Sigma methodologies. As a result, EMC has built Customer Integration Labs in its Global Solutions Centers to reflect real world deployments in which TCE use cases are developed and executed. These use cases provide EMC with an insight into the challenges currently facing its customers. This document summarizes a series of best practices that were discovered, validated, or otherwise encountered during the validation of the Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V solution that uses the following products: EMC VNX series EMC PowerPath Microsoft Server 2008 R2 Hyper-V VNX Fully Automated Storage Tiering (FAST) Cache Use case definition A use case reflects a defined set of tests that validates the reference architecture for a customer environment. This validated architecture can then be used as a reference point for a proven solution. Contents This chapter contains the following topics Topic Overview 4 Audience and purpose 5 Scope 5 Reference architecture 6 Validation environment profile 8 Hardware and software resources 8 Prerequisites and supporting documentation 9 Terminology 10 See Page 4
Audience and purpose Audience The intended audience for the is: Customers EMC partners Internal EMC personnel Purpose The purpose of this document is to provide a unified storage solution using EMC VNX series and Microsoft Windows Server 2008 R2 Hyper-V and to demonstrate the benefits of VNX FAST Cache for SQL Server in an online transaction processing (OLTP) environment. This document validates the performance of all aspects of the solution and provides guidelines to build similar solutions. Information in this document can be used as the basis for a solution build, white paper, best practices document, or training. It can also be used by other EMC organizations (for example, the technical services or sales organization) as the basis for producing documentation for a technical services or sales kit. Scope Scope This document contains the results of testing Microsoft SQL Server 2008 using EMC VNX5700 and VNX FAST Cache. The objectives of this testing are as follows: Demonstrate the performance of Microsoft SQL Server 2008 using EMC VNX5700 in a Microsoft Hyper-V virtual environment. Demonstrate the benefits of using VNX FAST Cache for Microsoft SQL Server 2008 databases in OLTP environments. Not in scope Implementation instructions and sizing guidelines are beyond the scope of this document. Information on how to install and configure Microsoft SQL Server 2008 and the required EMC products is out of scope for this document. However, links on where to find all required software for this solution are provided in this document. 5
Reference architecture Corresponding reference architecture This has a corresponding Reference Architecture document that is available on EMC Powerlink and EMC.com. The Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V Reference Architecture provides more details. If you do not have access to the document, contact your EMC representative. Reference architecture logical diagram The following diagram depicts the logical architecture of the solution. 6
Introduction to the EMC VNX series EMC VNX series The VNX series delivers uncompromising scalability and flexibility for the midtier while providing market-leading simplicity and efficiency to minimize total cost of ownership. Customers can benefit from the new VNX features such as: Next-generation unified storage, optimized for virtualized applications Extended cache using Flash drives with FAST Cache and Fully Automated Storage Tiering for Virtual Pools (FAST VP), which can be optimized for the highest system performance and lowest storage cost simultaneously on both block and file. Multiprotocol support for file, blocks, and object with object access through Atmos Virtual Edition (Atmos VE). Simplified management with EMC Unisphere for a single management framework for all NAS, SAN, and replication needs. Up to three times improvement in performance with the latest Intel multi-core CPUs, optimized for Flash. 6 Gb/s SAS back end with the latest drive technologies supported: o 3.5 100 GB and 200 GB Flash, 3.5 300 GB, and 600 GB 15k or 10k rpm SAS, and 3.5 2 TB 7.2k rpm NL-SAS o 2.5 300 GB and 600 GB 10k rpm SAS Expanded EMC UltraFlex I/O connectivity Fibre Channel (FC), Internet Small Computer System Interface (iscsi), Common Internet File System (CIFS), Network File System (NFS) including parallel NFS (pnfs), Multi-Path File System (MPFS), and Fibre Channel over Ethernet (FCoE) connectivity for converged networking over Ethernet. The VNX series includes five new software suites and three new software packs, making it easier and simpler to attain the maximum overall benefits. Software suites available VNX FAST Suite Automatically optimizes for the highest system performance and the lowest storage cost simultaneously (FAST VP is not part of the FAST Suite for the VNX5100 ). VNX Local Protection Suite Practices safe data protection and repurposing. VNX Remote Protection Suite Protects data against localized failures, outages, and disasters. VNX Application Protection Suite Automates application copies and proves compliance. VNX Security and Compliance Suite Keeps data safe from changes, deletions, and malicious activity. Software packs available VNX Total Efficiency Pack Includes all five software suites (not available for the VNX5100). VNX Total Protection Pack Includes local, remote, and application protection suites. VNX Total Value Pack Includes all three protection software suites and the Security and Compliance Suite (the VNX5100 exclusively supports this package). 7
Validation environment profile Profile characteristics The solution was validated with the following environment profile: Profile characteristic SQL database size Instances and databases Number of database files Workload Storage for SQL database RAID type, physical drive size, and speed of the production SQL 2008 databases VNX FAST Cache configuration Value 400 GB Single instance and single database Four files, each on a different LUN OLTP (TPC-C-like) FC storage RAID 10, 300 GB SAS drives (15k rpm) 4 x 100 GB EFDs, RAID 1 configuration, Read/Write mode, and 183 GB of usable size Hardware and software resources Hardware The following table lists the hardware used to validate the solution: Equipment Quantity Configuration Notes Storage 1 EMC VNX5700 300 GB SAS drives (15k rpm) 4 x 100 GB EFDs Primary database storage Enterprise-class FC switch Enterprise network switch Dell PowerEdge R710 Dell PowerEdge 2950 1 4 GB FC switch Production systems may require additional hardware for highavailability purposes 1 Gigabit Ethernet (GbE) switch 1 Two quad-core Intel Xeon X5550 with 2.67 GHz 64-bit processors 64 GB RAM 4 Two quad-core Intel Xeon E5440 with 2.83 GHz 64-bit processors Production systems may require additional hardware for highavailability purposes Primary database server Load generation for test workload 8
Software The following table lists the software used to validate the solution: Software Microsoft Windows Server Microsoft SQL Server EMC VNX Operating Environment for block EMC PowerPath Version Windows 2008 x64 Enterprise Edition R2 Windows 2003 x32 Enterprise Edition R2 SP2 Enterprise Edition 2008 SP1 05.31.000.3.071 5.3 SP1 Prerequisites and supporting documentation Technology It is assumed that the reader has a general knowledge of the following products: Microsoft SQL Server 2008 EMC VNX series Supporting documents The following documents, located on EMC Powerlink, provide additional, relevant information. Access to these documents is based on your login credentials. If you do not have access to the following content, contact your EMC representative. EMC Tiered Storage for Microsoft SQL Server 2008 - Enabled by EMC CLARiiON CX4 and Enterprise Flash Drives A Detailed Review EMC Solutions for Microsoft SQL Server - Enabled by EMC Celerra Unified Storage Platforms Applied Best Practices Other documents The following document is available on the Microsoft website (http://msdn.microsoft.com): SQL Server Books Online 9
Terminology Introduction This section defines the terms used in this document. Term Enterprise Flash Drive (EFD) Logical unit (LUN) RAID group RAID 1 RAID 10 Redundant Array of Inexpensive Disks (RAID) Storage processor (SP) System database User database VNX FAST Cache Definition The enterprise-class EMC EFDs supported by VNX systems are constructed with nonvolatile semiconductor NAND Flash memory. Due to their solid state design, these drives are especially well suited for latency-sensitive applications that require consistently low read/write response times. EFDs are packaged in a standard 3.5-inch disk drive form factor used by existing VNX disk drive array enclosures, making for simple integration with existing infrastructure. Since there are no moving parts, EFDs use far less energy when compared to drives with rotating media and weigh less as well. A LUN is a logical disk object presented from the storage array to a host. It is identified by a LUN number, which uniquely identifies the LUN for that host. The LUN number is not globally unique. In a VNX storage system, a RAID group is a set of physical disks with a RAID type on which one or more LUNs are bound. Each RAID group supports only the RAID type of the first LUN bound on it; any other LUNs bound on it have that same RAID type. LUNs are distributed equally across all the disks in the RAID group. A RAID method that provides data integrity by mirroring (copying) data onto another disk. This RAID type provides the greatest assurance of data integrity at the greatest cost in disk space. A RAID method that mirrors and stripes, with data being written to two disks simultaneously. Data transfer rates are higher than with RAID 5, but RAID 1/0 uses more disk space for mirroring. A method for storing information where the data is stored on multiple disk drives to increase performance and storage capacities and to provide redundancy and fault tolerance. On VNX storage, a circuit board with memory modules and control logic that manages the storage system I/O between the host's Fibre Channel adapter and the disk modules. A database that is installed as part of the installation of Microsoft SQL Server. The system databases include master, model, msdb, tempdb, and others. A non-system database that is put on the server after the installation of Microsoft SQL Server. Examples include an OLTP application database or data warehouse. VNX FAST Cache technology allows users to specify a set of EFD devices that can be used as a secondary caching layer. 10
Chapter 2: Application Design Overview Introduction The primary application in this solution is Microsoft SQL Server 2008. In addition, this solution contains the following supporting key components: EMC VNX series VNX FAST Cache Scope The application design layout instructions presented in this chapter apply to the specific components used during the development of this solution. Contents This chapter contains the following topics: Topic Overview 11 Microsoft SQL Server 12 Microsoft SQL Server best practices 12 See Page 11
Microsoft SQL Server Considerations Microsoft SQL Server is a relational database management system (RDBMS) that is commonly used in many business-critical applications to store, retrieve, and manage application data. It is sometimes considered an application, but is more appropriately considered as an application environment. Each individual business application has a dedicated set of database tables and associated query patterns that comprise the workload on the system as seen from a SQL server. Implementation A single instance, single database of Microsoft SQL Server 2008 was used to test this solution. An OLTP workload similar to a TPC-C workload was used for this solution. This workload was designed to simulate a typical transaction processing system characterized by many small transactions from a large number of users, which in aggregate appeared random. Microsoft SQL Server best practices Plan for storage performance, not for capacity The most common mistake made while planning the storage for Microsoft SQL Server is to design for storage capacity and not for performance or I/Os per second (IOPS). With advances in disk technology, the increase in storage capacity of a disk drive has outpaced the increase in IOPS by almost 1,000:1 for spinning media. With this effect, it is rare to find a system that, when planned for performance, does not meet the storage capacity requirements for the workload. Hence, the IOPS capacity is the standard to be used while planning Microsoft SQL Server storage configurations. Plan the storage capacity (GB) only after considering the IOPS capacity of a configuration. Enable SQL Server to keep pages in memory Microsoft SQL Server dynamically allocates and de-allocates memory based on the current state of the server to prevent memory pressure and swapping. However, if a process suddenly attempts to grab a substantial amount of memory, SQL Server may not be able to react quickly enough and the operating system (OS) may swap some of SQL Server s memory to the disk. Unfortunately, there is a good probability that the memory that was swapped to the disk contains part of what SQL Server will soon be deallocating to decrease its memory use in response to the newly created memory pressure. It is recommended that SQL Server be enabled to prevent its memory from being swapped. This process is known as Locking pages in RAM. To do this, provide the Lock pages in memory user right to the account that the SQL Server service is running under. 12
Chapter 3: Virtualization Overview Introduction This chapter provides procedures and guidelines to configure the virtualization components that can be used in this solution. Virtualization enables users to turn their infrastructure into an efficient and flexible internal cloud, thereby enabling them to: Decrease their capital and operating costs by over 50 percent. Run a greener data center and reduce energy costs. Control application service levels with advanced availability and security features. Streamline IT operations and improve flexibility. Microsoft Server 2008 R2 Hyper-V is the virtualization platform used in this solution. Hyper-V is a major feature in Windows Server 2008. In Hyper-V virtualized environments, storage can be provisioned to SQL Server in two different ways virtual hard disk (VHD) and pass-through disk. Both of these are examined for this solution. Scope The virtualization guidelines presented in this chapter apply to the specific components used during the development of this solution. Virtualization is not considered a required component of the solution. Contents This chapter contains the following topics: Topic Overview 13 Concepts 14 Advantages of virtualization 14 See Page 13
Concepts Virtualization layer The virtualization layer abstracts the processor, memory, storage, and network resources of a physical server into multiple virtual machines. This allows multiple operating systems to run simultaneously and independently on a single physical server. VHD Storage array LUNs presented to the hypervisor are formatted with a file system, typically VHD. When new disks are required for the guest virtual machine, a VHD file is created on the file system and exposed to the guest as a disk device. Pass-through disk Using a pass-through disk, a Hyper-V host disk can be exposed to the guest virtual machine without even placing a volume on it. Using this mechanism, Hyper-V enables you to avoid the file system of the host and access a disk directly. Advantages of virtualization Reduced costs One of the main challenges faced by the customer is to reduce costs by using the infrastructure effectively. Virtualization enables a reduction in the number of servers and related IT hardware in the data center. Reduced downtime A running virtual machine production database can be moved from one physical server to another physical server without any downtime. Performance and scalability In a scale-out context, virtualization can provide superior performance and scalability compared to physically booted configurations, even when identical hardware is used. Ease of use The single user interface enables administrators to manage and monitor multiple virtual machines from one console. Therefore, with virtualization, administrators can manage virtual machines easily and conveniently compared to physical servers. 14
Chapter 4: Network Design Overview Introduction This chapter describes the network architecture of the solution. Scope The network design guidelines presented in this chapter apply to the specific components used during the development of this solution. Contents This chapter contains the following topics: Topic Overview 15 Considerations 16 Implementation 16 Best practices 16 See Page 15
Considerations Physical design considerations To ensure uninterrupted communication between systems and storage in the environment, plan the networks for high availability. This includes having redundant switches and paths and redundant network interface cards (NICs)/HBA ports or HBA cards. These physical design considerations apply to the Ethernet network used by clients to access the server and the Fibre Channel (FC) network used for the server to access the storage. Logical design considerations For the Ethernet network, administrators must ensure to load balance traffic across multiple links and prevent a single saturated link from affecting the application. For FC networks, zoning partitions the FC fabric into smaller subsets to improve throughput, manageability, application separation, and to add security. For path failover, high availability, and load balancing, consider a path failover software that seamlessly handles any single point of failure in the network. Implementation Physical design implementation The Brocade DS-5100B switch was used in the testing of the solution for fabric connections, while the Cisco Catalyst 6509 switches were used to support gigabit Ethernet (GbE) connections. Redundant NICs/HBAs were used in the servers and the VNX storage arrays. Redundant paths were used between the storage and the servers. Logical design implementation World Wide Name (soft) zones were created on the FC fabric such that each HBA on the server was able to view each port on the storage array. EMC PowerPath was installed on SQL Server for high availability and path failover. Best practices Plan for network high availability One common oversight for network design is to provide for high availability at the server level using clusters and at the storage level using RAID, and to ignore the network connecting them. To ensure uninterrupted communication between systems in the environment, plan the networks for high availability. This includes having redundant switches and paths and redundant network interface cards (NICs)/HBA ports or HBA cards. 16
Chapter 5: Storage Design Overview Introduction Storage design is an important element to ensure the successful deployment of the solution. Scope The storage design layout instructions presented in this chapter apply to the specific components used during the development of this solution. Contents This chapter contains the following topics: Topic Overview 17 Design considerations 18 Storage design implementation 19 Best practices 19 See Page 17
Design considerations Overview Most people think of disks in terms of the data that can be stored on them. This refers to the storage capacity of the disk. The growth of data in gigabytes (GB) and terabytes (TB) is a measurement of the required storage capacity of the underlying disk system. However, another constraint in the design of storage systems is how quickly the data can be accessed. This refers to the performance capacity of the disk. Because the performance capacity of the disk directly translates into user experience, this constraint often exhibits challenges in planning and maintaining a production system. Storage for databases Traditionally, storage administrators add both performance and storage capacities to a system by adding disk spindles and various volume management techniques to harness the limits of the system. This method, while functional, is inefficient. It often leads to short-stroking, where only a small portion of the storage capacity of the drive is used, while drawing its entire performance capacity. This leaves the additional storage capacity on the drive essentially inaccessible without negatively affecting the performance of the database residing on it. EFDs Changing the rules The introduction of EFDs has skewed the relationship between performance capacity and storage capacity. A single EFD can replace many short-stroked drives by its ability to provide very high transaction rates. This reduces the total number of drives needed for the application and decreases the power consumption because of the fewer number of spinning disks. EFDs provide very low latency. This benefits database applications where the response times are critical and where all the data cannot be kept at the host or at the storage processor (SP) cache. VNX FAST Cache Placing the entire SQL database on EFDs will dramatically improve the performance. However, it would be a very costly solution. Alternatively, parts of the database (such as tables and indexes) that are accessed frequently can be moved to EFDs. This exercise requires careful analysis and expertise and continuous monitoring to determine if the selected parts of the database are still valid to be stored on the EFDs because the access patterns may change over a period of time. If the storage system is shared by multiple SQL servers or databases, this task will be more complex. The dynamic RAM (DRAM) cache that is traditionally available in storage arrays is too small to maintain large amounts of frequently accessed (hot) data for long periods. VNX FAST Cache technology uses EMC EFDs as a secondary cache layer by promoting and accessing frequently used data from the EFDs rather than from the slower, larger drive where it was stored in the long term. As data access patterns change, FAST Cache automatically demotes data that is not frequently used and replaces it with other data that is more relevant at the present time. In this way the cache constantly adapts to the changing workloads. 18
Storage design implementation Overview This document presents the performance studies using FAST Cache with a SQL OLTP workload. Each of these studies requires a different disk configuration. For this reason, most of the configuration information is located with the test data that it produces. Some elements of the design are common and are described in this section. Common elements In all cases, the primary storage for one or many databases was provided by a RAID 10 group of 15k rpm SAS disk spindles on a VNX5700 storage array running VNX Operating Environment for block. This is a standard for OLTP databases that require a high number of random I/Os to be serviced as quickly as possible. In all cases, multiple LUNs were created against the RAID group consistent with the best practices, and set up in the database file group on SQL Server. There were no non-default configuration settings used unless explicitly stated in the test results section. The transaction log had a dedicated LUN on dedicated spindles. Database working set The database size was selected such that it was larger than the FAST Cache. For non-fast Cache scenarios, the same database working set size was used to maintain consistency. The database size in all the cases was more than 400 GB. Best practices Do not allow database and log files to share physical spindles It is highly recommended to ensure that the database data files and log files do not share the same physical spindles. This improves the performance and also helps to prevent the loss of data due to the loss of multiple drives. Enable FAST Cache on appropriate LUNs FAST Cache can show significant benefits for some workloads and almost no benefits on others. All workloads should be evaluated separately before FAST Cache is enabled. In the test cases for this solution, it was determined that there was no benefit by enabling FAST Cache on the transaction log. Configuring FAST Cache for the log actually hampers the overall system performance because FAST Cache pages are used to cache log records that are not needed. Allow sufficient time for the cache to stabilize All types of cache require some form of warm-up period before their full effect can be realized. EMC FAST Cache promotes heavily used blocks from the primary storage to the FAST Cache when it receives a certain number of hits within a time period. This behavior must be taken into account when measuring a FAST Cache system. The full benefit of the cache does not fully present itself until the workload runs long enough to promote the working set. 19
Chapter 6: Virtualization Performance Testing and Validation Overview Introduction This chapter examines the performance of Microsoft SQL Server 2008 using EMC VNX in a Microsoft Hyper-V virtual environment as compared to a physical environment. Contents This chapter contains the following topics: Topic Overview 20 Testing tools 21 Methodology 21 Test results summary 22 Results analysis 22 See Page 20
Testing tools Introduction To test the SQL Server 2008 environment, the Quest Benchmark Factory tool, designed by Quest, was used. Quest Benchmark Factory Quest Benchmark Factory was used to conduct benchmark testing by simulating users performing TPC-C OLTP database transactions. These transactions are considered TPC-C-like because TPC and Microsoft did not audit the results. All SQL Server parameters were set to the default values and no benchmark-specific tunings were applied. The following table provides the transaction types and distributions that were used: Transaction type Stock level transaction 4 Delivery transaction 4 Order-status transaction 4 Payment transaction 43 New order transaction 45 Percentage The TPC-C workload simulates an OLTP environment suitable for many types of database applications. However, the workload is artificial and the user counts and transactions per second (TPS) metrics reported are only a representation of the system response to this workload. By default, Quest Benchmark Factory TPC-C users generate approximately 0.05 TPS per user. By altering the timings for various test scenarios, 0.1 TPS per user was simulated. This ability becomes important because SQL Server cannot accept more than 32,767 user connections in parallel. For simplicity, all results are expressed in terms of TPS regardless of the user profile that was simulated. Methodology Introduction This section explains the testing methodology used. Testing methodology This testing used a single database with 5,000 TPC-C warehouses. This resulted in a database with approximately 400 GB of active data. Virtual users were added to this environment in periodic increments. The testing was stopped after the average transaction response time breached the gating metric of 2 seconds. For each load level, a series of virtual users was started in the test environment. Based on the user parameters, each user was expected to generate a certain TPS load under ideal circumstances. In the graphs, this is denoted on the horizontal axis as the Projected TPS for the load. The achieved TPS is expected to be below this level for any significant load on the system. Unless otherwise noted, the scaling run was repeated at least twice with only minor 21
differences observed between the runs. Testing results summary Objective Determine the overall performance of SQL Server 2008 in a Hyper-V setup for an OLTP environment. Setup In this study, the performance of EMC VNX is ascertained. The configuration used in this testing was 20 SAS drives (16 RAID 10 for database and four RAID 10 for log). Result analysis Introduction This section explains the result analysis for the different test scenarios. Setup The server configuration, shown in the following figure, is common across all the listed scenarios. 22
The following figure shows the storage configuration that is specific to this test scenario: The configuration uses 300 GB 15k rpm SAS spindles. NOTE: The shelf positioning in the diagram is for illustration purposes only. Components in this solution do not require a specific location within the diskarray enclosure (DAE) or cabinet. The drives must be positioned based on the published best practices as they apply to your environment. Testing results The Hyper-V pass-through connection method achieved a maximum of around 1,368 TPS before becoming saturated based on the gating metric. When the Hyper-V VHD connection method was used, the performance dropped down almost 200 values, achieving a maximum of 1,105 TPS before saturation. 23
When these two results are compared with the results from a non-virtualized environment, it is clear that the design decisions in a virtualization process can significantly impact the result. The supported TPS and average response times are given in the following table. In this test scenario, virtualizing the system using pass-through disks only produced a 5.5 percent decline in performance, while virtualizing the system using a host file system caused a 23.7 percent decline. Connection method TPS Average response time (s) Percent decline in max TPS Physical 1,448 1.8 -- Hyper-V passthrough 1,368 1.6 5.5% Hyper-V VHD 1,105 1.8 23.7% 24
Conclusion Virtualization provides many benefits in a database environment. These tests show that when virtualization is implemented using pass-through disks, those benefits can be achieved with only a small degradation in performance. When implemented with a host file system, the degradation is more significant. 25
Chapter 7: FAST Cache Performance Testing and Validation Overview Introduction This chapter examines the impact of adding FAST Cache to an existing SQL Server workload. Contents This chapter contains the following topics: Topic Overview 26 Testing tools 27 Methodology 27 Test results summary 27 Results analysis 28 See Page 26
Testing tools Introduction To test the SQL Server 2008 environment, the Quest Benchmark Factory tool, designed by Quest, was used. Quest Benchmark Factory The Quest Benchmark Factory section on page 21 provides more information on Quest Benchmark Factory. Methodology Introduction This section explains the testing methodology used. Testing methodology This testing used a single database with 5,000 TPC-C warehouses. This resulted in a database with approximately 400 GB of active data. Virtual users were added to this environment in periodic increments, allowing sufficient time for the FAST Cache to stabilize before sampling for that load level. The testing was stopped after the average transaction response time breached the gating metric of 2 seconds. For each load level, a series of virtual users was started in the test environment. Based on the user parameters, each user was expected to generate a certain TPS load under ideal circumstances. In the graphs, this is denoted on the horizontal axis as the Projected TPS for the load. It was expected that the achieved TPS will be below this level for any significant load on the system. In most cases, 20 minutes was deemed sufficient for the cache to warm up for a specific workload. Unless otherwise noted, the scaling run was repeated at least twice with only minor differences observed between the runs. Testing results summary Objective Determine the potential benefit of adding FAST Cache to an OLTP environment. Setup In this study, the performance of the system before and after adding FAST Cache was compared. The configuration used in this testing was 20 SAS drives (16 RAID 10 for database and four RAID 10 for log). The FAST Cache that was added to the system was composed of four 100 GB EFD devices. 27
Result analysis Introduction This section analyzes the results of the different test scenarios. Setup The server configuration used for this testing is shown in the following figure. 28
The following diagram shows the storage configuration specific to this test scenario: The storage configuration used 300 GB 15k rpm SAS spindles and 100 GB EFD devices. NOTE: The shelf positioning in the diagram is for illustration purposes only. Components in this solution do not require a specific location within the DAE or cabinet. The drives must be positioned based on the published best practices as they apply to your environment. Testing results The configuration without FAST Cache achieved a maximum of around 1,448 TPS before becoming saturated based on the gating metric as shown in the following graph. 29
After FAST Cache was added, the performance improved dramatically, achieving a maximum of 5,838 TPS before becoming saturated. A comparison of these two results shows that inclusion of FAST Cache yields a dramatic increase in the number of TPS that the system can process. The supported TPS and response times are given in the following table: Configuration TPS Average response time (s) 20 SAS drives 1,448 1.8 20 SAS drives + 4 EFDs 5,838 1.9 Higher TPS was achieved in the FAST Cache configuration because the underlying database LUNs were able to service more I/O operations per second (IOPS). The following graph shows the average IOPS for the database LUNs across both configurations. 30
As the number of IOPS increases, it is interesting to observe the change in latency of those operations. Due to FAST Cache, the database disk latency is not very high. The latency of I/O operations to the database disk decreased in the FAST Cache scenario, which is the opposite of the behavior observed in the baseline scenario. It is an accepted relationship that the database LUN latency can drive user response time. In this case, the two metrics do not map. This behavior raises the question as to what drives the increase in response time. This behavior depends largely on the environment and requires a thorough analysis of the environment. However, this aspect is not covered in this document because at the time of publication, the analysis was not complete and the bottleneck may vary depending on the environment. Conclusion The addition of FAST Cache improved the performance of the system by four times according to the number of supported TPS. However, the introduction of FAST Cache is not a magic solution. In this test scenario, FAST Cache eliminated database disk latency as a driver for user response time. 31
Chapter 8: LUN Provisioning Options Testing and Validation Overview Introduction The EMC VNX series storage platform provides the following LUN provisioning options: Pool LUNs o Thick LUNs (Fully provisioned pool LUNs) o Thin LUNs o Compressed LUNs RAID group LUNs There are several ways to build an FC LUN on an EMC VNX series array. The traditional way that was used for the CX4 series of storage arrays required a user to specify a RAID group and then allocate all the space required for a LUN immediately. This is called a RAID group LUN. The CX4 series of arrays combined with FLARE 30 introduced the concept of a storage pool, and by extension, a pool LUN. Pool LUNs can be fully provisioned, like a RAID group LUN, with all the space reserved immediately. They can also provision space as needed. This is called a thin pool LUN. The equivalent of a fully provisioned pool LUN is commonly referred to as a thick pool LUN. Pool LUNs enable you to take advantage of several features that were introduced with FLARE 30 like compression and FAST. For the VNX series of storage, the default recommendation is to provision LUNs as thick pool LUNs. Compression is an attribute that can be enabled or disabled on each LUN. It can reduce storage costs significantly and is ideal for datasets that have low performance requirements. Compression is also flexible because it is implemented at the LUN level, which can be applied to any data type that resides on the array. This chapter compares the performance of Microsoft SQL Server 2008 when using each of these LUN types. The chapter also examines the snapshot performance for pool LUNs and RAID group LUNs. Contents This chapter contains the following topics: Topic Overview 33 Testing tools 34 Methodology 34 Test results summary 34 Results analysis 35 See Page 32
Testing tools Introduction To test the SQL Server 2008 environment, the Quest Benchmark Factory tool, designed by Quest, was used. Quest Benchmark Factory The Quest Benchmark Factory section on page 21 provides more information on Quest Benchmark Factory. Methodology Introduction This section explains the testing methodology used. This testing used a single database with 5,000 TPC-C warehouses. This resulted in a database with approximately 400 GB of active data. Testing methodology 1 This methodology was used for the performance scaling test. Virtual users were added to this environment in periodic increments. The test was stopped after the average transaction response time breached the gating metric of 2 seconds. For each load level, a series of virtual users was started in the test environment. Based on the user parameters, each user was expected to generate a certain TPS load under ideal circumstances. In the graphs, this is denoted on the horizontal axis as the Projected TPS for the load. It was expected that the achieved TPS will be below this level for any significant load on the system. Unless otherwise noted, the scaling run was repeated at least twice with only minor differences observed between the runs. Testing methodology 2 This methodology was used for the snapshot performance test. A constant user load (approximately 25 percent of the saturation user load) was run. During the test run, the snapshot operation was triggered so that the impact of the operation on the production could be seen. In this solution, four snaps were taken at an interval of one-hour duration. The duration of the constant user load test was selected such that it accommodated the completion of snapshot operation in its window. Testing results summary Results summary The following table summarizes the test results. The performance of different LUN types was compared with the performance of thick pool LUNs. Method Impact Thick pool LUNs baseline Thin pool LUNs <1% RAID group LUNs (+) 16.12% Compressed pool LUNs (-) 4.57% 33
NOTE: CX4-480 RAID group LUNs were almost 8 percent slower than the VNX5700 RAID group LUNs. Result analysis Introduction This section provides the result analysis for the following test scenarios: Thick pool LUNs Thin pool LUNs RAID group LUNs Compressed pool LUNs Configuration The configuration for all the scenarios used in this test was 20 SAS drives (16 RAID 10 for database and four RAID 10 for log). Setup The Result analysis section on page 22 provides the server configuration. The following figure shows the storage configuration that is specific to this test scenario: The configuration used 300 GB 15k rpm SAS spindles. NOTE: The shelf positioning in the diagram is for illustration purposes only. Components in the presented solution do not require a specific location within the DAE or cabinet. The drive positioning must be based on published best practices as they apply to your individual environment. Testing results Performance study LUN provisioning options This section presents the performance results achieved with each type of LUN along with the details of the actual storage space consumption on the array. In the validated solution, four LUNs were provisioned from the database pool to SQL Server, each with a user capacity of 200 GB and containing 131 GB of data. The total amount of consumed space on the database pool varies depending on the type of LUN provisioned from that pool. 34
Thick pool LUNs When thick pool LUNs were provisioned, the consumed capacity of the database pool was nearly 800 GB. The consumed capacity of the database changes with the provisioning options. This is discussed in the following sections. Thick pool LUNs achieved a maximum of 1,247 TPS before becoming saturated based on the gating metric. Thin pool LUNs When thin LUNs were provisioned, the consumed capacity of the database pool reduced to 552 GB, giving 248 GB of space to the pool. However, the host still viewed 200 GB of space on all four LUNs. Thin pool LUNs achieved a maximum of 1,243 TPS before becoming saturated based on the gating metric. These results indicate that thin LUNs perform almost equal to thick LUNs while providing additional space savings. 35
Compressed LUNs When compression was enabled on thin LUNs, the consumed capacity of the database pool reduced from 552 GB to 492 GB, saving 60 GB of space with a compression ratio of 0.89. Compressed pool LUNs achieved a maximum of 1,190 TPS before becoming saturated based on the gating metric. The degradation was less than 5 percent compared to thick pool LUNs, which is trivial considering the compression overhead. RAID group LUNs RAID group LUNs consume the same amount of space as the user capacity seen by the host server. RAID group LUNs achieved a maximum of 1,448 TPS before becoming saturated based on the gating metric. RAID group LUNs performed nearly 16 percent better than thick pool LUNs. The following two graphs show the performance comparison between the thick pool LUNs, thin pool LUNs, RAID group LUNs, and compressed LUNs. 36
The supported TPS and response times are given in the following table: Scenario Max TPS Avg Response Time (s) Percent TPS improvement / decline when compared to thin LUNs Thick pool LUNs 1,247 0.9 -- Thin pool LUNs 1,243 0.9 (-) 0.32 RAID group LUNs (VNX5700) 1,448 1.8 (+) 16.12 RAID group LUNs (CX4-480) 1,331 1.7 (+) 6.73 Compressed LUNs 1,190 1.7 (-) 4.57 In the tested environment, thin pool LUNs showed very minor performance decrement compared to thick pool LUNs, while compressed LUNs showed a marginal 4.5 percent decline. The performance of VNX5700 RAID group LUNs was nearly 16 percent better, whereas the performance of CX4-480 RAID group LUNs was 6.7 percent better than VNX thick pool LUNs. 37
Conclusion Thick pool LUNs perform well in SQL OLTP environments and are a good choice to start with. These can be converted to thin LUNs at a later point of time with minimal impact, if more efficient space management is required. If further space savings is required, compressing the LUNs is a good option. But this comes at the cost of reduced performance. However, if there are stringent performance requirements that outweigh the functional benefits of pool LUNs, it is recommended to provision RAID group LUNs in the environment. Snapshot performance study pool LUNs (Thick, Thin, and Compressed) as compared to RAID group LUNs Testing results This section examines the snapshot performance of different LUN types. The impact due to copy on first write (COFW) operation is observed and the time taken for each LUN type to settle back to the baseline value is noted. Thick pool LUNs There was no significant degradation in the TPS value due to the snapshot operation. After the snapshot was initiated, the database disk latency shot up almost 10 times the normal value and then gradually came down. It took almost 2 hours and 30 minutes to come back to the baseline value from the last snapshot. 38
Thin pool LUNs Thin pool LUNs also performed similar to thick pool LUNs. After the snapshot was initiated, the database disk latency shot up almost 10 times the normal value and then gradually came down. It took almost 3 hours and 30 minutes to come back to the baseline value from the last snapshot. 39
Compressed LUNs There was no significant degradation in the TPS value even in this case. After the snapshot was initiated, the database disk latency shot up almost 15 times the normal value and then gradually came down. It took almost 3 hours and 30 minutes to come back to the baseline value from the last snapshot. 40
VNX5700 RAID group LUNs After the snapshot was initiated, the database disk latency shot up almost 10 times the normal value and then gradually came down. It took around 1 hour and 15 minutes to come back to the baseline value from the last snapshot. 41
CX4-480 RAID group LUNs After the snapshot was initiated, the database disk latency shot up almost 10 times the normal value and then gradually came down. It took around 4 hours and 30 minutes to come back to the baseline value from the last snapshot. The following table compares the snapshot performance of different LUN types. LUN type Avg DB disk latency before the snap Impact of snapshot on baseline (approx) Settling time (approx) Thick pool LUNs 7 ms 10x 150 min Thin pool LUNs 10 ms 10x 210 min RAID group LUNs (VNX5700) 11.5 ms 10x 75 min RAID group LUNs (CX4-480) 12 ms 10x 270 min Compressed LUNs 30.5 ms 15x 210 min Settling time is the time taken for the database disk latency to come back to the baseline value after a snapshot is taken. The thick pool LUNs perform better than the thin pool LUNs in terms of settling time. 42
The settling times for all the LUN types of VNX5700 are better than those of the CX4-480 RAID group LUNs. The impact of the snapshot due to COFW operation is almost the same in all scenarios except for the compressed LUNs. In the compressed LUN scenario, there are large spikes in the disk latency, which is expected due to the extra processing required for compressing/uncompressing the data. Conclusion Thick pool LUNs perform well with snapshots in a SQL OLTP environment and are a good choice to start with. They can be converted to thin LUNs at a later point of time with minimal impact, if there is a need for more efficient space management. If further space savings is required, compressing the LUNs is a good option. But this comes at the cost of reduced performance. However, if there are stringent performance requirements that outweigh the functional benefits of pool LUNs, it is recommended to provision RAID group LUNs in the environment. 43