MANAGING MICROSOFT SQL SERVER WORKLOADS BY SERVICE LEVELS ON EMC VMAX3 EMC VMAX3, Microsoft SQL Server 2014, EMC TimeFinder SnapVX Powerful mission-critical enterprise storage Simplified storage provisioning Consistent SQL Server 2014 transactional and analytical performance Efficient data protection and management EMC Solutions Abstract This describes an EMC infrastructure offering service-level storage performance for Microsoft SQL Server 2014 on the EMC VMAX3 storage platform, and using EMC TimeFinder SnapVX to provide enterprise-class data protection. April 2015
Copyright 2015 EMC Corporation. All rights reserved. Published in the USA. Published April 2015 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC 2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Part Number H13826 2
Contents Contents Chapter 1 Executive Summary 7 Executive Summary... 8 Document purpose... 8 Audience... 8 Business case... 8 Solution overview... 8 Key benefits... 9 Terminology... 9 Chapter 2 Technology Overview 11 Technology overview... 12 Overview... 12 EMC VMAX3... 12 EMC VMAX3 Service Level Objective... 13 EMC FAST... 14 EMC Unisphere... 14 EMC TimeFinder SnapVX... 14 VMware vsphere... 14 Microsoft SQL Server 2014... 15 Chapter 3 Solution Architecture and Configuration 16 Overview... 17 Solution architecture... 17 Workload profile... 18 Hardware resources... 19 Software resources... 19 Chapter 4 Storage Design and Configuration 20 Overview... 21 Storage design considerations... 21 Front-end design considerations... 21 VMAX3 SLO design... 23 VMAX3 storage provisioning with SLO... 26 Creating SLOs... 26 Changing the SLO... 29 3
Contents Chapter 5 VMware Design and Configuration 31 Overview... 32 Virtual machine configuration... 32 VMware vsphere cluster configuration... 33 Virtual network design... 33 Storage I/O queue optimization for VMware vsphere... 34 Chapter 6 SQL Server Design and Configuration 36 Overview... 37 SQL Server 2014 configuration... 37 SQL Server 2014 database design... 37 Chapter 7 Solution Validation and Testing 39 Overview... 40 Performance criteria and methodologies... 40 Overview... 40 Performance criteria... 40 Test scenarios... 41 Test methodology... 41 Test result notes... 42 Mixed workload test results... 42 Overview... 42 Performance metrics... 42 Test results overview... 43 Test results for OLTP validation... 44 Test results for DSS validation... 46 VMAX3 system performance... 47 Time Finder SnapVX test results... 49 Overview... 49 Performance testing with snapshot creation... 49 Recovery testing... 50 Chapter 8 Conclusion 52 Summary... 53 Findings... 53 Chapter 9 References 54 References... 55 EMC documentation... 55 Other documentation... 55 4
Contents Appendix A Optimizing the Storage I/O Queue 56 Optimizing the storage I/O queue for VMware vsphere 5.5... 57 Optimizing the World Queue... 57 Optimizing the Adapter Queue... 57 Optimizing the Device Queue... 58 Figures Figure 1. EMC VMAX3 family of storage arrays... 12 Figure 2. Solution architecture... 17 Figure 3. Front-end port connectivity... 22 Figure 4. VMAX3 service levels... 23 Figure 5. Opening the SLO creation window... 26 Figure 6. Creating the storage group... 27 Figure 7. Select Host/Host Group... 28 Figure 8. Select Port Group... 28 Figure 9. Reviewing and finishing storage provision... 29 Figure 10. Example of changing an SLO... 30 Figure 11. Example of modifying the selected SLO... 30 Figure 12. Anti-affinity rule configuration... 33 Figure 13. Storage I/O queue flow in VMware vsphere 5.5... 34 Figure 14. IOPS and TPS results for OLTP... 44 Figure 15. SLO response time on VMAX3... 45 Figure 16. Gold SLO level DSS response times... 46 Figure 17. Disk heat map... 47 Figure 18. SAS disk utilization... 48 Figure 19. Flash disk utilization... 48 Figure 20. Snapshot creation test results... 49 Figure 21. SLO response times for test period... 50 Figure 22. Modifying the HBA I/O throttle count... 58 5
Contents Tables Table 1. Terminology... 9 Table 2. Available service Levels... 13 Table 3. Workload profile... 18 Table 4. Hardware resources... 19 Table 5. Software resources... 19 Table 6. SLOs used in this solution... 24 Table 7. SQL Server virtual machine configuration... 32 Table 8. Database design of SQL Server 2014... 38 Table 9. Pre-test response-time criteria for SLOs... 40 Table 10. Test scenarios... 41 Table 11. CPU and memory reservation... 42 Table 12. Summary of mixed workload test results... 43 Table 13. OLTP instance processor time... 45 Table 14. Tests results for DSS workload... 46 Table 15. Test results for snapshot recovery... 51 6
Chapter 1: Executive Summary Chapter 1 Executive Summary This chapter presents the following topics: Executive Summary... 8 7
Chapter 1: Executive Summary Executive Summary Document purpose The purpose of this is to describe how EMC VMAX3 capabilities can be applied to deploy storage and manage Microsoft SQL Server-based workloads combined with different service levels by using the VMAX3 Service Level Objective (SLO) feature. This guide also explains new features and enhancements to existing VMAX3 local replication technologies, including EMC TimeFinder SnapVX features for instant repurposing and rapid data recovery. This guide provides an overview of the key capabilities that VMAX3 can deliver to provision storage using EMC Unisphere for the VMAX3 SLO feature, which is an easy way to deploy a database on the VMAX3 system. This solution also validates the use of VMAX3 to achieve the ideal performance operating range for various sets of workloads for SQL Server deployment. Audience The primary audience of this guide is database and system administrators, storage administrators, and system architects who are responsible for implementing, maintaining, and protecting robust databases and storage systems, and who are interested in achieving higher database availability and protection. Readers of this guide should have some familiarity with SQL Server performance, database backup, and VMAX3 technology. Business case Solution overview SQL Server database administrators or IT/storage administrators face the following challenges: Meeting the needs of mixed workloads in an increasingly competitive workload environment Consolidating databases onto a shared storage system a complex process with many steps, including detailed tasks such as thin data device design, thin pool creation, and RAID level definition Difficulty estimating storage performance (response time) based on the EMC Fully Automated Storage Tiering (FAST ) policy alone and across different tiers Complicated, time-consuming, and largely manual nature of performance tuning of SQL Server storage environments Providing rapid database-as-a-service (DaaS) offerings for an enterprise's internal IT department, while maintaining performance levels for existing and new deployments Repurposing of datasets for developers, reporting, and testing This solution enables EMC customers and partners to take advantage of their SQL Server 2014 investment by providing a detailed reference architecture that uses the VMAX3 storage platform for both mission-critical and application development SQL Server 2014 instances, consolidated in a virtualized server environment. 8
Chapter 1: Executive Summary This solution relies on the latest EMC VMAX3 hardware and software to better utilize system resources and optimize storage using SLO and FAST for specific workload profiles for SQL Server deployments, including: Workload management using VMAX3 service levels with SQL Server 2014 Deployment of SQL Server databases on the VMAX3 system Assignment of VMAX3 service level to SQL Server to adjust resources and improve performance automatically Key benefits Terminology The key benefits of this solution are: VMAX3 provides a platform for SQL Server 2014 databases that is easy to deploy, provision, manage, and operate for different performance needs. FAST and SLO simplify performance predictability and tuning, which reduces effort for SQL Server administrators. VMAX3 services both mission-critical online transaction processing (OLTP) and heavy decision support system (DSS) workloads in SQL Server 2014 database environments while maintaining a lower response time. The VMAX3 configuration used in this solution, with an efficient disk configuration, surpassed solution performance objectives while maintaining the requested response times for each SLO level used. EMC TimeFinder SnapVX provided efficient snapshot management, application protection with zero performance impact to the SQL Server OLTP workload in this solution, and quick recovery for a 1 TB database in less than three minutes. Table 1 lists some of the terminology used in this guide. Table 1. Terminology Term Delta Set Extension Decision support system Masking view Port group Definition A Delta Set Extension solves the problem of accurately predicting cache and bandwidth requirements for production workloads as customer environments grow and change over time. A decision support system (DSS) supports business or organization decision-making activities. This solution uses an application derived from an industry-standard DSS or OLAP benchmark to mimic DSS workloads. The masking view in VMAX3 ensures that the target initiators (in a host group) can access the target storage resources (in a storage group) by means of the target ports (in a port group). Port groups in VMAX3 aggregate multiple ports under a common configuration and provide a stable anchor point for an array connecting to labeled networks. 9
Chapter 1: Executive Summary Term Storage group Service-level objective Storage resource pool Definition Storage groups in VMAX3 are logical groupings of devices for the purposes of common management. A service-level objective is the response time target for a storage group. Storage resources are storage-containing objects in EMC Symmetrix that contain real, physical storage. 10
Chapter 2: Technology Overview Chapter 2 Technology Overview This chapter presents the following topics: Technology overview... 12 11
Chapter 2: Technology Overview Technology overview Overview EMC VMAX3 The key technology components used in this solution are: EMC VMAX3 EMC VMAX3 Service Level Objective EMC FAST EMC Unisphere EMC TimeFinder SnapVX VMware vsphere Microsoft SQL Server 2014 The EMC VMAX3 family of storage arrays delivers the latest in Tier-1 scale-out multicontroller architecture with consolidation and efficiency for the enterprise. With completely redesigned hardware and software, the EMC VMAX 100K, EMC VMAX 200K, and EMC VMAX 400K arrays provide unprecedented performance and scale. Ranging from the single- or dual-engine VMAX 100K up to the eight-engine VMAX 400K, these arrays offer dramatic increases in floor-tile density with engines and high-capacity disk enclosures for both 2.5-inch and 3.5-inch drives consolidated in the same system bay. In addition, VMAX 100K, VMAX 200K, and VMAX 400K can be configured as either hybrid or all-flash arrays. This revolutionary VMAX3 architecture delivers a virtual matrix bandwidth of 175 GB/s per engine and up to 1400 GB/s across an eightengine array. These VMAX3 models come fully preconfigured from the factory to significantly shorten the time from installation to first I/O. Figure 1 shows the VMAX3 family. Figure 1. EMC VMAX3 family of storage arrays Dynamic Virtual Matrix The VMAX3 Dynamic Virtual Matrix architecture enables IT departments to build storage systems that transcend the physical constraint of competing array architectures. The architecture enables scaling of system resources through common and fully redundant building blocks called VMAX3 engines. 12
Chapter 2: Technology Overview VMAX3 engines provide the complete foundation for high-availability storage arrays. Each engine contains two VMAX3 directors and redundant interfaces to the Dynamic Virtual Matrix dual InfiniBand fabric interconnect. Each director consolidates frontend, global memory, and back-end functions, enabling direct memory access to data for optimized I/O operations. Depending on the array chosen, up to eight VMAX3 engines can be interconnected via a set of active fabrics that provide scalable performance and high availability. EMC HYPERMAX OS VMAX3 arrays introduce the industry s first open-storage and hypervisor-converged operating system, EMC HYPERMAX OS. This OS combines high availability, I/O management, quality of service, data integrity validation, storage tiering, and data security with an open application platform. HYPERMAX OS features the first real-time, non-disruptive storage hypervisor, which manages and protects embedded services by extending VMAX3 high availability to services that traditionally would run external to the array. HYPERMAX OS also provides direct access to hardware resources to maximize performance. The hypervisor can be non-disruptively upgraded. HYPERMAX OS runs on top of the Dynamic Virtual Matrix and uses its scale-out flexibility of cores, cache, and host interfaces. The embedded storage hypervisor reduces external hardware and networking requirements, while it delivers higher levels of availability and dramatically lowers latency. EMC VMAX3 Service Level Objective A VMAX3 Service Level Objective (SLO) is the response time target for a storage group. The SLO includes an optional workload type so you can further tune expectations for the workload storage group to provide just enough flash/serialattached storage (SAS) to meet your performance objective. The SLO enables you to build different levels of service and assign them to the specific workloads to meet performance and availability requirements. Table 2 lists the available service levels. Table 2. Available service Levels SLO Performance type Use case Diamond Ultra high High-performance computing, latency sensitive Platinum Very high Mission-critical, high rate OLTP Gold High Very heavy I/O, database logs, data sets Silver Price/performance Database datasets, virtual applications Bronze Cost optimized Backup, archive, file Optimized (default) N/A Places the most active dataset on the highest-performing storage, and the least active on the most cost-effective storage 13
Chapter 2: Technology Overview VMAX3 with SLO provides you with the ability to quickly provision storage resources to ensure that the right data is placed on the right storage level in real time. Provisioning storage with the SLO feature provides the following benefits to customer deployment: All devices are thin Thin data devices and thin pools are preconfigured to best practices SLO can be changed for a storage group to meet different workload needs No need to plan for or work with meta data, meta group, meta size, and so on EMC FAST EMC FAST provides automated management of VMAX3 disk resources on behalf of thin devices. FAST automatically configures disk groups to form a storage resource pool, creating thin pools according to each individual disk technology, capacity, and RAID type. VMAX3 service levels are tightly integrated with FAST software to optimize agility and array performance across all drive types in the system. FAST monitors the storage group performance relative to the SLO and automatically provisions the appropriate disk resources to maintain a consistent performance level. EMC Unisphere EMC Unisphere for VMAX3 is an intuitive management interface that enables IT managers to maximize human productivity by dramatically reducing the time required to provision, manage, and monitor VMAX3 storage assets. Unisphere delivers the key requirements such as simplification, flexibility, and automation. The Unisphere Performance Viewer facilitates detailed VMAX3 system performance analysis without the need for a live array connection. REST APIs simplify programmatic performance monitoring from cloud management and data center orchestration tools. EMC TimeFinder SnapVX EMC TimeFinder SnapVX is a local replication solution designed to non-disruptively create point-in-time copies (snapshots) of critical data. SnapVX creates snapshots, which are not Symm devices in VMAX3, by storing changed tracks (deltas) directly in the storage resource pool of the source device. With SnapVX, you do not need to specify a target device and source/target pairs when you create a snapshot. If you need the application to use the point-in-time data, you can create links from the snapshot to one or more target devices. If there are multiple snapshots and the application needs to find a particular point-in-time copy for host access, you can link and relink until the correct snapshot is located. In VMAX3, snapshots of an entire storage group can be done in a single command. Note: TimeFinder also supports legacy local replication solutions including EMC TimeFinder/Clone, EMC TimeFinder VP Snap, and EMC TimeFinder/Mirror. VMware vsphere VMware vsphere is a robust, high-performance server virtualization platform that enables you to virtualize business-critical application. It transforms the physical hardware resources of a computer by virtualizing the CPU, RAM, hard disk, and 14
Chapter 2: Technology Overview network controller with flexibility and reliability. This transformation creates fully functional virtual machines that run isolated and encapsulated operating systems and applications. VMware High Availability (HA) provides easy-to-use, cost-effective high availability for all applications running on virtual machines. If a server fails, affected virtual machines are automatically restarted on other host machines in the cluster that have spare capacity. HA minimizes downtime and IT service disruption while eliminating the need for dedicated standby hardware or installation of additional software. Together with VMware vsphere Distributed Resource Scheduler (DRS) and VMware vsphere Storage DRS, virtual machines have access to the appropriate resources at any point in time through load balancing of compute and storage resources. Microsoft SQL Server 2014 Microsoft SQL Server 2014 is the next generation of Microsoft s information platform, with features that deliver faster performance, expand capabilities both on premises and in the cloud, and provide powerful business insights. SQL Server 2014 offers organizations the opportunity to efficiently protect, unlock, and scale data across desktops, mobile devices, data centers, and a private, public, or hybrid cloud. SQL Server product groups made sizable investments to improve scalability and performance of the SQL Server database engine component. SQL Server 2014 is used to build mission-critical applications using highperformance, in-memory security technology across OLTP and data warehousing for decision-support systems, business intelligence, analytics services, and so on. You must fully understand these factors and plan accordingly when deploying SQL Server. Note: While SQL Server 2014 was the version used in this solution, the technology supports all versions of SQL Server from 2008 R2 onwards. 15
Chapter 3: Solution Architecture and Configuration Chapter 3 Solution Architecture and Configuration This chapter presents the following topics: Overview... 17 Solution architecture... 17 Workload profile... 18 Hardware resources... 19 Software resources... 19 16
Chapter 3: Solution Architecture and Configuration Overview This chapter describes validated reference architecture for mission-critical OLTP and DSS workloads of enterprise-class SQL Server 2014 in a virtualized VMware environment on a VMAX3 array. Solution architecture Figure 2 shows the validated solution architecture. Figure 2. Solution architecture 17
Chapter 3: Solution Architecture and Configuration In this solution, we 1 designed the reference architecture for both OLTP and DSS workloads of SQL Server 2014 on VMAX3. Both workloads were validated using a Fibre Channel (FC) connection to the VMAX3 array. For the OLTP workload, we deployed three virtual machines in the VMware virtualized environment, and mapped multiple databases created from one unique SLO level to each SQL Server OLTP instance. Diamond OLTP: Diamond SLO level with four databases of different sizes for an OLTP workload with ultra-high performance requirements Platinum OLTP: Platinum SLO level with four databases of different sizes for a mission-critical OLTP workload Gold OLTP: Gold SLO level with four databases of different sizes for a heavy I/O OLTP workload For the DSS workload, we deployed two standalone SQL Server 2014 instances residing on separate hosts. Each instance was mapped to a unique Gold SLO level for the data warehousing system. The Gold SLO level comprised one data warehousing database for a DSS workload with heavy throughput performance requirements. Workload profile Table 3 details the workload profile for both OLTP and DSS workloads used in this solution. Table 3. Workload profile Workload type Quantity Scale and size Description OLTP 3 100,000 users, 1 TB database 50,000 users, 500 GB database 25,000 users, 250 GB database 5,000 users, 50 GB database DSS 2 1,000 scale factor, 1.5 TB database OLTP workload derived from an industrystandard, modern OLTP benchmark, with a 90:10 read/write ratio DSS workload derived from an industrystandard DSS or OLAP benchmark, 100% sequential read 1 In this guide, "we" refers to the EMC Solutions engineering team that validated the solution. 18
Chapter 3: Solution Architecture and Configuration Hardware resources Table 4 lists the hardware components used in this solution. Table 4. Hardware resources Computer hardware Quantity Configuration Description EMC VMAX 200K array 1 HYPERMAX OS Four engines 3 TB Cache 168 x 10K RPM 600 GB SAS drives (including eight hot spares) 68 x 200 GB flash drives (including four hot spares) Storage ESXi Server 5 Three x 20 physical cores and 40 logical cores,160 GB memory for OLTP Two x 40 physical cores and 80 logical cores,380 GB memory for DSS Server FC switch 2 8 Gb/s FC SAN connection between servers and storage array Ethernet switch 2 10 GbE/s IP connection between servers Software resources Table 5 list the software components used in this solution. Table 5. Software resources Software Version Description EMC HYPERMAX OS 5977.497.471 Operating environment for VMAX3 EMC PowerPath 5.9 Multipathing and load balancing with I/O path optimization EMC Unisphere for VMAX 8.0.1.143 VMAX3 management interface EMC Solutions Enabler 8.0.1.214 API between storage and other components VMware vsphere ESXi 5.5 Hypervisor VMware vcenter Server 5.5 vsphere management server Microsoft SQL Server 2014 Enterprise edition SQL Server database software Microsoft Windows 2012 R2 Data center edition Operating system for database servers 19
Chapter 4: Storage Design and Configuration Chapter 4 Storage Design and Configuration This chapter presents the following topics: Overview... 21 Storage design considerations... 21 Front-end design considerations... 21 VMAX3 SLO design... 23 VMAX3 storage provisioning with SLO... 26 20
Chapter 4: Storage Design and Configuration Overview This chapter describes the VMAX3 storage configuration used in this solution. Design considerations must be understood, including front-end connectivity for the VMAX3 storage array, to ensure the best availability and performance for your hosts. EMC recommends that you check the latest best practice and design considerations before building your solution. Storage design considerations To achieve the optimized performance for your deployment of SQL Server 2014 on VMAX3, follow these general storage design best practices: Front-end design considerations Configure enough resources to handle the workload and use those resources as uniformly as possible. Connect storage ports across different directors instead of using all the ports on a single director. Balance the load as evenly as possible across VMAX3 resources, including front-end directors, front-end ports, and so on. The VMAX3 back-end is preconfigured in the bin file, therefore, the SLO-based provisioning handles which disks the data resides on. In a storage area network (SAN) environment, use redundant host bus adapters (HBAs) to connect to redundant fabrics for load balancing and resilience across HBA and switching paths. Install PowerPath for optimal path management and maximum I/O performance. Avoid multiple storage resource pools on VMAX3 and do not overcomplicate the configuration. The front-end configuration and connectivity of the VMAX3 array must follow the latest best practices and consideration: Spread front-end ports evenly across all available directors. Connect each host to ports on different directors before using additional ports on the same director. Use the same logical port number on each director for ease of administration and troubleshooting. Use more front-end ports to get maximum throughput for large block, highthroughput workloads. As shown in Figure 3, we configured 32 front-end ports evenly across the eight directors (four engines) in this solution. Each host HBA port is mapped to two frontend ports on different directors for redundancy and load balancing. To balance the 21
Chapter 4: Storage Design and Configuration load across VMAX3 storage array resources, all ESXi hosts are segregated to use different VMAX3 front-end ports, using SAN zoning and the masking view feature in VMAX3. In this solution, two HBA ports are enabled on each of the three hosts for OLTP testing. To achieve optimized throughput, the other two hosts each used four HBA ports for DSS testing. Figure 3. Front-end port connectivity 22
Chapter 4: Storage Design and Configuration VMAX3 SLO design A VMAX3 SLO defines the ideal performance operating range of an application. The SLO contains different service levels, as shown in Figure 4. All devices not explicitly associated with an SLO are managed by the system-optimized SLO by default. Figure 4. VMAX3 service levels We defined four different types of resource requirements for the database to reflect realworld customer needs: Diamond: To host databases that require the lowest response levels to support the business-critical operations Platinum: To host databases that support business applications that continue to require low levels of response time Gold: To host databases that are appropriate for test and development requirements Silver: To host roles that are appropriate for lower levels of performance and latency expectations, such as virtual machine boot images, disk-based backup targets, file shares, and so on We designed four different size databases for the OLTP SQL Server instances that reside on the Diamond, Platinum, and Gold service levels, and another two databases for the DSS SQL Server instance that resides on the Gold service level. 23
Chapter 4: Storage Design and Configuration Table 6 describes the SLOs used in this solution. Table 6. SLOs used in this solution Storage group name SLO Workload type SQL_DB1_Dia Diamond OLTP 1,536 SQL_LOG1_Dia OLTP N/A 300 SQL_DB2_Dia OLTP 750 SQL_LOG2_Dia N/A 150 SQL_DB3_Dia OLTP 400 SQL_LOG3_Dia N/A 100 SQL_DB4_Dia OLTP 100 SQL_LOG4_Dia N/A 50 SQL_Tempdb_OLTP_Dia N/A 100 SQL_Templog_OLTP_Dia N/A 100 SQL_DB1_Plat Platinum OLTP 1,536 SQL_LOG1_Plat OLTP N/A 300 SQL_DB2_Plat OLTP 750 SQL_LOG2_Plat N/A 150 SQL_DB3_Plat OLTP 400 SQL_LOG3_Plat N/A 100 SQL_DB4_Plat OLTP 100 SQL_LOG4_Plat N/A 50 SQL_Tempdb_OLTP_Plat N/A 100 SQL_Templog_OLTP_Plat N/A 100 SQL_DB1_Gold Gold OLTP 1536 SQL_LOG1_Gold OLTP N/A 300 SQL_DB2_Gold OLTP 750 SQL_LOG2_Gold N/A 150 SQL_DB3_Gold OLTP 400 SQL_LOG3_Gold N/A 100 SQL_DB4_Gold OLTP 100 SQL_LOG4_Gold N/A 50 SQL_Tempdb_OLTP_Gold N/A 100 Capacity (GB) 24
Storage group name SLO Chapter 4: Storage Design and Configuration Workload type SQL_Templog_OLTP_Gold N/A 100 SQL_DB5_DSS Gold DSS 2,048 SQL_LOG5_DSS DSS 1 N/A 400 SQL_Tempdb5_DSS N/A 250 SQL_Templog5_DSS N/A 200 SQL_DB6_DSS Gold DSS 2,048 SQL_LOG6_DSS DSS 2 N/A 400 SQL_Tempdb6_DSS N/A 250 SQL_Templog6_DSS N/A 200 SQL_Sys Silver N/A 100 Capacity (GB) SQL_DB_Backup N/A 10,240 VM_OS N/A 2,048 25
Chapter 4: Storage Design and Configuration VMAX3 storage provisioning with SLO Creating SLOs To create SLOs for SQL Server 2014: 1. Log in to Unisphere. Under Storage > Storage Group Management > Manage, click Provision Storage to Host, as shown in Figure 5, to access the SLO creation window. Figure 5. Opening the SLO creation window 2. Create a cascaded storage group with parent and child storage groups, each with their own SLO and optional Workload Type, as shown in Figure 6. 26
Chapter 4: Storage Design and Configuration Figure 6. Creating the storage group a. Type the Storage Group Name. Storage group names must be on the storage system and cannot exceed 64 characters. Only alphanumeric characters, underscores (_), and hyphens (-) are allowed. Storage group names are not case sensitive. b. To select a Storage Resource Pool other than the default, click Edit and select the pool. In this solution, we used SRP_1. c. Type the child Storage Group Name as defined in Table 6 on page 24. d. Select the service level for each Storage Group. The SLO specifies the characteristics of the provisioned storage, including maximum response time, workload type, and priority. e. Refine the SLO by selecting the Workload Type to assign to it. f. Type the number of Volumes and select the Capacity of each volume. g. Click Add Service Level to create another child storage group SLO set. 3. Select the host group, as shown in Figure 7, and click Next. 27
Chapter 4: Storage Design and Configuration Figure 7. Select Host/Host Group 4. Select either a new or existing port group, as shown in Figure 8, and click Next. Figure 8. Select Port Group 28
Chapter 4: Storage Design and Configuration 5. Type the name for Masking View. Verify the rest of your selections. Click Add to Job List, as shown in Figure 9. Choose Run Now to perform the operation immediately. Figure 9. Reviewing and finishing storage provision Changing the SLO The SLO feature provides the ability to change the service level when the storage performance of the SQL Server databases cannot meet business needs or an adjustment needs to be made on the resource level. With SLOs, you can complete the data migration directly by changing the SLO without interrupting the system and application. Figure 10 shows an example of how to change the SLO. In this example, the current SLO of SQL_DB1_Gold is Gold. We changed it to Platinum. 29
Chapter 4: Storage Design and Configuration Figure 10. Example of changing an SLO To change the SLO, we clicked Modify and chose Platinum for the Service Level, as shown in Figure 11. Figure 11. Example of modifying the selected SLO 30
Chapter 5: VMware Design and Configuration Chapter 5 VMware Design and Configuration This chapter presents the following topics: Overview... 32 Virtual machine configuration... 32 VMware vsphere cluster configuration... 33 Virtual network design... 33 Storage I/O queue optimization for VMware vsphere... 34 31
Chapter 5: VMware Design and Configuration Overview In this solution, we deployed virtualized SQL Server instances for both OLTP and DSS environments using the VMware vsphere 5.5 platform. In this chapter, we describe virtual design and configuration for the solution. Virtual machine configuration We deployed virtualized SQL Server instances across three physical machines for the OLTP environment, to fully utilize the storage resources from the VMAX3 storage array. We also deployed two virtualized standalone SQL Server instances on separate physical machines for the DSS environment. Table 7 shows the detailed configuration for the SQL Server virtual machines deployed in this solution. Table 7. SQL Server virtual machine configuration Virtual machine role Quantity vcpu Memory OS VMDK size Mapped SLO from VMAX3 OLTP 3 24 32 GB 100 GB Diamond, Platinum, Gold DSS 2 32 128 GB 100 GB Gold x 2 When allocating the CPU and memory resource for the SQL Server virtual machines in vsphere, EMC recommends that you implement the following best practices to achieve better performance: Enable non-uniform memory access (NUMA) nodes on each ESXi server to fully utilize the scalability of the compute and memory resources. Do not exceed the maximum NUMA node count of the physical server when sizing for each virtual machine. Run the following command to check the NUMA node settings on your ESXi Server: esxcli hardware memory get grep NUMA Allocate enough memory reservations on each virtual machine for both SQL Server and OS overhead. Install the latest version of VMware Tools in the guest OS to improve the ability to manage the virtual machine. VMAX3 provides unprecedented performance and scale for a wide range of SQL Server workloads, including typical OLTP and DSS workloads. To support very high levels of I/O throughput, configure multiple VMware Paravirtual SCSI (PVSCSI) controllers inside the virtual machine to drive the execution of some parallel I/O operations. The PVSCSI controllers would benefit overall performance and lower CPU utilization for the guest OS. EMC recommends that you evenly distribute database LUNs across each PVSCSI controller to achieve optimal performance. 32
Chapter 5: VMware Design and Configuration VMware vsphere cluster configuration In this solution, we deployed a vsphere cluster for high availability and flexible administration of both OLTP and DSS systems. EMC recommends that you configure both VMware vsphere High Availability (HA) and VMware vsphere Disaster Recovery System (DRS) for each SQL Server virtual machine. Figure 12 shows the DRS anti-affinity rule created for this solution. We enabled the anti-affinity rule for all SQL Server virtual machines. This rule helps place each virtual machine on different physical hosts, to achieve both performance isolation and efficient resource utilization. Figure 12. Anti-affinity rule configuration Virtual network design We created two standard virtual switches for each ESXi server: vswitch0: 1 Gb Ethernet for management network vswitch1: 10 Gb Ethernet for virtual machine connectivity To ensure optimal performance and stability, EMC recommends selecting VMXNET3 as the virtual network adapter type when connecting the virtual machine to vswitch1. 33
Chapter 5: VMware Design and Configuration Storage I/O queue optimization for VMware vsphere In the virtualized environment, the default settings for the storage I/O queue depth are not necessarily optimal for the I/O-intensive workloads that VMAX3 can support. By using storage I/O queues, vsphere enables multiple virtual machines to share a single resource. Figure 13 shows the main types of queues in vsphere: World Queue: per virtual machine queue Adapter Queue: per HBA queue Device Queue: per LUN queue Figure 13. Storage I/O queue flow in VMware vsphere 5.5 The I/O request flows into the World Queue, which then flows into the Adapter Queue and, finally, into the Device Queue for the LUN that the I/O is going to. 34
Chapter 5: VMware Design and Configuration To achieve extreme high performance for SQL Server 2014 on VMAX3, EMC recommends that you follow the instructions in Appendix A: Optimizing the Storage I/O Queue to optimize the storage queues in your vsphere environment. For more details, refer to the VMware blog post Troubleshooting Storage Performance in vsphere Storage Queues. 35
Chapter 6: SQL Server Design and Configuration Chapter 6 SQL Server Design and Configuration This chapter presents the following topics: Overview... 37 SQL Server 2014 configuration... 37 SQL Server 2014 database design... 37 36
Chapter 6: SQL Server Design and Configuration Overview This chapter provides the design and configuration for SQL Server 2014 for the mission-critical OLTP and DSS workloads used in this solution. SQL Server 2014 configuration The following list shows the Windows Server 2012 R2 and SQL Server 2014 configuration of each SQL Server instance. We used the default values for all other settings: Enable T834 trace flag to enable Large Pages for each SQL Server instance. Use Lock Pages in Memory for the SQL Server service account. Enable T1118 to allocate the full extent for tempdb objects for instances deployed for DSS workloads. Enable E for the startup parameter for both DSS instances. This increases the number of the contiguous extents in each file allocated to a database table as it grows. It also improves sequential disk access. Enable instant file initialization for the SQL Server startup service account to accelerate the process of initializing database files. Pre-allocate data and log files for both SQL Server OLTP/DSS and tempdb databases to avoid auto-growth during peak times. Use multiple files for data and tempdb and make SQL Server data files of equal size within the same file group. For DSS workloads that would excessively utilize tempdb, place the tempdb data files and log files on separate LUNs. Use the 64K allocation unit size when formatting all data and log LUNs. Set Max Server Memory to limit SQL Server available memory, so some reserved memory is available for OS operations. For detailed information about best practices for your SQL Server configuration, refer to the Microsoft SQL Server Best Practices and Design Guidelines for EMC Storage White Paper. SQL Server 2014 database design We designed four databases with different sizes for each OLTP SQL Server instance that resides on the different VMAX3 SLO levels Diamond, Platinum, and Gold. In most cases, tempdb under an OLTP workload may not be very I/O demanding, but you still need to follow basic design principles. We separated the tempdb data files into four files and put them on different LUNs. We designed one 1.5 TB data warehouse database for each of the two DSS instances that were hosted on the Gold SLO level. Tempdb under data warehousing or DSS 37
Chapter 6: SQL Server Design and Configuration workloads was generally placed under intensive I/O demands and warranted special attention in those environments. We designed enough throughput and capacity for both workloads. We also separated tempdb data files into eight files for each DSS instance and put them on different LUNs to fully utilize disk performance. Table 8 lists the detailed database design of SQL Server 2014. Table 8. Database design of SQL Server 2014 Workload profile Diamond OLTP, Platinum OLTP, Gold OLTP Database Data LUN capacity Data file quantity Log LUN capacity 1 TB database 1.5 TB 4 300 GB 1 500 GB database 750 GB 4 150 GB 1 250 GB database 400 GB 4 100 GB 1 Log file quantity 50 GB database 100 GB 4 50 GB 1 Tempdb 100 GB 4 100 GB 1 Gold DSS 1, Gold DSS 2 1.5 TB database 2 TB 16 400 GB 1 Tempdb 100 GB 8 100 GB 1 Note: The database design in this solution is based on the test workload. In the production environment, the database size, especially log file and tempdb files, varies depending on the type of transactions and queries that are running on those databases. 38
Chapter 7: Solution Validation and Testing Chapter 7 Solution Validation and Testing This chapter presents the following topics: Overview... 40 Performance criteria and methodologies... 40 Mixed workload test results... 42 Time Finder SnapVX test results... 49 39
Chapter 7: Solution Validation and Testing Overview This chapter validates the performance of SQL Server 2014 in a virtualized environment on the VMAX3 storage array with mission-critical combined workloads (OLTP/DSS). This chapter also describes how we implemented TimeFinder SnapVX and validated the performance impact during snapshot creation, and the functionality of the snapshot recovery. Performance criteria and methodologies Overview Performance criteria This section describes the performance criteria and methodologies used to validate the solution. Because the SLO is a desired level of performance required by the storage workload, we needed to note the expected response-time criteria listed in Table 9 before running the tests. Table 9. Pre-test response-time criteria for SLOs Service Level Objective Diamond Platinum Gold Silver Behavior Emulates flash drive performance Emulates performance between flash drive and 15k RPM drive Emulates 15k RPM performance Emulates 10k RPM performance Expected average response time for small I/O (OLTP) Compliance range (OLTP) Expected average response time for large I/O (DSS) Compliance range (DSS) 0.8 ms 0-3 ms 2.3 ms 1 6 ms 3.0 ms 2-7 ms 4.4 ms 3 9 ms 5.0 ms 3-10 ms 6.5 ms 4 12 ms 8.0 ms 6-15 ms 9.5 ms 7 17 ms Note: These results do not fully demonstrate the overall performance capabilities of VMAX3, which can achieve results higher than those achieved for this solution s requirements and configuration. This solution was designed to meet very specific customer-driven requirements using a subset of the available VMAX3 hardware configuration. 40
Chapter 7: Solution Validation and Testing Test scenarios Table 10 lists the test scenarios we used to validate this solution. Table 10. Test Number Test scenarios Scenario Description 1 Performance test with OLTP workload 2 Performance test with DSS workload 3 Snapshot protection test with SnapVX Recovery test through snapshot Three hosts, each containing one active SQL Server instance. Ran OLTP workloads on the four databases with 1 TB/500 GB/250 GB/50 GB. Each instance covered one of the SLO levels (Diamond, Platinum, Gold). Two hosts, each containing one standalone SQL Server instance, with one 1.5 TB database on each node. Ran DSS workloads on the database with Gold SLO level. Ran the OLTP workload against the four database sets on Gold SLO level and created snapshots every hour through Unisphere for VMAX. A maximum of eight snapshots was created during the test. Compared the performance during each snapshot period. Mounted one of the database snapshots to a second host for repurposing through Unisphere for VMAX3 to measure the recovery time. Test methodology To simulate workload in a realworld OLTP and DSS environment, we used the following tools: OLTP workload tool : Derived from an industry-standard, modern OLTP benchmark. It simulated a stockbroker trading system, such as managing customer accounts, executing customer trade orders, and other transactions within the financial markets. The majority of the I/O size was 8k with a 90:10 read/write ratio. OLAP workload tool: Derived from an industry-standard DSS or OLAP benchmark. It simulated system functionality representative of complex business analysis applications for a wholesale supplier, through 22 queries set that were given a realistic context. The majority of the I/O size was between 64k and 512k, with a 100 percent read ratio. The detailed test methodology was: 1. Ran the performance test against three OLTP instances on Gold, Platinum, and Diamond SLO levels respectively and reached a steady state. 2. Based on the previous step, ran the performance test against two DSS instances on a Gold SLO level, and reached a steady state. Monitored each SLO and host behavior for both OLTP and DSS. Measured and recorded the performance. 3. Ran the baseline performance test again against a Gold OLTP instance and reached a steady state. Created SnapVX snapshots every hour against the 41
Chapter 7: Solution Validation and Testing storage group for the Gold OLTP databases. Measured and recorded the performance during the snapshot period. 4. Performed the recovery test against one of the Gold OLTP database snapshots and mounted the snapshot on another host. Measured and monitored the recovery time. Test result notes The validation test results are highly dependent on workload, specific application requirements, design, and implementation. Relative system performance varies because of these and other factors. Therefore, the workloads used to validate this solution should not be used as a substitute for a specific customer application benchmark when critical capacity planning and product evaluation decisions are contemplated. All performance data in this guide was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary. EMC Corporation does not warrant or represent that a user can or will achieve similar performance expressed in transactions per minute. Note: The database metrics transactions per second (TPS) is described and used within the test results. As transactions differ greatly between various database environments, these values should only be used as a reference and for comparative purposes within the test results. Mixed workload test results Overview This section describes the test results for mixed workload validation for both OLTP and DSS environments. Table 11 shows the CPU and memory configuration for the test environment. Table 11. CPU and memory reservation Item OLTP DSS CPU reservation 24 32 Memory reservation 32 GB virtual machines (30 GB reserved for SQL Server) 128 GB virtual machines (120 GB reserved for SQL Server) Performance metrics To determine the performance of the SQL Server mixed workload on VMAX3, we used performance monitors, including Windows Perfmon, VMware ESXTOP, and EMC Unisphere for VMAX3, to measure and record the statistics. The key metrics for OLTP in the mixed workload test are: Throughput in IOPS (transfers per second) Throughput in TPS Processor time percentage 42
Chapter 7: Solution Validation and Testing The key metrics for DSS in the mixed workload test are: Bandwidth Processor time percentage The key metrics on VMAX3 are: SLO response time in milliseconds VMAX3 front-end and back-end utilization as a percentage Utilization for each disk technology as a percentage Test results overview We ran a traditional OLTP workload continuously against three OLTP SQL Server instances on Gold, Platinum, and Diamond SLO levels. At the same time, we applied an industrial DSS workload to simulate a typical data warehousing environment and drive high bandwidth. The DSS workload was generated by eight consecutive query sets, each containing 22 T-SQL queries. The total test duration was about 10 hours. Table 12 summarizes the high-level performance results for the mixed workload test. Table 12. Summary of mixed workload test results OLTP workload DSS workload Host IOPS 107,310 Host bandwidth 3,406 MB/s TPS 8,574 Avg. DSS query-set execution time SLO response time Diamond: < 2.1 ms Platinum: < 2.8 ms Gold: < 4.0 ms SLO response time 1 hour and 18 minutes < 12 ms As shown in Table 12, we achieved over 107,000 IOPS and 8,500 TPS in total for the three OLTP instances running DSS workloads on different SLO levels. The SLO response time was kept within the expected compliant range, at a very lower level. We also achieved over 3,400 MB bandwidth with the standard DSS workloads on the Gold SLO level, with each 22-query set of the DSS database completed within 1 hour and 18 minutes. The SLO response time for each DSS instance was kept within 12 ms. Note: The test results were based on a specific number of disk resources for the VMAX3 system, which demonstrated both efficient and balanced utilization of the VMAX3 hardware resources. 43
Chapter 7: Solution Validation and Testing Test results for OLTP validation Figure 14 shows the detailed IOPS and TPS results for the OLTP validation. We applied a similar workload profile on three different OLTP instances, and the results showed a highly scalable performance for each SLO level. Figure 14. IOPS and TPS results for OLTP For the Gold OLTP that simulates a heavy I/O transactional environment, the corresponding SLO level rivals the performance of 15k SAS disks. We achieved over 27,000 IOPS and 2,322 TPS with host latency kept within 5 ms. For the Platinum OLTP that simulates a mission-critical OLTP workload, the corresponding SLO level rivals the performance between 15k SAS and flash disks. We achieved over 33,000 IOPS and 2,759 TPS with host latency kept within 3 ms. For the Diamond OLTP that pursued extreme performance on VMAX3, the corresponding SLO level rivals the performance of pure flash disks. We achieved over 44,000 IOPS and 3,493 TPS with host latency kept to about 2 ms. 44
Chapter 7: Solution Validation and Testing Figure 15 shows the SLO response time on VMAX3 during the performance test. The workloads were started before hour zero. After nearly two hours, the performance entered a steady state. Figure 15. SLO response time on VMAX3 We kept all SLO response times within the corresponding SLO compliance range. For the Gold OLTP, the average SLO response time was less than 4 ms, which is compliant within the range of 3 to 10 ms. For the Platinum OLTP, the average SLO response time was slightly above 2.5 ms, which is compliant within the range of 2 to 7 ms. For the Diamond OLTP, the average SLO response time was about 2 ms, which is compliant within the range of 0 to 3 ms. Table 13 shows the processor time on each OLTP instance. The CPU utilization on each OLTP instance was kept within 75 percent. Table 13. OLTP instance processor time SQL Server instance Processor time Target Gold OLTP 31% Less than 75% Platinum OLTP 41% Less than 75% Diamond OLTP 49% Less than 75% The results demonstrate that VMAX3 can easily handle over 107,000 IOPS and 8,500 TPS (depending on the transaction types) even under a mixed OLTP and DSS workloads. As the VMAX3 SLO feature set the response time target, each storage 45
Chapter 7: Solution Validation and Testing group would service within the specified compliance range to ensure the performance level preset by the user. Test results for DSS validation On the basis of OLTP workloads, we applied DSS workloads, which contained eight same-query sets to enable a total test duration of about 10 hours, throughout the validation test. We monitored the bandwidth achieved on each SQL Server data warehousing instance and recorded the completion time of each query set. The workload applied to the two Gold DSS instances are derived from an industrystandard OLAP benchmark to simulate a realworld data warehousing system. The actual bandwidth achieved in the solution was highly dependent on the queries derived from the DSS benchmark. Table 14 shows the detailed test results for the DSS part of the mixed workload validation. The average bandwidth results were 1,670 MB/s and 1,736 MB/s for each DSS instance. The CPU utilization was over 60 percent, still within the target range. Table 14. Tests results for DSS workload SQL Server instance Avg. bandwidth (MB/s) Process or time Target Avg. duration for each query set Gold DSS 1 1,670MB/s 61.2% Less than 75% 1 hour and 18 minutes Gold DSS 2 1,736MB/s 63.1% Less than 75% 1 hour and 17 minutes Figure 16 shows the response time on each storage group of Gold SLO level DSS on the VMAX3 array. The SLO target was to keep the response time within a range of 3 to 12 ms for the data warehousing database. Figure 16. Gold SLO level DSS response times 46
Chapter 7: Solution Validation and Testing From Figure 16, for most of the test period, the response time of each storage group was kept around 5 to 8 milliseconds under full monitoring via the new VMAX3 SLO feature. At the eighth hour of the validation test, the response time for Gold DSS 1 hit 12.3 ms, which exceeded the 12 ms threshold. VMAX3 automatically discovered this occurrence and rescheduled the resource to keep the response time within the SLO target range. The response time dropped immediately for the next hour, without affecting the overall performance of the OLTP or DSS workloads. VMAX3 system performance We monitored the status of both the front-end and back-end of the VMAX3 array throughout the mixed workload test. We designed the entire system to provide full and balanced utilization on VMAX3. The front-end CPU utilization was less than 15 percent, while the back-end CPU utilization was close to 20 percent. This demonstrates that VMAX3 is capable of handling a heavier workload. In this solution, we equipped the VMAX3 array with a usable number of disks 64 flash and 160 10k SAS. The average disk utilization for the two-disk technology was over 90 percent for flash disks and 85 percent for SAS disks, as shown in Figure 17. This confirms that the available disks were used close to maximum efficiency while maintaining the SLOs. Figure 17. Disk heat map Note: In Figure 17, the disks in black are hot spares and NL_SAS drives, which are not used. 47
Chapter 7: Solution Validation and Testing Figure 18 and Figure 19 show the detailed disk utilizations. Figure 18. SAS disk utilization Figure 19. Flash disk utilization 48
Chapter 7: Solution Validation and Testing Based on the results shown in this section, we confirmed the following: Time Finder SnapVX test results The front-end and back-end directors achieved a balanced performance. The VMAX3 configuration used in this solution was not stressed at all by our workload, and still had enough buffer to promote a heavier workload. We achieved favorable performance based on an efficient disk configuration that was almost fully utilized. Overview Performance testing with snapshot creation This section describes the results for performance testing with snapshot creation and recovery testing. In this solution, we selected one OLTP storage group based on a Gold SLO level, which contained four databases (1 TB, 500 GB, 250 GB, 50 GB). We added a standard modern OLTP workload on each database as the baseline. When the baseline performance entered a stable state, we enabled the scheduled work in Unisphere for VMAX3 to activate hourly creation of snapshots on this storage group. This simulated a multiple-database repurposing scenario. We set the maximum number of snapshots to eight. We measured the performance throughout the test. During snapshot creation, we measured the performance impact as follows: Throughput in IOPS Throughput in TPS SLO response time (ms) Figure 20 shows the results for the snapshot creation test. Figure 20. Snapshot creation test results 49
Chapter 7: Solution Validation and Testing For the baseline, we achieved 29,898 IOPS and 2,861 TPS, while keeping the host latency below 4 to 5 ms for all four databases. When the hourly snapshots were created, we found that during each period, the performance result was maintained at about 30,000 IOPS and 2,900 TPS for the duration of the entire test, while the host latency was still kept within 4 to 5 ms. As the number of snapshots increased, the performance was still not affected. The average snapshot creation time was within five seconds for four databases of 1.8 TB in total size. Figure 21 shows the SLO response time results throughout the eight hours of testing. The latency was kept within 2.0 to 2.2 ms, without performance degradation. Figure 21. SLO response times for test period From these test results, we came to the following conclusions for the snapshot validation: SnapVX has zero performance impact against SQL Server OLTP workloads. The number of snapshots does not affect the performance of the SQL Server databases. A SnapVX snapshot can be created instantly and is ready to use in seconds. Recovery testing The recovery test was performed from the previously created snapshots. We mounted the oldest snapshot with the 1 TB databases, which had the largest data changes, to a second host and monitored the consumed time. We opted to link the target in no copy mode to be as space-efficient as possible. Table 15 shows the test results when creating snapshot links to target devices and mounting to the second host. 50
Chapter 7: Solution Validation and Testing Table 15. Test results for snapshot recovery Task Recovery time Total database size Data changes Snap link to target in no copy mode and mount to host 2 minutes 54 seconds 1 TB 397 GB Notes: No copy: Creates a temporary, space-saving snapshot of only the changed data on the snapshot's storage resource pool. Target volumes linked in this mode will not retain data after the links are removed. This is the default mode. Copy: Creates a permanent, full-volume copy of the data on the target volume's storage resource pool. Target volumes linked in this mode will retain data after the links are removed. During testing, the VMAX3 snapshot was linked to the target device and mounted to the host with a SQL Server instance installed in no copy mode. The results show that the database, with 1 TB total database size and 397 GB data changes, can be recovered and mounted to the host in three minutes. 51
Chapter 8: Conclusion Chapter 8 Conclusion This chapter presents the following topics: Summary... 53 Findings... 53 52
Chapter 8: Conclusion Summary This EMC solution enables you to simply and efficiently provision Microsoft SQL Server 2014 database storage on the EMC VMAX3 array through the VMAX3 SLO feature. With advanced EMC FAST and SLO management, you can provision and modify storage resources as business priorities, performance needs, and workload characteristics change over time. This solution was validated to show that you can use VMAX3 to service both OLTP and DSS workloads, while maintaining the desired service level of the predefined SLO. At the same time, with the SLO feature, you can monitor and automate the quality of service for both workload types, thus maintaining the application response time. You can use the EMC TimeFinder SnapVX local replication technology to produce snapshots of existing SQL Server systems for point-in-time database backup and fast recovery to meet database repurposing needs for test, development, reporting, analytics, and so on. Findings The key findings of this solution are: Through the ease of provisioning for SLOs, you can deploy SQL Server 2014 on VMAX3 without considerable amounts of planning and configuration at the storage level. VMAX3 service levels combined with FAST technology can dynamically allocate resources to the SQL Server database to meet performance challenges and keep lower application response times. The VMAX3 configuration used in this solution, with an efficient disk configuration, surpassed solution performance objectives while maintaining the requested response times for each SLO level used. SnapVX supports point-in-time snapshot copies and fast recovery. It also provides efficient snapshot management and application protection with minimal performance impact. 53
Chapter 9: References Chapter 9 References This chapter presents the following topics: References... 55 54
Chapter 9: References References EMC documentation Other documentation The following documents, available on the EMC.com website, provide additional and relevant information. If you do not have access to a document, contact your EMC representative. EMC VMAX3 Family VMAX 100K, 200K, 400K Specification Sheet EMC VMAX3 Family Data Sheet EMC VMAX3 Family Software Suite Data Sheet EMC VMAX3 Local Replication Technical Notes Optimize Microsoft SQL Server on EMC VMAX3 White Paper Microsoft SQL Server Best Practices and Design Guidelines for EMC Storage White Paper For additional information, see the documents or topics listed below. Books Online for SQL Server 2014 (Microsoft TechNet) Changing the queue depth for QLogic, Emulex and Brocade HBAs (VMware KB topic 1267) Large-scale workloads with intensive I/O patterns might require queue depths significantly greater than Paravirtual SCSI default values (VMware KB topic 2053145) Setting the Maximum Outstanding Disk Requests for virtual machines (VMware KB topic 1268) 55
Appendix A: Optimizing the Storage I/O Queue Appendix A Optimizing the Storage I/O Queue This appendix presents the following topics: Optimizing the storage I/O queue for VMware vsphere 5.5... 57 56
Optimizing the storage I/O queue for VMware vsphere 5.5 Appendix A: Optimizing the Storage I/O Queue This appendix describes the method for optimizing the storage I/O queue for the VMware vsphere 5.5 environments used in this solution. Optimizing the World Queue World Queue depth can be optimized by increasing the SCSI controller queue depth inside the Windows virtual machine. In this solution, we modified the number of pages used by the controller for the request ring to 32, and increased the queue depth to 254. To modify the number of pages and the queue depth: 1. On the Windows virtual machine, run the following script from the command line: REG ADD HKLM\SYSTEM\CurrentControlSet\services\pvscsi\Parameters\Dev ice /v DriverParameter /t REG_SZ /d "RequestRingPages=32,MaxQueueDepth=254" Note: The changes apply to the VMware Paravirtual SCSI controller to service I/O patterns that require queue depth significantly greater than the default values (8 for request ring page, 64 for default queue depth). For more details, refer to the VMware KB topic Large-scale workloads with intensive I/O patterns might require queue depths significantly greater than Paravirtual SCSI default values (2053145). 2. Reboot the virtual machine. 3. Confirm the creation of the registry entry for the parameters by navigating to the following path in the registry editor: HKLM\ SYSTEM\CurrentControlSet\services\pvscsi\Parameters\Device Optimizing the Adapter Queue The HBA adapter queue configuration varies from different manufacturers. To support mission-critical OLTP and DSS workloads, the default values set by the manufacturer may not be adequate. In this solution, we adjusted the queue depth for the HBAs on the ESXi host on which the Windows virtual machines run. We also modified the default HBA throttle settings to improve I/O spike. For all other parameters, we used the default settings. To change the HBA queue depth in the ESXi 5.5 host: 1. Create an SSH session to connect to the ESXi host. 2. Verify the HBA module is underused: # esxcli system module list grep qln 3. Modify parameter ql2xmaxqdepth to 256: # esxcli system module parameters set p ql2xmaxqdepth=256 - m qlnativefc 4. Reboot the ESXi host to make the changes take effect. 57
Appendix A: Optimizing the Storage I/O Queue 5. Run the following command to confirm your settings: # esxcli system module parameters list -m driver Note: These steps apply to Qlogic native HBA drivers. For information about configuring the HBA queue depth for HBA drivers from other manufacturers, refer to the VMware KB topic Changing the queue depth for QLogic, Emulex and Brocade HBAs (1267). The following steps are an example of how to modify HBA I/O throttle settings. For the specific hosts, contact the server or HBA manufacturer for more details. 1. In the server management console, under Server, select Inventory, select Cisco VIC adapters. 2. Navigate to vhba Properties. 3. Set I/O Throttle Count to 1024, as shown in Figure 22. Figure 22. Modifying the HBA I/O throttle count 4. Click Save Changes. Optimizing the Device Queue For the device on the virtual machine, increase the value of maximum outstanding disk requests to improve disk intensive workloads. To change the device queue depth: 1. Create an SSH session to the ESXi host. 2. Validate the current value for the World Queue parameter by running the following script: # esxcli storage core device list -d naa.id 58
Appendix A: Optimizing the Storage I/O Queue Note: The value is displayed under No. of outstanding I/Os with completing worlds; the default value was 32. 3. To modify the maximum outstanding disk requests to 256, edit the Disk.SchedNumReqOutstanding parameter by running the following script: # esxcli storage core device set -d naa.id -O 256 Note: EMC recommends that you apply these steps for the devices for I/O intensive environments, especially for disks that service very heavy IOPS or bandwidth. For more information, refer to the VMware KB topic Setting the Maximum Outstanding Disk Requests for virtual machines (1268). 59