White Paper October 2014 Introducing memory extensions from Microsoft s* newest database product, and Intel SSD Data Center Family for PCIe.* Order Number: 331409-001US
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such IOMeter*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance when combined with other products. According to Intel test methodologies and hardware: SQL TPC-e tests run on Dell server R720 system with Intel Xeon CPU E5-2690 v2 @ 1.90 GHz (2 processors), 64-bit Operating System, x64-based processor, 4 sockets, 24 cores, 48 logical processors, Hyper-V support, 128 GB RAM, Microsoft* Windows* Server 2012 R2 64 bit O/S and SQL Server 2014. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Material in this presentation is intended as product positioning and not approved end user messaging. Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate. *Other names and brands may be claimed as the property of others. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. Copyright 2014 Intel Corporation. All rights reserved. White Paper October 2014 2 331409-001US
Contents Executive Summary... 5 Microsoft s New Memory Caching Tier: Buffer Pool Extension... 6 Benefits of Buffer Pool Extension (BPE)... 7 New Capabilities with Intel SSD Family for PCIe*... 7 Configurations... 8 Objectives... 9 Test Definitions... 9 Cost for Performance... 9 Continuous Availability... 11 User Experience... 11 Conclusion... 12 Tables and Resources... 13 Intel SSD Data Center Total Cost of Ownership (TCO) Calculator... 13 Summary Data of Test... 14 October 2014 White Paper 331409-001US 3
Revision History Revision Description Revision Date 1.0 Initial release October 2014 White Paper October 2014 4 331409-001US
Executive Summary We all know the saying You get what you pay for. What we don t often see is the hidden cost of an infrastructure solution, using more or most of what you pay for. The premise of virtualization is to use more of your powerful Intel Solid State Drives (SSDs). For many years, database platforms have been limited by millisecond storage systems. In-Memory Databases have been developed that help bridge this gap. Cost and capacity are clearly limiters to an all-dram approach. DRAM-like performance (often in low microseconds) of high performance Data Center Solid-State Drives (SSDs) can help satisfy current user experience pressures and alleviate the nuisance of constrained DRAM. Inconsistent database experiences have been exacerbated by data growth, expensive queries, and maintenance routines adding stress to the slow storage of widely implemented Storage Area Networks (SAN) many Enterprises rely on today. These inefficiencies have fostered the advent of in-memory options, or pure in-memory databases, with no network direct attached storage (no SAN). New storage options have been a tipping point in today s evolving data architectures. Augmenting DRAM with Buffer Pool Extension (BPE), especially if your environment is locked into a Hard Disk Drive (HDD) storage solution is easy and cost effective. However, you can use your entire investment more wisely and to the user s benefit by complimenting DRAM with storage close in capability to high performance SSDs, where there are hidden benefits of space, heat, reliability, and power. Intel SSD benefits for your Data Center costs are: Better utilization of server systems Consistency when peak workload events occur Improved user experience This white paper focuses on the new breed of storage opportunities for the SQL Server market: all Flash Non-Volatile Memory (NVM) and DRAM augmentation, which are so widespread in Microsoft s* enterprise data center ecosystem. This covers the DRAM augmentation (BPE) capabilities of Intel SSD Data Center Family for PCIe to be certified by all Intel OEM s in early 2015. This white paper also outlines test results and economic benefits of SQL Server 2014 running on Intel SSDs using SATA-based connections. October 2014 White Paper 331409-001US 5
Microsoft s New Memory Caching Tier: Buffer Pool Extension With the release of Microsoft* SQL Server 2014, the core Buffer Manager includes a new tier of memory that handles the SQL Server Memory runtime and interaction between storage and memory. This feature is called Buffer Pool Extension (BPE). BPE is tightly coupled into the SQL Buffer Manager and focuses on use of an SSD to augment the DRAM utilized for SQL Memory. Over time, database usage tends to grow, and thus database size and overflow of the active working set of the database beyond DRAM and onto the disks can be a nagging performance and scalability concern. Microsoft* designed BPE as a way to augment a system s DRAM onto NVM as a cost effective memory extension. As shown in Figure 1, there are two tiers of memory, denoted by L1 and L2 from which the queries are serviced. Figure 1: Buffer Pool Extension within SQL Server Architecture Updated (or dirty) SQL pages needing to be flushed to the storage layer are NOT added to BPE. The design goal is to improve read performance and add to the scale up capability when larger query pressure is exerted on the database, such as when a periodic usage spike occurs. Most database workloads are read oriented and therefore the focus on a read extension to the Buffer Pool. The Buffer Pool and the BPE is always re-built on a database restart, so there is no need for a RAID setup to protect the data. Your storage system is still the persistence layer of your database where data protection is to be designed. A common question regarding extending the Buffer Pool is what ratio of DRAM to NVM should I use? Microsoft* recommends a range of from 4x to 16x the amount of SQL Memory Buffer Pool configured. This is a guideline. Intel decided to use 14x of 100GB, which translates to 1.4TB utilized of a 1.6TB Intel SSD P3700 Series, PCIe-based SSD. One approach is to consolidate several databases onto one server and service many databases from a single PCIe* SSD featuring NVMe.* In Intel s testing, we observed no bottleneck on the PCIe* SSD s total space, as we were testing only one large database. White Paper October 2014 6 331409-001US
Benefits of Buffer Pool Extension (BPE) The benefits of BPE are described in more depth within the Microsoft documentation. Go to: http://msdn.microsoft.com/en-us/library/dn133176.aspx The summary of these benefits are: Increased random I/O throughput Reduced I/O latency Increased transaction throughput Improved read performance with a larger hybrid buffer pool Caching architecture that can take advantage of present and future low-cost NVM drives New Capabilities with Intel SSD Family for PCIe* Intel SSD P3700 Series featuring NVMe* is a strong candidate for Buffer Pool Extension (BPE) for several reasons. The following describes Microsoft s summary of benefits and how PCIe* SSDs are architected towards delivering on these goals: Random I/O Performance randomizing for SSDs is easy, as the SSD controller runs on the advanced, highbandwidth PCIe* bus controlled within the CPU, where PCIe* has its core. SSD controllers have become multi-core offer many channels into the memory. Intel offers a multi-queuing NVMe* device driver for Microsoft Windows platforms, and tools for administration of the drive making SSDs not only more parallel, as well as easier to manage. Drivers and standards have been built for Flash-based NVM and for the NVM of tomorrow. For access to Intel s tools and drivers for NVMe* featured SSDs, see the Tables and Resources section. Reduced I/O Latency Latency is key. When bursts or high production activity occurs, queuing to a hard drive data volume makes IO latency much worse. Intel SSDs featuring NVMe* typically provide a latency of sub 200μ seconds (or 0.2 milliseconds) at a wide range of queue depths because the design of the device is focused on great scale at a variety of queue depths. Great latency, even at queue depths up to 128, is where random performance rates in the Intel SSD Data Center P3700 Series. Most vendors claim very large specification numbers, such as IOPs on their SSDs. However, performance at low queue depths is what matters in your production environment. Intel achieves both there are no compromises. (Reference, Intel SSD DC P3700 Series review by Anandtech.com: http://www.anandtech.com/show/8104/intel-ssd-dcp3700-review-the-pcie-ssd-transition-begins-with-nvme) Increased Transactional Throughput Without greater scale and transactions, why focus on adding to DRAM? This white paper showcases how transaction rates went up by adding a BPE and an SSD to the Buffer Pool. Improved Read Performance In this study, we focused on ensuring that the BPE feature raises the Buffer Hit Cache ratio and maintains the user transaction performance as good or close, to having all DRAM for the Buffer Pool. Caching Architecture Most of today s modern enterprise database platforms have delivered on providing a multi-tier caching architecture such as Microsoft SQL Server. The tipping point is the nature of modern SSDs and the delivery of tightly integrated software tiering features in unison at a cost equation that is positive in cost to performance terms. In addition to SQL Memory tiering, a better storage system is still important in achieving greater scale, even after you have achieved a larger Buffer Pool and are servicing more transactions. Your benchmark goal is to be bottlenecked on the CPU as this is your fastest and most expensive resource. For many years only better storage tiers based on SSDs have been able to maximize CPU investment of an Intel Xeon processor. As Intel works on dramatic NVM improvements towards very low microsecond operations, this trend towards tiering I/O operations for both server runtime (application or SQL memory) and storage performance are critical for a consistent user experience, and maximizing investments. October 2014 White Paper 331409-001US 7
Configurations The following tables detail system, storage, and client configurations for servers under test. System Configuration (Server under Test) System Details Dell R720 Server System CPU Model used DDR3 DRAM Memory Dual socket, rack mounted server system 2 each - Intel Xeon CPU E5-2690 v2 @ 3.00GHz, 10 Core(s), 20 Logical Processor(s) 128GB installed / 122GB available for applications BIOS Version Dell* 2.2.2, 1/16/2014 Network Adapters Storage Adapters Internal Drives and Volumes Intel Ethernet 1G X520 PERC H810 RAID Adapter for external JBOD, 1GB NV Cache PERC H710P RAID Adapter for internal storage, 1GB NV Cache C:\ OS Volume 2 each 15k HDD (RAID 1) PERC 710 Mini attached LOGS Volume 6 each 15k HDD (RAID 10) PERC 710 Mini attached Storage Configuration (Server under Test) System Details Dell MD1220 Storage Connections HDD SSD External Volumes 2.5 inch SFF, 24 bay Storage Enclosure Dual 6Gb/sec SAS connections 24 each 300GB, 15k RPM SAS Hard Disk Drives (HDD) 4 each Intel Solid-State Drive Data Center S3500 Series - 800GB 4 each SATA 800GB Intel SSD DC S3500 Series (RAID 5), 2.23TB usable, PERC 810 attached 24 each SAS 300GB 15k Enterprise HDD s (RAID 5), 6.4 TB usable, PERC 810 attached Client Configuration (Units Driving Test) System Dell R620 Server System CPU Model used DDR3 DRAM Memory Details Two each - dual socket, rack mounted server system 2 each - Intel Xeon CPU E5-2690 v2 @ 3.00GHz, 10 Core(s), 20 Logical Processor(s) 128GB installed / 122GB available for applications BIOS Version Dell* 2.2.2, 1/16/2014 Network Adapters Storage Adapters Intel (R) Ethernet 1G X520 PERC H710P with RAID 1 on two 300GB 15K HDD (C: Volume) White Paper October 2014 8 331409-001US
Objectives Test Definitions Intel ran standardized benchmark testing on user interactions with a retail securities trading application. Testing was designed mostly around read oriented (90%) workloads with approximately 10% write orientation, and scenarios involved in making and updating trades. This test design is common in that most high-volume websites involve significant amounts of read capability by a broad user base. Three tests were built for comparison. 1. Baseline Test The baseline test uses 24 units of 300GB 2.5 inch small form factor HDDs to build and test a 1.2TB database size. Attempting to get an active portion of this database to exceed the 100GB of DRAM available for the Buffer Pool making the database I/O centric, testing for value of different storage options to better augment a system running low on DRAM, which can be a common occurrence over time. 2. Test A The second test, Test A does not change the core system hardware configuration, except for the use of one Intel SSD for the Data Center, which houses the extra database pages that would not fit in 100GB of DRAM. Essentially, we turned on Microsoft s SQL Server 2014 to test how it could effectively improve the performance at a fraction of the DRAM cost. 3. Test B The third test, Test B runs exact baseline configuration of the HDDs onto a set of 4-800GB Intel SSD Data Center Family products to free up slots for other work, such as data integrations, high availability, or added databases. Cost for Performance As an application owner, you should expect a system that can become processor constrained. This will only limit you after the CPU threads are all saturated when doing real user-facing work. The following table shows SSDs provide the most transactional output for the dual socket Dell R720 server. In going from an I/O device constrained test to one that was comfortably CPU constrained, at approximately 80% utilized SSDs made the Storage of the DB a non-constrained component. Transactions Per Second (TPS) Achieved by Test Results Test TPS CPU% Baseline (HDD) 795 8% Buffer Pool Extension (Test-A) 1367 14% Intel SSD DC S3500 Series (Test-B) 10,590 78% Baseline test is an HDD only system, the database is approximately 1.2TB in total size and the memory provided to the database Buffer Manager is 100GB. Baseline (HDD) test, the main detractor from scaling is the 24-HDDs supporting the database data files are maxing out through high queue depths and a limiter on the number of random I/O operations (IOPS). HDDs start maxing out at approximately 200 IOPS per device. This is the common limitation of performing randomized reads across a hard disk drive spindle. Even limits of a high performance 15K HDD are quite low. October 2014 White Paper 331409-001US 9
To compensate for this limit, designers have used an excessive amount of HDDs, at ever more power, heat and space requirements. In our test, we filled the enclosure with 24-Hard Disk Drives (HDD) in total. Our average Read I/O per second with the 24-HDDs was 9,005 (IOPS) at a queue depth of 316. We are not too concerned with the DB logging and the write I/O characteristics, as this is more a scalability of a read (query) intensive system. This workload sets in motion a set of screen transitions of a Securities Trading Company customer website that involves entities such as account lookups and trading securities. Even at 9,000 IOPS, we notice at this load level a Disk Queue Length of over 300 read requests. Most likely a productive system would never run at this level. In fact, very small or low queue depths must be maintained on HDDs to maintain consistent performance. As we stress the system more, the IOPS issues only become more apparent. In the Intel SSD DC S3500 Series Test, where we can utilize the CPUs up to around 80% with excellent transaction times, the IOPS of the 4-SSDs is at 104,716. This result is 23x the amount of IOPS from only 4-drives in a RAID 5 configuration, compared to 24-total HDDs running at their maximum. SSDs are not 6x the cost of HDDs. They are getting closer to HDD costs more every day. Plus with the SSDs, you are able to utilize your entire investment. You should always consider performance as total cost based, and how far can you take your system based on well-balanced devices and components working together harmoniously, such as Intel Xeon Processors and Intel SSDs. System cost may vary for each system, however, the primary difference is the single Intel SSD addition in System #2, and the differential of $533, suggested OEM cost of swapping 24-HDD s with 4-SSDs between System #1 and #3. Cost of Systems System Total System Hardware Details #1- Baseline (HDD) $28,311 System + JBOD + 24-15K HDD #2-Buffer Pool Extension (Test-A) $32,151 Adding one $3840 Intel SSD DC P3700 Series 1.6TB #3-Intel SSD DC S3500 Series (Test-B) $28,844 Swapping HDD for SSD, costs $533. Cost of Performance Test System Cost Per Transaction CPU Percentage Leveraged Baseline (HDD) $35.61 8% Buffer Pool Extension (Test-A) $23.52 14% Intel SSD DC S3500 Series (Test-B) $2.72 78% White Paper October 2014 10 331409-001US
Power Specifications per Drive Drive Idle Power Typical Operating Power Intel SSD DC S3500 Series (SATA) 0.6 watts 5 watts Intel SSD DC P3700 Series (PCIe) 7 watts 25 watts Dell* 15,000 RPM 2.5 SFF HDD 1 NA 8.68 watts 1 Reference: http://www.seagate.com/internal hard drives/enterprise hard drives/hdd/enterprise performance 15k hdd/#specs You can create a scalable system that supports expansion for databases or additional virtual machine instances and a more consistent user experience with Intel SSDs as your storage solution. Or you can eliminate the 2u rack space and the significant cost of the external enclosure, cabling and extra HBA cards and create a more compact database application single enclosure. Use the Intel Database Total Cost of Ownership (TCO) calculator to do the complete SSD drive cost savings comparison against using HDD s. Continuous Availability This paper does not cover the scope of providing continuous or high availability. However, BPE is agnostic to the choice you make with SQL Server 2014. Failover Microsoft Clustering solutions on shared storage or AlwaysOn Availability with system-to-system replication, can integrate a BPE SSD. Because BPE is a Cache and not a Storage Tier, you do not need to protect it with RAID as you would a storage tier. For more information on Continuous Availability from Microsoft*, see Tables and Resources.. User Experience The advent of the web-oriented internet has been challenging. The user experience is often a serial chain of events of loading resources into a web browser. At some point, the database transaction time is a component of the user experience. As a general rule, you want to keep transactions below 400 milliseconds. Some groups even have tougher restrictions with SSDs where it is possible to set a limit of database queries supporting 100 milliseconds, even when the system is strained (as shown in the tests). Intel s test team chose 78% CPU performance to give a more realistic world view of what could be done with Intel SSDs in production. The only configuration that provided all transactions running at well below 500 milliseconds (or.50 seconds) is the Intel SSD DC S3500 Series based test. The BPE option not only improved the cost of the system on a transactional basis, it improved the user experience numbers across the more expensive transaction at a favorable level. For your business, understanding what SSDs running at Gigabyte transfer rates are capable of is important to your user as well as your bottom line system costs. Transaction Performance in Seconds --- Baseline (HDD) Buffer Pool Extension Intel SSD DC S3500 Series Time ranges in seconds.01 1.55.01 -.85.01 -.12 All averages below.500 seconds No No Yes The following graph shows a plot of all 10 transactions of the test showing the bad case situation of the data at the 90 th percentile and not the average, which often hides this tier of slowest transaction measured by Intel in the test. October 2014 White Paper 331409-001US 11
User Experience - 90th Percentile by Time of Transaction (lower is better) USER EXPERIENCE 2.50 HDD BPE SSD 2.00 1.50 1.00 0.50 1 2 3 4 5 6 7 8 9 10 User experience is much better with an all Intel SSD configuration, and will be well below 2 seconds. These results are without global network latency or client (browser) processing time. Conclusion A more consistent user experience is the one thing your customers can truly feel from all your Data Center works. The underlying complexity of the SQL query can sometimes take up to 20 seconds to update. Users will feel that as the slowest portion of your site. Latency is why the industry is working at better cost equations and a new layer of memory with NVM and industry standard SSDs to improve the overall user experience. Intel and Microsoft now give you industry standard, stable, mature and lower cost components that can achieve more work in less time and provide a completely integrated approach. White Paper October 2014 12 331409-001US
Tables and Resources Intel SSDs website: http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-ssd.html P3700 review by Anandtech.com: http://www.anandtech.com/show/8104/intel-ssd-dc-p3700-review-the-pcie-ssd-transition-begins-with-nvme Microsoft Buffer Pool Extension resource page: http://msdn.microsoft.com/en-us/library/dn133176.aspx Microsoft SQL Server continuous availability options. http://msdn.microsoft.com/en-us/library/ms190202.aspx Caching a database file system utilizing Intel Cache Acceleration software: http://www.principledtechnologies.com/intel/cache_acceleration_software_0214.pdf Intel SSD Data Center Total Cost of Ownership (TCO) Calculator Discover the potential cost savings of deploying Intel SSDs in a data center environment. Determine the number of SSDs your environment would require to match the performance and capacity of your current hard disk drive (HDD) configuration. Simply input your information into the calculator and see the results to evaluate your savings and suggested Intel SSD requirements. Go to Intel.com to access the tool: http://www.intel.com/content/www/us/en/solid-state-drives/ssd-data-center-tco-calculator-tool.html October 2014 White Paper 331409-001US 13
Summary Data of Test Transaction Results (txn/sec) BPE Off (HDD only) BPE On DC S3500 (No BPE) Delta (BPE On/Off) Delta (S3500/HDD) 795.12 1366.85 10,589.80 71.9% 1231.8% CPU Total % Processor Time 8.09 13.81 78.31 70.8% 868.4% Network Bytes Total/sec 4,090,565 7,115,499 55,764,027 73.9% 1263.2% LOG DATA (24xHDDs or 4xS3500 SSDs) BPE (FD) SQL Buffer Manager Avg. Disk Write Queue Length 0.01 0.02 0.09 44.2% 554.2% Avg. Disk sec/write 0.00010 0.00009 0.00007-7.4% -33.1% Disk Writes/sec 135.76 212.07 1,339.59 56.2% 886.7% Avg. Disk Read Queue Length Avg. Disk Write Queue Length 240.72 235.24 147.91-2.3% -38.6% 75.41 172.88 32.86 129.3% -56.4% Avg. Disk sec/read 0.0291 0.0417 0.0014 43.0% -95.1% Avg. Disk sec/write 0.1024 0.1360 0.0035 32.8% -96.6% Disk Reads/sec 8,268.55 5,810.40 104,716.35-29.7% 1166.4% Disk Writes/sec 736.52 1,270.23 9,358.51 72.5% 1170.6% Avg. Disk Read Queue Length Avg. Disk Write Queue Length - 1.45 - n/a n/a - 0.02 - n/a n/a Avg. Disk sec/read - 0.00014 - n/a n/a Avg. Disk sec/write - 0.00003 - n/a n/a Disk Reads/sec - 10,209.87 - n/a n/a Disk Writes/sec - 828.49 - n/a n/a Buffer cache hit ratio 97 99 98 1.5% 0.5% Database pages 14,823,710 78,045,330 14,817,532 426.5% 0.0% Page reads/sec 8,266 5,810 104,714-29.7% 1166.9% Page writes/sec 735 1,268 9,330 72.6% 1170.0% Extension page writes/sec Extension page reads/sec Extension page evictions/sec Extension allocated pages - 1,089 n/a n/a - 10,223 n/a n/a - - n/a n/a - 72,618,863 n/a n/a White Paper October 2014 14 331409-001US