Storage Performance Testing Woody Hutsell, Texas Memory Systems
SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. 2
Abstract Storage Performance Testing Conducting storage performance tests is essential to selecting storage for tiered storage environments. Some applications require endless hours of constant data acquisition, while others experience peak bursts of small block I/O. The best storage device for one application is almost never the right storage device for another. This session will provide an in-depth technical discussion of storage performance testing. 3
Agenda Topics Why test storage? Types of storage performance testing: Benchmarking. Application simulation. Application testing. Production testing. Common mistakes. 4
Why Test Storage? Because it matters to your users and customers. Slow storage performance means slow response times and long running queries. Because it affects your batch window. Slow storage can mean longer batch or backup windows causing lower application availability or maintenance windows. Because it matters to your company s profitability. Slow storage can frustrate your customers and waste the investment you have made in your server infrastructure. Inappropriate use of fast storage means wasted dollars spent on performance. Because storage vendors do not publicize every relevant metric for your application and environment. 5
Types of Storage Performance Testing Benchmarking. Review published and audited industry benchmarks. Conduct tests with industry standard software. Conduct tests for data corruption. Application simulation. Use industry standard software to test a program with conditions similar to a target application. Application testing. Test an application with sample queries or scripts in a production-like environment. Production testing. 6
Published Benchmarks Storage Performance Council www.storageperformance.org SPC-1 test simulates an on-line transaction processing (OLTP) environment. SPC-2 test to simulate large block sequential processing. Spec-SFS www.spec.org/sfs97r1 A good test for measuring performance of file servers and network attached storage. TPC-C www.tpc.org TPC-C for testing OLTP, TPC-H for decision support and TPC-W for web e-commerce. 7
Sample SPC-1 Result Shows peak SPC-1 IOPS. Shows response time curve. 8
Benchmarking Software IOMeter Most popular tool among storage vendors. Available free from www.iometer.org. Primarily a Windows-based tool. IOZone Broad OS support. Available free from www.iozone.org. Benchmark Factory for Databases by Quest Software. TPC-B, TPC-C, TPC-D (not for publishing results). Vendor tools. 9
IOMeter Disk Targets Tab Heuristics: One manager per server. One worker per processor. Note: If you leave this field at 0, IOMeter will use all available disk space. Can play a significant role in observed performance. 10
IOMeter Example Effect of Varied Outstanding I/Os 400,000 350,000 300,000 339,944 25 Outstanding IOs. IOPS 250,000 200,000 150,000 125,538 100,000 50,000-1 Outstanding IO. '2k' '4k' '16K' '64K' '256K' Block Size of Transfer 11
IOMeter Setting Access Specifications Test storage with small and large block transfer request sizes. Try different read/write mixtures. Try different sequential vs. random tests. Usually leave at default but can be changed to match application behavior. 12
IOMeter Example Effect of Varied Block Sizes 400,000 IOPS 350,000 300,000 250,000 200,000 150,000 Small block size = High IOPS but relatively Low bandwidth. Big block size = Low IOPS but relatively High bandwidth. 100,000 50,000 - '2k' '4k' '16K' '64K' '256K' Block Size of Transfer 13
IOMeter Example Effect of Random vs. Sequential 6,000 5,000 4,626 Sequential Transfers IOPS 4,000 3,000 2,000 Random Transfers 1,413 1,000 '2k' '4k' '16K' '64K' '256K' Block Size of Transfer 14
IOMeter - Scripting IOMeter can be used to generate scripts. Scripts can be set-up to run through a long set of data patterns and then record the output to a log file. Good for overnight testing runs of new products. Collect these reports and use them to compare products. Monitor a device over time or after major configuration changes to verify performance baseline. 15
IOMeter Scripting Example Name your script Set the duration for each test iteration. This script will increment through Outstanding I/O levels. 16
Testing for Data Corruption Storage devices and storage network components are almost always reliable in predictable performance ranges, but the question is how do they handle extreme requirements. Most benchmarking tools do not automatically check data. Testing for data corruption usually means testing with data patterns that challenge components. Need to test extremes of performance. Need to test extreme data patterns. 17
Case Study: Server NMIs Problem: Storage device was causing server NMI (crash) when other company storage does not. Weeks were spent testing. Progress was slow because the problem did not repeat frequently. Key test tool: Medusa Labs software was deployed to generate difficult data patterns for server and storage. Resulted in nearly instant NMIs due to their challenging data patterns (a good thing in this case because it helped diagnose the problem faster). Conclusion: Brand X HBA in PCI slot 1 caused NMIs while the same card in PCI slot 2 did not cause NMI. Brand Y HBA worked fine in either PCI slot. Problem only observed under extreme loads. 18
Application Simulation Testing One type of test does not represent all applications. One type of application does not represent all uses for a storage product. Common types of application simulation testing: Test storage latency for messaging or other singlethreaded applications. Test peak storage bandwidth for data acquisition or data streaming environments. Test peak storage IOPS for databases. 19
IOMeter Simulating Single Threaded Applications Note: Single threaded applications are extremely sensitive to latency (server, HBA, switch and storage device). 1 1 Outstanding I/O simulates a single threaded application. 20
IOMeter Simulating Multithreaded Applications 25 25 Outstanding I/Os simulates a multi-threaded application. 21
IOMeter Simulating Database Environments 8 Small transfer request size simulates database transfers. Match the application s read/write distribution. Database activity is mostly random. 22
IOMeter Testing Mixed Access Patterns Name Your Access Pattern Set % Access to resemble your application. 23
IOMeter Simulating Data Streaming 512 Big transfer request size tests peak bandwidth. Match the application s read/write distribution. A mostly sequential setting is best. 24
Application Testing Testing with the actual application is the best way to measure storage performance. Production-like environment that can stress storage limits is desirable. Measure performance of different solutions: Compare OLTP response times. Compare batch run times. Compare sustained streaming rates. Operating system and application tools can help monitor storage performance. 25
Case Study: Windows Storage Performance Windows performance monitor can be used to monitor storage performance. Capture the following key variables over the duration of a peak processing period or test run: Processor: % processor time (total and by processor). Physical disk: average disk queue (total, read and write by disk/array). Physical disk: disk bytes/second (total, read and write by disk array). 26
Case Study: Windows Storage Performance Tips to analyzing Windows Performance Monitor results: Use the following scaling to ease visual analysis: Disk queues: 1:1 ratio (default is 100:1). Processor utilization: 1:1 ratio (default is 1:1). Disk bytes: 0000001:1 ratio (default is a.0001:1). Start with total fields and then drill down into read/write/by disk/by processor variables. Alter the line thickness to see your results easier. Use the slider bars to zoom into trouble spots. 27
Case Study: Windows Storage Performance More tips to analyzing Windows Performance Monitor results: Disk bytes per second should be divided by 1024 to get disk KB and 1024 again to get disk MB. Where physical disk queues increase is likely at the same point where you have hit a storage performance limitation. A system with high processor utilization does not have a storage performance bottleneck. Microsoft recommends that physical disk queues greater than 3 (per disk) shows an I/O bottleneck. Processor utilization levels off in places you have physical disk queues this is an indication that faster storage will improve application performance. 28
Case Study: Windows Storage Performance Processor Time shows percent processor utilized. Disk Queues show pending requests to storage. Disk Bytes Per Second helps reveal storage limitations. 29
Case Study: Windows Storage Performance No Processor bottleneck Low disk activity No disk queues 30
Monitoring Storage Performance With UNIX IOStat results show read and write bytes per second: Device: r/s w/s rkb/s wkb/s avgrq-sz avgqu-sz await svctm %util /dev/sdb 0.00 10619.39 0.00 85570.91 16.12 4636.79 43.52 0.10 101.21 /dev/sdc 0.00 10678.79 0.00 85570.91 16.07 2438.06 22.75 0.10 107.27 avg-cpu: %user %nice %sys %idle 13.04 0.33 68.15 18.48 TOP shows CPU utilization including I/O Wait. load averages: 0.09, 0.04, 0.03 16:31:09 66 processes: 65 sleeping, 1 on cpu CPU states: 69.2% idle, 18.9% user, 11.9% kernel, 0.0% iowait, 0.0% swap Memory: 128M real, 4976K free, 53M swap in use, 542M swap free 31
Monitoring Storage Performance With Oracle Elapsed: 68.87 (mins) Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------- db file sequential read 18,073,422 581,168 59.36 db file scattered read 933,001 267,364 27.31 db file parallel write 25,990 35,898 3.67 SQL*Net message from dblink 181,872 20,372 2.08 latch free 11,936 17,879 1.83 ------------------------------------------------------------- These wait events are heavily influenced by storage. Tablespace ------------------------------ Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------ ------- ------------ -------- ---------- ------ SESSION_DATA 61 0 20.2 1.2 190,606 94 128,753 56.8 UNDOTBS1 32 0 14.1 1.0 16,517 8 6,083 2.3 Tablespace metrics are a good way to monitor storage performance. 32
Production Testing Risk vs. Reward. Risk: taking an unsupported, well-traveled evaluation unit and putting it in a production environment could compromise application availability and expose unexpected system problems. Reward: sometimes this is the only way to know for certain that storage performance is acceptable for an application. 33
Typical Mistakes Testing storage performance with file copy commands. Comparing storage devices back-to-back without clearing server cache. Testing where the data set is so small the benchmark rarely goes beyond server or storage cache. Forgetting to monitor processor utilization during testing. Monitoring the wrong server s performance. 34
Continue Your SNIA Education Experience At SNW Attend Hands-On Labs in: Data Classification Key to Service Level Management Data Security and Protection Data Assurance Solutions to Meet Corporate Requirements IP Storage iscsi, Your IP SAN Storage Management Manage Storage or Be Managed By It Storage Virtualization Increasing Productivity Zero to SAN Fibre Channel Connectivity in No Time Sessions begin Monday afternoon, April 16 and continue through Wednesday, April 18. All sessions in Emma/Maggie/Annie, 3 rd Floor of the Hyatt Manchester. Registration at the SNW Registration area 35
Q&A / Feedback Please send any questions or comments on this presentation to SNIA: trackstorage@snia.org Many thanks to the following individuals for their contributions to this tutorial. SNIA Education Committee Sarah Worthy Storage Performance Council Chris Lionetti Jamon Bowen Elaine Silber Rob Peglar 36