PERFORMANCE BENCHMARKS PAPER Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks Arvind Pruthi Senior Staff Manager Marvell April 2011 www.marvell.com
Overview In today s virtualized data centers, application server responsiveness and scalability are being limited by storage I/O performance. The traditional approach to solve this pain is to use high-end shared storage. This is expensive and not very scalable. The Marvell DragonFly TM Virtual Storage Accelerator (VSA), powered by Marvell HyperScale TM embedded technology, eliminates this pain by creating a performance tier on the host side. This paper describes a series of tests that evaluate and demonstrate the ability of the DragonFly Virtual Storage Accelerator (VSA) to eliminate the I/O pain in a scalable manner. For full details of the Marvell DragonFly Virtual Storage Accelerator product, see the solutions white paper entitled: BREAKING THROUGH THE STORAGE I/O BARRIER FOR CLOUD COMPUTING; How Marvell DragonFly Virtual Storage Accelerator Enables Massively Scalable Virtualization While Reducing Storage Capital Costs by 50 Percent or More, By Shawn Kung, Director of Product Marketing, Marvell. High-Level Summary The goal of this paper is to provide detailed testing results and information to help quantify and validate the benefits of Marvell s Dragonfly VSA and demonstrate how DragonFly can: Help improve I/O performance in virtualized environments with multiple Guest Operating Systems simultaneously. Help allow many more Virtual Machines (VMs) (scaling) to be connected to existing shared storage without compromising I/O performance. Help I/O performance in case of I/O bursts from one or more guest operating systems. Test Setup and Methodology Setup As shown in Figure 1 below, the test setup involves three primary components: Two host machines. o o o A network switch Each host machine is running Ubuntu 10.04 Server edition with the Linux KVM Hypervisor and Linux Kernel version 2.6.33.5. The host machine is used for workload generation and calibrating I/O performance. The host machines use one or more tiles (explained below) for load generation. One host machine runs without Marvell DragonFly and the other runs without it. Two (IBM N3300 A20) iscsi Shared Storage (iscsi SAN Storage) Servers. o One Storage server is dedicated for testing without DragonFly and the other is dedicated for testing with DragonFly. 2
Figure 1: Test Scenario Virtual Machines The test involves running two virtual machines on each host in parallel with the following configuration and workload: VM Configuration (Microsoft Exchange <Mail> Server Simulation) CPU 1 Virtual CPU Memory 2GB OS Windows 2003 Server R2 OS Virtual Disk Image Size 12 GB Workload Generator Microsoft Exchange JetStress Number of Mailboxes 500 Size of each Mailbox 35 MB Target IOPS per mailbox 1 Inserts 20% Delete 10% Replace 40% Read 30% Lazy Commits 80% fio Profiles As described in Test Methodology, Linux fio tool is used for generating workload for a burst test. Following are the fio profiles for the two tests: Random Read burst test Rw randread size 1G filesize 5G iodepth 32 ioengine libaio direct 1 blocksize 4KB, 8KB, 16KB, 32KB, 64KB 3
Random Write burst test rw randwrite size 1G filesize 5G iodepth 32 ioengine libaio direct 1 blocksize 4KB, 8KB, 16KB, 32KB, 64KB Host Machine Configuration CPU Memory OS Hypervisor 2x Intel Xeon 5530 CPUs with 8MB L2 cache 24GB DDR3 Ubuntu 10.04 Server edition with Linux Kernel 2.6.33 Linux KVM SAN Storage Manufacturer IBM Make N3300 A20 Test Methodology Two different types of testing will be performed to illustrate multiple scenarios in which Marvell DragonFly helps alleviate I/O performance pain in virtualized environments: Without/With DragonFly Virtual Storage Accelerator 1. On a host without DragonFly card, generate workload to an iscsi SAN Storage Server from 2 Virtual Machines in parallel, each running Microsoft Exchange benchmark. 2. Measure the load on the SAN Storage. 3. Measure the load on the host machine. 4. Re-run the same configuration from another host with the DragonFly card. 5. Compare results. Burst I/O Performance 1. On a host without DragonFly card, generate a random write workload using the linux fio tool. Utilize the fio profile described in the Test Setup section. 2. Re-run test in step 1 with a random read workload using the fio profile described in the Test Setup section. 3. Re-run steps 1 and 2 with the host with DragonFly. 4. Record and compare results. 4
Test Results Without/With DragonFly Host Stats The above graph shows 3.9X improvement in JetStress IOPS with DragonFly The above graph shows 18X reduction in read latency and 3.8X reduction in latency for writes with DragonFly 5
Stats on SAN Storage The above graph shows a 14.5X reduction in iscsi ops reaching SAN Storage in the test with DragonFly The above graph shows a 13X reduction in CPU utilization and 18X reduction in Disk Utilization on SAN Storage in the test with DragonFly 6
Burst IO Testing The above graph shows a 11.3X - 3.4X improvement in random read throughput for block sizes 4K 256K in a burst IO test with DragonFly The above graph shows a 4.5X - 13X improvement in random write throughput for block sizes 4K 256K in a burst IO test with DragonFly 7
Conclusion Storage I/O Performance is greatly enhanced with Marvell DragonFly (3X 13X benefit). Huge reduction in IO latency with Marvell DragonFly (Up to 18X). Huge load reduction on shared storage can be achieved with Marvell DragonFly (14.5X, 13X and 18X for iscsi ops, CPU utilization and disk utilization respectively). Huge improvement in burst random IO performance (3X 11X for reads and 4.5X 13X in writes depending on the IO Size) with Marvell DragonFly. Huge scaling benefits with Marvell DragonFly. With Marvell DragonFly, 10X 20X more virtual machines can be connected to an existing shared storage without noticing any significant drop in performance. (Based on the reduction in load on the shared storage as described in the results). This directly leads to tremendous cost savings in terms of capital costs, TCO, power and cooling. Marvell DragonFly VSA offers a cost-effective, turn-key solution that seamlessly fits into existing storage architectures. By dramatically changing the way you view the storage NAS/SAN architecture, the Marvell DragonFly, powered by Marvell HyperScale embedded cache technology, has the potential to revolutionize Virtual Data Centers with significant reduction in cost and I/O pain. For more information go to http://www.marvell.com/dragonfly/ About the Author: Arvind Pruthi Senior Staff Manager, Marvell Arvind Pruthi is senior staff manager for the Enterprise Storage Solutions group at Marvell. In this position, Mr. Pruthi works closely with customers to understand their needs and help ensure a seamless and painless integration of Marvell s ESS solutions into customers environments. Arvind has directly contributed to over a dozen patents filed by the ESS group. He has over 13 years of in-depth experience in the storage industry, holding technical leadership positions at NetApp and a storage startup company prior to that. He has an M.S. in Computer Science from Kurukshetra University, India, and a B.S. degree in Computer Science from St. Stephens College, Delhi University, India. Marvell Semiconductor, Inc. 5488 Marvell Lane Santa Clara, CA 95054, USA Tel: 1.408.222.2500 www.marvell.com Copyright 2011. Marvell International Ltd. All rights reserved. Marvell and the Marvell logo are registered trademarks of Marvell or its affiliates. Marvell Smart, DragonFly and HyperScale are trademarks of Marvell or its affiliates. Other names and brands may be claimed as the property of others. DragonFly_Performance_Benchmark_paper-001 4/2011 8