Optimizing Cloud Performance Using Veloxum Testing Report on experiments run to show Veloxum s optimization software effects on Terremark s vcloud infrastructure
Contents Introduction... 3 Veloxum Overview... 3 Executive Summary... 3 Key Findings... 3 Scenario... 4 Network Performance... 4 Disk I/O Performance... 6 Summary... 6 APPENDIX A Test Environment... 7 REFERENCES... 8 Page 2 of 8
Optimizing Cloud Performance Using Veloxum Introduction This document is intended to provide advanced technical personnel - architects, engineers and consultants with data regarding the performance improvement to Terremark s vcloud infrastructure using Veloxum s server infrastructure optimization software with Active, Continuous Optimization (ACO). This document presents the findings of internal Veloxum testing that simulated varying workloads on Terremark vcloud Windows 2003 SP2 client both before and after introduction of Veloxum. While Veloxum normally works to optimize holistically all virtual infrastructure, including virtual hosts, guest OS, and clients, these tests demonstrated Veloxum s optimization efficacy while only operating on a client accessing a Terremark s vcloud instance. Additional performance and utilization gains are possible if Veloxum is installed on additional virtual infrastructure. Veloxum Overview Veloxum s founders started the company with the vision to apply active, continuous optimization to manage complex cloud environments for optimum performance and utilization. Veloxum actively and continuously optimizes cloud infrastructure operating system (OS) and application settings, using its ACO process, to increase guest performance and maximize workload density. It leverages the existing systems and infrastructure by tuning the various components within their manufacturer supported settings. The solution enables cloud providers to maximize performance, increase workload density, and minimize virtualization costs, dramatically reducing cloud infrastructure costs. Executive Summary Veloxum internally tested a Terremark cloud-based host with simulated real-world workloads. The tests first ran with no optimization and then with Veloxum s optimization software with Active, Continuous Optimization (ACO). The end-to-end environment included a single Terremark Windows 2003 SP2 virtual host that testers could test using varying traffic loads. In networking tests Veloxum showed increased download and upload speeds by 9x (900%) and ~46x (4554%), respectively. In disk I/O speed tests Veloxum showed improvements that ranged from 26 to 54%. Key Findings Tests run on a Terremark cloud-base Windows 2003 SP2 instance and Veloxum together showed the following benefits versus the Terremark instance alone. Increased download and upload speeds to the instance by 9x (900%) and ~46x (4554%) Increased disk I/O within the client between 26 to 54% Page 3 of 8
Scenario The test setup included a Veloxum for Enterprise intelligent Performance Test Engine (ipte) server running in a local data center providing optimization parameters to Veloxum client connector running a local client. (See Appendix A) The client machine was virtualized on top of VMware s vsphere hypervisor. No additional access was provided by Terremark beyond normal access privileges. No changes were made to the server. All optimizations were performed on the client. The following sections present the results of two tests run on the Terremark vcloud host: 1) Network Performance: Upload and Download throughput tests are conducted by the Veloxum ipte-controlled client. 2) Disk I/O Performance: Artificial disk I/O loads using IOMeter, a commonly used load tester originally developed by Intel, were introduced on the client both before the machine was optimized and after it was optimized. Network Performance The graph below reflects a screen shot from the Veloxum for Enterprise s ipte management console. In this graph, two sets of data have been highlighted: Upload and Download throughput measurements from before the machine was optimized and after it was optimized. Before After Figure 1: Network performance before-and-after Veloxum The orange line represents the upload measurement, while the green represents download. Variations in performance are expected as data is being transferred over public networks. Network bandwidth saturation is still achieved consistently with the lowest upload measurement demonstrating a 20x improvement over the peak measurement prior to optimization. Data samples were collected every two hours over a one week testing period. Fixed sized payloads were transferred between the Veloxum ACO appliance and the client machine in the cloud. Each data sample represents a bi-directional transfer of information. Page 4 of 8
Optimizing Cloud Performance Using Veloxum Before optimization of the client machine, several important metrics were reported: 1) Calculated ACK speed of 100Mbits 2) Average percentage of out of order packets was 20%. 3) Average packet loss of 0.01%. 4) High connection limited state. 5) High receiver limited state. 6) Average upload speed of 2.2Mbits. 7) Average download speed of 10Mbits. After optimization of the client machine, these metrics changed dramatically: 1) Calculated ACK speed of 100Mbits 2) Average percentage of out of order packets was 0%. 3) Average packet loss of 0.002%. 4) Low connection limited state. 5) Low receiver limited state. 6) Average upload speed of 100Mbits. 7) Average download speed of 90Mbits. In short, the upload speed improved by 46x and the download improved by 9x. These improvements saturated the 100Mbit connection between client and server. Page 5 of 8
Disk I/O Performance The graph below describes the throughput of the Windows client under various artificial loads. The optimization event took place at the midpoint of the graph, as indicated. Before After Figure 2: Disk I/O optimization before-and-after Veloxum Prior to optimization, there are two peaks, which represent clusters of IOMeter loads (see Reference section), hereafter referred to as Load 1 and Load 2. Those same loads are repeated after optimization and in the same order and timescale. The table below displays the peak throughput numbers in logical bytes/second, as reported by the Windows 2003 WMI reporting engine. Load Number Before Optimization After Optimization Percentage Load 1 217 Mbytes/sec 336 Mbytes/sec 54% improvement Load 2 300 Mbytes/sec 387 Mbytes/sec 29% improvement Summary Both networking and disk performance of the remote client system improved dramatically and immediately after optimization was performed. These changes did not require any special access to the vcloud infrastructure, nor were any special accommodations made for the tests to be performed by Veloxum. Most importantly, these improvements could be demonstrated without any modification to the underlying VMware infrastructure. If access were provided to the underlying ESX host as well as the statistics generated by that host, further and more dramatic improvements are possible. Page 6 of 8
Optimizing Cloud Performance Using Veloxum APPENDIX A Test Environment Figure 3: Terremark / Veloxum Test Environment The test setup for the experiments described within this document attempted to simulate a client accessing a Terremark cloud environment. Veloxum software included was a Veloxum for Enterprise intelligent Performance Test Engine (ipte) server running in a local data center providing optimization parameters to Veloxum client connector running a local client. No additional access was provided by Terremark beyond normal access privileges. No changes were made to the server. All optimizations were performed on the client. Page 7 of 8
REFERENCES About Iometer Iometer is an I/O subsystem measurement and characterization tool for single and clustered systems. Iometer is pronounced eye-om-i-ter, to rhyme with thermometer. Iometer does for a computer s I/O subsystem what a dynamometer does for an engine: it measures performance under a controlled load. Iometer was formerly known as Galileo. It was originally developed by the Intel Corporation and announced at the Intel Developers Forum (IDF) on February 17, 1998 - since then it got wide spread use within the industry. Meanwhile Intel has discontinued work on Iometer and it was given to the Open Source Development Lab (OSDL). In November 2001, a project was registered at SourceForge.net and an initial drop was provided. Since the re-launch in February 2003, the project is driven by an international group of individuals who are continuously improving, porting and extending the product. The tool (Iometer and Dynamo executable) is distributed under the terms of the Intel Open Source License. The iomtr_kstat kernel module as well as other future independent components is distributed under the terms of the GNU Public License. Iometer is an I/O subsystem measurement and characterization tool for single and clustered systems. Iometer is both a workload generator (that is, it performs I/O operations in order to stress the system) and a measurement tool (that is, it examines and records the performance of its I/O operations and their impact on the system). It can be configured to emulate the disk or network I/O load of any program or benchmark, or can be used to generate entirely synthetic I/O loads. It can generate and measure loads on single or multiple (networked) systems. Iometer can be used for measurement and characterization of: Performance of disk and network controllers. Bandwidth and latency capabilities of buses. Network throughput to attached drives. Shared bus performance. System-level hard drive performance. System-level network performance. Page 8 of 8