Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment



Similar documents
KVM: Kernel-based Virtualization Driver

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

FOR SERVERS 2.2: FEATURE matrix

Vertical Scaling of Oracle 10g Performance on Red Hat Enterprise Linux 5 on Intel Xeon Based Servers. Version 1.0

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: PRICING & LICENSING GUIDE

SAS Business Analytics. Base SAS for SAS 9.2

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

KVM Virtualized I/O Performance

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

RPM Brotherhood: KVM VIRTUALIZATION TECHNOLOGY

Developing a dynamic, real-time IT infrastructure with Red Hat integrated virtualization

KVM KERNEL BASED VIRTUAL MACHINE

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

RED HAT ENTERPRISE VIRTUALIZATION 3.0

Full and Para Virtualization

Red Hat enterprise virtualization 3.0 feature comparison

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

RED HAT ENTERPRISE VIRTUALIZATION

RED HAT ENTERPRISE VIRTUALIZATION & CLOUD COMPUTING

8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Microsoft Office SharePoint Server 2007 Performance on VMware vsphere 4.1

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

Performance brief for IBM WebSphere Application Server 7.0 with VMware ESX 4.0 on HP ProLiant DL380 G6 server

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE PRICING GUIDE

Microsoft Exchange Server 2007 and Hyper-V high availability configuration on HP ProLiant BL680c G5 server blades

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Enabling Technologies for Distributed and Cloud Computing

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Virtualizing Performance-Critical Database Applications in VMware vsphere VMware vsphere 4.0 with ESX 4.0

SUSE Linux Enterprise 10 SP2: Virtualization Technology Support

Scaling in a Hypervisor Environment

Dell Solutions Overview Guide for Microsoft Hyper-V

White Paper. Recording Server Virtualization

W H I T E P A P E R. Performance and Scalability of Microsoft SQL Server on VMware vsphere 4

PARALLELS CLOUD SERVER

Enabling Technologies for Distributed Computing

Virtualization and the U2 Databases

Performance and scalability of a large OLTP workload

MS Exchange Server Acceleration

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

Dell Virtual Remote Desktop Reference Architecture. Technical White Paper Version 1.0

RED HAT ENTERPRISE VIRTUALIZATION SCALING UP LOW LATENCY, VIRTUALIZATION, AND LINUX FOR WALL STREET OPERATIONS

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

WHITE PAPER Mainstreaming Server Virtualization: The Intel Approach

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Performance and Scalability of the Red Hat Enterprise Healthcare Platform

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

Best Practices for Virtualised SharePoint

Database Virtualization

Virtualization. Michael Tsai 2015/06/08

DELL. Dell Microsoft Windows Server 2008 Hyper-V TM Reference Architecture VIRTUALIZATION SOLUTIONS ENGINEERING

Qualcomm Achieves Significant Cost Savings and Improved Performance with Red Hat Enterprise Virtualization

Virtualization: What does it mean for SAS? Karl Fisher and Clarke Thacher, SAS Institute Inc., Cary, NC

Virtualization. Types of Interfaces

System and Storage Virtualization For ios (AS/400) Environment

RED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

HRG Assessment: Stratus everrun Enterprise

Virtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Virtualized Environments

VMware Server 2.0 Essentials. Virtualization Deployment and Management

Parallels Virtuozzo Containers

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

With Red Hat Enterprise Virtualization, you can: Take advantage of existing people skills and investments

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Servervirualisierung mit Citrix XenServer

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

Virtualization of the MS Exchange Server Environment

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

Red Hat Satellite Management and automation of your Red Hat Enterprise Linux environment

Windows Server 2008 R2 Hyper V. Public FAQ

Red Hat Network Satellite Management and automation of your Red Hat Enterprise Linux environment

Microsoft Exchange Solutions on VMware

3 Red Hat Enterprise Linux 6 Consolidation

Red Hat Enterprise Virtualization Performance. Mark Wagner Senior Principal Engineer, Red Hat June 13, 2013

2972 Linux Options and Best Practices for Scaleup Virtualization

Dell High Availability Solutions Guide for Microsoft Hyper-V

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

Sizing and Best Practices for Deploying Microsoft Exchange Server 2010 on VMware vsphere and Dell EqualLogic Storage

HP Data Protector software. Assuring Business Continuity in Virtualised Environments

QuickSpecs. HP Integrity Virtual Machines (Integrity VM) Overview. Currently shipping versions:

PARALLELS SERVER 4 BARE METAL README

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

A quantitative comparison between xen and kvm

YOUR STRATEGIC VIRTUALIZATION ALTERNATIVE. Greg Lissy Director, Red Hat Virtualization Business. James Rankin Senior Solutions Architect

MIGRATING LEGACY PHYSICAL SERVERS TO HYPER-V VIRTUAL MACHINES ON DELL POWEREDGE M610 BLADE SERVERS FEATURING THE INTEL XEON PROCESSOR 5500 SERIES

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

CON9577 Performance Optimizations for Cloud Infrastructure as a Service

Deploying Microsoft Exchange Server 2007 mailbox roles on VMware Infrastructure 3 using HP ProLiant servers and HP StorageWorks

Transcription:

Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment LoadGen Workload Microsoft Exchange Server 2007 Microsoft Windows Server 2008 Red Hat Enterprise Linux 5.4 (with integrated KVM Hypervisor) Dell PowerEdge R710 (Intel Xeon E5540 - Nehalem) Version 1.0 August 2009

Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment 1801 Varsity Drive Raleigh NC 27606-2072 USA Phone: +1 919 754 3700 Phone: 888 733 4281 Fax: +1 919 754 3701 PO Box 13588 Research Triangle Park NC 27709 USA Linux is a registered trademark of Linus Torvalds. Red Hat, Red Hat Enterprise Linux and the Red Hat "Shadowman" logo are registered trademarks of Red Hat, Inc. in the United States and other countries. All other trademarks referenced herein are the property of their respective owners. 2009 by Red Hat, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, V1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/). The information contained herein is subject to change without notice. Red Hat, Inc. shall not be liable for technical or editorial errors or omissions contained herein. Distribution of modified versions of this document is prohibited without the explicit permission of Red Hat Inc. Distribution of this work or derivative of this work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from Red Hat Inc. The GPG fingerprint of the security@redhat.com key is: CA 20 86 86 2B D6 9D FC 65 F6 EC C4 21 91 80 CD DB 42 A6 0E www.redhat.com 2

Table of Contents 1 Executive Summary... 4 2 Red Hat Enterprise Virtualization (RHEV) - Overview... 5 2.1 Red Hat Enterprise Virtualization (RHEV) - Portfolio... 5 2.2 Kernel-based Virtualization Machine (KVM)... 7 2.2.1 Traditional Hypervisor Model... 8 2.2.2 Linux as a Hypervisor... 8 2.2.3 A Minimal System... 9 2.2.4 KVM Summary... 9 3 Test Configuration... 10 3.1 Hardware... 10 3.2 Software... 11 3.3 Storage Layout... 11 4 Test Methodology... 12 4.1 Workload... 12 4.2 Configuration & Workload... 12 4.3 Performance Test Plan... 12 4.4 Tuning & Optimizations... 13 5 Test Results... 15 5.1 Scaling Multiple 2-vCPU Guests... 16 5.2 Scaling Multiple 4-vCPU Guests... 18 5.3 Scaling Multiple 8-vCPU Guests... 20 5.4 Scaling-Up by Increasing the Number of vcpus in a Single Guest... 22 5.5 Virtualization Efficiency in Consolidation Scenarios... 24 6 Conclusions... 25 7 References... 25 3 www.redhat.com

1 Executive Summary This paper describes the performance and scaling of an industry-standard Exchange application, Microsoft Load Generator (LoadGen), running in Microsoft Windows Server 2008 guests under Red Hat Enterprise Linux 5.4 using the KVM hypervisor. The host system was deployed on a Dell PowerEdge R710 G6 server equipped with 72 GB of RAM and comprising dual sockets each with a 2.53 GHz Intel Xeon E5540 (Nehalem) processor with support for hyper-threading technology, totaling 8 cores and 16 threads. This paper illustrates the ability of Red Hat virtualization to virtualize disk and network IO in both scale-up and scale-out scenarios. The paper demonstrates that for this particular application and workload, Red Hat virtualization is more efficient at scaling-out than at scaling-up. In addition, the paper gives some general guidelines for optimizing Exchange in such an environment. Scaling-Up a Virtual Machine First, the performance of the LoadGen application was measured by loading a single guest on the server, and assigning it 2, 4, or 8 vcpus. The network latency increased by less than 87% as the number of users was increased by a factor of 4 and the guest expanded from 2 threads to a complete 4 core/8 thread server. Scaling-Out Virtual Machines A second series of tests involved scaling out multiple independent VMs each comprising 2 or 4 vcpus up to a total of 8 vcpus on an 8 core/16 thread Nehalem server. As an example, the performance was tested for 1, 2, 4, and 8 concurrent 2-vCPU guests running LoadGen. The network latency for the 8-guest configuration increased by less than 20 percent over the 1-guest setup while increasing the number of concurrent Exchange users by a factor of 8. The data presented in this paper clearly establishes that Red Hat Enterprise Linux 5.4 guests using the KVM hypervisor on a Dell PowerEdge R710 provide a stable platform for hosting multiple virtualized Exchange applications. The ability to scale-out contributes to the effectiveness of KVM for Exchange applications. The number of actual users and latency supported in any specific customer situation will, of course, depend on the specifics of the customer application used and the intensity of user activity. However, the results demonstrate that in a heavily virtualized environment, acceptable latency was retained even as the number and size of guests/virtual-machines was increased up until the physical server was fully subscribed. www.redhat.com 4

2 Red Hat Enterprise Virtualization (RHEV) - Overview 2.1 Red Hat Enterprise Virtualization (RHEV) - Portfolio Server virtualization offers tremendous benefits for enterprise IT organizations server consolidation, hardware abstraction, and internal clouds deliver a high degree of operational efficiency. However, today, server virtualization is not used pervasively in the production enterprise datacenter. Some of the barriers preventing wide-spread adoption of existing proprietary virtualization solutions are performance, scalability, security, cost, and ecosystem challenges. The Red Hat Enterprise Virtualization portfolio is an end-to-end virtualization solution, with use cases for both servers and desktops, that is designed to overcome these challenges, enable pervasive datacenter virtualization, and unlock unprecedented capital and operational efficiency. The Red Hat Enterprise Virtualization portfolio builds upon the Red Hat Enterprise Linux platform that is trusted by millions of organizations around the world for their most mission-critical workloads. Combined with KVM, the latest generation of virtualization technology, Red Hat Enterprise Virtualization delivers a secure, robust virtualization platform with unmatched performance and scalability for Red Hat Enterprise Linux and Windows guests. Red Hat Enterprise Virtualization consists of the following server-focused products: 1. Red Hat Enterprise Virtualization Manager (RHEV-M) for Servers: A feature-rich server virtualization management system that provides advanced management capabilities for hosts and guests, including high availability, live migration, storage management, system scheduler, and more. 2. A modern hypervisor based on KVM (Kernel-based Virtualization Machine) which can be deployed either as: Red Hat Enterprise Virtualization Hypervisor (RHEV-H): A standalone, small footprint, high performance, secure hypervisor based on the Red Hat Enterprise Linux kernel. Or Red Hat Enterprise Linux 5.4: The latest Red Hat Enterprise Linux platform release that integrates KVM hypervisor technology, allowing customers to increase their operational and capital efficiency by leveraging the same hosts to run both native Red Hat Enterprise Linux applications and virtual machines executing supported guest operating systems. 5 www.redhat.com

Figure 1: Red Hat Enterprise Virtualization Hypervisor www.redhat.com 6

Figure 2: Red Hat Enterprise Virtualization Manager for Servers 2.2 Kernel-based Virtualization Machine (KVM) A hypervisor, also called virtual machine monitor (VMM), is a computer software platform that allows multiple ( guest ) operating systems to run concurrently on a host computer. The guest virtual machines interact with the hypervisor which translates guest I/O and memory requests into corresponding requests for resources on the host computer. Running fully-virtualized guests, i.e., guests with unmodified guest operating systems, requires complex hypervisors and incurs a performance penalty for emulation and translation of I/O and memory requests. Over the last couple of years as chip vendors (Intel and AMD) have been steadily adding CPU features that offer hardware enhancements to the support virtualization. Most notable are: 1.First generation hardware assisted virtualization: Removes the need for hypervisor to scan and rewrite privileged kernel instructions using Intel VT (Virtualization Technology) and AMD's SVM (Secure Virtual Machine) technology. 2.Second generation hardware assisted virtualization: Offloads virtual to physical memory address translation to CPU/chip-set using Intel EPT (Extended Page Tables) and AMD RVI (Rapid Virtualization Indexing) technology. This provides significant reduction in memory 7 www.redhat.com

address translation overhead in virtualized environments. 3.Third generation hardware assisted virtualization: Allows PCI I/O devices to be attached directly to virtual machines using Intel VT-d (Virtualization Technology for directed I/O) and AMD IOMMU. And SR-IOV (Single Root I/O Virtualization) which allows special PCI devices to be split into multiple virtual devices. This provides significant improvement in guest I/O performance. The great interest in virtualization has led to the creation of several different hypervisors. However, many of these predate hardware-assisted virtualization, and are therefore somewhat complex pieces of software. With the advent of the above hardware extensions, writing a hypervisor has become significantly easier and it is now possible to enjoy the benefits of virtualization while leveraging existing open source achievements to date. Kernel-based Virtual Machine (KVM) turns a Linux kernel into a hypervisor. Red Hat Enterprise Linux 5.4 provides the first commercial-strength implementation of KVM, which is developed as part of the upstream Linux kernel. 2.2.1 Traditional Hypervisor Model The traditional hypervisor model consists of a software layer which multiplexes the hardware among several guest operating systems. The hypervisor performs basic scheduling and memory management, and typically delegates management and I/O functions to a special, privileged, guest. Today's hardware, however, is becoming increasingly complex. The so-called basic scheduling operations have to take into account multiple hardware threads on a core, multiple cores on a socket, and multiple sockets on a system. Similarly, on-chip memory controllers require that memory management take into effect the Non-Uniform Memory Access (NUMA) characteristics of a system. While great effort is invested into adding these capabilities to hypervisors, we already have a mature scheduler and memory management system that handles these issues very well the Linux kernel. 2.2.2 Linux as a Hypervisor By adding virtualization capabilities to a standard Linux kernel, we can enjoy all the finetuning work that has gone (and is going) into the kernel, and bring that benefit into a virtualized environment. Under this model, every virtual machine is a regular Linux process scheduled by the standard Linux scheduler. Its memory is allocated by the Linux memory allocator, with its knowledge of NUMA and integration into the scheduler. By integrating into the kernel, the KVM 'hypervisor' automatically tracks the latest hardware and scalability features without additional effort. www.redhat.com 8

2.2.3 A Minimal System One of the advantages of the traditional hypervisor model is that it is a minimal system, consisting of only a few hundred thousand lines of code. However, this view does not take into account the privileged guest. This guest has access to all system memory, either through hypercalls or by programming the DMA hardware. A failure of the privileged guest is not recoverable as the hypervisor is not able to restart it if it fails. A KVM based system's privilege footprint is truly minimal: only the host kernel plus a few thousand lines of the kernel mode driver have unlimited hardware access. 2.2.4 KVM Summary Leveraging new silicon capabilities, the KVM model introduces an approach to virtualization that is fully aligned with the Linux architecture and all of its latest achievements. Furthermore, integrating the hypervisor capabilities into a host Linux kernel as a loadable module simplifies management and improves performance in virtualized environments, while minimizing impact on existing systems. Red Hat Enterprise Linux 5.4 incorporates KVM-based virtualization in addition to the existing Xen-based virtualization. Xen-based virtualization, of course, remains fully supported for the life of the Red Hat Enterprise Linux 5 family. An important feature of any Red Hat Enterprise Linux update is that kernel and user APIs are unchanged, so that Red Hat Enterprise Linux 5 applications do not need to be rebuilt or recertified. This extends to virtualized environments: with a fully integrated hypervisor, the application binary interface (ABI) consistency offered by Red Hat Enterprise Linux means that applications certified to run on Red Hat Enterprise Linux on physical machines are also certified when run in virtual machines. So the portfolio of thousands of certified applications for Red Hat Enterprise Linux applies to both environments. 9 www.redhat.com

3 Test Configuration Figure 3: Test Configuration Diagram 3.1 Hardware Dell PowerEdge R710 Server Switches Storage Dual Socket, Quad Core, Hyper-threading (Total of 16 processing threads) Intel Xeon CPU E5540 @ 2.53GHz 18 x 4 GB DIMMs - total: 72 GB 2 x 146 GB SAS 15K drives Dell PowerConnect 5448 Dell PowerConnect 6248 2 x Dell EqualLogic PS5000XV drive array Table 1: Hardware The KVM hypervisor solution was run on a Dell PowerEdge R710 server and two shelves of Dell EqualLogic PS5000XV storage, connected via a PowerEdge 6248 switch with 10GbE uplink module. The Dell PowerEdge R710 server had two 2.53GHz Intel Xeon Processor E5540s and 18 4GB sticks of RAM. The Dell PowerEdge R710 server was connected to the Dell EqualLogic storage via a 10Gb iscsi connection. Red Hat Enterprise www.redhat.com 10

Linux 5.4 Beta (kernel-2.6.18-159.el5) was installed as the host operating system, and Windows Server 2008 was installed as the guest operating system. 3.2 Software Red Hat Enterprise Linux 5.4 - Beta KVM Microsoft Windows Server 2008 2.6.18-159.el5 kvm-83-80.el5 Microsoft Exchange 2007 SP1 08.01.0240.006 Table 2: Software 3.3 Storage Layout Two trays of Dell EqualLogic PS5000XV storage, connected to each other and to the server via a Dell PowerConnect 6248 switch and 10Gb uplink module, were used. Each guest had two volumes allotted at the SAN level: one for the operating system, and one for the Exchange database and backup files. The operating system iscsi LUN at the host level was revealed using the bundled Red Hat Enterprise Linux iscsi drivers. After installing the guest OS on these.img files, the guest was booted and the software iscsi initiators inside each guest were used to access the LUN reserved for Exchange data. Inside the Windows Server 2008 guest, this LUN was partitioned into two partitions, one for Exchange data, and one for backup files. 11 www.redhat.com

4 Test Methodology 4.1 Workload The goal of testing was to measure network response time for different numbers of users performing this workload on Microsoft Windows Server 2008 with Exchange 2007. Acceptable latency was defined as less than 750ms. To reach that goal, a new custom workload that reflected existing standards was created. To build the workload, Microsoft Exchange LoadGen 2007, an industry-standard tool for benchmarking an Exchange Mail Server, was used. That workload was based on the LoadGen settings from an industry-standard virtualized mail benchmark that uses the Microsoft Exchange Server 2003 MAPI Messaging Benchmark 3 (MMB3). MMB3 is the previous generation Exchange benchmark, which Microsoft has replaced with LoadGen 2007. The MMB3 workload was modified so it would work on the latest version of LoadGen 2007 by using the Custom feature of that tool. The workload was tuned in such a way as to stress CPU and memory. The workload consisted of 1,000 users per vcpu. For example, a 2-vCPU guest ran with 2,000 users and an 8-vCPU guest with 8,000. During the tests, LoadGen performs tasks to simulate a standard user generating mail activity. When the workload finishes, LoadGen reports response times, which are the average number of milliseconds necessary for the highest 95th percentile of users to complete several tasks. This workload was executed on all virtual machines simultaneously for 30 minutes. 4.2 Configuration & Workload The following configuration was used to get maximum performance out of the guests while staying below the 750ms latency threshold: RAM per vcpu= 2.5 GB Load per vcpu = 1,000 users For example, when executing four 2-vCPU guests, each would use 5 GB RAM, so the total RAM used in the system would be 20 GB. 4.3 Performance Test Plan Scale-out: To demonstrate Red Hat KVM s ability to scale-out, the following test plan was executed: 2 vcpus 1 guest (2 vcpus/5 GB RAM/2,000 users) 2 guests (2 vcpus/5 GB RAM/2,000 users each) 4 guests (2 vcpus/5 GB RAM/2,000 users each) 8 guests (2 vcpus/5 GB RAM/2,000 users each) www.redhat.com 12

4 vcpus 1 guest (4 vcpus/10 GB RAM/4,000 users) 2 guests (4 vcpus/10 GB RAM/4,000 users each) 4 guests (4 vcpus/10 GB RAM/4,000 users each) 8 vcpus 1 guest (8 vcpus/20 GB RAM/8,000 users) 2 guests (8 vcpus/20 GB RAM/8,000 users each) Scale-up: To demonstrate Red Hat KVM s ability to scale-up, the following test plan was executed: 1 guest (2 vcpus/5 GB RAM/2,000 users) 1 guest (4 vcpus/10 GB RAM/4,000 users) 1 guest (8 vcpus/20 GB RAM/8,000 users) Virtualization Efficiency: To demonstrate Red Hat KVM s ability to consolidate, the following test plan was executed: 8 guests (2 vcpus/5 GB RAM/2,000 users each) 4 guests (4 vcpus/10 GB RAM/4,000 users each) 2 guests (8 vcpus/20 GB RAM/8,000 users each) 4.4 Tuning & Optimizations The host was installed with the Red Hat Enterprise Linux 5.4 Beta. The main focus of this server is to be a KVM hypervisor to guest virtual machines. To optimize performance, the VMs were bound to NUMA nodes. For the odd numbered VMs, they were bound to NUMA node one with the command cpunodedbind=1, membind=1 For the even ones, they were bound to NUMA node zero with the command cpunodedbind=0, membind=0 The host allocated memory in the form of Huge Pages (2048 kb) which was then made available to guests. With the guest using memory backed by huge pages, the number of accesses to translation lookaside buffers (TLB) should be reduced, thereby improving performance. To allocate huge pages, the following files were modified. First, the /etc/sysctl.conf file was modified to indicate the number of huge pages to be made available to the system: # enable hugepages vm.nr_hugepages = 26624 13 www.redhat.com

Next, the /etc/sysconfig/rc.local file was modified to mount the hugepages on each reboot: mkdir -p /mnt/libhugetlbfs mount -t hugetlbfs hugetlbfs /mnt/libhugetlbfs chmod 777 /mnt/libhugetlbfs The guests were started using the qemu-kvm command line. This allowed for the use of numactl to specify CPU and memory locality, the use of huge pages, and specifying the cache mechanism of the disks. The example which follows allocates a guest with: two CPUs (-smp 2) bound VM to numa node (--cpunodebind=1 --membind=1) 5GB of memory(-m 5120) huge page memory (--mem-path /mnt/libhugetblfs) one drive using the virtual IDE controller three networks using the Red Hat virtio driver (one for iscsi database LUNs, one for network connections to clients and an alternate client network NIC that was not used) numactl --cpunodebind=1 --membind=1 /usr/libexec/qemu-kvm -rtc-td-hack -no-hpet -- mem-path /mnt/libhugetlbfs -m 5120 -smp 2 -name Mailserver1 -cpu qemu64,+sse2 -uuid 0380a714-6650-9248-a7d0-af3ea4204398 -monitor pty -boot c -drive file=/mnt/vm_01/os.img,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:47:0f:7a,vlan=0,model=virtio -net tap,script=/etc/qemu-ifupbr0,vlan=0,ifname=vnet0 -net nic,macaddr=54:52:00:43:f4:d4,vlan=1,model=virtio -net tap,script=/etc/qemu-ifup-br2,vlan=1,ifname=vnet1 -net nic,macaddr=54:52:00:29:c9:67,vlan=2,model=virtio -net tap,script=/etc/qemu-ifupbr3,vlan=2,ifname=vnet2 -serial none -parallel none -usb -usbdevice tablet -vnc 127.0.0.1:5901 -k en-us A few adjustments were made inside of the guest along with our guest start scripts. For multipath support inside the guest, the Dell EqualLogic Host Integration Toolkit was installed. The database LUNs that were used for the testing were accessed through the guest s iscsi initiators. To remove some overhead from testing, we turned off logging on the guest NICs. As part of the test configuration, the Microsoft Exchange Search Indexer service was disabled to avoid file indexing during test runs. www.redhat.com 14

5 Test Results Multiple factors can effect scaling. Among these are hardware characteristics, application characteristics, and virtualization overhead. Hardware: The most important hardware characteristics for the tests in this paper are the storage and network throughput and system architecture. Exchange performance is largely dependent upon the disk and network IO capabilities of the system. This becomes especially important as the number of Exchange mailboxes (and users) increases. The system was designed with Non-Uniform Memory Architecture (NUMA), which allows quicker access to nearby memory, but conversely, slower access to remote memory. This architecture has two NUMA nodes, one for each processor. Assigning a process within a NUMA node allows cache sharing and memory access performance boots. Application: The type of scaling up (increased amounts of memory and increased CPU count per guest) or out (multiple instances of similar-sized guests) can affect various applications in different ways. The added memory and CPU power will typically help applications that do not contend for a limited resource, where scaling out may provide a multiple of a limited resource. However, scaling out may not be suited for applications requiring a high degree of coordination for the application, which would occur in memory for a scale-up configuration. Additionally, virtualization can be used to consolidate multiple independent homogenous or heterogeneous workloads onto a single server. Virtualization: Since it is not completely operating directly on hardware and requires the hypervisor layer, which consumes some processing cycles, any type of virtualization uses some overhead. The amount of virtualization overhead can vary depending on the efficiency of the hypervisor and the drivers used. In this paper, the results demonstrate the quality of the virtualization of the network and disk IO subsystems. 15 www.redhat.com

5.1 Scaling Multiple 2-vCPU Guests This section presents the results obtained when executing multiple 2-vCPU Windows guests/virtual-machines in a single physical host. Figure 4 is a schematic illustrating the configuration as multiple 2-vCPU guests are added. Figure 4: Scaling Multiple 2-vCPU Guests www.redhat.com 16

Figure 5 presents the scalability achieved by increasing the number of 2-vCPU Windows guests from one to eight. The number of users in these tests demonstrates very good scalability up to four guests with acceptable increases in response time. Beyond four guests, the latency changes become slightly worse with increased reliance on hyper-threading on this two-socket, quad-core Dell PowerEdge R710 Server System, which has eight cores and 16 hyper-threads. (Note: 1 vcpu = 1 hyper-thread) Figure 5: Results of Scaling Multiple 2-vCPU Guests 17 www.redhat.com

5.2 Scaling Multiple 4-vCPU Guests This section presents the results obtained when executing multiple 4-vCPU Windows guests/virtual-machines in a single physical host. Figure 6 is a schematic illustrating the configuration as multiple 4-vCPU guests are added. Figure 6: Scaling Multiple 4-vCPU Guests www.redhat.com 18

Figure 7 presents the scalability achieved by increasing the number of 4-vCPU Windows guests from one to four. The number of users in these tests demonstrates good scalability up to four guests with acceptable increases in response time on this two-socket, quad-core Dell PowerEdge R710 Server System, which has eight cores and 16 hyper-threads. (Note: 1 vcpu = 1 hyper-thread) Figure 7: Results of Scaling Multiple 4-vCPU Guests 19 www.redhat.com

5.3 Scaling Multiple 8-vCPU Guests This section presents the results obtained when executing multiple 8-vCPU Windows guests/virtual-machines in a single physical host. Figure 8 is a schematic illustrating the configuration as multiple 8-vCPU guests are added. Scale-Out LoadGen Workload LoadGen Workload Microsoft MS Exchange Exchange 2007 Server 2007 Mailbox Server Windows Server 2008 Windows Guest Server (82008 vcpus) Guest (8 vcpus) KVM Hypervisor Red Hat Enterprise Linux 5.4 Dell PowerEdge R710 (2 x Quad-Core Intel Xeon E5540 Nehalem) Figure 8: Scaling Multiple 8-vCPU Guests www.redhat.com 20

Figure 9 presents the scalability achieved by increasing the number of 8-vCPU Windows guests from one to two. The number of guests in these tests demonstrates almost identical response times on this two-socket, quad-core Dell PowerEdge R710 Server System, which has eight cores and 16 hyper-threads. (Note: 1 vcpu = 1 hyper-thread) Figure 9: Results of One & Two 8-vCPU Guests 21 www.redhat.com

5.4 Scaling-Up by Increasing the Number of vcpus in a Single Guest This section presents the results obtained when executing a single Windows guest with an increasing number of vcpus (from one to eight). Figure 10 is a schematic illustrating the configuration as multiple vcpus are added to the guest. LoadGen Workload LoadGen Workload MS Exchange Server 2007 MS Exchange Server 2007 Windows Server 2008 Guest (2 vcpus) Scale-Up Windows Server 2008 Guest (8 vcpus) KVM Hypervisor Red Hat Enterprise Linux 5.4 KVM Hypervisor Red Hat Enterprise Linux 5.4 Dell PowerEdge R710 (2 x Quad-Core Intel Xeon E5540 Nehalem) Dell PowerEdge R710 (2 x Quad-Core Intel Xeon E5540 Nehalem) Figure 10: Scale-Up of RHEV Guest www.redhat.com 22

Figure 11 plots the results when the Exchange workload was run on a guest with 2, 4, and 8 vcpus with a corresponding ratio of 2.5 GB of memory for each vcpu. The latency increases as the guest (and the corresponding number of users supported) increases. This illustrates that Exchange workloads tend to perform better when spread among several guests. Figure 11: Results of Scaling Memory and Number of vcpus in a Guest 23 www.redhat.com

5.5 Virtualization Efficiency in Consolidation Scenarios Figure 12 compares the average latency of various virtual machine configurations totaling eight vcpus. In the virtual environment, this test was run with eight 2-vCPU guests, four 4- vcpu guests, and two 8-vCPU guests. The virtualization achieves best latency when spread out among eight guests as opposed to smaller amount/larger guest configurations. These results indicate that Exchange performs best when scaling-out as opposed to scaling-up. Figure 12: Results of Various Exchange Consolidation Scenarios which Fully Subscribe Available CPUs www.redhat.com 24

6 Conclusions This paper describes the performance and scaling of an industry-standard Exchange application, Microsoft Load Generator (LoadGen), running in Microsoft Windows Server 2008 guests under Red Hat Enterprise Linux 5.4 using the KVM hypervisor. The host system was deployed on a Dell PowerEdge R710 G6 server equipped with 72 GB of RAM and comprising dual sockets each with a 2.53 GHz Intel Xeon E5540 (Nehalem) processor with support for hyper-threading technology, totaling 8 cores and 16 threads. The data presented in this paper clearly establishes that Red Hat Enterprise Linux 5.4 guests using the KVM hypervisor on a Dell PowerEdge R710 provide a stable platform for hosting multiple virtualized Exchange applications. The ability to scale-out contributes to the effectiveness of KVM for Exchange applications. The number of actual users and latency supported in any specific customer situation will, of course, depend on the specifics of the customer application used and the intensity of user activity. However, the results demonstrate that in a heavily virtualized environment, acceptable latency was retained even as the number and size of guests/virtual-machines was increased up until the physical server was fully subscribed. 7 References 1.Qumranet White paper: KVM Kernel-based Virtualization Machine 25 www.redhat.com