Microsoft Exchange Server 2007 and Hyper-V high availability configuration on HP ProLiant BL680c G5 server blades

Similar documents
Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

Deploying Microsoft Exchange Server 2007 mailbox roles on VMware Infrastructure 3 using HP ProLiant servers and HP StorageWorks

HP reference configuration for entry-level SAS Grid Manager solutions

How to configure Failover Clustering for Hyper-V hosts on HP ProLiant c-class server blades with All-in-One SB600c storage blade

HP recommended configuration for Microsoft Exchange Server 2010: ProLiant DL370 G6 supporting GB mailboxes

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

How To Write An Article On An Hp Appsystem For Spera Hana

Sizing guide for SAP and VMware ESX Server running on HP ProLiant x86-64 platforms

HP ProLiant DL380p Gen mailbox 2GB mailbox resiliency Exchange 2010 storage solution

Sizing guide for Microsoft Hyper-V on HP server and storage technologies

Oracle Database Scalability in VMware ESX VMware ESX 3.5

HP recommended configurations for Microsoft Exchange Server 2013 and HP ProLiant Gen8 with direct attached storage (DAS)

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Summary. Key results at a glance:

Performance brief for IBM WebSphere Application Server 7.0 with VMware ESX 4.0 on HP ProLiant DL380 G6 server

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

HP Cloud Map for TIBCO ActiveMatrix BusinessWorks: Importing the template

Microsoft Exchange Server 2003 Deployment Considerations

Best practices for fully automated disaster recovery of Microsoft SQL Server 2008 using HP Continuous Access EVA with Cluster Extension EVA

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment

Benchmarking Guide. Performance. BlackBerry Enterprise Server for Microsoft Exchange. Version: 5.0 Service Pack: 4

HP ConvergedSystem 900 for SAP HANA Scale-up solution architecture

Windows Server 2008 R2 Hyper-V Live Migration

HP high availability solutions for Microsoft SQL Server Fast Track Data Warehouse using SQL Server 2012 failover clustering

White Paper. Recording Server Virtualization

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Windows Server 2008 R2 Hyper-V Live Migration

QuickSpecs. HP Integrity Virtual Machines (Integrity VM) Overview. Currently shipping versions:

HP and Mimosa Systems A system for archiving, recovery, and storage optimization white paper

RAID 5 rebuild performance in ProLiant

Deploying Microsoft Exchange Server 2010 on the Hitachi Adaptable Modular Storage 2500

Implementing the HP Cloud Map for SAS Enterprise BI on Linux

Legal Notices Introduction... 3

Introducing logical servers: Making data center infrastructures more adaptive

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Virtualizing Microsoft SQL Server 2008 on the Hitachi Adaptable Modular Storage 2000 Family Using Microsoft Hyper-V

Microsoft SharePoint Server 2010

OPTIMIZING SERVER VIRTUALIZATION

QuickSpecs. What's New. Models. ProLiant Essentials Server Migration Pack - Physical to ProLiant Edition. Overview

HP Smart Array Controllers and basic RAID performance factors

Enhancing the HP Converged Infrastructure Reference Architectures for Virtual Desktop Infrastructure

Philips IntelliSpace Critical Care and Anesthesia on VMware vsphere 5.1

Performance brief for Oracle Enterprise Financial Management 8.9 (Order-to-Cash Counter Sales) on HP Integrity BL870c server blades

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Assessing RAID ADG vs. RAID 5 vs. RAID 1+0

HP ProLiant Cluster for MSA1000 for Small Business Hardware Cabling Scheme Introduction Software and Hardware Requirements...

Virtualizing Microsoft Exchange 2010 with HP StoreVirtual 4500 G2 and VMware vsphere 5.0

Microsoft Exchange Solutions on VMware

SAP database backup and restore solutions for HP StorageWorks Enterprise Virtual Array using HP Data Protector 6.1 software

HP VMware ESXi 5.0 and Updates Getting Started Guide

Best practices for a Microsoft Hyper-V Dynamic Data Center in an HP BladeSystem environment

Proof Point: Example Clustered Microsoft SQL Configuration on the HP ProLiant DL980

The Benefits of Virtualizing

Managing Microsoft Hyper-V Server 2008 R2 with HP Insight Management

Deploying Microsoft Exchange Server 2010 on the Hitachi Virtual Storage Platform with Hitachi Dynamic Tiering

EMC Unified Storage for Microsoft SQL Server 2008

QuickSpecs. What's New. At A Glance. Models. HP StorageWorks SB40c storage blade. Overview

Certification: HP ATA Servers & Storage

HP Converged Infrastructure Solutions

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

HP Intelligent Management Center Basic WLAN Manager Software Platform

Brocade and EMC Solution for Microsoft Hyper-V and SharePoint Clusters

Comparing Multi-Core Processors for Server Virtualization

MS Exchange Server Acceleration

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Power efficiency and power management in HP ProLiant servers

Microsoft Exchange Server 2007

Business Continuity for Microsoft Exchange 2010 Enabled by EMC Unified Storage, Cisco Unified Computing System, and Microsoft Hyper-V

XenDesktop 7 Database Sizing

Virtualization of the MS Exchange Server Environment

Microsoft SharePoint Server 2010

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Reference Architecture - Microsoft Exchange 2013 on Dell PowerEdge R730xd

Implementing Red Hat Enterprise Linux 6 on HP ProLiant servers

Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads

Building a Microsoft Windows Server 2008 R2 Hyper-V failover cluster with HP Virtual Connect FlexFabric

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

SQL Server Business Intelligence on HP ProLiant DL785 Server

Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008

Multi-Tenant Scalability Guidance for Exchange Server 2010 Service Pack 2

Implementing Microsoft Windows Server 2008 Hyper-V Release Candidate 1 on HP ProLiant servers

HP iscsi storage for small and midsize businesses

HP Smart Array 5i Plus Controller and Battery Backed Write Cache (BBWC) Enabler

Enabling VMware Enhanced VMotion Compatibility on HP ProLiant servers

QuickSpecs. Models HP Smart Array E200 Controller. Upgrade Options Cache Upgrade. Overview

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Configuration best practices for Microsoft SQL Server 2005 with HP StorageWorks Enterprise Virtual Array 4000 and HP blade servers white paper

Deploying a 48,000-user Exchange Server 2010 Environment with Hitachi Compute Blade 2000 and Hitachi Adaptable Modular Storage 2500

HP StorageWorks MPX200 Simplified Cost-Effective Virtualization Deployment

HP Education Services Course Overview

HP MSA 2040 Storage 750 Mailbox Resiliency Exchange 2013 Storage Solution with Microsoft Hyper-V

Transcription:

Microsoft Exchange Server 2007 and Hyper-V high availability configuration on HP ProLiant BL680c G5 server blades Executive summary... 2 Introduction... 2 Exchange 2007 Hyper-V high availability configuration... 3 Exchange 2007 topology... 4 Storage configuration... 6 Hyper-V configuration... 7 Software... 10 Test harness... 10 Performance testing and analysis... 11 Workload validation... 12 Hyper-V physical servers... 14 Mailbox server virtual machines... 16 Processor utilization... 17 Storage subsystem performance... 18 Additional Exchange counters... 19 Hub transport virtual machines... 20 Processor utilization... 20 Message queuing... 22 Summary... 23 For more information... 24

Executive summary Over the past several years there have been a number of technology advancements that are causing administrators to rethink the traditional model of Microsoft Exchange Server architecture design. That is, deploying Exchange servers on separate hardware from other Exchange servers (and other applications) to provide isolation, independent of whether the underlying server hardware is being efficiently utilized. With processor advancements increasing the number of cores on an individual processor die, improvements in virtualization technology, and the move to a role base topology, an alternative design methodology is starting to gain traction for Microsoft Exchange Server 2007 (Exchange). This approach is to leverage virtualization to move away from the scale-out model, and consolidate onto fewer, larger systems. This white paper describes a configuration using this design concept. The solution is architected to support 4,000 users in a highly available, fault tolerant configuration using two HP ProLiant BL680c G5 server blades and Microsoft Hyper-Virtualization (Hyper-V) technology. The paper covers both the specifics of the configuration and design goals, as well the performance results of running a simulated workload against the environment. The high-level results of the performance tests are as follows: Performance is well within acceptable performance thresholds under both normal and peak Exchange workloads for normal operating conditions when both servers are online. Performance is well within acceptable performance thresholds under both normal and peak workloads for the scenario when one of the servers has failed and gone offline. Target audience: The information contained in this white paper is intended for solutions architects, engineers, and project managers involved in the planning and design of Microsoft Exchange Server 2007 solutions. The expectation is that readers are familiar with both Exchange 2007 and Hyper-V technology. Introduction Over the past several years, processor architectures have undergone significant advancements. Among other changes, the number of cores on an individual processor has gone from a single core design, to supporting as many as six cores on a single processor die. This has allowed HP to pack more and more processing power into an individual server. For example, the HP ProLiant DL580 G5 servers and ProLiant BL680c G5 server blades can scale up to as many as 24 cores with four, 6-core processors. As this trend continues, the number of applications that can fully utilize the processing power available on systems with 16, or 24, cores becomes fewer and fewer. Specifically Exchange does not see significant performance benefits when scaling beyond 8, and certainly 16 cores, in a server. This leaves system architects with a critical design decision to make. One approach is the scale-out model in which a greater number of smaller servers are deployed. However, this methodology can clash with ongoing datacenter initiatives around consolidation, and lowering power, cooling and real estate costs. The alternative approach is to utilize virtualization technologies and stack multiple workloads on fewer, larger servers. Both designs provide unique advantages and challenges, and the model can often be dictated by the applications that are being deployed, as well as the server models that a company has standardized on. 2

With the release of Microsoft Exchange Server 2007, Microsoft formalized the concept of a server role with five distinct Exchange roles. And with the release of the Microsoft Hyper-Virtualization hypervisor architecture, there is now official support for deploying four of the five Exchange roles in a virtualized environment 1. Virtualization can provide significant benefits in helping to reduce costs in the datacenter. With virtualization, companies can reduce the total server count needed to support Exchange, thus reducing the real estate footprint and helping to reduce power and cooling requirements. This white paper provides configuration details and the performance results of deploying Exchange 2007 in a virtualized environment on larger 16-core servers. The purpose of the solution is to provide a building-block approach to deploying Exchange using larger multi-core systems. High availability and fault tolerance was a key consideration when designing this solution due to the risk of having multiple virtual machines (VMs) impacted with the loss of a single server. Thus, two servers make up the building block design, coupled with Exchange 2007 cluster continuous replication (CCR) technology. Exchange 2007 Hyper-V high availability configuration The approach outlined in this white paper is to utilize a building block methodology for the Exchange server infrastructure. The benefit of using a building block approach is that each discrete block is a known and tested configuration. Then depending on the size of the Exchange organization, multiple building blocks can be rolled out in increments to satisfy the required number of mailboxes in the environment. The building block configuration described in this white paper had several key design goals. One of the goals was to minimize the number of physical servers in the environment. This drove the design towards higher end, multi-core systems using Hyper-V virtualization to run multiple VMs on one server. The server choice for this building block was the HP ProLiant BL680c G5 server blade. Another design criterion was to provide high availability for the Exchange service, and to minimize the impact of a single server or storage array failure. Thus, the base building block configuration was two BL680c server blades with redundant Exchange infrastructure role VMs and Exchange CCR clustered mailbox servers. 1 The Unified Messaging server role is not supported in a virtualized environment. Please see http://technet.microsoft.com/enus/library/cc794548.aspx for more information on virtualization with Microsoft Exchange Server. 3

Exchange 2007 topology The diagram below (Figure 1) shows the hardware topology for the Exchange environment which includes the server and storage configuration that is described in further detail in this section of the white paper. Figure 1. Exchange 2007 Hardware architecture diagram 4

Exchange 2007 does not generally benefit from scaling up beyond 8 processor cores in a single server for a typical corporate mailbox server role configuration (4000 to 5000 users per server). And the BL680c server blades are four-socket, quad-core systems supporting a total of 16 processor cores. Thus, to maximize the utilization of the resources on the BL680c and decrease the number of physical servers required for the Exchange deployment, virtualization technology was required. For this topology, Microsoft Hyper-V is installed as the hypervisor layer on the BL680c server blades. Two HP ProLiant BL680c server blades make up the Exchange building block unit. Each server blade is configured with an identical hardware configuration as described in the following table (Table 1). Table 1. HP ProLiant BL680c hardware configuration Processor 2 Quad-Core Intel Xeon Processor E7340 (2.40 GHz, 2x4 MB L2 Cache) Memory 40GB PC2-5300 Fully Buffered DIMMs (667 MHz) Internal Storage 2 146GB 10K SFF SAS drives (RAID 1) HP Smart Array P400i Controller with battery backed write cache (BBWC) Networking 4 1 Gb NIC ports total: 2 NC373i Multifunction Gigabit Server Adapters 1 NC326i Dual Port Gigabit Server Adapter Management Integrated Lights-Out 2 Standard Blade Edition (ilo 2) Density Full height Each BL680c server blade is running five Hyper-V virtual machines (VM) for a total of ten VMs supporting the Exchange requirements. This provides the following server roles in the Exchange environment: Two Active Directory (AD) Two Client Access Servers (CAS) Two Hub Transport Servers (HT) Two Exchange 2007 Cluster Continuous Replication (CCR) clusters (4 VMs total) Each Exchange 2007 CCR cluster is designed to support 2,000 Exchange users, with 500MB mailboxes, for a total of 4,000 Exchange users in the solution. This configuration provides fault tolerance in the event of a single VM failure or the loss of an entire physical server. And by using larger 16-core servers, the infrastructure in this solution can be reduced from ten physical servers to two physical servers. The VM layout is depicted in Figure 2 and described in more detail in the Hyper-V configuration section below. Note: There are various ways a discrete building block unit can be built, and which approach to take should be evaluated on a customer by customer basis. 5

The BL680c server blades are installed into an HP BladeSystem c7000 enclosure. Details of the c7000 enclosure are listed in table 2. Table 2. HP BladeSystem c7000 enclosure information Network 2 - HP GbE2c Ethernet Blade Switch for c-class BladeSystem SAN 2 - Brocade 4Gb SAN Switch for HP c-class BladeSystem Management HP Onboard Administrator Height 10 U Storage configuration To support the external storage requirements including the Hyper-V VM OS LUNs, along with the external storage requirements for the Exchange environment, an HP StorageWorks 8100 Enterprise Virtual Array (EVA8100) was utilized. The EVA8100 was installed with 168, 300GB 10K RPM drives. However, (as shown in Figure 1) only 88 drives were used in this performance testing, with 80 drives remaining unused. To calculate the Exchange database and transaction log storage requirements and the number of disks necessary to support the workload, the latest version of the HP Sizing Tool for Microsoft Exchange Server 2007 was used. This tool can be used to provide recommended server and storage hardware configurations based on the Exchange user profiles. To download and install this free tool, please go to www.hp.com/solutions/activeanswers/exchange. Based on the results of the tool, five disk groups were configured on the EVA array. Four disk groups are dedicated to supporting the Exchange transaction logs and database files. The fifth disk group hosts ten LUNs used as the boot disks for the ten virtual machines in this configuration. Each disk group used a failure protection level of single and all LUNs were configured as VRAID1. The decision was made to use 16 storage groups per mailbox server for this Exchange solution. This resulted in 125 users per storage group LUN, yielding an estimated database size of less than 100 GB when factoring in both the 500 MB mailbox capacity along with various overhead factors. Since this is a CCR configuration, each storage group can only contain a single mailbox database. Note The decision to use 16 storage groups was made in order to provide a consistent number of users per storage group while maintaining a database size less than 100GB. This number can vary depending on the specifics of the Exchange environment and this does not necessarily reflect a best practice recommendation. 6

The following table (Table 3) describes the EVA disk group configuration, LUN layout and VRAID levels. Table 3. EVA storage configuration Disk Group Name Drive Count LUN Layout LUN Size VRAID Failure Protection EVA1 Transaction Logs 8 16 LUNs Transaction Logs CCR1 16 LUNs Transaction Logs CCR2 50 GB 1 Single EVA2 Transaction Logs 8 16 LUNs Transaction Logs CCR2 16 LUNs Transaction Logs CCR1 50 GB 1 Single 16 LUNs Databases CCR1 150 GB DB EVA1 Databases 32 16 LUNs Databases CCR2 1 Single 1 LUN Hub Transport Queue 100 GB HT Queue 16 LUNs Databases CCR1 150 GB DB EVA2 Databases 32 16 LUNs Databases CCR2 1 Single 1 LUN Hub Transport Queue 100 GB HT Queue Hyper-V VM Boot LUNs 8 10 LUNs VM boot LUNs 50 GB 1 Single In order to simplify the setup for this testing, a single array was configured. Normally in a production environment, separate disk arrays would be used to host the two copies of data in order to eliminate the storage array as a single point of failure. In this example, the disk groups marked EVA2, which are presented to the second server, would be hosted on a second storage array. This drops the total storage requirements for a single array down to 48: 32 drives for the databases, 8 drives for the logs, and 8 drives for the OS boot LUNs. Thus, if the production Exchange environment is made up of a single 4,000 user building block, two EVA4100s (or EVA4400s) could be used to host the Exchange storage requirements. If additional building blocks are required, there is the option to deploy either the EVA6100 or EVA8100 arrays as the capacity and I/O demands increase to support larger Exchange topologies. For more information on the HP StorageWorks EVA family please visit, www.hp.com/go/eva. One other consideration to point out is that in this configuration, a single disk group (EVA1 Databases for example) contains database LUNs for both CCR configurations. Prior to Exchange 2007 SP1, this would have been a potential performance problem as the passive LUN I/O activity was often two to three times higher than that of the active LUNs. Thus, the I/O activity on the passive LUN could negatively impact the response time of the active LUNs in that disk group. With Exchange 2007 SP1, this condition has been resolved by maintaining a persistent ESE (Extensible Storage Engine) cache on the passive node. This helps reduce the I/O activity level to equal or lower than that of the active LUN I/O activity. And the test results described below show that performance is not impacted when mixing the active and passive LUNs in a single disk group that is sized appropriately, both under normal operating conditions and after a failure scenario. If there is a desire to further isolate the workloads, the alternative approach to this configuration would be to break the 32 disk DGs into two, 16 disk DGs. Hyper-V configuration For this Exchange configuration, ten VMs are required to support the 4000 Exchange users in a highly available, fault tolerant configuration. To support the infrastructure and non-mailbox Exchange 7

server roles, redundant AD, CAS, and HT server VMs were configured. To support the mailbox server requirements, two Exchange 2007 CCR clusters supporting 2,000 users each were created. To maintain high availability in the event of a loss of one of the BL680c blades, the CCR clusters were configured such that the active node of one cluster is paired with the passive node of the second cluster on the same physical machine. Figure 2 below shows the layout of the individual VMs across the two physical servers. Figure 2. Hyper-V virtual machine layout Note that CCR1NODE1VM (active) and CCR2NODE2VM (passive) are configured on the physical server HyperVOne, and CCR1NODE2 (passive) and CCR2NODE1 (Active) are configured on the physical server HyperVTwo. Thus, in the event of a failure scenario where an entire node is lost, the passive node for the cluster that has failed will become the active node and the Exchange mailbox services will come back online. 8

To support Microsoft Hyper-V hypervisor, the BIOS settings in each BL680c server blade were modified. The following two BIOS settings (disabled by default) were enabled: No-Execute Memory Protection Intel Virtualization Technology After the BIOS settings were properly configured, the Hyper-V role was installed on each BL680c blade. Specific detailed information on installing and configuring Microsoft Hyper-V is outside the scope of this document. For more information on installing Hyper-V on the HP BladeSystem please read the Implementing Microsoft Windows Server 2008 Hyper-V on HP ProLiant servers" white paper available at http://h20000.www2.hp.com/bc/docs/support/supportmanual/c01516156/c01516156.pdf. Once the Hyper-V role was installed, the first things to configure were the virtual networks. For this configuration, three virtual networks were configured identically on both BL680c server blades as shown in Table 4. Table 4. Hyper-V virtual network configurations Network Name Speed Connection Type Production 1 Gbps External Production Cluster 1 Gbps External Private Cluster 1 Gbps External The next step in setting up the environment was to configure the disks for the VMs to use as the boot disk. As described in the Storage configuration section above, five LUNs were created and presented to each physical server to serve as the boot LUNs for the virtual machines. These LUNs were initialized and a primary partition was created within the parent partition. Then, new fixed hard disks were created to be used during the virtual machine creation process. Once this step was complete, the virtual machines themselves were created. The following table, Table 5, details the configuration specifics for each server role. Note that the VMs were identically configured on both physical servers so this table only depicts a single server configuration. Table 5. Hyper-V VM configuration details Resources AD HT CAS CCR Mailbox (Active) CCR Mailbox (Passive) Processor Cores 2 2 2 4 4 Memory 4 GB 4 GB 4 GB 12 GB 12 GB Virtual Networks Production Production Production Production Production Cluster Private Cluster Production Production Cluster Private Cluster Storage IDE (Boot) IDE (Boot) SCSI 1 LUN Pass-Through for HT queue IDE (Boot) IDE (Boot) SCSI 32 LUNs Pass-Through for Exchange logs and DBs IDE (Boot) SCSI 32 LUNs Pass-Through for Exchange logs and DBs 9

The AD, HT, and CAS VMs were allocated 2 CPU cores and 4 GB of memory. Each of the VMs was also assigned the Production virtual network to communicate with the rest of the VMs and the rest of the network. The only difference between these virtual machines was that the HT VMs had an additional 100GB pass-through SCSI LUN. This LUN was used as a dedicated LUN for the HT queue database and transaction logs to ensure sufficient disk performance. Note: Each BL680c server blade has 16 DIMM slots and supports up to 128 GB of RAM. Additional physical memory can be added to the server if the AD and/or HT virtual machines require additional memory to be allocated. Both the active and passive VMs for the CCR mailbox servers were allocated 4 CPU cores and 12 GB of memory. The mailbox servers had two additional virtual networks assigned along with the Production network. This includes the Production Cluster and Private Cluster networks which are dedicated to cluster communications. In this case, the separate Production Cluster network was configured to provide additional bandwidth and help balance traffic across the physical NIC ports in the parent server. However, this is not a requirement and a single production network would be sufficient to support both the cluster and client network traffic for this Exchange environment. The Private Cluster network on the other hand is a requirement as this network is utilized as the heartbeat network for the cluster. Along with the additional networking requirements, each of the mailbox VMs has 32 pass-through LUNs for Exchange transaction log and database storage. There are 16 transaction log LUNs and 16 database LUNs for each server. The specific EVA storage configuration was described in greater detail above in the Storage Configuration section. Software The configuration described in this white paper utilized the following software components: Microsoft Windows Server 2008 Enterprise x64 Edition (SP1) Microsoft Exchange Server 2007 Enterprise Edition (SP1 and update Rollup 7) Hyper-V Update for Windows Server 2008 x64 Edition (KB950050) Microsoft Forefront Security for Exchange Server Microsoft Forefront Security software was installed on each of the hub transport servers, along with the mailbox servers, with the default option of allowing the installer to select the scanning engines. This provides a closer approximation of a real world deployment in the test environment with the overhead of multiple scanning engines working at both the hub transport and mailbox server layers. Test harness To simulate a production Exchange workload, the Microsoft Exchange Load Generator (LoadGen) tool was utilized. LoadGen can be used to simulate the key messaging functions of a MAPI workload representative of clients using Office Outlook 2007. The latest version of LoadGen (Build 8.02.0045) downloadable from the web was used in this testing. To download LoadGen and for more information please visit, http://www.microsoft.com/downloads/details.aspx?familyid=0fdb6f14-1e42-4165-bb17-96c83916c3ec&displaylang=en 10

In this environment, three BL460c server blades were used to drive load against the Exchange configuration (as shown in Figure 1 above). One of the BL460c server blades is used as a LoadGen master control client. This allows you to control multiple clients from a single test machine. The other two BL460c blades were used to simulate the load. The default Heavy Outlook 2007 cached mode LoadGen profile was used for client simulation. This generates 80 messages/received per user/day and 20 messages/sent per user/day for an 8-hour workday simulation. This would be considered a normal load against the servers. Testing was also performed to simulate a peak workload with a 4-hour simulation day. This effectively doubles the amount of work performed in an 8-hour period to 160 messages/received per user/day and 40 messages/sent per user/day. Performance testing and analysis In order to validate the solution beyond a hypothetical level, a number of performance tests were conducted under both normal and failure conditions. These tests were run in order to show that under a simulated messaging workload the solution could deliver the necessary performance to support the two 2,000 user CCR clusters. As mentioned above, the Microsoft LoadGen simulation tool was used to generate the MAPI workload to emulate a production environment. This was the primary purpose of the testing and other workloads (OWA for example) were not emulated. In situations where there is a heavy percentage of OWA (or other Exchange protocol) traffic, then its highly recommended to use LoadGen to simulate the additional workloads as this could impact the design and configuration of the Exchange environment. The validation testing was composed of the following six test points shown in Table 6. Table 6. Validation test points Test Number Scenario Physical Servers Users Workload Simulation 1 Base 2 2000 (single cluster) Normal 8hr 2 Base 2 2000 (single cluster) Peak 4hr 3 Normal 2 4000 (two clusters) Normal 8hr 4 Normal 2 4000 (two clusters) Peak 4hr 5 Failure 1 4000 (two clusters) Normal 8hr 6 Failure 1 4000 (two clusters) Peak 4hr Of the six tests, half were conducted using the normal 8-hour workday simulated load (1, 3 and 5), and the other half used the peak 4-hour workday load (2, 4 and 6). This provided a normal and peak performance result for each of the three tested scenarios, base, normal, and failure. The first test state, tests 1 and 2, is the base scenario in which just a single CCR cluster (2,000 users) is running. This testing was conducted to ensure that no performance anomalies existed in the environment prior to running the full 4,000 user load. Tests 3 and 4 represent the normal operation environment in which both physical servers are online and both CCR clusters are active. Tests 5 and 6 represent a failure condition. This is the failure scenario in which one of the physical servers goes offline (HyperVTwo) and both CCR cluster active nodes are running on one physical server (HyperVOne). 11

The following sections detail the results of the performance testing with a primary focus on the performance of the mailbox servers and a secondary look at the performance of the hub transport servers. A number of performance counters were analyzed to ensure that acceptable performance criteria were being met and are discussed in greater detail below. Workload validation In any comparative performance analysis, the first thing to validate is that the workload between successive test runs is consistent, and that the correct amount of work is being generated. In this testing, LoadGen was used to simulate two workloads. One workload was designed to simulate a typical 8-hr workday. The second workload was designed to simulate peak conditions by doubling the amount of work and using a 4-hr workday simulation. Note that the peak workload test still runs for the same duration (10 hours) as the normal workload test, however, the difference is the workload is generating twice as much activity over that time period. The table below shows the amount of work per user that each run should be simulating. Table 7. LoadGen simulated workload details Workload 8-hr 4-hr Messages Received per User/Day 80 160 Messages Sent per User/Day 20 40 The following graphs in Figures 3 and 4 below show the derived messages received per user/day and messages sent per user/day for both the 8-hr and 4-hr workday simulations respectively. These values are derived from the MSExchange IS Mailbox\Message Recipients Delivered and MSExchange IS Mailbox\Messages Sent counters. 12

Figure 3. 8-Hour simulation day workload Figure 4. 4-Hour simulation day workload The results depicted in the graph show that in all cases both the correct amount of work is being performed, and the work is consistent between test runs. 13

Hyper-V physical servers The next thing to evaluate in the testing was the overall performance on the Hyper-V server as viewed from the parent partition, specifically relating to processor performance. The traditional method of measuring processor performance is to analyze the \Processor\% Processor Time counter. However, within a virtual machine guest, this counter is no longer considered accurate 2. This is due to a combination of the fact that clock skewing can occur and that the CPU time slices for a VM are spread round-robin across all the physical processors in the server. To solve this problem, Microsoft has introduced a number of new counter sets specifically related to measuring virtualization performance. The two that are most interesting in regards to measuring processor performance are the Hyper-V Hypervisor Logical Processors 3 and Hyper-V Hypervisor Virtual Processors 4 counters. The Hyper-V Hypervisor Logical Processors counter set provides information on all the logical processors in the server as seen from the hypervisor. In the BL680c there are four, quad-core processors, which provide a total of 16 logical processors that are managed by the hypervisor. The Hyper-V Hypervisor Virtual Processors counter set provides performance data on how each virtual processor allocated to a VM is performing relative to that VM. To get an overall understanding of the CPU utilization on the system, the % Total Run Time counter in both the Hyper-V Hypervisor Logical Processors and Hyper-V Hypervisor Virtual Processors set was analyzed. The % total run time value is the sum of the guest and hypervisor run time on that logical or virtual processor. As the Measuring Performance on Hyper-V 5 white paper describes the goal is to achieve a balance between the two values. Figures 5 and 6 below display the average % Total Run Time values across the logical and virtual processors for both physical servers, HyperVOne and HyperVTwo, respectively. 2 See the following Microsoft blog entry for more details http://blogs.msdn.com/tvoellm/archive/2008/03/20/hyper-v-clocks-lie.aspx 3 For more information on the Hyper-V Hypervisor Logical Processors counter set go to http://blogs.msdn.com/tvoellm/archive/2008/05/09/hyper-v-performance-counters-part-three-of-many-hyper-v-logical-processors-counterset.aspx 4 For more information on the Hyper-V Hypervisor Virtual Processors counter set go to http://blogs.msdn.com/tvoellm/archive/2008/05/12/hyper-v-performance-counters-part-four-of-many-hyper-v-hypervisor-virtual-processor-andhyper-v-hypervisor-root-virtual-processor-counter-set.aspx 5 This white paper is available at the following link http://msdn.microsoft.com/en-us/library/cc768535.aspx 14

Figure 5. HyperVOne % Total Run Time values for virtual and logical processors The most important data points to look at are the results from test points 5 and 6 on the above graph. This is the failure scenario case in which both active mailbox server VMs are running on the same physical server. In this case there is no indication that performance is a concern on HyperVOne with average % Total Run Time values around 25% for both the logical and virtual processors during the peak 4-hr duration testing. 15

Figure 6. HyperVTwo % Total Run Time values for virtual and logical processors In all test scenarios, the difference between the % Total Run Time for the logical and virtual processor counter sets is between 2 and 3 total run time percentage points. This shows a nice balance between the two counter sets. Taking this into account along with the fact that there is sufficient headroom on the server indicates that processor performance on the servers is acceptable for these workloads and that there is headroom to support additional peak activity or growth. Mailbox server virtual machines There are several areas to evaluate when looking at the performance results from the mailbox server VMs. This included analysis of processor utilization, disk performance, and various Exchange counters to ensure acceptable levels of performance were being maintained while under load. 16

Processor utilization As we saw above, overall CPU performance across the servers was well within the acceptable criteria range. Digging into the data further, the graphs in Figures 7 and 8 below depict the average value of the Hyper-V Hypervisor Virtual Processor\% Total Run Time\ counter for the Exchange mailbox VM virtual processors for both the normal and peak workload simulations. Figure 7. Mailbox server VMs virtual processor % Total Run Time values for normal workload 17

Figure 8. Mailbox server VMs virtual processor % Total Run Time values for peak workload The first test points on both graphs (Test 1 and 2, representing the 2000 users 8hr, and 2000 users 4hr data points) show a single cluster under load for the baseline test points. The results show that the % total run time on the active cluster node (blue line CCR2NODE1) is more than two or three times that of the passive cluster node (red line CCR2NODE2). This is a trend that continues in test points 3 and 4 from the two graphs as well (4000 users 8hr, and 4000 users 4hr test points). Across the board, there is also a noticeable increase in % total run time comparing the 8-hr to 4-hr simulation data points with the highest peaks in the upper twenty and low thirty percent range. As the overall processor data showed, there is sufficient headroom and processor resources on the mailbox server VMs and that the processor resource is not a source of performance concern for this solution. Storage subsystem performance Another key component to evaluate is the performance of the storage subsystem. While Exchange 2007 is not as I/O intensive as previous Exchange versions, the application is still sensitive to disk latencies and requires a properly designed storage infrastructure. 18

Table 8 below details the storage performance from test points 3 through 6, covering both the normal and failure scenarios for the two workloads. The first two data points are not shown as there was only one cluster under load and disk performance was well within the acceptable performance thresholds. To facilitate displaying the data, the disk transfers per second value is the summation of all the database or log LUNs, while the latency values are an average across the respective LUNs. And since the performance across both CCR clusters was consistent, the value shown for each test run is the average of the two clusters. Table 8. Storage I/O and latency results Counter Test 3 8-hr normal Test 4 4-hr normal Test 5 8-hr failure Test 6 4-hr failure Database Disk Transfers/sec Active Node 214.05 400.5 244 413 Database Disk Transfers/sec Passive Node 134.7 261 NA NA Transaction Log Disk Transfers/sec Active Node 127.1 244.5 120.5 232.5 Transaction Log Disk Transfers/sec Passive Node 33.9 63.4 NA NA Database Avg. Disk sec/read Active Node 10 ms 10.7 ms 9.4 ms 10.8 ms Database Avg. Disk sec/write Active Node 4.2 ms 4.3 ms 4.1 ms 4 ms Transaction Log Avg. Disk sec/write Active Node 1 ms 1 ms 1 ms 1 ms Transaction Log Avg. Disk sec/read Active Node 5.75 ms 11 ms 2.3 ms 5.4 ms The important thing to analyze is the latencies on the database and log LUNs. The typical goal is to maintain average latencies below 20 ms for database reads and writes, and 10 ms for log writes. The test results show that the latency values are well below these thresholds for all test points. The database read latency is around 10 ms while database write latency is below 5 ms. Also note the increased read latency on the transaction log LUNs due to the ongoing log shipping replication mechanism. It s important to point out that for the failure scenario (test points 5 and 6) there is negligible impact on disk latency. In these tests, both CCR cluster s active database and log LUNs are in the same disk group. This shows that placing multiple CCR cluster LUNs in the same group is acceptable when the disk group is sized properly. However, in an extreme event in which the activity on one CCR cluster spikes, there could still be an impact on the LUNs of the other CCR cluster sharing the same spindles. This is a design decision that must be considered when building out the storage infrastructure for Exchange. Additional Exchange counters When looking at evaluating the passing and failing criteria of a performance test, there are also several additional Exchange related counters to analyze to make sure no other issues have occurred that can impact performance. Tables 9 and 10 contain the additional Exchange counters evaluated in 19

this testing for both clusters (Ex2k7CCR1 and Ex2k7CCR2) respectively. The following three counters were analyzed: MSExchange IS\RPC Requests Average value < 50 and Maximum < 100 MSExchange IS\RPC Average Latency (ms) Average value < 50 ms and Maximum < 100 ms MSExchange IS Mailbox\Messages Queued for Submission Average value < 250 and Maximum < 1000 Table 9. Ex2K7CCR1 Exchange performance counters Counter Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 MSExchange IS\RPC Requests 1 2 1 2 2 3 MSExchange IS\RPC Average Latency (ms) 5 ms 4 ms 6 ms 5 ms 7 ms 7 ms MS Exchange IS Mailbox\Messages Queued for submission 1 3 1 2 3 3 Table 10. Ex2K7CCR2 Exchange performance counters Counter Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 MSExchange IS\RPC Requests NA NA 1 2 1 2 MSExchange IS\RPC Average Latency (ms) NA NA 5 ms 6 ms 7 ms 7 ms MS Exchange IS Mailbox\Messages Queued for submission NA NA 2 2 1 3 Exchange performance is predicated on the performance of the core components of the infrastructure (CPU, memory, disk, and network). If performance on the key subsystems is acceptable, Exchange will typically have good levels of response time and performance. In this set of tests, processor and storage subsystem performance is well within the acceptable levels, and as expected, the additional Exchange counters are also below threshold levels. The RPC requests and latency values are well below an average of 50. There is also very little message queuing, below a value of 5 messages. This indicates that the hub transport server is having no issues with pickup and routing mail from the mailbox server in this Exchange environment. Hub transport virtual machines While the focus of the testing and performance analysis has been around the CCR clustered mailbox servers, another important component to understand is the hub transport servers. A new architectural design of Exchange 2007 is that every message is routed through a hub transport (HT) server. Thus, ensuring that mail flow is not impacted by HT server performance is important, especially in the failure scenario in which only a single HT virtual machine remains online. Processor utilization The HT servers are allocated two processor cores per VM. Figure 9 shows the average Hyper-V hypervisor virtual processor \ % total run time for the two virtual processors on each of the HT VMs. 20

Figure 9. Hub Transport server VMs virtual processor % Total Run Time values The first thing to notice is that for the first 4 test points, load is equally balanced between the two HT servers. Also note that there is a fairly linear pattern to the processor utilization as the load is increased; both in going from the normal to peak workloads and from 2000 users to 4000 users. From the 2000 users 8-hr test point (test 1) to 2000 users 4-hr (test 2), the % total run time value jumps from 6% to 14%. From 4000 users 8-hr (test 3) to 4000 users 4-hr (test 4), the jump is from 14% to 27%. Going to the final test point, 4000 users 4-hr (test 6), and only a single HT VM, the % total run time increases to 51%. Thus, even with a single HT VM, there is still headroom for additional processor consumption on the VM. However, processor utilization is only one component of VM performance. To truly understand whether there is a performance impact when running with only a single HT VM in the environment, the active mailbox delivery queue length needs to be analyzed. Note The virtual processor utilization values include the overhead of running Microsoft Forefront Security for Exchange 2007 on the HT VMs. 21

Message queuing Figure 10 below shows a graph of the maximum HT server MSExchange Transport\Active mailbox delivery queue length values in a given test. This counter measures how many messages are waiting to be routed to a specific mailbox server. The threshold for acceptable performance of this counter is that the maximum value should be below 1000 for the duration of the test. Figure 10. Hub Transport Server Active Mailbox Delivery Queue Lengths The first thing that is obvious is that for all test points, the maximum value for the active mailbox delivery queue length is far below the 1000 message threshold. During the test periods there is very little ongoing queue and the spikes are well within the acceptable 1000 message threshold. For the first four test points, the max queue lengths between the two HT VMs are close, and like the processor utilization values above, indicate equal load balancing between the servers. When the failure scenario is simulated, the queue lengths on the remaining HT VM roughly double in size. However, these are well below the 1000 message threshold and all indications are that the remaining HT VM is able to satisfy the routing requirements for both CCR clusters while the second physical server is offline. 22

Summary This white paper documents the configuration details and performance results of a two server building block with HP ProLiant BL680c G5 server blades. Each BL680c server blade in the building block is running five, Hyper-V virtual machines (VM) for a total of ten VMs. This provides the following server roles in the Exchange environment: Two Active Directory (AD) Two Client Access Servers (CAS) Two Hub Transport Servers (HT) Two Exchange 2007 Cluster Continuous Replication (CCR) clusters (4 VMs total) Each Exchange 2007 CCR cluster is designed to support 2,000 Exchange users, with 500MB mailboxes, for a total of 4,000 Exchange users in the solution. This configuration provides fault tolerance in the event of a single VM failure or the loss of an entire physical server. And by using larger multi-processor, multi-core BL680c server blades, the infrastructure in this solution can be reduced from ten physical servers to two physical servers. Performance testing was conducted using the Microsoft LoadGen utility to simulate an Exchange production workload. Under both normal and peak workloads, performance of the building block configuration was well within acceptable performance thresholds for both normal operating conditions, as well as during a server failure event. 23

For more information For more Information on planning, deploying, or managing Microsoft Exchange server on HP ProLiant servers and HP storage see, www.hp.com/solutions/exchange or www.hp.com/solutions/activeanswers/exchange For more information on planning, deploying, or managing a Hyper-V based virtual infrastructure on ProLiant servers see: http://h18004.www1.hp.com/products/servers/software/microsoft/virtualization/ For more information on HP BladeSystem see, www.hp.com/go/bladesystem For more information on HP ProLiant servers see, www.hp.com/go/proliant For more information on HP Storage solutions see, www.hp.com/go/storage To help us improve our documents, please provide feedback at http://h20219.www2.hp.com/activeanswers/us/en/solutions/technical_tools_feedback.html. Technology for better business outcomes Copyright 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Intel and Xeon are trademarks of Intel Corporation in the U.S. and other countries. 4AA2-7270ENW, June 2009