Enabling VMware Enhanced VMotion Compatibility on HP ProLiant servers Technical white paper Table of contents Executive summary... 2 VMware VMotion overview... 2 VMotion CPU compatibility... 2 Enhanced VMotion Compatibility... 3 VMware Enhanced VMotion compatibility requirements... 7 Intel-based HP ProLiant servers allowed in an EVC cluster... 8 AMD Opteron-based HP ProLiant servers allowed in an EVC cluster... 9 Summary... 10 Appendix A enabling processor virtualization options... 11 Appendix B migration logs... 17 For more information... 18
Executive summary VMware VMotion technology allows running virtual machines to move from one physical machine to another with no impact to the virtual machines. VMotion offers improved system utilization with load balancing, increased serviceability and manageability, as well as enhanced flexibility. Administrators can reduce unplanned downtime and can eliminate planned downtime to perform hardware maintenance, such as disruptive firmware updates. Successful VMotion migration requires CPU compatibility between source and destination ESX hosts. This document explains the use of HP ProLiant servers with VMware Enhanced VMotion Compatibility (EVC) to ensure all hosts in a cluster are VMotion compatible. Target audience: The intended audience for this document is VMware administrators who intend to deploy HP ProLiant servers in EVC clusters, and purchasing managers who wish to add new HP ProLiant servers to an EVC cluster. It is assumed that you have working knowledge of VMware Infrastructure 3 (VI3) and/or VMware vsphere 4. VMware VMotion overview The VMotion process starts by VMware vcenter performing several checks to verify that the virtual machine to be migrated is in a stable state on the source host and that the destination host is compatible. Next, vcenter begins an iterative pre-copying of the memory state of the source guest to the destination host. The memory pre-copy completes when memory changed is below a given threshold or no forward progress is made. The amount of time needed to perform the memory precopy depends on the workload, amount of memory and type of network used for VMotion. This step can happen in seconds, or it can take minutes. Next the virtual machine is quiesced, and the remaining state is sent to the destination host. At this point, control is transferred from the source host to the destination host. This step typically takes under one second. The last step is to send the remaining modified memory, and start the virtual machine on the destination host. The amount of time needed for this step depends on workload and memory size. In the event that the VMotion operation fails, the virtual machine continues to run on the source host. A VMotion operation can fail for several reasons, such as network latency or unresponsive storage. In many cases, vcenter will have an error message detailing the cause of the failed migration. The ESX hosts also log migration information. See Appendix B migration logs for information on logs to check to help diagnose a failed VMotion migration. VMotion CPU compatibility Successful VMotion migration requires that the processors of the destination host be able to execute using equivalent instructions to those the processors of the source host were using when the virtual machine was migrated off of the source host. Processor clock speeds, cache sizes, and the number of processor cores can vary, but processors must come from the same vendor (Intel or AMD ) and present an identical CPU feature set. VMotion compatibility rules prevent unsafe migration that can make a virtual machine unstable. By default, vcenter only allows live migration with VMotion between source and destination processors with a compatible feature set. If processors do not have a compatible feature set, a CPU mask must be used to make the CPU feature set on the destination host appear identical to the CPU feature set on the source host. For information on VMotion CPU compatibility requirements for Intel processors, see VMware KB Article 1991, http://kb.vmware.com/kb/1991. For information on VMotion CPU compatibility requirements for AMD processors, see VMware KB Article 1992, http://kb.vmware.com/kb/1992. 2
Note VMware does not recommend or support the use of CPU compatibility masks in production environments. For more information on CPU compatibility masks, see http://kb.vmware.com/kb/1993. Enhanced VMotion Compatibility Enhanced VMotion Compatibility (EVC) removes the need to set CPU masks manually. EVC is a cluster setting that automatically configures all hosts in the cluster to be VMotion compatible with each other. All guests in the cluster can migrate live to any host in the cluster because guests always see an identical CPU feature set from all hosts in the EVC cluster. Figures 1 and 2 show the EVC options available in VI3 vcenter cluster settings. VI3 has one EVC mode for Intel hosts and one EVC mode for AMD hosts. Note Before enabling EVC, or adding a new host to an EVC cluster, it is recommended that the host be updated to the latest HP ProLiant system ROM version. Figure 1. VI3 EVC cluster setting for AMD hosts 3
Figure 2. VI3 EVC cluster setting for Intel hosts Figures 3 shows the VMware EVC modes available in vsphere 4 for AMD hosts. vsphere 4 has four EVC modes for AMD hosts (AMD Opteron Generation 1, AMD Opteron Generation 2, AMD Opteron Generation 3 (no 3DNow!), and AMD Opteron Generation 3). Figure 3. vsphere EVC cluster setting for AMD hosts 4
Figures 4 shows the EVC modes available in vsphere 4 for Intel hosts. vsphere 4 has four EVC modes for Intel hosts (Intel Xeon Core 2, Intel Xeon 45nm Core 2, Intel Xeon Core i7, and Intel Xeon 32nm Core i7). Figure 4. vsphere EVC cluster setting for Intel hosts EVC uses Intel VT FlexMigration and AMD-V Extended Migration, concepts jointly developed by VMware and the CPU manufacturers, to dynamically turn off selected CPUID feature bits. Intel VT FlexMigration is available in Intel processors with the Intel Core 2 microarchitecture and newer. AMD-V Extended Migration is available in Second-Generation AMD Opteron processors and newer. In VI3, with vcenter 2.5 U2 and hosts using ESX 3.5 U2 or later, there is one EVC baseline for each CPU vendor. For Intel Xeon processor-based EVC clusters, the baseline is CPU features supported by Intel Core 2 (Merom) processors. For AMD processor-based EVC clusters, the baseline is CPU features supported in AMD Opteron First and Second Generation (Revision E/F) processors. vsphere 4 includes support for multiple baselines, such as Penryn and Nehalem baselines for Intelbased EVC clusters, and a Greyhound baseline for AMD-based EVC clusters. This allows more control over which CPU features are exposed to the guest. There is a tradeoff between compatibility and capability. The most capable baseline will expose the largest subset of CPU features supported by CPUs in the cluster, but older hardware may not be added to the cluster. The most compatible baseline will expose a minimal set of CPU features to the guest so older hardware may be added to the cluster. Table 1 lists HP ProLiant server processors supported in EVC clusters. 5
Table 1. Processors in HP ProLiant Servers Supported in EVC Clusters Baseline Processors Supported Intel Xeon Core 2 (Merom) 51xx, 53xx, 72xx, 73xx series Intel Core 2 Intel Xeon 45nm Core 2 (Penryn) 33xx, 52xx, 54xx, 74xx series Intel Xeon Core i7 (Nehalem) 35xx, 55xx series Intel Xeon 32nm Core i7 (Westmere) 56xx series Intel Xeon 45nm Core 2 (Penryn) Intel Xeon 33xx, 52xx, 54xx, 74xx series Intel Xeon 45nm Core 2 Intel Xeon Core i7 (Nehalem) 35xx, 55xx series Intel Xeon 32nm Core i7 (Westmere) 56xx series Intel Xeon Core i7 Intel Xeon 32nm Core i7 Intel Xeon Core i7 (Nehalem) 35xx, 55xx series Intel Xeon 32nm Core i7 (Westmere) 56xx series Intel Xeon 32nm Core i7 (Westmere) 56xx series First Generation AMD Opteron Rev. E based CPUs 2xx, 8xx First Generation AMD Opteron (vsphere) Second Generation AMD Opteron (VI3) Second Generation AMD Opteron Rev. F based CPUs 22xx, 82xx Third Generation AMD Opteron Greyhound based CPUs 23xx, 83xx, 24xx, 84xx Second Generation AMD Opteron Second Generation AMD Opteron Rev. F based CPUs 22xx, 82xx Third Generation AMD Opteron Greyhound based CPUs 23xx, 83xx, 24xx, 84xx Third Generation AMD Opteron Third Generation AMD Opteron Greyhound based CPUs 23xx, 83xx, 24xx, 84xx 6
VMware Enhanced VMotion compatibility requirements All hosts in the cluster must be running ESX Server 3.5 Update 2 or later, and be connected to vcenter Server 2.5 U2 or later. Hosts with Nehalem processors must be running ESX Server 3.5 Update 4 or later. All hosts must be licensed for VMotion. All hosts in the cluster must use shared storage for guests. Shared storage can be implemented using a Fibre Channel (FC) storage area network (SAN), iscsi, or network attached storage (NAS). All hosts must have access to the same subnets, and network labels for each virtual machine port group should match. All hosts require a private gigabit Ethernet network for VMotion. All hosts in the cluster must have CPUs from a single vendor, either Intel or AMD. All hosts in the cluster must either have hardware live migration support (Intel VT FlexMigration or AMD-V Extended Migration) or have the CPU feature set you intend to enable as the EVC cluster baseline. All hosts in the cluster must have hardware virtualization enabled in the BIOS if it is available (Intel Virtualization Technology or AMD Virtualization). See Appendix A enabling processor virtualization options for more information on enabling these features on HP ProLiant servers. All hosts in the cluster must have execute protection enabled in the BIOS (No-Execute Memory Protection on Intel processors and No-Execute Page-Protection on AMD processors). See Appendix A enabling processor virtualization options for more information on enabling these features on HP ProLiant servers. All virtual machines in the cluster must be powered off or migrated out of the cluster when EVC is enabled. If the virtual machines are migrated to a host with the same processor type that will be enabled as the baseline for the EVC cluster, VMotion can be used to migrate these virtual machines into the EVC cluster after it is configured. Any new hosts added to an existing EVC cluster must have virtual machines powered off or evacuated prior to the new hosts being added to the EVC cluster. Note EVC clusters are supported only if applications running in the guest are well-behaved. This means an application uses relevant CPUID feature flags to detect the existence of a feature. See VMware KB 1005763 for details. http://kb.vmware.com/kb/1005763. 7
Intel Xeon 51xx Series Intel Xeon 53xx Series Intel Xeon 72xx Series Intel Xeon 73xx Series Intel Xeon 33xx Series Intel Xeon 52xx Series Intel Xeon 54xx Series Intel Xeon 74xx Series Intel Xeon 35xx Series Intel Xeon 55xx Series Intel Xeon 56xx Series Intel-based HP ProLiant servers allowed in an EVC cluster Table 2 lists HP ProLiant server models that have been certified for VMware ESX 3.5 U2 or later, and contain processors compatible with an EVC baseline. Table 2. HP ProLiant servers allowed in an EVC cluster for Intel hosts Merom Penryn Nehalem Westmere BL20p G4 BL2x220c G6 BL260c G5 BL280c G6 BL460c G1 BL460c G5 BL460c G6 BL480c G1 BL490c G6 BL680c G5 DL160 G6 DL170h G6 DL180 G6 DL360 G5 DL360 G6 DL360 G7 DL370 G6 DL380 G5 DL380 G6 DL380 G7 DL580 G5 8
AMD Opteron 2xx Series AMD Opteron 8xx Series AMD Opteron 22xx Series AMD Opteron 82xx Series AMD Opteron 23xx Series AMD Opteron 83xx Series AMD Opteron 24xx Series AMD Opteron 84xx Series Intel Xeon 51xx Series Intel Xeon 53xx Series Intel Xeon 72xx Series Intel Xeon 73xx Series Intel Xeon 33xx Series Intel Xeon 52xx Series Intel Xeon 54xx Series Intel Xeon 74xx Series Intel Xeon 35xx Series Intel Xeon 55xx Series Intel Xeon 56xx Series Merom Penryn Nehalem Westmere ML330 G6 ML350 G5 ML350 G6 ML370 G5 ML370 G6 SL160z G6 SL170z G6 AMD Opteron-based HP ProLiant servers allowed in an EVC cluster Table 3. HP ProLiant servers allowed in an EVC cluster for AMD hosts Generation 1 Generation 2 Generation 3 BL25p G1 BL25p G2 BL35p G1 BL45p G1 BL45p G2 BL465c G1 BL465c G5 BL465c G6 BL495c G5 9
AMD Opteron 2xx Series AMD Opteron 8xx Series AMD Opteron 22xx Series AMD Opteron 82xx Series AMD Opteron 23xx Series AMD Opteron 83xx Series AMD Opteron 24xx Series AMD Opteron 84xx Series Generation 1 Generation 2 Generation 3 BL495c G6 BL685c G1 BL685c G5 BL685c G6 DL365 G1 DL365 G5 DL385 G1 DL385 G2 DL385 G5 DL385 G6 DL585 G1 DL585 G2 DL585 G5 DL585 G6 DL785 G5 DL785 G6 Summary VMware VMotion allows virtual machines to migrate from one physical server to another with no downtime. A live migration is undetectable to end users of the virtual machine. Successful migration requires CPU compatibility between source and destination hosts. Maintaining VMotion compatibility between hosts in a cluster can become challenging, especially as new hardware enters the environment. Enhanced VMotion Compatibility simplifies maintaining compatibility between hosts by enabling a baseline set of features for all hosts in an EVC cluster. In order to use an HP ProLiant server in an EVC cluster, several conditions beyond VMotion requirements must be met. As outlined in Tables 2 and 3, the server model and processor combination must be certified for ESX 3.5 U2 or later, and the processor must be compatible with an EVC cluster. 10
Appendix A enabling processor virtualization options In order to enable an HP ProLiant server for VMotion or for use in an EVC cluster, BIOS settings for the processors must enable Hardware Virtualization (if available) and Execute Protection. On HP ProLiant servers, Hardware Virtualization is specified as Intel Virtualization Technology on Intel-based ProLiant servers and AMD Virtualization on AMD-based ProLiant servers. Execute Protection is specified as No- Execute Memory Protection on Intel-based ProLiant servers and No-Execute Page-Protection on AMDbased ProLiant servers. The Figures A-1a A-3c below show the paths to enable these features using the ROM Based Setup Utility (RBSU). The RBSU is accessed by pressing F9 during POST. The path to the relevant BIOS options is dependent on the server model. Figure A-1a shows a common step for both Intel- and AMD-based ProLiant G5 (or earlier) servers. Advanced Options is selected first. Figure A-1a. Advanced Options selection 11
Figure A-1b shows RBSU Processor Options on an Intel-based HP ProLiant BL460c G6 server. System Options is selected first. Figure A-1b. System Options selection on an HP ProLiant BL460c G6 server 12
Some ProLiant servers list processor options such as Hardware Virtualization in the Advanced Options section of RBSU, as shown in Figure A-2a. In those servers, Hardware Virtualization can be enabled in the Advanced Options section. RBSU of most ProLiant servers will have a Processor Options selection, as shown in Figures A-2b and Figure A-2c. Figure A-2a. RBSU Advanced Options list with Hardware Virtualization options for an Intel-based ProLiant server 13
14 Figure A-2b. RBSU Processor Options selection
Figure A-2c shows RBSU Processor Options on an Intel-based HP ProLiant BL460c G6 server. Figure A-2c. Processor Options selection of a HP ProLiant BL460c G6 server Figure A-3a shows the RBSU processor option for many AMD-based HP ProLiant servers. In Figure A-3a, No-Execute Page-Protection and AMD Virtualization (if available) must be enabled for AMDbased HP ProLiant servers. Some AMD-based HP ProLiant servers, such as the HP ProLiant DL585 G1, will not have the AMD Virtualization option available. Figure A-3a. RBSU AMD processor options 15
Figure A-3b shows RBSU Processor Options on many Intel-based HP ProLiant servers. No-Execute Memory Protection and Intel Virtualization Technology (if available) must be enabled. Figure A-3b. Intel Virtualization Technology processor option Figure A-3c shows RBSU Processor Options on an Intel-based HP ProLiant BL460c G6 server. No- Execute Memory Protection and Intel Virtualization Technology must be enabled. Figure A-3c. No-Execute Memory Protection processor option on an HP ProLiant BL460c G6 server Once the Hardware Virtualization options have been enabled, press <Esc> three times, then F10 to exit the RBSU and reboot the server. 16
Appendix B migration logs Several logs can be used to troubleshoot a failed migration on an ESX host. The first step is to obtain the migration ID. The migration ID is listed on the left of each row of /proc/vmware/migration/failed. /proc/vmware/migration/active, and /proc/vmware/migration/history can also be used to obtain the migration ID. The following command can be used in the ESX Service Console to retrieve the migration ID in the case of a failed VMotion operation: cat /proc/vmware/migration/failed Once the migration ID is obtained for a failed VMotion operation, search for the migration ID in several of the other ESX logs, such as the /var/log/vmkernel logs. The following command will return the log entries related to the migration: grep <migration id> /var/log/vmkernel* The logs vmware*.log are located in the guest s home directory on the source host. The location of these logs will depend on the datastore that holds the virtual machine files and the name of the virtual machine. For a virtual machine on a VMFS datastore, the path to the location of the logs would be similar to /vmfs/volumes/<datastore>/<vm name>/vmware-*.log. The following command will return the log entries related to the migration: grep <migration id> /vmfs/volumes/<datastore>/<vm name>/vmware-*.log It may also be useful to grep the migration ID of hostd logs, /var/log/vmware/hostd*.log, on source and destination hosts. grep <migration id> /var/log/vmware/hostd*.log Examination of these logs may give you clues as to what caused the migration to fail. 17
For more information HP Virtualization with VMware, http://www.hp.com/go/vmware HP ProLiant servers, http://www.hp.com/go/proliant HP ProLiant servers VMware support matrix, http://h20219.www2.hp.com/enterprise/cache/505363-0-0-0-121.html VMware SiteSurvey and CPU Identification Utility, http://www.vmware.com/download/shared_utilities.html VMware Hardware Compatibility Guide search, http://www.vmware.com/resources/compatibility/search.php? VMotion Compatibility Info Guide, http://www.vmware.com/files/pdf/vmotion_info_guide.pdf Basic System Administration Guide for ESX 3.5 U2, http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_admin_guide.pdf vsphere Resource Management Guide, http://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdf VMotion and CPU Compatibility FAQ, http://kb.vmware.com/kb/1005764 Enhanced VMotion Compatibility (EVC) processor support, http://kb.vmware.com/kb/1003212 Detecting and Using CPU Features in Applications, http://kb.vmware.com/kb/1005763 General VMotion Intel processor compatibility information, http://kb.vmware.com/kb/1991 General VMotion AMD processor compatibility information, http://kb.vmware.com/kb/1992 To help us improve our documents, please provide feedback at http://h20219.www2.hp.com/activeanswers/us/en/solutions/technical_tools_feedback.html. Copyright 2009-2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. AMD and AMD Opteron are trademarks of Advanced Micro Devices, Inc. Intel, Core and Xeon are trademarks of Intel Corporation in the U.S. and other countries. 4AA2-6016ENW, Created April 2009; Updated, May 2010, Rev. 1