Intel Virtualization and Server Technology Update Petar Torre Lead Architect Service Provider Group 29 March 2012 1
Legal Disclaimer Intel may make changes to specifications and product descriptions at any time, without notice. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See www.intel.com/products/processor_number for details. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel Virtualization Technology requires a computer system with a processor, chipset, BIOS, virtual machine monitor (VMM) and applications enabled for virtualization technology. Functionality, performance or other virtualization technology benefits will vary depending on hardware and software configurations. Virtualization technology-enabled BIOS and VMM applications are currently in development. 64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, operating system, device drivers and applications enabled for Intel 64 architecture. Performance will vary depending on your hardware and software configurations. Consult with your system vendor for more information. Intel, Intel Xeon, Intel Core microarchitecture, and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. 2
Agenda Virtualisation Review Invisible technology Intel Xeon processors in Cisco UCS E5-2600 E7-2800 E7-4800 3
Virtualization: Evolving Towards the Enterprise Cloud Enterprise Cloud: Virtualization 3.0 Automation and Resource Scalability Flexible Resource Management: Virtualization 2.0 Dynamic Resource Allocation Consolidation: Virtualization 1.0 Operational Expense Efficiency 4 4
Virtualization Data Center Foundation Data Center Requirements Infrastructure Scale Massive Scaling Balanced Platform RAS is critical Lower TCO Network consolidation Multi-Tenancy Resource Sharing Security Isolation Data center Foundation Unified Network (10Gb Ethernet for Storage and Networking) Intel Xeon Processor Based Platform (CPU & I/O performance, New Technologies) 5
A Framework for Optimizing Virtualization and scale out across VMs Reduce overhead within each VM OS OS OS VMM Host Ways to Reduce Overheads VT-x Latency Reductions Virtual Processor IDs Extended Page Tables APIC Virtualization (Flex Priority) I/O Assignment via DMA Remapping (VT-d) VMDq 4GB pages Technologies for Scaling Scaling from EP to EX Hyper Threading PAUSE-loop Exiting SR-IOV RAS Capabilities Intel working with VMware and other VMM vendors to reduce virtualisation overhead 6 VT-x = Intel Virtualization Technology for IA-32, Intel 64 and Intel Architecture
Optimizing VT Transition Latencies Virtual Machine Control Structure (VMCS) VMCS holds all essential guest and host register state Backed by host physical memory Accessed via an architectural VMREAD / VMWRITE interface Enables CPU implementations to cache VMCS state on-die Virtual Processor IDs (VPIDs) VMM-specified values used to tag microarchitectural structures (TLBs) Used to avoid TLB flushes on VT transitions VMM pcpu Memory VMREAD VMWRITE VMCS Backing Page in Memory 1800 1600 1400 1200 1000 800 600 400 200 0 Round-Trip VM exit/entry (Cycles) 2007 2008 2009 2010 7 Significant VT latency reductions over time
Making Resource Sharing Safer VM VM VM VM Virtual Machine Monitor Isolate Intel Virtualization and Intel Trusted Execution Technology (Intel TXT) work together to better isolate VMs Measure Intel TXT measures VMM for launch protection Encrypt Intel AES New instructions in Intel Xeon processor E5 and E7 families quickly encrypts data in flight and at rest Intel TXT and AES New instructions in Intel Xeon Make Multi-Tenancy More Secure processors 8
Intel Xeon Processor Families in Cisco UCS E5 Family Top Performance / $, Energy Efficiency, & Flexibility for Infrastructure Apps Xeon E5-2600 7000 E7 E7 Family Sequence Family Scalable Performance, Flexibility, & Advanced RAS for Demanding Apps / Consolidation Xeon Xeon E7-4800 7500 & Xeon 2800 6500 9
Typical Workloads Workload / Usage Larger Workloads Mid/Smaller Workloads Business Processing (DB, ERP, CRM, batch)* Decision Support (data warehouse, Business Intelligence) Large-scale Consolidation (including Virtualization & Multi-tier) Application Development High Performance Computing Collaboration Web Infrastructure IT Infrastructure Development/Quality Assurance E7 UCS Servers: B440 M2, B230 M2, C460, M2 C260 M2 2S Volume Servers B200 M3, C240 M3, C220 M3 Fewer/Smaller Instances/Users More/Larger Instances/Users 10 * For directional guidance only. This is not a server selection guide. Actual server sizing is a relatively complex effort involving workload characterization including such considerations as type of application, size of workload, number of users, type of transaction, SLA response times, targeted utilization, level and estimation accuracy of workload baseline/peak/growth, physical or managerial constraints, need to maintain a single state at all times to ensure all users see the same results at any given time, cost to migrate to a scale-out alternative, and system availability requirements.
Introducing the Intel Xeon Processor E5 Family Leadership Performance Breakthrough I/O Innovation Trusted Security Exceptional Energy Efficiency The Heart of a Flexible, Efficient Data Center Built to Scale Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright 2012, Intel Corporation.
Intel Xeon 2S Platform Comparison Xeon 5500 / 5600 based Platform (B200 M2) up to 1333 Xeon 5500 Xeon 5600 Core up to 6.4 GT/s QPI Xeon 5500 Xeon 5600 Core up to 1600 E5-2600 Product Family based Platform (B200 M3) Sandy Bridge 8 Core up to 8.0 GT/s QPI QPI Sandy Bridge 8 Core QPI QPI x4 x8 Intel 5500 Series (IOH) up to 36 lanes PCIe2 up to 40 lanes PCIe3 per socket x4 Intel ICH 10 Intel C600 Series (PCH) Serial Attached SCSI (SAS) 4 ports, 3Gb/s Up to 18 DIMMs per 2S platform Up to 36 PCIe2 lanes Two-chip IOH / ICH Up to 8 Cores Up to 24 DIMMs @1666 Mhz Up to 80 PCIe lanes Two QPI links @ 8.0GT/s between CPUs One-chip Platform Controller Hub (PCH) Integrated PCI-E Gen 3 12
Romley Memory New Features Higher speed memory: 1600 MHz DDR 3 Intel Xeon processor E5-2600/4600 product families (EP) 1600MHz Support: SR/DR RDIMM 3SPC/2DPC or 1DPC LR-DIMMs Max 16GB DIMMs total RAM 256GB @ 1600 MHz Load Reduced DIMMs 3 x 32GB DIMMs per channel supported total RAM 768GB @ 1066 MHz 13
DIMM Types UDIMM Address, Control & Data signals are unbuffered RDIMM Address & Control signals are buffered in a register LR-DIMM Address, Control and Data signals are all buffered Address & Control Data Bus UDIMM Data Bus Data Bus Address & Control Register + Rank Mult. Logic Data Buffers Address & Control Register LR-DIMM RDIMM 14 14
What is an LR-DIMM? LR-DIMM = Load Reduced DIMM An LR-DIMM has additional components on the DIMM which: a) Buffer the data bus pins To reduce electrical loading on the data bus b) Rank multiplication logic to make to logically reduce the number of ranks visible to the memory controller Even though there is likely four ranks physically on the DIMM Depending on the DIMM, rank multiplication can be disabled, set to 2:1 or set to 4:1 (4:1 makes a Quad ranked DIMM logically look like a single rank DIMM) More info about LR-DIMMs available here: http://www.edn.com/article/519386-basics_of_lrdimm.php 15 15
Trusted Security Intel Trusted Execution Technology Average organizational costs of a data breach over $7M per incident2 Intel Advanced Encryption Standard New Instructions Use of AES encryption1 has nearly tripled in the last 10 quarters 1 AES Instructions refers to AES128-SHA1. Source: Akamai Second Quarter 2011 State of the Internet' Report. See details and report at: http://www.akamai.com/html/about/press/releases/2011/press_102411.html 2 Source: Ponemon Institute - 2010 Annual Study: Cost of a Data Breach. March 2011 * Other names and brands may be claimed as the property of others. Copyright 2012, Intel Corporation.
Xeon Processor Energy Efficiency ~50% Relative Performance and System Power 1 higher Performance At Same Power Peak power under load Performance X5675 E5-2660 Best Data Center Performance per Watt1 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. 1 Performance comparison using SPEC_Power results published as of March 6th, 2012. See back up for configuration details. For more information go to intel.com/performance * Other names and brands may be claimed as the property of others. Copyright 2012, Intel Corporation.
Intel Xeon Processor E5-2600 Product Family Virtualized Consolidation 2.0 Performance on VMmark* 2 Performance (Score @ Number of Tiles) 12 10 8 6 4 2 0 Higher is better No Publication Available Opteron* 6282 SE (2.6GHz, 16C, 140W) VMmark* v2.1 Performance 7.59 @7 Tiles Best Published X5690 (3.46GHz, 6C, 130W) 1.43X 10.88 @10 Tiles Best Published E5-2690 (2.9GHz, 8C, 135W) Flexible datacenter, multi-host benchmark models application throughput with minimum QoS requirements and virtual infrastructure operations Measures consolidated application throughput and operations completed Grow your consolidation capabilities with the Intel Xeon processor E5-2600 product family Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. 18 Source: Best available publications/submissions as of 6 March 2012 Configuration Details: Please reference slide speaker notes and back up slides For more information go to http://www.intel.com/performance * Other names and brands may be claimed as the property of others
Intel Xeon Processor E5-2600 Product Family Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Public white paper published at: http://www.principledtechnologies.com/clients/reports/intel/xeon_e5-2690_consolidation_0312.pdf For more information go to http://www.intel.com/performance
The Intel Xeon Processor E5 Family Leadership Performance: 15 New x86 World Records Breakthrough I/O Innovation: Up to 3X I/O Performance Trusted Security: Trusted Hardware Security Exceptional Energy Efficiency: Best performance per watt Cisco Servers: B200 M3, C220 M3 and C240 M3 The Heart of a Flexible, Efficient Data Center that s Built to Scale Learn More at: www.intel.com/datacenter Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmarkand MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Westmere-EX INTEL XEON PROCESSOR E7 FAMILY 21
Intel Xeon Processor E7 platform (Westmere-EX) More Performance 10 cores / 20 threads 30MB of last level cache More Expandable Supports 32GB DIMMs (2TB per 4- socket system) 1 More Security & RAS WSM-EX WSM-EX SECURITY Intel Advanced Encryption Standard-New Instructions Intel Trusted Execution Technology (TXT) WSM-EX WSM-EX More Efficient More performance within same max CPU TDP as Xeon 7500 Lower partial active & idle power via Intel Intelligent Power Technology 2 RELIABILITY, AVAILABILITY, SERVICEABILITY Double Device Data Correction (DDDC) Support for Low Voltage-DIMMs 3 Reduced power memory buffers 4 Partial Memory Mirroring Delivers more Performance, Expandability and RAS while improving Energy Efficiency 22 1. Up to 64 slots per standard 4 socket system x 32GB/DIMM = 2TB 2. Uses similar core and package C6 power states enabled on Intel Xeon 5500/5600 series processors. Requires OS support. 3. Savings dependent on workload and configuration. Example: At 100% SPECpower load it can save ~0.8W for 4GB DIMM DRx8 based on early Intel internal estimates 4. Memory buffer power savings of up to 1.3W active and 3W idle per buffer based on internal Intel estimates. Slightly more savings when used with LV DIMMs
2-Socket Platform Comparison Efficient Performance Class Expandable Class E5-2600 8 Core QPI QPI E5-2600 8 Core E7-2800 E7-2800 Xeon E5-2600 B200M3 C220M3 C240M3 B230M2 C260M2 IO HUB Xeon E7-2800 Frequency Sensitive Performance Thread Sensitive Low Power Efficiency High Low Memory Capacity High Good Reliability Best Virtualization driving need for compute, memory & reliability 23
Xeon E7 Family: Meeting the Highest Virtualization Needs Xeon E7 10C/20 threads per socket 2-256 socket scaling 512GB memory per skt 2X I/O capacity Mission Critical RAS Large Scale, Mission Critical Virtualization (>8GB) Infrastructure Consolidation (of multi-tier Applications) Intel Platform Virtualization Technologies Intel VT-x Intel VT For Directed I/O Intel VT Flex Migration Intel VT For Connectivity Processor Chipset Network Headroom for Peak & Unpredictable Demand Live Migration of Big Workloads Optimized for the most demanding virtualization workloads 24
Machine Check Architecture Recovery Previously seen only in RISC, mainframe, and Itanium-based systems MCA Recovery System works in conjunction with OS or VMM to recover or restart processes and continue normal operation System Recovery with OS Normal Status With Error Prevention Error information passed to OS / VMM Bad memory location flagged so data will not be used by OS or applications Error Contained Error Detected* Error Corrected Un-correctable HW Un-correctable Errors Errors HW Correctable Errors Allows Recovery From Otherwise Fatal System Errors 25 *Errors detected using Patrol Scrub or Explicit Write-back from cache
MCA-R Fault Isolation with vsphere5.0 Only Supported on Intel Xeon processor E7 family & VMware vsphere 5.0 VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM ESXi 5.0 ESXi 5.0 UNCORRECTABLE PURPLE ERROR REBOOT REPORT STATUS: LOGGED SCREEN SERVER TO MEMORY GOOD OF ESXi FOR DEATH PFA ERROR Non MCA System UNCORRECTABLE MARK RESTART ERROR VM MEMORY REPORT ISOLATE STATUS: GETS LOGGED FAILED RESTARTED LOCATION TO ERROR MEMORY GOOD VM ESXi FOR IN PFA POOL ERROR BAD Cisco C460 M2 C260 M2 B230 M2 B430 M2 MCA-Recovery Provides Isolation From Otherwise Fatal System Errors 26
Intel Xeon Processor E7 Family and Vmware* vsphere* 5.0 Vmware* vsphere* 5.0 scales to exploit the Intel Xeon processor E7 platform capabilities: 80-cores (160 threads) physical servers 2 TB physical RAM 32-vCPU VMs w/ 1TB virtual RAM 512 VMs, up to 2048 vcpusper physical host VM consolidation on this scale requires hypervisor focus on: Scalability, Energy Efficiency, & Uptime (Reliability) Especially Memory Reliability New Vmware* vsphere* 5.0 memory reliability features require Intel Xeon processor E7 family 27
The Intel Xeon Processor E5 & E7 Families For Cisco UCS Servers Leadership Performance: Breakthrough I/O Innovation: with E5 Trusted Security: Trusted Hardware Security E5 & E7 E5 Exceptional Energy Efficiency: Best performance /watt E7 Best RAS with MCA The Heart of a Flexible, Efficient Data Center that s Built to Scale Learn More at: www.intel.com/datacenter Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmarkand MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Thank You for Your Attention 29