Intel Virtualization Technology Examining VT-x and VT-d August, 2007 v 1.0 Peter Carlston, Platform Architect Embedded & Communications Processor Division Intel, the Intel logo, Pentium, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Agenda Software Virtualization Challenges Silicon Enhancements for Virtualization New Processor Hardware Architecture Extensions New Memory and I/O Controller Extensions New 1/10Gb Ethernet NIC Extensions Intel New Product Cadence & Penryn Summary 2 Copyright 2007, Intel Corporation. All rights reserved.
Non-Virtualized System OS Runs in Ring-0. Uses time slicing to run multiple applications OS drivers control all access to the platform hardware. Ring 1-3 Ring 0 Min Operating System Platform Hardware 3 Copyright 2007, Intel Corporation. All rights reserved.
Software-Based VMM Challenges OS Ring-0 code now runs in Rings 1-3 OS de-privileged VMM must resolve potential conflicts between OSes: Binary patching, etc Paravitrualization Can lead to performance and stability issues Ring 1-3 OS Ring 0 Min OS OS Virtual Machine Monitor VMM operates in Ring-0 Traditional OS domain Platform Hardware 4 Copyright 2007, Intel Corporation. All rights reserved.
Intel Processor Virtualization Technology Applications run in ring 3 as expected Applications remain unchanged 3D NON- ROOT OS runs at privilege level 0 as expected No excessive faulting No expensive SW virtualization hacks Improved performance and stability 0D 0P OS OS OS Min Light-Weight Virtual Machine Monitor Platform Hardware ROOT VMM now runs in new CPU execution mode HW-based mode transitions Memory protection in HW VMM is independent of HW VMM controls memory paging state and Memory Processors Graphics Network Storage Keyboard / Mouse 5 Copyright 2007, Intel Corporation. All rights reserved. exceptions
VMM Types Hosted Virtual Machine Monitor VMM sits on top of a Host OS. VMM uses host OS device drivers Hypervisor VMM Primary software layer directly on top of hardware; has built-in device drivers App App App VM 1 App App App VM N App App App VM 1 App App App VM N App App App Host OS Guest OS 1... Guest OS 1 Guest OS 1... Guest OS 1 Hosted VM Monitor (VMM) Host OS Physical Host Hardware Hypervisor VM Monitor (VMM) Physical Host Hardware 6 Copyright 2007, Intel Corporation. All rights reserved.
Virtualization Event Frequencies with VT-x Events / Million Instructions 250 200 150 100 50 0 Other I/O Operations Control-Register Accesses Interrupt Handling Base VT-x Base VT-x SYSmark Internet SYSmark Office VT-x architecture substantially reduces the frequency of virtualization events a VMM needs to process 7 Copyright 2007, Intel Corporation. All rights reserved.
Evolution of Intel Hardware Virtualization Technology Vector 4: Trust TXT Trusted Execution Technology Secure Launch Memory protection, hardened keys Vector 3: I/O Assisted Virtualization Vector 2: Platform Focus Public Specs VT-d IOV (NICS) Standards for IO-device sharing: Multi-context I/O devices Endpoint device translation caching Under definition in the PCI-SIG* Hardware support for IO-device virtualization: Device DMA remapping Direct assignment of I/O devices to VMs Device-independent control over DMA Vector 1: Processor Focus VT-x Establish foundation for virtualization in the IA-32 and Itanium architectures followed by on-going evolution of support: Micro-architectural (e.g., lower VM switch times) Architectural (e.g., extended page tables (EPT) VMM Software Evolution Software-only VMMs Binary translation Paravirtualization Past No Hardware Support Simpler and more Secure VMM through foundation of virtualizable ISAs Increasingly better CPU and I/O virtualization Performance and Functionality as I/O devices and VMMs exploit infrastructure provided by VT-x, VT-d Today 8 Copyright 2007, Intel Corporation. All rights reserved.
I/O Virtualization Without VT-d 1 VM Partition OS 1 2 Application Ring 3 Application Application Buffer Driver A VMM Virtual Device Emulation Ring 0 Unmodified OS (i.e. Linux*) Virtualized HW Instance Operating System Export dev Virtualized HW Instance Driver A Buffer Driver B Buffer Ring 3 Ring 0 VMM VMM Enum mods I/O Device Hardware Device A Device B Emulation Based Virtual I/O Hardware NIC 9 Copyright 2007, Intel Corporation. All rights reserved.
VT-d Features DMA remapping Multi-level page tables allow SW to manage host physical memory and set up a hierarchy with page directories and tables SW controllability for page walk snooping Super page support (i.e. >4KB) DMA fault logging Fault recording registers Advanced fault logging uses memory-resident fault log Interrupt remapping Routes based on originator ID Applicable to all interrupt sources 10 Copyright 2007, Intel Corporation. All rights reserved.
Intel Hardware Virtualization Technology for Directed I/O (VTd) Ring 3 Application Application Ring 0 Unmodified Operating System Unmodified Operating System Error! Ring 3 Ring 0 VMM Remapping Scheme Memory map VMX D$ IA I$ I$ APIC APIC Virtual Machine 1 MCH PCI Express Root ports VT-D, Remapping Scheme OK CPU# 1 VMCS NIC VMD(q) Device Assignment Address Translation 11 Copyright 2007, Intel Corporation. All rights reserved.
Direct Assigned I/O via VT-d VM Partition OS Application Driver A Buffer 2 VMM Ring 3 Application Ring 0 Operating System Export dev Virtualized HW Instance I/O Device Hardware DMA Re-Map Device A Device B Ring 3 Ring 0 VMM VMM Enum mods Hardware NIC 12 Copyright 2007, Intel Corporation. All rights reserved.
VT-x and VT-d Summary Modern multi-core processors outstanding performance/watt ratios are enabling new use models, including Virtualization Intel Virtualization Technology : Increases system performance Improves system robustness Allows creation of simpler/smaller Hypervisors Part of industry momentum towards improved system performance, security, and trust 13 Copyright 2007, Intel Corporation. All rights reserved.
Receive Side Scaling (RSS) Receive Side Scaling feature is now designed into Intel NICs Core 0 Core 1 Core 2 Core 3 The NIC driver configures a redirection table in the NIC I$ D$ L2 Cache I$ D$ I$ D$ L2 Cache I$ D$ The NICs provide queues; each queue is assigned a MAC address The NIC hardware performs a five-tuple hash of the arriving packet s ip address and returns a queue # The NICs are MSI-x enabled, so an interrupt can be generated per queue Interrupts can be assigned to individual cores: Interrupt Affinity IRQ handler lives in core s L2 cache. Greatly increases performance NICs can also be configured to load balance IRQs across cores MSI-x MCH rx_que0 rx_que1 rx_que2 rx_que3 hash = (tcp->th_sport) ^ (tcp->th_dport) ^ (ip->ip_src.s_addr) ^ (ip->ip_dst.s_addr); hash = hash % PRIME_NUMBER; return lookup_table[hash]; 14 Copyright 2007, Intel Corporation. All rights reserved.
Receive Side Scaling + Direct Cache Access Receive Packet received by network controller Arriving packet data is DMAed into system memory Packet header marked with DCA tag MCH issues Cache Hint to target Core Core pre-fetches tagged DCA data into cache Transmit Core posts data to be transmitted into memory NIC reads data from memory and sends it out on the wire NIC places completion notice with DCA tag in memory Completion is pre-fetched into the core s cache Memory Core 0 I$ D$ MCH (Blackford) Core 1 I$ D$ L2 Cache 2 3 1 DCA DMA Hash# 1 3 2 15 Copyright 2007, Intel Corporation. All rights reserved.
Intel Product Cadence 2 YEARS TICK Pentium D, Xeon, Core Processor TOCK Core 2 Processor, Xeon Processor 65nm 2005 2006 2 YEARS TICK TOCK PENRYN Family NEHALEM High K Dialectic 45nm 2007 2008 2 YEARS TICK TOCK WESTMERE SANDY BRIDGE 32nm 16 All product information and dates are preliminary and subject to change without notice
Extended Life-Cycle Silicon Silverthorn/Poulsbo Intel Core 2 Duo Intel Xeon 5100 Intel Xeon 5300 Dual Core Intel Core 2 Duo Dual Core Quad-Core 1W 8W 35W 40W Max Thermal Design Power Ultra Mobile Industry Leading Performance Per Watt HPC Ultra Fine-Grained Power Management Enhanced Sleep States SOC Large L2 Caches Threading Tools Performance Analyzers C++ & Fortran*Compilers Integrated Graphics / Floating Point Processing External Rich Internet Experience Gen 4 Graphics Core New Graphics/FP Instructions Larrabee Intel Performance Primitives (Media & Signal Processing) Intel Math Kernel Library 1 Gb I/O Optimization: Greater Bandwidth; Less Latency 10 Gb > FPGA Interrupt Coalescing; Per-Core Queues; Direct Cache Access; Front Side Bus-Attached FPGAs Information Assurance (MILS) Intel Virtualization Technology; Trusted Execution Technology; IO Device Virtualization 17 All product information and dates are preliminary and subject to change without notice
Next Generation Intel 45 nm High-k Process Technology Penryn Family ~2x larger transistor budget provides freedom to add new features and higher performance with cost effective die sizes >20% faster transistor switching speed delivers higher core speeds and increased instructions per clock Lower leakage current reduces power consumption or enables more capability and performance within a given power envelope compared to 65nm processors 18 Copyright 2007, Intel Corporation. All rights reserved.
Improved OS Synchronization Primitive Performance Penryn Family Faster locked instruction performance Key primitive for multiple thread synchronization Faster locks enable more concurrency between threads Up to 55-80% faster Example spin lock sequence: Used for controlling access to shared resources (i.e. I/O, kernel state) Applicable for MT/MP OS env spin_lock: lock dec [edi] ;atomic decrement jns lock_acquired ;exit if lock was 1 spin: pause ;otherwise loop until cmp [edi], 0 ; lock is released jle spin jmp spin_lock ;try to reacquire lock lock_acquired: ret Faster interrupt masking control Execution time critical to OS for shared resource control Uarch improvements to eliminate pipeline stalls in the common case CLI/STI instructions as much as 100% faster Faster access of Time Stamp Counter RDTSC instr as much as 3x faster Key functionality for database servers, OS time-of-day services frequent in transaction processing Improves Performance Scalability 19 Copyright 2007, Intel Corporation. All rights reserved.
Penryn Family Virtualization Performance Improvements Penryn improves VT-x instruction context switch times by ~25-75% VM 0 App App... App Guest OS 0... VM 1 App App... App Guest OS 1 VM Transition Latency* (lower is better) VM Exit VM Entry Physical Host Hardware VM Monitor Clocks VMlaunch VMresume Merom Family VMexit (vm call) VMexit (#PF) Penryn Family Source: Intel * Comparisons based on RTL simulations via micro-kernel workloads with cache hits 20 Copyright 2007, Intel Corporation. All rights reserved.