The Microsoft Windows Hypervisor High Level Architecture



Similar documents
Full and Para Virtualization

Hyper-V Server 2008 Getting Started Guide

Microsoft Windows Common Criteria Evaluation

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Hybrid Virtualization The Next Generation of XenLinux

Server Consolidation with SQL Server 2008

Security Overview of the Integrity Virtual Machines Architecture

Hyper-V Server 2008 Setup and Configuration Tool Guide

Microsoft Hyper-V Server 2008 R2 Getting Started Guide

Windows Server Virtualization & The Windows Hypervisor. Background and Architecture Reference

Windows Small Business Server 2003 Upgrade Best Practices

Windows Server Virtualization & The Windows Hypervisor

Pipeliner CRM Phaenomena Guide Sales Pipeline Management Pipelinersales Inc.

Microsoft Windows Common Criteria Evaluation

Virtual Machines. COMP 3361: Operating Systems I Winter

Virtual machines and operating systems

Kronos Workforce Central on VMware Virtual Infrastructure

Virtualization and the U2 Databases

Pipeliner CRM Phaenomena Guide Administration & Setup Pipelinersales Inc.

Windows Server 2008 R2 Hyper-V Live Migration

Windows Server Virtualization An Overview

COS 318: Operating Systems. Virtual Machine Monitors

Hypervisors. Introduction. Introduction. Introduction. Introduction. Introduction. Credits:

Pipeliner CRM Phaenomena Guide Sales Target Tracking Pipelinersales Inc.

Abstract. Microsoft Corporation Published: August 2009

Pipeliner CRM Phaenomena Guide Getting Started with Pipeliner Pipelinersales Inc.

A Hypervisor IPS based on Hardware assisted Virtualization Technology

UPGRADE. Upgrading Microsoft Dynamics Entrepreneur to Microsoft Dynamics NAV. Microsoft Dynamics Entrepreneur Solution.

FRONT FLYLEAF PAGE. This page has been intentionally left blank

VMware and CPU Virtualization Technology. Jack Lo Sr. Director, R&D

Monitoring Microsoft Hyper-V. eg Enterprise v6.0

KVM: Kernel-based Virtualization Driver

Pipeliner CRM Phaenomena Guide Opportunity Management Pipelinersales Inc.

Virtualization. Pradipta De

Pipeliner CRM Phaenomena Guide Add-In for MS Outlook Pipelinersales Inc.

The 2007 R2 Version of Microsoft Office Communicator Mobile for Windows Mobile: Frequently Asked Questions

kvm: Kernel-based Virtual Machine for Linux

Virtualization Technology. Zhiming Shen

Microkernels, virtualization, exokernels. Tutorial 1 CSC469

CS5460: Operating Systems. Lecture: Virtualization 2. Anton Burtsev March, 2013

Virtualization in the ARMv7 Architecture Lecture for the Embedded Systems Course CSD, University of Crete (May 20, 2014)

Hardware & Software Requirements for BID2WIN Estimating & Bidding, the BUILD2WIN Product Suite, and BID2WIN Management Reporting

Hardware Based Virtualization Technologies. Elsie Wahlig Platform Software Architect

Deploying Microsoft RemoteFX on a Single Remote Desktop Virtualization Host Server Step-by-Step Guide

Windows BitLocker Drive Encryption Step-by-Step Guide

How to Test Out Backup & Replication 6.5 for Hyper-V

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Virtualization. Types of Interfaces

Abstract. Microsoft Corporation Published: November 2011

How To Understand The Power Of A Virtual Machine Monitor (Vm) In A Linux Computer System (Or A Virtualized Computer)

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

SUSE Linux Enterprise 10 SP2: Virtualization Technology Support

Developing a dynamic, real-time IT infrastructure with Red Hat integrated virtualization

Hypervisors and Virtual Machines

PCI-SIG SR-IOV Primer. An Introduction to SR-IOV Technology Intel LAN Access Division

Knut Omang Ifi/Oracle 19 Oct, 2015

Hands-On Lab: WSUS. Lab Manual Expediting WSUS Service for XP Embedded OS

Virtualization. Jukka K. Nurminen

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Basics of Virtualisation

SmoothWall Virtual Appliance

Basics in Energy Information (& Communication) Systems Virtualization / Virtual Machines

Chapter 5 Cloud Resource Virtualization

Virtualization. Jia Rao Assistant Professor in CS

Nested Virtualization

OSes. Arvind Seshadri Mark Luk Ning Qu Adrian Perrig SOSP2007. CyLab of CMU. SecVisor: A Tiny Hypervisor to Provide

HyperV_Mon 3.0. Hyper-V Overhead. Introduction. A Free tool from TMurgent Technologies. Version 3.0

COS 318: Operating Systems. Virtual Machine Monitors

evm Virtualization Platform for Windows

VMware Server 2.0 Essentials. Virtualization Deployment and Management

System Requirements for Microsoft Dynamics NAV 2013 R2

Credit Card Extension White Paper

BHyVe. BSD Hypervisor. Neel Natu Peter Grehan

Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0

How To Set Up A Load Balancer With Windows 2010 Outlook 2010 On A Server With A Webmux On A Windows Vista V (Windows V2) On A Network With A Server (Windows) On

Product Guide for Windows Home Server

IOS110. Virtualization 5/27/2014 1

Parallels Desktop 4 for Windows and Linux Read Me

Virtual Machines. Virtual Machine (VM) Examples of Virtual Systems. Types of Virtual Machine

Virtualization in Linux KVM + QEMU

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

KVM Security Comparison

Windows Embedded Security and Surveillance Solutions

Kernel Virtual Machine

Cooperation of Operating Systems with Hyper-V. Bartek Nowierski Software Development Engineer, Hyper-V Microsoft Corporation

Solution Recipe: Improve PC Security and Reliability with Intel Virtualization Technology

Technical Brief for Windows Home Server Remote Access

HBA Virtualization Technologies for Windows OS Environments

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Expanded Frequency Capping

WHITE PAPER. AMD-V Nested Paging. AMD-V Nested Paging. Issue Date: July, 2008 Revision: 1.0. Advanced Micro Devices, Inc.

How to Secure a Groove Manager Web Site

Overview of Microsoft Office 365 Development

Understanding Full Virtualization, Paravirtualization, and Hardware Assist. Introduction...1 Overview of x86 Virtualization...2 CPU Virtualization...

Intel Virtualization Technology for Directed I/O

Transcription:

The Microsoft Windows Hypervisor High Level Architecture September 21, 2007 Abstract The Microsoft Windows hypervisor brings new virtualization capabilities to the Windows Server operating system. Its design introduces several new concepts related to partitioning, CPUs, memory, address translation, interrupts, and virtualization intercepts. This paper provides a high level overview of the architecture of the Windows hypervisor and how these functions support the virtualization of guest operating systems. Disclaimer This is a preliminary document and may be changed substantially prior to final commercial release of the software described herein. The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo, person, place or event is intended or should be inferred. 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

1 Hypervisor Overview & Definitions Hypervisor and Partitions A hypervisor is a layer of software that sits just above the hardware and beneath one or more operating systems. Its primary job is to provide isolated execution environments called partitions. Each partition is provided with its own set of hardware resources (memory, devices, CPU cycles). The hypervisor is responsible for controlling and arbitrating access to the underlying hardware. Guests Software running within a partition is referred to as a guest. A guest might consist of a fullfeatured operating system like Windows Vista or a small, special-purpose kernel. The hypervisor is guest-agnostic. Specialized Partitions In most respects, all partitions are equal. However, some partitions may be granted special privileges or assigned specialized functions. The root partition acts as the default owner of all hardware resources. It is also typically in charge of power management, plug and play, and hardware failure events. The root partition is also responsible for creating and managing other partitions and assigning hardware resources. In some respects, it acts like the service processor on a machine with hardware partitioning facilities. The root partition is also responsible for loading and booting the hypervisor. Virtualization and Emulation The hypervisor provides support for hardware virtualization. Virtualization provides multiple logical instances of CPUs and other hardware resources. These logical instances are mapped onto physical hardware resources using a variety of techniques. One such technique is emulation, the simulation of a processor or device using software. While the hypervisor facilitates processor and device emulation, the architecture attempts to provide or facilitate alternatives to emulation for performance reasons. Legacy Guests, External Monitors and Enlightenment The hypervisor provides support for legacy guests. A legacy guest is an operating system that was written to run on a physical machine and has no knowledge of the fact that it is running within a virtualized environment. Legacy guests require substantial infrastructure including a system BIOS and a wide variety of emulated devices. This infrastructure is not provided directly by the hypervisor. A separate piece of code called an external monitor is required for legacy guests. External monitors can request specific intercepts (i.e. notifications that specific events have occurred within a guest). This support for guest-mode intercept handlers provides added flexibility and reduced complexity within the hypervisor. Device emulation provides broad compatibility, but it results in poor performance. Other operational assumptions made by legacy guests also add to virtualization overhead. This virtualization performance overhead can be mitigated by enlightening a legacy operating system. An enlightened guest has knowledge of the fact that it is running within a virtualized environment and changes its behavior accordingly. Various degrees of enlightenment are possible. For example, a guest might use a specialized block device driver to talk to an idealized virtual disk. More extensive enlightenment might involve modifying a guest HAL so it talks to an idealized synthetic interrupt controller. 2007 Microsoft Corporation. All rights reserved. 2

2 Hardware Requirements The Microsoft hypervisor requires the following hardware: Processor virtualization extensions: The hypervisor runs out-of-context. This means the hypervisor s code and data are not mapped into the address space of the guest. The hypervisor relies on processor support for various intercepts (i.e. notifications of specified processor events). For example, the hypervisor may request to intercept execution of the HLT instruction by the guest. Intercepts must cause a context switch to a separate address space that maps the hypervisor s code and data. The intercept mechanism must also provide the hypervisor with enough information to allow it to identify the reason for the intercept and respond accordingly. Intel s VT extensions and AMD s AMD-V extensions both provide this required support. 3 Design Goals Isolation & Security: The hypervisor must provide a high degree of isolation between partitions. Data owned by one partition should not be allowed to leak to another partition. Communication between partitions should be restricted to well-defined communication paths. Efficient Virtualization: The hypervisor must support efficient virtualization of processors, memory and devices. Scalability: The hypervisor must provide scalability to large numbers of processors. 4 High-level Description By design, the hypervisor is a small and relatively simple piece of code. It will run on 64-bit x64 systems and will support both 32-bit and 64-bit guests The hypervisor runs in its own context and runs at a privilege level higher than that of guests. From this perspective, it makes sense to view the hypervisor as running at ring -1 (where rings 0 through 3 are used for guests). The hypervisor is designed with concurrency in mind, and it is a design goal to provide scalability to large numbers of processors. In an effort to minimize complexity, the hypervisor is not pre-emptable. External interrupts and inter-processor interrupts are disabled while code is running within the hypervisor. (SMIs and NMIs will remain enabled.) Partitions A partition is the basic unit of isolation supported by the hypervisor. A partition is made up of a physical address space and one or more virtual processors. Furthermore, a partition can be assigned specific hardware resources (memory, devices and CPU cycles) and certain permissions and access rights. Address Translation The hypervisor provides a physical memory virtualization facility. This allows each partition to have a zero-based contiguous physical address space. Virtual processors support all x86 paging features and memory access modes (including large pages, global pages, write protection control, PAE, four-level page tables, paging disabled, etc.). The hypervisor implements this behavior by means of two levels of address translation. The first level is controlled by the guest, allowing it to define a virtual address (VA) to guest physical address (GPA) translation. This is done via standard page tables maintained by the guest. 2007 Microsoft Corporation. All rights reserved. 3

Second-level address translation is provided by the hypervisor without knowledge of the guest. This allows the hypervisor to virtualize physical memory, mapping guest physical addresses (GPAs) to system physical addresses (SPAs). The layout of a guest s physical address space is defined at partition creation time and can be modified at runtime. Today s processors only implement one level of address translation in hardware, so initial implementations of the hypervisor use shadow page tables to effect a two-level translation. If future processors support two levels of translation in hardware, subsequent versions of the hypervisor may take advantage of the facility. Virtual Processors Each partition has one or more virtual processors associated with it. A virtual processor is a virtualized instance of an x86 processor complete with user-level and system-level registers. A virtual processor is scheduled on a physical processor (more specifically, on a hardware thread or core). The hypervisor does not use hard affinities, so a virtual processor may move from one physical processor to another. The hypervisor schedules virtual processors according to specified scheduling policies and constraints. Note that while virtual processors are similar to threads within a traditional kernel, they differ from threads in the sense that they don t have an assigned address space. Rather, the guest operating system is able to switch the address space of a virtual processor by modifying its page table base (i.e. reload CR3 on x86 processors). The hypervisor virtualizes all modes of an x86 processor including real mode, 16-bit and 32-bit protected mode, v86, long mode, etc. The hypervisor will utilize built-in virtualization support in every possible situation. If virtualization support is missing for one or more processor modes (e.g. real mode), the hypervisor will provide the necessary emulation layer. Invoking the hypervisor If guest code is actively executing, there are three ways code can enter the hypervisor: 1. Interrupts: Asynchronous events including external interrupts, IPIs, NMIs, SMIs, SIPIs, etc. (see section below on Interrupt Routing and Delivery) 2. Intercepts: Synchronous events the hypervisor has requested to intercept within the guest 3. Hypercalls: A programmatic interface to the hypervisor (analogous to kernel calls ) Interrupt Routing and Delivery The hypervisor is responsible for routing interrupts to individual guests. All real hardware interrupts are delivered to the hypervisor first (regardless of whether interrupts are currently masked by the guest). The hypervisor can then decide whether to handle the interrupt itself or reflect it to a partition. Interrupts are used to signal a variety of asynchronous events from assigned hardware devices, VSPs, other virtual processors, etc. While some interrupts are targeted at specific virtual processors (e.g. the virtual equivalent of IPIs), most interrupts are targeted at a partition. The hypervisor is responsible for routing the interrupt to an appropriate virtual processor. From the perspective of a virtual processor, all interrupts delivered by the hypervisor are dispatched in the same way as traditional interrupts. On x86 processors, the behavior is defined by the guest s interrupt descriptor table (IDT). Most interrupts delivered by the hypervisor are maskable, so the guest can choose to hold off interrupts. Intercept Handling, Redirection and Reflection 2007 Microsoft Corporation. All rights reserved. 4

When the hypervisor is invoked due to some action performed by a guest (e.g. an access to an unmapped page, execution of a HLT instruction, etc.), the hypervisor can choose to do one of three things: 1. Silently handle the intercept and return to the guest. For example, the hypervisor may receive a page fault intercept because the page table entry in its shadow page table has not yet been populated despite the fact that the guest has already mapped the page in its page tables. In this case, the hypervisor would fill in the corresponding shadow page table entry and return back to the guest in such a way that the guest was unaware that the intercept occurred. 2. Redirect the intercept to an external monitor. An external monitor is a piece of code running in another partition that is notified when an intercept occurs. This allows, for example, emulation of legacy devices. The hypervisor allows an external monitor to specify which events it would like to intercept. Examples on the x86 include: I/O port accesses, MSR accesses, CPUID, HLT, specific faults and exceptions, accesses to specified guest physical page ranges, and triple faults. 3. Reflect the intercept back to the guest. The hypervisor can simulate an exception within the guest. For example, if the hypervisor receives a page fault intercept and determines that the accessed page is unmapped within the guest s page tables, it can reflect the page fault back to the guest. Device Ownership and Assignment The hypervisor effectively owns the following hardware facilities: CPUs, memory, and APICs. All other hardware facilities and devices are assigned to the root partition. Hypercalls and Inter-partition Communication (IPC) The hypervisor exposes a direct call mechanism to guests in the form of a reserved instruction. Execution of this instruction in a guest generates an intercept into the hypervisor. The hypervisor treats this intercept as a hypercall. Hypercall arguments are passed through registers, with defined semantics and return values. The hypervisor supports a simple IPC mechanism for sending fixed-sized messages between partitions. Message sends are initiated through the use of a hypercall. Message receipt is signaled through the use of an interrupt. In addition, the hypervisor supports shared memory as a general IPC mechanism. Guests can build more complex communication mechanisms and protocols using these primitives. Memory Intercepts Intercepts can be requested for specific GPAs. The hypervisor supports read, write and execute intercepts. When a memory intercept is detected by the hypervisor, it sends an intercept message to the external monitor. The external monitor typically responds by removing the intercept (e.g. paging in memory to back the GPA) and allowing the intercepted instruction to be re-executed. Alternatively, the external monitor can choose to complete the intercepted instruction in software. This is required for certain emulation scenarios (e.g. MMIO, VGA frame buffer accesses, etc.). 2007 Microsoft Corporation. All rights reserved. 5