Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1
Process Virtual Machines Capable of supporting an individual process. Virtualization located at ABI (app-binary interface). On top of OS and hardware. Emulates user-level ISA and OS system-calls. VM supports ends with process termination. Virtualizing software: runtime. System Virtual Machines Provides a complete system environment. Virtualization located at ISA (instruction set architecture) interface. On top of hardware, allows access to I/O, networking, display. Emulates user and system-isa of guest hardware. While alive, VM supports OS with users and processes. Virtualizing software: virtual machine monitor (VMM). Defined in early VM concept (1960s). 2
Examples of Process VMs High level languages virtual machines: Usually do not match an existing platform: Java,.NET, Python. Designed for portability: Follow the high level language closely. Implement very few if any HW and OS specific operations. Application installation: portable intermediate code (virtual ISA) Bytecodes instead of binary objects. Calls to standard libraries. VM executes intermediate portable code: Using interpretation, standard compilation or JIT compilation (just in time). In general, the strong execution control and encapsulation is used to create languages that are strongly typed, dynamic, with garbage collection... Examples of System VMs Hosted VMs Virtual machine monitor over the OS: VMware, VirtualPC. Pros-Cons: + Installed as an application, uses the native OS and its low-level services (E/S, drivers). - Loss of efficiency due to the additional indirection level. Challenges: Heterogeneity: provide the same low-level OS features on different platforms (timing, data formats, arithmetic precision, semantics). Emulate the ISA of the hosted system: The emulated system is just an application of the host system. How to correctly intercept system calls and convert them into native calls. Dealing with interruptions, memory management... 3
Applications of System VMs Multiprogramming: Each user can run a simpler single-user OS. Multiple virtual applications. Multiple secure environments (e.g. virtual hosting). Environments with mixed OSs: vintage OSs, OS development. Running old SW on new HW. Multi-platform development. Safe development (without crashing the development machine). Emulating client environments for software maintenance. Instrumentation: measurements, monitoring. Checkpointing, recovery, migration. The Versatility of Virtualization 4
Process VMs Process VMs: Java <->.Net Java Virtual Machine Architecture <-> CLI Similar to an ISA. Java Virtual Machine Implementation <-> CLR Equivalent to the implementation of a computer. Java bytecodes <-> Microsoft Intermediate Language (MSIL) ISA instructions. Java Platform <->.NET Framework ISA + libraries; a more abstract ABI. Key features of a high level VM: Security and protection. Network access. Instruction set model. Performance. 5
Process VMs Key Issues: Protection Allowing the loading/execution of programs from unreliable sources. Sandbox: Access to remote files protected by the remote system. Access to local files through reliable libraries and a security manager. Protecting the data and VM code from hosted applications: While they share the same process. Static verification by a reliable compiler. Dynamic verification by a reliable emulator. Process VMs Key Issues: Protection Jump instructions: All jumps have to be offsets within the code segment. The loader checks whether all jumps are within the process bounds. Indirections in execution flow are only method calls and method return operations. Reading and writing instructions (load/store): Checked statically by the loader: e.g., out of bounds accesses to local variables or to object fields. Checked dynamically by the execution environment: e.g., array accesses or dereferencing null pointers. 6
Process VMs Key Issues: Network Many modern languages are strongly geared toward a networked environment. It s important to limit the bandwidth in transmitting apps or components: The instruction set is encoded in a compact format. The virtual ISA is just a specification. Emulation converts the specification into native HW instructions. Allows dynamic class loading if needed. Distributes the class loading cost during the whole application execution. Process VMs Key Issues: Performance Cons: OO languages are slower than procedural languages. Running OO languages in a VM is even slower. Pros: There is a tendency for the HW advances to be quicker than SW optimizations. VMs are highly optimized. 7
JVM: Java Virtual Machine JVM Architecture Java virtual ISA: Bytecode definition. Set of predefined types. 8
JVM Architecture Memory and Registers: Methods area Contains the core Implicit Program Counter: There is no direct load/store at the CPU. JVM Architecture Memory and Registers: Java stacks: Store local variables, operands and method arguments. Implicit Stack Pointer (one per thread): Only accessible via pop/push. Stack size is implementation dependent: Overflow causes StackOverflowError exception. 9
JVM Architecture Memory and Registers : Heap: Global memory for objects and arrays. Dynamically allocated when an object or array is created. Heap size is implementation dependent: Overflow causes OutOfMemoryError exception. JVM Architecture Garbage Collection: Object lifecycle: instantiation, use, recycling... When the last reference is removed, the unreachable object becomes garbage. GC recovers heap memory occupied by unreachable objects. The Java specification does not predetermine the algorithm used. 10
JVM Architecture Emulation Engine: Emulates the instructions expressed by the Java bytecodes. Main techniques: Basic interpretation. Removal of calls with known results (precoding). Binary translation. Uses native methods and the implicit PC and SP. JVM Architecture Class Loader Subsystem: Converts class files into an internal representation in the VM. Locates classes: Dynamically, on demand. Locally or on the web. Checks the correction and integrity of the.class files. Key component of the security model. 11
JVM: Protection Sandbox protects the local system outside the JVM. Protection inside the JVM: Problem: Use in distributed environment with unreliable code. Solution: The application can only access its heap and stack so as not to modify the JVM. The application interacts with the JVM only through calls to local libraries containing reliable code. Program execution is verified statically and dynamically. JVM: Protection Security Manager, class from API java.lang: Checks whether a request operation is allowed: Returns if yes; throws an exception if not. Set of checkxxx methods: e.g., File..., Socket..., PropertyPermission. Associated with the application at startup time: Cannot be modified, erased or replaced. The user may decide what to allow: e.g., what files to expose and with what access, which network ports to open, etc... Limitation: only qualitative protection No quantitative protection: on memory allocation, thread creation, stack size on recursion. The application is additionally limited by its owner s permission: The JVM is a user-level process. 12
JVM: Protection Static checking (at loading): Types. Reference validity. Control transfer: Check that the code does not jump outside the process s memory. Dynamic checking (at runtime): Null references. Array limits. Type conversion: up-casting: Checked at compile time. down-casting: Checked at runtime. System VMs 13
Virtual Machine Monitor (VMM) Program responsible for virtualization: Also called hypervisor. Arbitrate access to physical resources. Present a set of virtual resources to each of the hosted machines. Placed between the exposed HW and the conventional HW. Manages the HW allocation/access to the host platform. Gives each hosted SO the illusion of owning the resources. System VMs Key Issues: State Management Manipulated state of the host machine: Store the state of the hosts HW resources. Map the guestmachine s state on the host. Approaches: Use indirection to access the state of the hosted machine. Copy the hosted machine s state to the host. Half way: copy what is frequently used, resort to indirection for what is seldom used. 14
System VMs Key Issues: Resource Management The VMM has to keep the host s HW resources under control: Assign them to the VMsand make sure that they are returned. A problem similar to time-sharingin the SO: In each moment, the running process believes he owns the resource until the dispatcher runs and takes over. Approach in system VMs: The hosted OS runs until: It uses a privileged instruction. There is a system interrupt. There is an exception (e.g. page fault). Problems in x86 The x86 does not allow interposition in privileged instructions. Two solutions: Binary rewriting: Sweep the binary code in memory and replace all privileged code with code that is interceptableby the VM (VMware). Paravirtualization: Do not use non-virtualizable instructions (Xen). 15
Paravirtualization: Xen Runs on an OS and provides virtual environments to execute otheross. Executes a subset of the x86 ISA: Provides a modified interface to the hosted OS. Avoids X86 operations that are difficult to virtualize. Eliminates the need for more complex virtualization (binary rewriting). All HW accesses are rerouted to the host machine. When the calls return, the hosted OS receives the results as Xenevents on a queue. x86 has 4 protection rings: In most OSs, system is 0 and applications are 3. In Xen, the hosted OS runs in ring 1. This way it has system privileges but the host VMM controls its requests. Changes to the Windows kernel: 1.36% (3000 lines of code), >90% performance Migration Moving running VMs between machines. Alternatively, one can move the application state, but it s not always the simplest solution. Motivation: Load balancing. Security: Move a suspicious VM to a protected/restricted location. Co-location: Group VMs that are communicating intensely.. Fault tolerance: Move out of unstable HW. Maintenance: Free a machine that is/will undergoing maintenance/updates. 16
Migrate to Load Balance Migration: Issues Time needed to migrate a large state: Transmit only part of the state. Transmit additional part as they become needed. Marshalling and secure transmission. Need for compression and encryption. HW heterogeneity: Normally solved by virtualization itself. 17
Example: VMWare VMotion VMWare s VM migration technology. Belongs to VMWare VirtualCenter: Infrastructure management software. Targeted at x86 clusters: Connected by a LAN. Example: VMWare VMotion Live migration: 1. Check that the VM is running stably. 2. Base copy: Copy VM memory to the destination host. Mark copied pages with a flag. Continue execution. 3. When the base copy finishes, suspend the VM on the source host. 18
Example: VMWare VMotion 4. Final copy: Sending the incremental capsule: Containing all pages that changed since the base copy. 5. Restart the VM: Activate the VM on the new host. Notifies the router of the new physical location of the virtual MAC address. VMotion allows concurrent migrations. Example: VMWare VMotion Limitations: Both nodes need to be on the same cluster, administered by the same VMWare VirtualCenter. The file systems must be identical and hosted at a distributed file system. Both processors must have the same architecture. The applications that are running have to be nondistributed. 19
Next Time... Storage in Cloud Platforms 20