Towards Hardware Embedded Virtualization Technology: Architectural Enhancements to an ARM SoC VtRES 2013 P. Garcia, T. Gomes, F. Salgado, J. Monteiro, A. Tavares
Summary 1. Current landscape in 2. Embedded Virtualization Use-Cases 3. The case for Hardware support to Virtualization 4. The development platform: why an ARM SoC? 5. Hardware-Based Virtualization extensions 6. Results, Conclusions and Research Roadmap
1-Current Landscape in Contradicting Requirements Low Cost (size, power consumption) Reliability (fault tolerance, hard real-time) Feature-Richness (touch-screen graphical interfaces) Design solutions: Application Specificity Technologies for separation of concerns
1-Current Landscape in Traditionally: COTS solutions: easy to get up and running, short development time Application specific: big engineering effort, long time-to-market Nowadays: Use COTS products to develop application-specific solutions
1-Current Landscape in Proven IP hardware Proven software stacks Hypervisor
2-Embedded Virtualization Use-Cases Consolidation Legacy software stacks with modern, feature-rich OSs Co-existence GPOSs and RTOSs Functionality and Temporal Partitioning Security and Safety concerns, verification effort
2-Embedded Virtualization Use-Cases Simpler use-cases than for server/enterprise virtualization Most commonly, two partitions (RTOS or baremetal + GPOS) In the safety-critical domain, replicated partitions for N-Modular Redundancy Most likely, Type-0 Hypervisors suffice for most applications
3-The case for hardware support to Virtualization Type-0 Hypervisor is simple: lends itself well to hardware implementations Hardware implementations are (more) deterministic, higher performance Architectural features which merely expedite software functionality do not limit scalability Exploring application specificity allows designing custom Virtualized solutions
3-The case for hardware support to Virtualization Research goals Novel architectural and micro-architectural support to Virtualization Processor level and SoC level Expedite Virtualization requirements without sacrificing scalability/flexibility Research the feasibility of hardware-complete Hypervisor for application-specific solutions Methodology Implement Virtualization support on an FPGA-based ARM SoC
4-The development platform: why an ARM-SoC? ARM is ubiquitous in the embedded domain ARM Virtualization Extensions (VE) already provide (some) support to virtualization Implementing novel architectural support required an open-source processor Modifying key pipeline elements Adding internal registers
4-The development platform: why an ARM-SoC? An ATMEL AT91SAM9XE was cloned and implemented on an Xilinx Virtex 5 FPGA Implements the ARMv5TE architecture 16KB ICache 8KB DCache Memory Management Unit Periodic Interval Timer Advanced Interrupt Controller USART DDRII Controller SD Card controller
4-The development platform: why an ARM-SoC? The developed platform is (most likely) microarchitecturally different than the commercial ones Results will vary across different implementations, but will most likely result in minor variations Developed platform is 100% compatible with all ARM toolchains/software
5-Hardware Based Virtualization Extensions CPU Virtualization New processor mode (Hyper mode) Instruction to enter Hyper mode Traps to Hypervisor if executed by Guest OS Traps to Guest OS if executed by User application Banked Register File for Hypervisor Control Registers (partition ID, exception handling register) Controls if processor exceptions trap to Guest OS or Hypervisor CPU virtualization is (for now) nearly identical to ARM VE
5-Hardware Based Virtualization Extensions Memory Virtualization ARM VE supports two stage address translation For the embedded domain use-cases (small, fixed number of VMs) guest segmentation is likely to suffice Hypervisor registers to control guest memory access (segmentation) Less overhead than MMU double page table walk
5-Hardware Based Virtualization Extensions Peripheral and Interrupt Virtualization At this point, support has only been added to the PIT and AIC PIT was extended with an additional Timer/Counter, responsible for the VMM tick Accessible only to Hyper mode Can trigger an interrupt that bypasses AIC, causing transition to Hyper mode VMM Scheduler ISR PIT modified with update logic In Hyper mode, PIT updating performs hardware timekeeping
5-Hardware Based Virtualization Extensions
5-Hardware Based Virtualization Extensions Peripheral and Interrupt Virtualization The AIC was extended with an additional Interrupt Monitoring Register Interrupt behavior depends on Guest Interrupt Mask Register VMM Interrupt Monitoring Register
5-Hardware Based Virtualization Extensions Interrupt is intended to current partition only No Hypervisor invoked Interrupt Guest IMR Set Hyper IMR Clear
5-Hardware Based Virtualization Extensions Interrupt is intended to another partition No Hypervisor invoked Handled when respective partition is loaded Interrupt Guest IMR Clear Hyper IMR Clear
5-Hardware Based Virtualization Extensions Interrupt is intended to another partition Real Time Requirement Hypervisor partition scheduling vector invoked Interrupt Guest IMR Clear Hyper IMR Set
5-Hardware Based Virtualization Extensions Interrupt can be intended to several partitions Hypervisor software invoked to decide appropriate action Interrupt Guest IMR Set Hyper IMR Set
5-Hardware Based Virtualization Extensions Comparing to ARM VE. Either all interrupts trap to Hypervisor Additional latency in interrupt handling Or all interrupts trap to current Guest Unlikely case Presented approach offers higher flexibility in interrupt handling
6-Results, Conclusions and Research Roadmap Synthesis results on a Xilinx Virtex 5 FPGA
6-Results, Conclusions and Research Roadmap Performance results At this point, no system-level benchmarking has yet been performed Experiments were conducted upon Hypervisor microoperations
6-Results, Conclusions and Research Roadmap
6-Results, Conclusions and Research Roadmap Some features seem to be lacking in ARM VE Namely, finer granularity in how interrupts are handled The presented approach offers higher degree of control in interrupt handling Memory address translation (guest physical to host physical) can be simplified Decreasing translation latency in segmented memory models Experiments have demonstrated the performance increase offered by the Virtualization Extensions
6-Results, Conclusions and Research Roadmap Research roadmap System level benchmarking (Linux + FreeRTOS guests) Extensive VMM cycle-accurate characterization Identification of further limitations in ARM VE Experiment with hardware-complete Hypervisor Flexibility and scalability limitations Performance benefits Real Time benefits Hardware partition scheduling Peripheral-specific Virtualization techniques Memory hierarchy features for virtualization
Thank you for your attention More details on paper Questions?