Improving Agility and Elasticity in Bare-metal Clouds Yushi Omote, Takahiro Shinagawa, Kazuhiko Kato University of Tsukuba, The University of Tokyo 1
Bare-metal Clouds An IaaS for high performance and device functionality User transparent No Virtual Machine Physical Provider Machine 2
-deployment Problem Long wait time sacrifices agility and elasticity (1) Image Copy (2) Reboot from Local Disk (Tens of minutes) (A few minutes) Installer User User Image Image Image Server 3
Existing Approach 1 Streaming Deployment [Clerc et al. IPCCC 10] Image Server User Special Driver Network Boot + Background Copy Agility and Elasticity Performance -specific drivers are required. transparency 4
Existing Approach 2 Conventional s [VMware 01, Xen 03, KVM 07] User Streaming deployment with s Image Server Agility and Elasticity transparency Continuous virtualization overhead Performance 5
Deployment with a Special-purpose 1) Streaming deployment 2) Seamless de-virtualization Agility and Elasticity transparency Performance User User 6
Challenge Expose & Control Physical Devices Virtual Devices? Direct I/O? Control I/Os Expose physical interface 7
Device-interface-level I/O mediation A device mediator performs: Device Driver Device Mediator (1) I/O interpretation to understand I/O context (2) I/O redirection to perform network booting (3) I/O multiplexing to perform background install Physical device interface 8
I/O Interpretation Determine when/how to mediate I/O requests Device Driver Device Mediator Understand state transitions based on monitoring I/O Device State Transitions 9
I/O Redirection Data LBA=4 NUM=8 (1) Interpret Image Server (2) Redirect LBA=4 NUM=8 Small Request Interrupt (3) Restart Disk 10
I/O Multiplexing Status Check Request (1) Request Request Idle State Image Server (2) Emulate (3) Queue Disk 11
CPU/Memory Virtualization for De-virtualizable CPU Memory Guest Physical Address = Physical Address No indirection runs passively with VMX No guest scheduling 12 Identity Mapping exposes physical memory Mark regions as reserved (via BI INT15/e802)
De-virtualization (1) Turns off IO VM exits (2) Turns off nested paging (3) Turns off CPU virtualization Device Driver H/W Find safe I/O timing Unsynchronized TLB flush 13 Ease VM exits condition (VMXOFF Issue)
Performance Evaluation Deployed 32-GB Image (Ubuntu 14.04 64-bit) -startup Time Cassandra Throughput A HPC Cluster Storage Throughput InfiniBand Latency Intel Xeon X5680 (3.33 GHz) / 96GB RAM HDD 500GB/7200 RPM SATA Mellanox InfiniBand (4X QDR) Intel 82575 EM GbE Network Card 14 Interconnected by A Mellanox Grid Director InfiniBand Switch & A FUJITSU SR- S348TC1 GbE Switch
-startup Time Image Copy Reboot+Firminit. Boot Boot Image Copy 370 145 29 Proposed Streaming (NFSRoot) VM Streaming (KVM/NFS) 49 5+58 30+42 Quick start up (8.6 times faster) 0.00 150.00 300.00 450.00 600.00 Elapsed Time (sec) 15
Cassandra Throughput (Throughout Deployment) Proposed KVM (No Background Install) % of Baremetal 120% 110% 100% 90% 80% 70% Seamless de-virtualization 50 200 350 500 650 800 950 1100 1250 Elapsed Time (sec) Eventual bare-metal performance 16
Storage Throughput Read Write Throughput (MB/sec) 120.00 90.00 60.00 30.00 0.00 Bare-metal performance 117 112 112 112 112 115 101 100 Bare-metal Deploy Devirt KVM/Local 17
InfiniBand RDMA latency 1.70 Bare-metal performance 1.61 Latency (usec) 1.28 0.85 0.43 1.30 1.30 1.30 0.00 Baremetal Deploy Devirt KVM/Pass 18
Conclusion Improved agility and elasticity in bare-metal clouds De-virtualizable with streaming deployment Device-interface-level I/O mediation Achieved quick startup of an 8.6 times faster than image copy Preserved high performance & -transparency 19
Future work Generating device mediators from specification Reduce development cost of device mediators More advanced features of IaaS clouds Live migration and checkpointing 20
Thank you 21