Cloud Operating Systems for Servers Mike Day Distinguished Engineer, Virtualization and Linux August 20, 2014 mdday@us.ibm.com 1
What Makes a Good Cloud Operating System?! Consumes Few Resources! Fast Booting time! Provides Containers or Virtual Machines! Automatically Deployed and Updated! Updates are Atomic! Runs Workloads with Excellent Performance! Good Networking and Storage Support Provides network and block storage to containers or virtual machines 2
Most Cloud Operating Systems are Based Upon Linux! Kernel Configuration Enables Build of Tiny Kernels 3
Most Cloud Operating Systems are Based Upon Linux (cont d.)! Kernel Configuration Enables Tiny Kernels! GPL Enables Source Modification! Linux Enjoys a Heritage of Embedded Systems Embedded Systems share many requirements with Cloud Operating Systems! Choice of Network File Systems and Block Services! Up-to-Date Networking and Storage Support 4
Innovation in Cloud Operating Systems! Single-purpose Host OS - designed to Run Multiple Instances of a Different Guest OS! Perhaps Using the Same Kernel, Perhaps Not! Clustered and Distributed Host OS and toolkit 59
Innovation in Cloud Operating Systems, cont d.! Host Designed to Migrate Workloads! Atomic Updates of Host OS! Use of Non-Traditional Systems Languages! C++, golang! Unusual Performance Techniques! Single Memory Space! Abnormally High use of Lock-Free Algorithms and Structures! Collaborative Memory Management! Tickless Operation 69
Primary Techniques Used by Cloud Operating Systems to Reduce Overhead! Shared Host Kernel! Linux Containers - Each Workload Shares the Host Kernel!! Tiny, Super-tuned Guest Kernel! Running in a Virtual Machine! OSv - lockless, single memory space, paravirtual I/O, Cooperative Memory Management etc. 79
A Survey of Cloud Operating Systems! CoreOS Linux for Massive Server Deployments! https://coreos.com! Project Atomic Deploy and Manage your Docker Containers! http://www.projectatomic.io! OSV Probably the Best OS for Cloud Workloads! http://osv.io 8
OSv! http://osv.io Specialty Operating System Designed to Run Efficiently in a Virtual Machine Single process group, Single Memory space Built-in VM for running Java and other languages with same byte codes clib, POSIX environment virtio drivers, netchannels Implemented largely in c++ Significant re-use of freebsd XFS, networking stack 9
OSv Attacks on Overhead and Jitter! OSv Attacks on Performance Overhead: Avoids resource starvation through a very small kernel, single flat memory space Reduces Exits on faults through a single process group and single memory space No need to translate between user-space and kernel-space addresses.! OSv Attacks on Jitter: JVM collaborates with hypervisor, has intelligent garbage collection net channels moves protocol processing out of interrupt handler Single-process execution environment reduces synchronization issues Lock-free algorithms, RCU Tickless Scheduler 10
Docker! http://www.docker.com Distributed runtime (with REST API) for deploying Linux Containers (LXC). Docker is really about containers (for now) Docker package format and online repositories provide the real value. Linux Containers virtualize the host kernel Thinner virtualization than hypervisors, completely integrated with Linux Docker Container inherits the performance and jitter characteristics of the host kernel 11
Docker Attacks on Overhead and Jitter! Docker Attacks on Performance Overhead: With Docker containers, no additional resource translation beyond kernel and user spaces. Containers may use physical I/O devices; in which case we don t need interrupt virtualization Uses less memory than most hypervisors! Docker Attacks on Jitter: Does not need to virtualize timer, other interrupts More predictable scheduling model (one kernel scheduler - not two) Holds true for I/O schedulers as well 12
CoreOS! Small Linux Kernel! Linux Containers! Docker! etcd - Distributed Dictionary - Provides Service Discovery, events and Configuration! Atomic updates to Host OS through active/passive Partition Scheme! fleet Clustering - Run Container Workloads Throughout the Cluster 13
Project Atomic! Small Linux Kernel! Linux Containers! Docker! Atomic Updates with rpm-ostree! etcd - distributed dictionary also used in CoreOS! Anaconda Installer 14
Cloud OS Performance! We Can Review Two Different Comparisons! Linux Containers versus KVM Virtual Machines! OSv Guest versus Linux Guest 15
LXC Versus KVM Virtual Machines! Roughly Equal:! Memory Bandwidth! TCP Throughput! Sequential Block IO! NoSQL Deployment Scenario! Containers Exceed VMs: * Authors Failed to use a key I/O Optimization, and used a sub-optimal virtual disk configuration. They should have passed through block partitions and enabled host caching.! ** Authors used a sub-optimal virtual disk configuration. They should have passed through block partitions.! TCP Latency! Random Block IO and latency*! MySQL Throughput** http://goo.gl/zqfcl6 https://github.com/thewmf/kvm-docker-comparison 16
OSv Evaluation Compared OS v guest to Fedora 20 guest w/o firewall. On KVM host.! https://www.usenix.org/system/files/conference/atc14/atc14-paperkivity.pdf 17
Macro benchmarks Memcached. UDP. Single-vCPU guest, loaded with memaslap (90% get, 10% set) OSv throughput 22% better than Linux. Memcached reimplemented with packet-filtering API OSv throughput 290% better than baseline. SPECjvm2008. Suite of CPU/memory intensive Java workloads. Little use of OS services. Can't expect much improvement. Got 0.5%. Good correctness test (diverse, checks results). 18
OSv Micro benchmarks Netperf measure network stack performance. TCP single-stream thoughput: 24% improvement. UDP and TCP r/r latency: 37%-47% reduction. Context switch - two threads, alternate waking each other with pthreads condition variable. 3-10 times faster than in Linux. As little as 328 ns when two threads on same CPU. JVM Balloon microbenchmark where large heap and large page cache are needed, but not at the same time. Osv 35% faster than Linux. 19
OSv Latest unofficial results Experimental, non-release, code... Need more verification... Cassandra stress test, READ, 4 vcpu, 4 GB ram OSv 34% better Tomcat, servlet sending fixed response, 128 concurrent HTTP connections, measure throughput. 4 vcpus, 3GB OSv 41% better. 20
www.ibm.com/systems/kvm mdday@us.ibm.com 21