SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG



Similar documents
Achieving QoS in Server Virtualization

SLURM Resources isolation through cgroups. Yiannis Georgiou Matthieu Hautreux

Storage I/O Control: Proportional Allocation of Shared Storage Resources

Cisco Application-Centric Infrastructure (ACI) and Linux Containers

Linux Kernel Namespaces (an intro to soft-virtualization) kargig [at] GPG: 79B B8F6 803B EC C E02C

lxc and cgroups in practice sesja linuksowa 2012 wojciech wirkijowski wojciech /at/ wirkijowski /dot/ pl

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Use Cases for Docker in Enterprise Linux Environment CloudOpen North America, 2014 Linda Wang Sr. Software Engineering Manager Red Hat, Inc.

An Analysis of Container-based Platforms for NFV

Limiting PostgreSQL resource consumption using the Linux kernel

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Managing Linux Resources with cgroups

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

International Journal of Advance Research in Computer Science and Management Studies

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Rackspace Cloud Databases and Container-based Virtualization

Resource Management with CGroups

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Optimizing Shared Resource Contention in HPC Clusters

Towards an understanding of oversubscription in cloud

WHITE PAPER 1

Docker : devops, shared registries, HPC and emerging use cases. François Moreews & Olivier Sallou

Lightweight Virtualization: LXC Best Practices

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1

Energy Constrained Resource Scheduling for Cloud Environment

CS 695 Topics in Virtualization and Cloud Computing. Introduction

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

70-414: Implementing a Cloud Based Infrastructure. Course Overview

Do Containers fully 'contain' security issues? A closer look at Docker and Warden. By Farshad Abasi,

Introduction to the NI Real-Time Hypervisor

HP Virtualization Performance Viewer

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

Delivering Quality in Software Performance and Scalability Testing

Data Centers and Cloud Computing

using Linux/Cgroups with IO throttle

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

CS 695 Topics in Virtualization and Cloud Computing and Storage Systems. Introduction

Lightweight Virtualization with Linux Containers (LXC)

Affinity Aware VM Colocation Mechanism for Cloud

Capacity Estimation for Linux Workloads

Enabling Technologies for Distributed Computing

Analysis of VDI Storage Performance During Bootstorm

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Sizing guide for SAP and VMware ESX Server running on HP ProLiant x86-64 platforms

Distributed Scheduling with Apache Mesos in the Cloud. PhillyETE - April, 2015 Diptanu Gon

STRATEGIC WHITE PAPER. The next step in server virtualization: How containers are changing the cloud and application landscape

Understanding Data Locality in VMware Virtual SAN

Elastic Load Balancing in Cloud Storage

Dynamic Resource allocation in Cloud

Resource Efficient Computing for Warehouse-scale Datacenters

Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

Avoiding Performance Bottlenecks in Hyper-V

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

BridgeWays Management Pack for VMware ESX

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Big Data Trends and HDFS Evolution

Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang

Characterizing Task Usage Shapes in Google s Compute Clusters

Introduction to Cloud Computing

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

Scaling Database Performance in Azure

Enabling Technologies for Distributed and Cloud Computing

Pivot3 Reference Architecture for VMware View Version 1.03

Evaluation Methodology of Converged Cloud Environments

Private Cloud Migration

Emerging Technology for the Next Decade

Windows Server Performance Monitoring

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

VMware vsphere 4.1 with ESXi and vcenter

Product Brief SysTrack VMP

Cloud Computing. Adam Barker

Energy-aware Memory Management through Database Buffer Control

Azure VM Performance Considerations Running SQL Server

Windows Server 2008 R2 Hyper-V Live Migration

Intel Service Assurance Administrator. Product Overview

Optimizing Cloud Performance Using Veloxum Testing Report on experiments run to show Veloxum s optimization software effects on Terremark s vcloud

Directions for VMware Ready Testing for Application Software

WHITE PAPER. SAS IT Intelligence. Balancing enterprise strategy, business objectives, IT enablement and costs

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Virtualization of the MS Exchange Server Environment

Can t We All Just Get Along? Spark and Resource Management on Hadoop

9/26/2011. What is Virtualization? What are the different types of virtualization.

CFS-v: I/O Demand-driven VM Scheduler in KVM

The Cloud to the rescue!

Benchmarking Hadoop & HBase on Violin

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

Sizing and Best Practices for Deploying Microsoft Exchange Server 2010 on VMware vsphere and Dell EqualLogic Storage

Billing for OpenStack Cloud Services

Monitoring Cloud Applications. Amit Pathak

Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality

Transcription:

SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG 1

WHAT IS RESOURCE UTILIZATION? This is what we buy A gap of $$$ wasted This is what we use 2

ENERGY AND RESOURCE UTILIZATION Energy-related costs 42% of total (including buy new machines) An idle server consumes even 70% as much energy as running in fullspeed Low resource utilization is energy inefficient Waste energy, waste money 3

A CLOSER LOOK TO CLOUD The key advantage of cloud - workload consolidation Improved resource utilization Less machines, more apps. Energyefficient and saves money. 4

CLOUD RESOURCE UTILIZATION BIG PICTURE Scheduling - choose the best resource placement when app starts Examples: Green Cloud, Paragon. And the schedulers in Openstack, Kubernetes, Mesos, Migration - continuously optimize the resource placement when app is running Examples: Openstack Watcher, VMware DRS Soft Container - dynamically bubble up/down resource constraints in respond to co-located apps Related: Google Heracles 5

CLOUD RESOURCE UTILIZATION BIG PICTURE Apps Scheduler Manages resource utilization at app kick-off Soft Container Manages resource utilization at fine granularity inside host Migration Manages resource utilization cross hosts while app running 6

CLOUD RESOURCE UTILIZATION BIG PICTURE A battle of putting more apps in each host vs. guarantee app SLA The key problem: resource interference 7

THE KEY PROBLEM: RESOURCE INTERFERENCE What is resource interference? Apps co-located in one host share resources like CPU, cache, memory, They interfere with each other, result in poor performance compared to running standalone Resource interference make SLA easy to be violated Related readings Google Heracles: an analysis of resource interference Paragon: resource interference-aware scheduling Bubble-up: to measure resource interference 8

RESOURCE INTERFERENCE: IT LOOKS LIKE? MySQL standalone running vs co-located with a CPU & disk hungry task 9

RESOURCE INTERFERENCE: HOW TO MEASURE? Bubble-up The setup Run app co-located with resource benchmarks, each benchmark stresses one type of resource App tolerated resource interference Slowly increase resource benchmark stress until app fails its SLA. The critical point shows how much resource interference the app can tolerate. App caused resource interference Run app at what its SLA requires. The stress it causes on each type of resource is the app s caused resource interference. Where to use it? Better resource utilization management Scheduling, Migration, Soft Container, 10

RESOURCE INTERFERENCE: HOW TO MEASURE? MySQL standalone running, vs co-located with CPU stress, vs disk stress. In my case, MySQL is much more sensitive to CPU interference. 11

INTRODUCING TO SOFT CONTAINER Motivations Increase resource utilization by co-locating more apps E.g. Business services is critical but may not use all resources on the host. Add the low priority hadoop batching tasks to fill what is left. Respond to the dynamic nature of time-varying workload E.g. Business service may become more idle at lunch time, hadoop tasks can then expand its resource bubble and utilize the leftover. Guarantee the SLA of critical apps E.g. When the business service suddenly requires more resource for processing, hadoop tasks will shrink instantly to give out resources. Challenges Resource control and isolation of interference Respond to dynamic workload change 12

RESOURCES CPU Core Time Quota Disk I/O IOPS Throughput Memory Size Bandwidth 13

RESOURCES - MISSING CPU Core Time Quota Disk I/O IOPS Throughput Memory Size Bandwidth* Cache LLC Network Ulimit Bandwidth GPU Device* Waiting & implemented some in house 14

ISOLATION THE RESOURCES - NAMESPACE clone(): create a new process and attached to a new namespace unshare(): create a new namespace and attaches to a existed process setns(): Set a a process to a existing namespace /proc/<pid>/ns: lrwxrwxrwx 1 root root 0 Jun 21 18:38 ipc -> ipc:[4026532509] lrwxrwxrwx 1 root root 0 Jun 21 18:38 mnt -> mnt:[4026532507] lrwxrwxrwx 1 root root 0 Jun 16 18:24 net -> net:[4026532512] lrwxrwxrwx 1 root root 0 Jun 21 18:38 pid -> pid:[4026532510] lrwxrwxrwx 1 root root 0 Jun 21 18:38 user -> user:[4026531837] lrwxrwxrwx 1 root root 0 Jun 21 18:38 uts -> uts:[4026532508] We are still waiting security namespace security keys namespace device namespace time namespace 15

LIMIT THE RESOURCE - CGROUP Task, Control Group & Hierarchy Subsystem What can be control blkio cpu cpuacct cpuset devices freezer memory net_cls net_prio ns Usage Create a cgroup subsystem Change the limitation # echo 524288000 > /sys/fs/cgroup/memory/foo/memory.limit_in_b ytes 16

MISSING - NETWORK Community attempts: Base on Traffic Control (tc) 17

MISSING - GPU Nvidia s efforts: a. GPU exposed as separated normal devices in /dev b. devices cgroup => partial supported: Allow/Deny/List Access i. R ii. W iii. M Ref: https://github.com/nvidia/nvidia-docker/wiki/gpu-isolation 18

MISSING - CACHE Intel s efforts: Cache Monitor Technology (CMT) For an OS or VMM to indicate a softwaredefined ID for each of applications or VMs that are scheduled to run on a core. This ID is called the Resource Monitoring ID (RMID). To Monitor cache occupancy on a per RMID basis For an OS or VMM to read LLC occupancy for a given RMID at any time. Cache Allocation Technology (CAT) The ability to enumerate the CAT capability and the associated LLC allocation support via CPUID. Interfaces for the OS/hypervisor to group applications into classes of service (CLOS) and indicate the amount of last-level cache available to each CLOS. These interfaces are based on MSRs (Model-Specific Registers). Code and Data Prioritization (CDP) Extension to CAT a new CPUID feature flag is added within the CAT sub-leaves at CPUID.0x10.[ResID=1]:ECx[bit 2] to indicate support 19

MISSING MEMORY BANDWIDTH Memory Bandwidth Monitoring (MBM) Mechanisms in hardware to monitor cache occupancy and bandwidth statistics as applicable to a given product generation on a per software-id basis. Mechanisms for the OS or hypervisor to read back the collected metrics such as L3 occupancy or Memory Bandwidth for a given software ID at any point during runtime. Monitor Control Ref Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platform: http://pertsserver.cs.uiuc.edu/~mcaccamo/papers/private/ieee_tc_journal_submitted_c.pdf Code: https://github.com/heechul/memguard 20

MISSING MEMORY BANDWIDTH Memory Bandwidth Monitoring (MBM) Mechanisms in hardware to monitor cache occupancy and bandwidth statistics as applicable to a given product generation on a per software-id basis. Mechanisms for the OS or hypervisor to read back the collected metrics such as L3 occupancy or Memory Bandwidth for a given software ID at any point during runtime. Monitor Control Ref Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platform: http://pertsserver.cs.uiuc.edu/~mcaccamo/papers/private/ieee_tc_journal_submitted_c.pdf Code: https://github.com/heechul/memguard 21

WATCH THE WORKLOAD CHANGE Latencies App request latency Disk IO await Network response time Queue length CPU load average Disk request queue size Network queue length Utilization CPU util rate Disk util rate Network util rate Bandwidth DRAM bandwidth CPU bandwidth Disk bandwidth Request count App request count Disk IOPS / req/s Network IOPS / req/s Granularity Global level Per container level 22

THE FEEDBACK CONTROL LOOP Controller Soft Container Watcher Limiter Containers 23

THE FEEDBACK CONTROL LOOP Controller Soft Container Watcher Immediately response Limiter Containers 24

THE FEEDBACK CONTROL LOOP Controller Soft Container Watcher Immediately response Limiter Containers How to immediately resize the containers? 25

HOW WE LOOK AT SHRINK & EXPANSION? a. Create a new container; b. Live migrate the contents to new container: 1. Transfer existed data to new container; 2. Transfer the instant data to new container. c. Stop the old container d. Start the new container e. Route the traffic to new container 26

IN CONTAINER S WORLD 9527 /usr/sbin/httpd a. Mount to new cgroup or change the value of the cgroup b. Done! Control Groups (cgroup): CPU time: 20 System memory: 1G Disk bandwidth: 2000 Network bandwidth: 100Mbs Control Groups (cgroup): CPU time: 70 System memory: 5G Disk bandwidth: 8000 Network bandwidth: 1Gbs 27

IN CONTAINER S WORLD a. Mount to new cgroup or change the value of the cgroup b. Done! 9527 /usr/sbin/httpd Control Groups (cgroup): CPU time: 20 We need to take a fresh look at System memory: 1G Disk bandwidth: 2000 Network bandwidth: 100Mbs the resources management from Container s perspective. Control Groups (cgroup): CPU time: 70 System memory: 5G Disk bandwidth: 8000 Network bandwidth: 1Gbs 28

SOFT CONTAINER: IMPLEMENTATION Controller Algorithm expand Algorithm pin_idle Container Repo RunC plugin Docker plugin Algorithm plugin N Container type N Watcher CPU plugin Disk plugin Watcher plugin N CPU statistics Disk More Metrics Store Auto discovery Limiter RunC plugin Docker plugin Limiter plugin N Containers 29

SOFT CONTAINER: CURRENT STATUS Support RunC and Docker containers A few controller algorithms which is effective Able to expand with more plugins Completely runnable! 30

Demo Time :-) 31

BENCHMARK RESULTS: BEFORE If uncontrolled, MySQL workload is severely interfered by co-located low priority task 32

BENCHMARK RESULTS: BEFORE The CPU utilization is far from saturation while workload varies by time (Although in my case, disk IO is highly utilized) 33

BENCHMARK RESULTS: SOFT CONTAINER With Soft Container (green line), latency impact is controlled. (We can improve the algorithm to cope better with peak workload) 34

BENCHMARK RESULTS: SOFT CONTAINER Soft Container helps improve CPU utilization by co-locating new tasks with MySQL 35

BENCHMARK RESULTS: SOFT CONTAINER CPU utilization looks close to saturation, after adding in iowait time 36

BENCHMARK RESULTS: SOFT CONTAINER How the resource bubble floats under the control of Soft Container. (The vibration threshold are made very sensitive to workload change) 37

Q&A