Chatty Tenants and the Cloud Network Sharing Problem



Similar documents
Predictable Data Centers

Chatty Tenants and the Cloud Network Sharing Problem

Towards Predictable Datacenter Networks

Decentralized Task-Aware Scheduling for Data Center Networks

Silo: Predictable Message Completion Time in the Cloud

The Price Is Right: Towards Location-Independent Costs in Datacenters

IoNCloud: exploring application affinity to improve utilization and predictability in datacenters

Falloc: Fair Network Bandwidth Allocation in IaaS Datacenters via a Bargaining Game Approach

Beyond the Stars: Revisiting Virtual Cluster Embeddings

Towards Predictable Datacenter Networks

QoS-Aware Storage Virtualization for Cloud File Systems. Christoph Kleineweber (Speaker) Alexander Reinefeld Thorsten Schütt. Zuse Institute Berlin

Resource Provisioning of Web Applications in Heterogeneous Cloud. Jiang Dejun Supervisor: Guillaume Pierre

Resource Efficient Computing for Warehouse-scale Datacenters

Paolo Costa

Elasticity-Aware Virtual Machine Placement in K-ary Cloud Data Centers

ElasticSwitch: Practical Work-Conserving Bandwidth Guarantees for Cloud Computing

Allocating Bandwidth in Datacenter Networks: A Survey

Small is Better: Avoiding Latency Traps in Virtualized DataCenters

QoS & Traffic Management

Comparison of Windows IaaS Environments

Choreo: Network-Aware Task Placement for Cloud Applications

A Cooperative Game Based Allocation for Sharing Data Center Networks

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES

On Tackling Virtual Data Center Embedding Problem

IaaS Multi Tier Applications - Problem Statement & Review

Figure 1. The cloud scales: Amazon EC2 growth [2].

Data Center Network Topologies: VL2 (Virtual Layer 2)

Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team

A hierarchical multicriteria routing model with traffic splitting for MPLS networks

Cloud Computing Trends

End-to-end Performance Isolation through Virtual Datacenters

Airlift: Video Conferencing as a Cloud Service using Inter- Datacenter Networks

Recommendations for Performance Benchmarking

EyeQ: Practical Network Performance Isolation for the Multi-tenant Cloud

Towards an understanding of oversubscription in cloud

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

Resource Provisioning of Web Applications in Heterogeneous Clouds

Computer Networks COSC 6377

On Tackling Virtual Data Center Embedding Problem

Performance Evaluation of Linux Bridge

Lecture 7: Data Center Networks"

Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality

Cisco Meraki MX products come in 6 models. The chart below outlines MX hardware properties for each model: MX64 MX64W MX84 MX100 MX400 MX600

Fixed Price Website Load Testing

Scalable Network Virtualization in Software-Defined Networks

FairCloud: Sharing the Network in Cloud Computing

A dynamic optimization model for power and performance management of virtualized clusters

Cisco Meraki MX products come in 6 models. The chart below outlines MX hardware properties for each model: MX60 MX60W MX80 MX100 MX400 MX600

Rackspace Cloud Databases and Container-based Virtualization

2013 WAN Management Spectrum. October 2013

Testing & Assuring Mobile End User Experience Before Production. Neotys

Cloud Scale Resource Management: Challenges and Techniques

Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures

COMMA: Coordinating the Migration of Multi-tier Applications

Network Infrastructure Services CS848 Project

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research

1. First thing you'll do is login to the New Control Panel, select Load Balancers from the list at the top, and then select Create Load Balancer.

What is Cloud Computing? Tackling the Challenges of Big Data. Tackling The Challenges of Big Data. Matei Zaharia. Matei Zaharia. Big Data Collection

Data Center Network Topologies: FatTree

SDN Interfaces and Performance Analysis of SDN components

Cloud Server Performance A Comparative Analysis of 5 Large Cloud IaaS Providers

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Relocating Windows Server 2003 Workloads

Web Load Stress Testing

Greenhead: Virtual Data Center Embedding Across Distributed Infrastructures

Performance and Scalability Best Practices in ArcGIS

Part2 Hyper-V Replica and Hyper-V Recovery Manager. Datacenter Specialist

INFO5011. Cloud Computing Semester 2, 2011 Lecture 11, Cloud Scheduling

Optimizing Data Center Networks for Cloud Computing

Installing Intercloud Fabric Firewall

Cloud Computing. Up until now

Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds

Reverse Auction-based Resource Allocation Policy for Service Broker in Hybrid Cloud Environment

IaaS Federation. Contrail project. IaaS Federation! Objectives and Challenges! & SLA management in Federations 5/23/11

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks

Performance In the Cloud. White paper

How To Create A Cloud Based System For Aaas (Networking)

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Dynamic Bandwidth Allocation for Multiple Network Connections: Improving User QoE and Network Usage of YouTube in Mobile Broadband

Towards an Optimized Big Data Processing System

COMMA: Coordinating the Migration of Multi-tier Applications

Fault-Tolerant Framework for Load Balancing System

Liferay Portal Performance. Benchmark Study of Liferay Portal Enterprise Edition

Efficient Load Balancing Algorithm in Cloud Computing

Microsoft Azure for IT Professionals 55065A; 3 days

Integrating Active Directory Federation Services (ADFS) with Office 365 through IaaS

WITH the ability to scale computing resources on demand

Performance Testing Percy Pari Salas

OPTIMAL MULTI SERVER CONFIGURATION FOR PROFIT MAXIMIZATION IN CLOUD COMPUTING

CLOUD PERFORMANCE TESTING - KEY CONSIDERATIONS (COMPLETE ANALYSIS USING RETAIL APPLICATION TEST DATA)

Introduction to Software Defined Networking (SDN) and how it will change the inside of your DataCentre

Investigating the Pricing Impact on the QoS of a Server Farm Deployed on the Cloud

Load Balance Scheduling Algorithm for Serving of Requests in Cloud Networks Using Software Defined Networks

ONE of the often cited benefits of cloud computing

Navigating The World of Cloud Computing

Experimental Awareness of CO 2 in Federated Cloud Sourcing

Load Imbalance Analysis

DOCLITE: DOCKER CONTAINER-BASED LIGHTWEIGHT BENCHMARKING ON THE CLOUD

What We Talk About When We Talk About Cloud Network Performance

Transcription:

Chatty Tenants and the Cloud Network Sharing Problem Hitesh Ballani, Keon Jang, Thomas Karagiannis Changhoon Kim, Dinan Gunawardena, Greg O Shea MSR Cambridge, Windows Azure

This talk is about... How to share the network in multi-tenant datacenters? Multi-tenant datacenters Public cloud datacenters Windows Azure, Amazon EC2, Rackspace, Tenants: users renting virtual machines Private cloud datacenters 2

A use-case of cloud datacenters Intra-tenant traffic Inter-tenant traffic Cloud Provider Storage Services Network is shared! Map reduce Physical Host Physical Host Virtual machines for blue tenant 3

Requirements for network sharing [ FairCloud] Tenants want predictable performance / cost Req 1. Minimum bandwidth guarantee Not all flows are equal: some tenants pay more Req 2. Proportionality Utilize spare resources as much as possible Req 3. High utilization 4

Existing solutions for network sharing Today Seawall Oktopus FairCloud (PS-L) FairCloud (PS-P) With Inter-tenant traffic Min-guarantee Proportionality High utilization Harder Problematic Prior work focuses on intra tenant traffic 5

Chatty tenants in the cloud Typical cloud applications have many dependency Provider Services A tenant s own File Storage SQL DB Web Frontend Cache Wan Optimizer Firewall Third Party Services in Marketplace 6

Inter-tenant traffic (%) Prevalence of inter-tenant traffic Measurement from 8 datacenters of a public cloud service provider 40 30 20 10 0 1 2 3 4 5 6 7 8 Datacenter Inter-tenant traffic accounts for 10-35% of traffic! 7

Min bandwidth guarantee is harder On what links bandwidth guarantee is required? Intra-tenant traffic Inter-tenant traffic Physical Host Physical Host Inter-tenant traffic leads to richer communication pattern and makes minimum bandwidth guarantee harder! 8

How to define proportionality? P and Q are paying same amount p q 1000Mbps r1 r2 r3 r4 Allocation P (Mbps) Q (Mbps) Per flow 250 750 Seawall 250 750 FairCloud 333 666 Q: Whose payment should dictate the flow bandwidth? 9

Hadrian Overview What semantics should we provide to tenants? Virtual network abstraction How to allocate bandwidth? Hose-compliant bandwidth allocation This talk How to place virtual machines? Greedy heuristic that guarantees min bandwidth 10

Hadrian Overview What semantics should we provide to tenants? Virtual network abstraction How to allocate bandwidth? Hose-compliant bandwidth allocation How to place virtual machines? Greedy heuristic that guarantees min bandwidth 11

State of the art: Hose-model Tenant Request: <N, B> Each is guaranteed to send/receive at B minimum bps of B bps B P 1 Virtual Switch 2 B P p B P Minimum bandwidth guarantee High-utilization Tenant P s 12

State of the art: Hose-model Tenant Request: <N, B> Each is guaranteed to send/receive at minimum of B bps Virtual Switch Virtual Switch Virtual Switch B P B P B P B Q B Q B Q B R B R B R 1 2 p 1 2 q 1 2 r Tenant P s Tenant Q s Tenant R s 13

Multi-hose model Tenant Request: <N, B> s in different tenants communicates with each other at a rate of min(bp, Bq) B P 1 Switch Switch Switch 2 B P p B P Multi Hose Switch B Q Assumes same inter- and intra- tenant bandwidth 1 2 B Q B Q q B R 1 2 B R B R r Tenant P s Tenant Q s Tenant R s Allows Inter-tenant communication 14

Hierarchical hose model Tenant Request: <N, B, B inter > B P 1 2 p Virtual Inter-tenant Switch Virtual Switch Virtual Switch Virtual Switch B P B P inter 1 2 B Q inter Assumes all-to-all communication B P B Q B Q B Q q B R 1 2 B R inter B R B R r Tenant P s Tenant Q s Tenant R s Separate inter-tenant bandwidth requirement 15

Communication dependency Most tenants communicate with only few other tenants Tenant Request: <N, B, B inter, list of dependent tenants> Reduces possible communication patterns Helps place dependent tenants closer Q:How about service tenants (e.g., storage )? Tenant Request: <N, B, B inter, *> 16

Hadrian Overview What semantics should we provide to tenants? Virtual network abstraction How to allocate bandwidth? Hose-compliant bandwidth allocation How to place virtual machines? Greedy heuristic that guarantees min bandwidth 17

Hose-compliant bandwidth allocation Whose payment should dictate the bandwidth of the flow? p 100Mbps 20 20 20 20 20 20 Min 40 40 5 flows 5 flows q 200Mbps Our approach : take minimum from two sides 40 40 40 18

Hose-compliant bandwidth allocation Weighted fair-share at the bottleneck p B p N p flows W pq? N q flows q B q W pq = min( B p N p, B q N q ) 19

Upper bound proportionality p 100Mbps 5 flows 20 20 20 20 20 Min (20,?) x 5 100 Max aggregate weight for p s flow = 100 Upper bound of total weight of s flows is proportional to the s payment 20

Minimum bandwidth guarantee Total weight for all flows of a given is bounded 500Mbps 500Mbps p q N flows 1000Mbps r1 r2 Total weight 1000 rk Placement algorithm enforces Total Weight Capacity on any link in the network Min bandwidth guarantee The verification can be formulated as max flow network problem 21

Evaluation Synthetic cloud workload benchmark Tenants submit requests for s and execute jobs A job has CPU Processing, Inter-tenant traffic, Intra-tenant traffic Inter-tenant traffic ratio: 10-40% Fraction of tenant w/ inter-tenant : 20% Environments Testbed: 16 end hosts Large scale simulation: 16,000 end hosts 22

Evaluation criteria Network sharing requirements Minimum bandwidth guarantee Upper-bound proportionality High-utilization Benefits of Hadrian Metric: acceptance ratio Comparison with Baseline: per flow sharing Existing approaches: Oktopus, FairCloud 23

Job completion time Always finishes in time Utilize spare resources 3.4x at 95 th percentile Relative job completion time against estimate Estimate = Minimum amount of bandwidth data / bandwidth guarantee requirement High utilization 24

Throughput for p (Mbps) Bandwidth allocation 500 p 1000Mbps r1 400 300 Per flow FairCloud(PS-L) Hose Compliant q N flows r2 rk 200 100 0 1 3 5 7 9 11 13 15 Number of flows for q Upper bound proportionality 25

Acceptance Ratio (%) Request acceptance ratio testbed 100 80 60 40 Simulation Testbed 35 37 72 72 37% more jobs 20 0 Baseline Hadrian Similar benefits in large scale simulations Testbed and Simulation shows consistent result 26

Summary We show that Inter-tenant traffic is prevalent 10~35% from a major public cloud provider We propose Hadrian Virtual network abstraction: inter-tenant, dependency Bandwidth allocation strategy: upper-bound proportionality Placement algorithm: greedy dependency aware packing Our evaluation show that Hadrian meets three network sharing requirements Hadrian delivers predictability and higher efficiency 27

Thank you