Designing and Experimenting with Data Center Architectures. Aditya Akella UW-Madison

Similar documents
Robert Ricci GEC 22 March 2015

CloudLab. updated: 8/13/14

The computer science community is bubbling

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

Cisco s Massively Scalable Data Center. Network Fabric for Warehouse Scale Computer

SDN and Data Center Networks

Non-blocking Switching in the Cloud Computing Era

The Software Defined Hybrid Packet Optical Datacenter Network SDN AT LIGHT SPEED TM CALIENT Technologies

Network Virtualization for the Enterprise Data Center. Guido Appenzeller Open Networking Summit October 2011

Data Center Network Topologies: FatTree

BUILDING A NEXT-GENERATION DATA CENTER

VxRACK : L HYPER-CONVERGENCE AVEC L EXPERIENCE VCE JEUDI 19 NOVEMBRE Jean-Baptiste ROBERJOT - VCE - Software Defined Specialist

AMD SEAMICRO OPENSTACK BLUEPRINTS CLOUD- IN- A- BOX OCTOBER 2013

Netvisor Software Defined Fabric Architecture

Hardware Performance Optimization and Tuning. Presenter: Tom Arakelian Assistant: Guy Ingalls

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器

State of the Art Cloud Infrastructure

Cisco s Massively Scalable Data Center

Core and Pod Data Center Design

Bringing the Public Cloud to Your Data Center

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Pluribus Netvisor Solution Brief

Building Scalable Multi-Tenant Cloud Networks with OpenFlow and OpenStack

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Towards Rack-scale Computing Challenges and Opportunities

Load Balancing in Data Center Networks

Load Balancing Mechanisms in Data Center Networks

Paolo Costa

Integrating SSDs into Virtual Servers

Advanced Computer Networks. Datacenter Network Fabric

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Taming the Flying Cable Monster: A Topology Design and Optimization Framework for Data-Center Networks

Panel: Cloud/SDN/NFV 黃 仁 竑 教 授 國 立 中 正 大 學 資 工 系 2015/12/26

Networking Architectures for Big-Data Applications

Brocade Solution for EMC VSPEX Server Virtualization

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Scientific Computing Data Management Visions

Introduction to Cloud Design Four Design Principals For IaaS

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

How To Switch A Layer 1 Matrix Switch On A Network On A Cloud (Network) On A Microsoft Network (Network On A Server) On An Openflow (Network-1) On The Network (Netscout) On Your Network (

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

NET ACCESS VOICE PRIVATE CLOUD

Ant Rowstron. Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge. Hussam Abu-Libdeh, Simon Schubert Interns

Data Center Network Topologies

Cloud Computing and Amazon Web Services

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Load Balancing in Data Center Networks

SDN Applications in Today s Data Center

Maximum performance, minimal risk for data warehousing

StorPool Distributed Storage Software Technical Overview

THE REVOLUTION TOWARDS SOFTWARE- DEFINED NETWORKING

Data Center Network Architectures

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Data Center Use Cases and Trends

VMware Virtual SAN 6.2 Network Design Guide

Panel : Future Data Center Networks

NephOS A Licensed End-to-end IaaS Cloud Software Stack for Enterprise or OEM On-premise Use.

Hyperscale Use Cases for Scaling Out with Flash. David Olszewski

Intel Cloud Builder Guide to Cloud Design and Deployment on Intel Platforms

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Towards Smart and Intelligent SDN Controller

Fabrics that Fit Matching the Network to Today s Data Center Traffic Conditions

Minimize cost and risk for data warehousing

Data Center Switch Fabric Competitive Analysis

Pre-Conference Seminar E: Flash Storage Networking

Restricted Document. Pulsant Technical Specification

Transform Your Business and Protect Your Cisco Nexus Investment While Adopting Cisco Application Centric Infrastructure

Network Technologies for Next-generation Data Centers

OPENFLOW, SDN, OPEN SOURCE AND BARE METAL SWITCHES. Guido Appenzeller (Not representing Anyone)

Operating Systems. Cloud Computing and Data Centers

SSDs: Practical Ways to Accelerate Virtual Servers

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

SSDs: Practical Ways to Accelerate Virtual Servers

Traditional v/s CONVRGD

BSN Big Cloud Fabric and OpenStack CloudLabs. Big Cloud Fabric P+V Solution Validation with OpenStack

Presented By Joe Mambretti, Director, International Center for Advanced Internet Research, Northwestern University

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Expert Reference Series of White Papers. VMware vsphere Essentials

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

FOR LOWEST COST AND GREATEST AGILITY, CHOOSE SOFTWARE- DEFINED DATA CENTER ARCHITECTURES OVER TRADITIONAL HARDWARE-DEPENDENT DESIGNS

Springpath Data Platform

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at

Virtualization of the MS Exchange Server Environment

Datacenter Operating Systems

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

IronPOD Piston OpenStack Cloud System Commodity Cloud IaaS Platforms for Enterprises & Service

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Zadara Storage Cloud A

Delivering Scale Out Data Center Networking with Optics Why and How. Amin Vahdat Google/UC San Diego

How To Make A Vpc More Secure With A Cloud Network Overlay (Network) On A Vlan) On An Openstack Vlan On A Server On A Network On A 2D (Vlan) (Vpn) On Your Vlan

How To Build A Cisco Ukcsob420 M3 Blade Server

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Enabling Technologies for Distributed Computing

Unified Computing Systems

IO Visor: Programmable and Flexible Data Plane for Datacenter s I/O

EMC VFCACHE ACCELERATES ORACLE

Transcription:

Designing and Experimenting with Data Center Architectures Aditya Akella UW-Madison

http://www.infotechlead.com/2013/03/28/gartner-data-center-spending-to-grow-3-7-to-146-billion-in-2013-8707 Charles E. Leiserson's 60th Birthday Symposium 2

What to build? This question has spawned a cottage industry in the computer networking research community. Fat-tree [SIGCOMM 2008] VL2 [SIGCOMM 2009, CoNEXT 2013] DCell [SIGCOMM 2008] BCube [SIGCOMM 2009] Jellyfish [NSDI 2012] 3

Fat-tree SIGCOMM 2008 Isomorphic to butterfly network except at top level Bisection width n/2, oversubscription ratio 1 4

VL2 (SIGCOMM 2009, CoNEXT 2013) called a Clos network oversubscription ratio 1 (but 1Gbps links at leaves, 10Gbps elsewhere) 5

How to compare networks? Bisection width Diameter Maximum degree Degree sequence Area or volume Fault tolerance Cost 6

A Universal Approach Build a single network that is competitive, for any application, with any other network that can be built at the same cost. 7

Area-Universality Theorem [Leiserson, 1985]: There is a fattree network of area n that can emulate any other network that can be laid out in area n with slowdown O(log 3 n). Later improved to O(log n) slowdown area can be replaced by volume 8

Coarse Structure of Fat-Tree Network node channel leaves 9

Example of Fine Structure of a Fat-Tree switch link processors Butterfly Fat-Tree (Greenberg-Leiserson) 10

Recursive Decomposition of VLSI Layout Idea: match fat-tree channel capacity with maximum number of wires cut in layout. 4 n node channel 8 n leaves 11

Layout of Area-Universal Fat-Tree Top layer capacity: 4 n Next later capacity: 8 n but at half wire length Equal area at each layer! Message crossing link in emulated network fat-tree sends message across a pair of leaves 12

The Universal Approach for Data Center Design Allocate the same amount of money to each level in a three- or four-level fat-tree network. Levels: within a rack, between racks in a row, between rows, etc. Rule of thumb: build best network at lowest level; match cost at higher levels 13

Caveats Assumes that performance is proportional to cost (e.g., for one-third the cost, can buy one-third the capacity) Assumes that it is physically possible to spend the same amount at each level Assumes that bandwidth bottlenecks, not latency, limit performance. 14

Trends Looked at large-scale network deployed by a major provider of on-line services. Five years ago, money spent on layers (bottom up) was 6:2:1. In 2014, the ratio was 2:2:1 we re building nearly cost-universal networks today! 15

The Need Addressed by CloudLab How to optimize software (e.g., to take superfluous latency out)? How best to design and run applications? Storage, networking, virtualization, To investigate these questions, we need: Flexible, scalable scientific infrastructure That enables exploration of fundamental science in the cloud To ensure repeatability of research

The CloudLab Vision A meta-cloud for building clouds Build your own cloud on our hardware resources Agnostic to specific cloud software Run existing cloud software stacks (like OpenStack, Hadoop, etc.) or new ones built from the ground up Control and visibility all the way to the bare metal Sliceable for multiple, isolated experiments at once With CloudLab, it will be as easy to get a cloud tomorrow as it is to get a VM today

What Is CloudLab? Slice A Geo-Distributed Storage Research Slice B Stock OpenStack Supports transformative cloud research Built on Emulab and GENI Control to the bare metal Diverse, distributed resources Repeatable and scientific Slice C Slice D Virtualization and Isolation Research Allocation and Scheduling Research for Cyber-Physical Systems Utah Wisconsin Clemson GENI CC-NIE, Internet2 AL2S, Regionals

CloudLab s Hardware One facility, one account, three locations About 5,000 cores each (15,000 total) 8-16 cores per node Baseline: 4GB RAM / core Latest virtualization hardware TOR / Core switching design 10 Gb to nodes, SDN 100 Gb to Internet2 AL2S Partnerships with multiple vendors Wisconsin Clemson Utah Storage and net. Per node: 128 GB RAM 2x1TB Disk 400 GB SSD Clos topology Cisco High-memory 16 GB RAM / core 16 cores / node Bulk block store Net. up to 40Gb High capacity Dell Power-efficient ARM64 / x86 Power monitors Flash on ARMs Disk on x86 Very dense HP

Wisconsin/Cisco Nexus 3172PQ 8X10G Nexus 3132Q 40G 40G Nexus 3172PQ 20X12 servers 2X10G

Compute and storage 90X Cisco 220 M4 10X Cisco 240 M4 2X 8 cores @ 2.4GHz 128GB RAM 1X 480GB SSD 2X 1.2 TB HDD 1X 1TB HDD 12X 3TB HDD (donated by Seagate) Over the next year: 140 additional servers; Limited number of accelerators, e.g., FPGAs, GPUs (planned)

Networking Nexus 3132q Nexus 3172pq OF 1.0 (working with Cisco on OF 1.3 support) Monitoring of instantaneous queue lengths Fine-grained tracing of control plane actions Support for multiple virtual router instances per router Support for many routing protocols

Experiments supported Large number of nodes/cores, and bare-metal control over nodes/switches, for sophisticated network/memory/storage research Network I/O performance (e.g., Presto), intra-cloud routing (e.g., Conga) and transport (e.g., DCTCP) Network virtualization (e.g., CloudNaaS, OpenNF) In-memory big data frameworks (e.g., Spark/Shark, Graphene) Cloud-scale resource management and scheduling (e.g., Mesos; Tetris) New models for Cloud storage (e.g., tiered; flat storage; IOFlow, split-level scheduling) New architectures (e.g., RAMCloud for storage)

Technology Foundations Built on Emulab and GENI ( ProtoGENI ) Provisions, then gets out of the way Run-time services are optional Controllable through a web interface and GENI APIs Scientific instrument for repeatable research Physical isolation for most resources Profiles capture everything needed for experiments Software, data, and hardware details Can be shared and published (eg. in papers)

Sign Up At CloudLab.us