Networking in the Big Data Era



Similar documents
SDN and Data Center Networks

Network Virtualization for Large-Scale Data Centers

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES

基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器

Definition of a White Box. Benefits of White Boxes

TRILL for Data Center Networks

International Journal of Emerging Technology in Computer Science & Electronics (IJETCSE) ISSN: Volume 8 Issue 1 APRIL 2014.

Optical interconnection networks for data centers

Cloud Fabric. Huawei Cloud Fabric-Cloud Connect Data Center Solution HUAWEI TECHNOLOGIES CO.,LTD.

TRILL Large Layer 2 Network Solution

Avaya VENA Fabric Connect

Cloud Management: Knowing is Half The Battle

Virtual Machine in Data Center Switches Huawei Virtual System

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

How To Make A Vpc More Secure With A Cloud Network Overlay (Network) On A Vlan) On An Openstack Vlan On A Server On A Network On A 2D (Vlan) (Vpn) On Your Vlan

Software Defined Environments

Datacenter architectures

Applying SDN to Network Management Problems. Nick Feamster University of Maryland

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

DCB for Network Virtualization Overlays. Rakesh Sharma, IBM Austin IEEE 802 Plenary, Nov 2013, Dallas, TX

Testing Software Defined Network (SDN) For Data Center and Cloud VERYX TECHNOLOGIES

Open Source Networking for Cloud Data Centers

Boosting Business Agility through Software-defined Networking

What is SDN? And Why Should I Care? Jim Metzler Vice President Ashton Metzler & Associates

White Paper. Juniper Networks. Enabling Businesses to Deploy Virtualized Data Center Environments. Copyright 2013, Juniper Networks, Inc.

OVERLAYING VIRTUALIZED LAYER 2 NETWORKS OVER LAYER 3 NETWORKS

Lecture 7: Data Center Networks"

HAWAII TECH TALK SDN. Paul Deakin Field Systems Engineer

The Software Defined Hybrid Packet Optical Datacenter Network SDN AT LIGHT SPEED TM CALIENT Technologies

Virtualization, SDN and NFV

Analysis of Network Segmentation Techniques in Cloud Data Centers

NVO3: Network Virtualization Problem Statement. Thomas Narten IETF 83 Paris March, 2012

White Paper. Requirements of Network Virtualization

CLOUD NETWORKING THE NEXT CHAPTER FLORIN BALUS

Cloud Computing and the Internet. Conferenza GARR 2010

Avoiding Network Polarization and Increasing Visibility in Cloud Networks Using Broadcom Smart- Hash Technology

Software Defined Network (SDN)

Optical interconnects in data centers

Simplify Your Data Center Network to Improve Performance and Decrease Costs

Scalable Approaches for Multitenant Cloud Data Centers

Optimizing Data Center Networks for Cloud Computing

Combined Smart Sleeping and Power Scaling for Energy Efficiency in Green Data Center Networks

ConnectX -3 Pro: Solving the NVGRE Performance Challenge

Data Center Network Virtualisation Standards. Matthew Bocci, Director of Technology & Standards, IP Division IETF NVO3 Co-chair

Cloud Networking Disruption with Software Defined Network Virtualization. Ali Khayam

Extending Networking to Fit the Cloud

Networking Issues For Big Data

AIN: A Blueprint for an All-IP Data Center Network

SDN AND SECURITY: Why Take Over the Hosts When You Can Take Over the Network

SDN CONTROLLER. Emil Gągała. PLNOG, , Kraków

Simplifying Virtual Infrastructures: Ethernet Fabrics & IP Storage

Virtualization and SDN Applications

Panel: Cloud/SDN/NFV 黃 仁 竑 教 授 國 立 中 正 大 學 資 工 系 2015/12/26

Virtualized Network Services SDN solution for enterprises

Carrier/WAN SDN. SDN Optimized MPLS Demo

Data Center Networking Designing Today s Data Center

Multitenancy Options in Brocade VCS Fabrics

RIDE THE SDN AND CLOUD WAVE WITH CONTRAIL

Software-Defined Networks Powered by VellOS

Improving Network Management with Software Defined Networking

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

Software-Defined Networking for the Data Center. Dr. Peer Hasselmeyer NEC Laboratories Europe

Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani

Xperience of Programmable Network with OpenFlow

Software Defined Networks

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

A Reliability Analysis of Datacenter Topologies

Network performance in virtual infrastructures

Ten Things to Look for in an SDN Controller

Virtualized Network Services SDN solution for service providers

Data Center Convergence. Ahmad Zamer, Brocade

Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya and Amin Vahdat

Network Virtualization

Evolution of Software Defined Networking within Cisco s VMDC

EVOLVING ENTERPRISE NETWORKS WITH SPB-M APPLICATION NOTE

Introduction to Software Defined Networking (SDN) and how it will change the inside of your DataCentre

CS6204 Advanced Topics in Networking

VMDC 3.0 Design Overview

WHITE PAPER. Network Virtualization: A Data Plane Perspective

Designing and Experimenting with Data Center Architectures. Aditya Akella UW-Madison

Dynamic Resource Allocation in Software Defined and Virtual Networks: A Comparative Analysis

Palo Alto Networks. Security Models in the Software Defined Data Center

SOFTWARE DEFINED NETWORKING: INDUSTRY INVOLVEMENT

VMware and Brocade Network Virtualization Reference Whitepaper

SOFTWARE DEFINED NETWORKING

Network Technologies for Next-generation Data Centers

Transform Your Business and Protect Your Cisco Nexus Investment While Adopting Cisco Application Centric Infrastructure

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

Pluribus Netvisor Solution Brief

SDN Applications in Today s Data Center

Scaling 10Gb/s Clustering at Wire-Speed

State of the Art Cloud Infrastructure

Broadcom Smart-NV Technology for Cloud-Scale Network Virtualization. Sujal Das Product Marketing Director Network Switching

VXLAN Overlay Networks: Enabling Network Scalability for a Cloud Infrastructure

Lecture 02a Cloud Computing I

Load Balancing Mechanisms in Data Center Networks

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support

Transcription:

Networking in the Big Data Era Nelson L. S. da Fonseca Institute of Computing, State University of Campinas, Brazil e-mail: nfonseca@ic.unicamp.br IFIP/IEEE NOMS, Krakow May 7th, 2014

Outline What is Big Data? What is the role of networking in Big Data? What are the sources of Big Data? What are the issues in networking for Big Data? How can Big Data be processed, transferred and storaged in a friendly way?

What is Big Data?

In 60 seconds.. https://plus.google.com/+avinash/posts/mgyatu6mbhd

It is not just about Volume!

Big Data Siewert, S. B. Biga data in the cloud, IBM Developerworks, Tech. Rep., http://www.ibm.com/developerworks/library/bd-bigdatacloud/#what-is-big-data, July 9, 2013.

Big Data and Enterprise http://wikibon.org/wiki/v/big_data_market_size_and_vendor_revenues Analytics: The real-world use of big data: How innovative enterprises extract value from uncertain data, Executive REport, IBM Institute for Business Value

What is the role of networking in Big Data?

https://www.usenix.org/legacy/event/usenix99/invited_talks/mashey.pdf

Infrastress Alibaba Mall processes in a single day (Nov 11th, 2013) 105.8 million online transactions from 213 million users and 4.1 billion transactions

Networking..

Networking Computing Network Bandwidth Communication delays (tolerance) Degree of interactivity Storage

Networking Computing Network Bandwidth Communication delays (tolerance) Degree of interactivity Storage

What are the sources of data?

Map-Reduce Facebook Trace analysis: 30% to 50% of running time took up by communication phase

Schedulers which are data-location aware to decrease network traffic as well as I/O operation How to schedule tasks with heterogeneous (CPU, I/O) demand to promote load balance? How to benefit from Yarn resource management?

Scientific Computation The Montage application created by NASA/IPAC stitches together multiple input images to create custom mosaics of the sky.

Current cloud tools do not provide an out-of-box solution to address application needs Interconnects is the major obstacle to cloud computing broad adoption for larger-scale, more tightly coupled HPC applications

Sensing-as-a-Service C. Perera, A. Zaslavsky, P. Christen and D. Georgakopoulos, "Sensing as a service model for smart cities supported by Internet of Things", TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2014; 25:81 93

Sensing-as-a-Service C. Perera, A. Zaslavsky, P. Christen and D. Georgakopoulos, "Sensing as a service model for smart cities supported by Internet of Things", TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2014; 25:81 93

Objects may require to be uniquely identified, or to be identified as belonging to a given class Multi-service platform Distributed processing of data traces Distributed flow control Privacy

What are the issues in networking for Big Data?

Data Centers

Cloud Data Center Traffic Cisco Global Cloud Index: Forecast and Methodology, 2012 2017

Canonical Data Center Architecture Core (L3) Aggregation (L2) Edge (L2) Top-of-Rack Application servers

Data Center Traffic Most of the flows are small in size (< 10 KB) Most of the bytes in top 10% large flows Traffic leaving edge switches ON-OFF, lognormal distributions Packet size distribution bimodal (200 to 1400 B) T. Benson, A. Akella, and D. A. Maltz. 2010. Network traffic characteristics of data centers in the wild. In Proc of the 10th ACM SIGCOMM conference on Internet measurement (IMC '10). ACM, New York, NY, USA, 267-280

Data Center Traffic In cloud data center majority of flows stay in rack (80%) while in enterprise and university data center it varies from 40% to 90% Core layer most utilized, edge layer lightly utilized Core layer contain hot spot but less than 25% of links No need for more bisection bandwidth Most of losses occur in links with low utilization due to bursty traffic

VM Processes VM arrival and departure processes self similar, power law VM in the system: ARIMA model Yi Han, Jeffrey Chan and Christopher Leckie. Analysing Virtual Machine Usage in Cloud Computing. In Proceedings of the IEEE 2013 3rd International Workshop on Performance Aspects of Cloud and Service Virtualization, 2013

Need of longitudinal study on cloud (data center) traffic characterization Need of publicly available traces

Data Center Network Fat Tree Dcell B-Cube Jellyfish

Liberate upper layer switches for load balance to avoid few hot switches being overloaded Networks should guarantee well isolation, and stable service among multiple tenants.

Hybrid Data Center Networks Christoforos Kachris and Ioannis Tomkos "Optical interconnection networks for data centers", ONDM 2013

Hybrid Data Center Networks Christoforos Kachris and Ioannis Tomkos "Optical interconnection networks for data centers", ONDM 2013

Need of high radix, scalable, energy efficient Data Centers that can sustain the exponential increase of the network traffic.

VM Placing Non- trivial network topology for scalability and reliability Multi-path routing; route can change dynamically Heterogeneous services; large variety of run-time traffic pattern Unpredictable traffic variability due unpredictable request spikes and servisse-dependente operations

Need for traffic-aware VM placing that takes into consideration the correlations of VM traffic as well as traffic variability; dynamic placement decision.

VM Migration Improvement of data and network locality; Not Always possible to mantain same IP address, leading to service disruption; WAN Migration: trade off bandwidth x downtime R. Boutaba, Q. Zhang and M. F. Zhani. Virtual Machine Migration in Cloud Computing Environments: Benefits, Challenges, and Approaches. In Communication Infrastructures for Cloud Computing. H. Mouftah and B. Kantarci (Editors). IGI-Global, USA. pp. 383-408, September, 2013

Transport protocols to handle service disruption Sophisticated management strategies for large scale VM deployment Development of Inter-data centers VM migration framework VM migration to facilitate the collaboration between cloud and mobile devices

How can Big Data be processed, transferred and storaged in a friendly way?

Software Defined Data Center

Software Defined Data Center <http://youtu.be/uwb4kmghzaa>

Virtual Networks VN Scheme Description Encapsulation Scalability- # of VNs VLAN VXLAN NVGRE Bridges VMs, for dedicated management MAC-in-IP 2 12 Ammeliorates scalability for cloud environments Ammeliorates scalability for cloud environments MAC-in-UDP MAC-in-GRE 2 24 2 24 Contrail Uses Openflow All 2 12 NSX Uses Openflow All 2 12

Network Virtualization Routing Protocol Multicast Tree Encapsulation TRILL IS-IS Single MAC-in-MAC SPB IS-IS Multiple MAC-in-MAC NetLord SPAIN Single MAC-in-(IP+MAC) Openflow All Single All

OpenFlow Switching OpenFlow Switch specification OpenFlow Switch Controller PC sw hw Secure Channel Flow Table The Stanford Clean Slate Program http://cleanslate.stanford.edu

ElasticTree [Brandon Heller, NSDI 2010]

SDN functional architecture

Open Daylight Plataform http://www.opendaylight.org/project/technical-overview

SDN and Hadoop P. Qin, B. Dai, B. Huang and G. Xu, Bandwidth-Aware Scheduling with SDN in Hadoop: A New Trend for Big Data, in Proc. of INFOCOM 2014

Multipath solution for fast traffic rerouting Scalability - current support of 10+6 flows for data centers Delay in flow set up, proliferation of flow table can limit scalability How to place a given number of controller in a certain physical network such that predefined objectives are achieved?

Network Programming languages Language FML Frenectic Nettle Netcore Procera Pyretic Flog HFT FlatTire Short description high level policy description language (e.g. access control) avoid race conditions through well defined high level programming abstractions allow programmers to deal with streams instead of events means for expressing packet-forwarding policies in a high level high level abstractions to describe reactive and temporal behaviors specify network policies at a high level of abstraction, offering transparent composition and topology mapping combine ideas of FML and Frenetic, providing an event-driven and forward-chaining logic programming language enables hierarchical policies description with conflict-resolution operators, well suited forn decentralized decision makers enables hierarchical policies description with conflict-resolution operators, well suited for decentralized decision makers

Support of SDN language to the automation of Software Defined Data Center Abstraction of resources for the support of application requirements

Software Defined Storage

Final Remarks Need of characterization of traffic generated by Big Data applications Use of communication patterns of Big Data processing to define resource allocation Distributed processing of Big Data Integration of SDN functionality into automationof SDDC for the support of requirements of Big Data applications