Trevi: Watering down storage hotspots with cool fountain codes. Toby Moncaster University of Cambridge

Size: px
Start display at page:

Download "Trevi: Watering down storage hotspots with cool fountain codes. Toby Moncaster University of Cambridge"

Transcription

1 Trevi: Watering down storage hotspots with cool fountain codes Toby Moncaster University of Cambridge

2 Trevi summary Ø Trevi is a cool new approach to data centre storage Ø based on exis;ng ideas that are known to work well Ø It leverages fountain coding to give a number of key advantages: Resilience to loss Data replica;on for free (Reliable) Mul;cas;ng of writes Mul;sourcing of reads Full support for mul;ple network paths Ø Trevi works on commodity hardware needs no changes to the applica;on or network Ø Trevi doesn t need TCP or any other reliable transport 2

3 Background Ø Commodity data centres are here to stay Amazon, Google, etc Ø This has led to lots of new ideas and innova;ons: Novel programming paradigms: MapReduce, Hadoop, Ciel New network topologies: FatTree, VL2, CamCube Alterna;ve transport protocols: MPTCP, DCTCP, HULL Ø But storage was the poor rela;on (un;l recently ) GFS/Colossus is distributed blob store with central metadata server Flat Datacenter Storage advocates a distributed metadata approach Ø Both are liable to TCP- related problems such as incast and unnecessary re- transmissions 3

4 Brief reminder on Fountain Codes Ø Fountain codes are a form of reliable mul;cast [1] Ø They are rateless and loss- tolerant Ø Based on sparse erasure codes (Luby Transform,Tornado codes) Ø 1-7% overhead (depends on sta.s.cal distribu.on) Ø Use XOR for encode and decode opera;ons C 1 C 1 = D 1 Receive C 1 C 2 C 4 C 5 C 7 D 1 C 2 C 2 = D 1 + D 2 + D 4 C 1 à D 1 D 2 D 3 C 3 C 4 C 3 = D 2 + D 3 C 4 = D 3 + D 4 Transmit C 7 à D 5 C 5 + D 5 à D 4 D 4 C 5 C 6 C 5 = D 4 + D 5 C 6 = D 1 + D 3 + D 5 C 4 + D 4 à D 3 C 2 + D 1 à C (= D 2 + D 4 ) D 5 C 7 C 7 = D 5 C + D 4 à D 2 [1] A digital fountain approach to asynchronous reliable mul.cast. Byers, Luby et. Al. 4

5 Ø Wri;ng Data (mul;cast): Strawman Design Source encodes data chunks as symbols Symbols mul;cast to a subset of storage nodes If symbols are lost, keep sending new ones un;l recovered Once all storage nodes have all the data, stop transmibng Reading Data (mul;sourcing): Client requests data from set of storage nodes Each nodes starts crea;ng symbols and transmit them Nodes randomise the coding so all symbols provide new informa;on to the client Client receives data in parallel from all servers Once data is received, tell storage nodes to stop 5

6 Strawman Issues Our strawman has several obvious issues: Ø Wasted bandwidth If you transmit un;l all nodes send stop you will waste BW All nodes see all traffic during writes (due to mul;cast) Ø Lack of fairness If you transmit at linerate you squeeze out other traffic If you transmit at a lower rate you might s;ll trigger conges;on Ø Overload at controller If you transmit at linerate, slower storage controllers will be congested Storage controllers may fail under excess load snowball effect 6

7 Detailed design Ø Based on Flat Data Center Storage: Physical storage divided into blobs at, data divided into 8Mb tracts Each blob has Tract Server which controls loca;on of data tracts Each node has a copy of the Tract Locator Table (lazily replicated) If node g wants tract i in TLT length l: use {hash(g) + i}mod (l) to get line no. TLT has mul;ple columns where each column = 1 replica Includes mechanisms to deal with stale data, node failure, etc Ø Our system adds a mul;cast address to each TLT entry Ø Data is sent as encoded symbols, but stored as actual data chunks Ø Use a receiver- driven flow control Receiver dictates rate that symbols are sent. Rate determined by min of the rate at which each storage server can store data the rate at which a sender can send data the conges;on in the network. 7

8 Ø Predic;ve flow control Use a hybrid push- pull model PotenIal refinements Send enough symbols to ensure recep;on in absence of loss Then use pull approach to fill in gaps Ø Priority and scavenging Trevi is ideal for scavenger- style transports In absence of compe;ng traffic can transmit at any rate If compe;ng traffic present, then slow down sending rate Virtual Queues can be used to measure this (especially at final hop) Ø Op;mising for slow writes If you are wri;ng data to mul;ple nodes, one may be much slower This has big impact on overall speed of writes. 2 solu;ons: 1) Remove the slow node from the mul;cast group 2) Ignore the slower node if it becomes overwhelmed it can unsubscribe 8

9 Trevi summary (revisited) Trevi uses fountain codes to provide scalable data centre storage Ø Resilience to loss : no retransmissions, no ;meouts Ø Data replica.on for free: Mul;cast is built in so mul;ple copies of each blob are always stored Ø (Reliable) Mul.cas.ng of writes: Makes for simple management of replica;on groups just subscribe/unsubscribe Ø Mul.sourcing of reads: Each node generates a difference set of symbols, so all symbols provide new informa;on Ø Full support for mul.ple network paths: With careful planning mul;cast can make full use of available paths Ø Trevi works on commodity hardware: Although hardware offload might speed up the XOR opera;ons (e.g. NetFPGA) Ø It needs no changes to the applica.on: built on top of UDP and uses a simple shim layer between the app and network 9

10 QuesIons?

Mul$media Networking. #3 Mul$media Networking Semester Ganjil PTIIK Universitas Brawijaya. #3 Requirements of Mul$media Networking

Mul$media Networking. #3 Mul$media Networking Semester Ganjil PTIIK Universitas Brawijaya. #3 Requirements of Mul$media Networking Mul$media #3 Mul$media Semester Ganjil PTIIK Universitas Brawijaya Schedule of Class Mee$ng 1. Introduc$on 2. Applica$ons of MN 3. Requirements of MN 4. Coding and Compression 5. RTP 6. IP Mul$cast 7.

More information

A Digital Fountain Approach to Reliable Distribution of Bulk Data

A Digital Fountain Approach to Reliable Distribution of Bulk Data A Digital Fountain Approach to Reliable Distribution of Bulk Data John Byers, ICSI Michael Luby, ICSI Michael Mitzenmacher, Compaq SRC Ashu Rege, ICSI Application: Software Distribution New release of

More information

Ant Rowstron. Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge. Hussam Abu-Libdeh, Simon Schubert Interns

Ant Rowstron. Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge. Hussam Abu-Libdeh, Simon Schubert Interns Ant Rowstron Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge Hussam Abu-Libdeh, Simon Schubert Interns Thinking of large-scale data centers Microsoft, Google,

More information

Hadoop Architecture. Part 1

Hadoop Architecture. Part 1 Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

More information

Advanced Computer Networks. Scheduling

Advanced Computer Networks. Scheduling Oriana Riva, Department of Computer Science ETH Zürich Advanced Computer Networks 263-3501-00 Scheduling Patrick Stuedi, Qin Yin and Timothy Roscoe Spring Semester 2015 Outline Last time Load balancing

More information

Transforming cloud infrastructure to support Big Data Ying Xu Aspera, Inc

Transforming cloud infrastructure to support Big Data Ying Xu Aspera, Inc Transforming cloud infrastructure to support Big Data Ying Xu Aspera, Inc Presenters and Agenda! PRESENTER Ying Xu Principle Engineer, Aspera R&D ying@asperasoft.com AGENDA Challenges in Moving Big Data

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

MMPTCP: A Novel Transport Protocol for Data Centre Networks

MMPTCP: A Novel Transport Protocol for Data Centre Networks MMPTCP: A Novel Transport Protocol for Data Centre Networks Morteza Kheirkhah FoSS, Department of Informatics, University of Sussex Modern Data Centre Networks FatTree It provides full bisection bandwidth

More information

Cloud Computing at Google. Architecture

Cloud Computing at Google. Architecture Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

More information

Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons. Dr Robert N. M. Watson

Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons. Dr Robert N. M. Watson Distributed systems Lecture 6: Elec3ons, consensus, and distributed transac3ons Dr Robert N. M. Watson 1 Last 3me Saw how we can build ordered mul3cast Messages between processes in a group Need to dis3nguish

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

Networking in the Hadoop Cluster

Networking in the Hadoop Cluster Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop

More information

Internet Storage Sync Problem Statement

Internet Storage Sync Problem Statement Internet Storage Sync Problem Statement draft-cui-iss-problem Zeqi Lai Tsinghua University 1 Outline Background Problem Statement Service Usability Protocol Capabili?es Our Explora?on on Protocol Capabili?es

More information

Multipath TCP design, and application to data centers. Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke

Multipath TCP design, and application to data centers. Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke Multipath TCP design, and application to data centers Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke Packet switching pools circuits. Multipath pools links : it is Packet Switching 2.0.

More information

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

More information

CS 91: Cloud Systems & Datacenter Networks Failures & Replica=on

CS 91: Cloud Systems & Datacenter Networks Failures & Replica=on CS 91: Cloud Systems & Datacenter Networks Failures & Replica=on Types of Failures fail stop : process/machine dies and doesn t come back. Rela=vely easy to detect. (oien planned) performance degrada=on:

More information

Introduc7on to Computer Networks

Introduc7on to Computer Networks Introduc7on to Computer Networks Data Centers Computer Science & Engineering Cloud Computing! Elastic resources Expand and contract resources Pay-per-use Infrastructure on demand Multi-tenancy Multiple

More information

Project Overview. Collabora'on Mee'ng with Op'mis, 20-21 Sept. 2011, Rome

Project Overview. Collabora'on Mee'ng with Op'mis, 20-21 Sept. 2011, Rome Project Overview Collabora'on Mee'ng with Op'mis, 20-21 Sept. 2011, Rome Cloud-TM at a glance "#$%&'$()!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"#$%&!"'!()*+!!!!!!!!!!!!!!!!!!!,-./01234156!("*+!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!&7"7#7"7!("*+!!!!!!!!!!!!!!!!!!!89:!;62!("$+!

More information

Load Balancing Mechanisms in Data Center Networks

Load Balancing Mechanisms in Data Center Networks Load Balancing Mechanisms in Data Center Networks Santosh Mahapatra Xin Yuan Department of Computer Science, Florida State University, Tallahassee, FL 33 {mahapatr,xyuan}@cs.fsu.edu Abstract We consider

More information

Scaling IP Mul-cast on Datacenter Topologies. Xiaozhou Li Mike Freedman

Scaling IP Mul-cast on Datacenter Topologies. Xiaozhou Li Mike Freedman Scaling IP Mul-cast on Datacenter Topologies Xiaozhou Li Mike Freedman IP Mul0cast Applica0ons Publish- subscribe services Clustered applica0ons servers Distributed caching infrastructures IP Mul0cast

More information

Offensive & Defensive & Forensic Techniques for Determining Web User Iden<ty

Offensive & Defensive & Forensic Techniques for Determining Web User Iden<ty Offensive & Defensive & Forensic Techniques for Determining Web User Iden

More information

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com ` CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS Review Business and Technology Series www.cumulux.com Table of Contents Cloud Computing Model...2 Impact on IT Management and

More information

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Agenda Industry Trends Cloud Storage Evolu4on of Storage Architectures Storage Connec4vity redefined S3 Cloud Storage Use

More information

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE Deploy a modern hyperscale storage platform on commodity infrastructure ABSTRACT This document provides a detailed overview of the EMC

More information

Large-Scale Distributed Systems. Datacenter Networks. COMP6511A Spring 2014 HKUST. Lin Gu lingu@ieee.org

Large-Scale Distributed Systems. Datacenter Networks. COMP6511A Spring 2014 HKUST. Lin Gu lingu@ieee.org Large-Scale Distributed Systems Datacenter Networks COMP6511A Spring 2014 HKUST Lin Gu lingu@ieee.org Datacenter Networking Major Components of a Datacenter Computing hardware (equipment racks) Power supply

More information

Data Center 2020. DC planning for the next 5 10 years. Copyright 2004-2013 Experture and Robert Frances Group, all rights reserved

Data Center 2020. DC planning for the next 5 10 years. Copyright 2004-2013 Experture and Robert Frances Group, all rights reserved DC planning for the next 5 10 years Topics to be Discussed Introduc=on Indirect Drivers Technology Direct Drivers Data Center DC Management DC Opera=ons s and Disaster Recovery 2 Introduc=on The future

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead RAID # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead Tiffany Yu-Han Chen (These slides modified from Hao-Hua Chu National Taiwan University) RAID 0 - Striping

More information

Giving life to today s media distribution services

Giving life to today s media distribution services Giving life to today s media distribution services FIA - Future Internet Assembly Athens, 17 March 2014 Presenter: Nikolaos Efthymiopoulos Network architecture & Management Group Copyright University of

More information

DNS Big Data Analy@cs

DNS Big Data Analy@cs Klik om de s+jl te bewerken Klik om de models+jlen te bewerken! Tweede niveau! Derde niveau! Vierde niveau DNS Big Data Analy@cs Vijfde niveau DNS- OARC Fall 2015 Workshop October 4th 2015 Maarten Wullink,

More information

www.basho.com Technical Overview Simple, Scalable, Object Storage Software

www.basho.com Technical Overview Simple, Scalable, Object Storage Software www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...

More information

Energy Efficient MapReduce

Energy Efficient MapReduce Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

More information

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc. The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc The need for new architectures Business innovation Time-to-market

More information

Final for ECE374 05/06/13 Solution!!

Final for ECE374 05/06/13 Solution!! 1 Final for ECE374 05/06/13 Solution!! Instructions: Put your name and student number on each sheet of paper! The exam is closed book. You have 90 minutes to complete the exam. Be a smart exam taker -

More information

http://bradhedlund.com/?p=3108

http://bradhedlund.com/?p=3108 http://bradhedlund.com/?p=3108 This article is Part 1 in series that will take a closer look at the architecture and methods of a Hadoop cluster, and how it relates to the network and server infrastructure.

More information

Transport Layer Protocols

Transport Layer Protocols Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed.

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed. Congestion Control When one part of the subnet (e.g. one or more routers in an area) becomes overloaded, congestion results. Because routers are receiving packets faster than they can forward them, one

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

Apache Hadoop FileSystem and its Usage in Facebook

Apache Hadoop FileSystem and its Usage in Facebook Apache Hadoop FileSystem and its Usage in Facebook Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System dhruba@apache.org Presented at Indian Institute of Technology November, 2010 http://www.facebook.com/hadoopfs

More information

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay Weekly Report Hadoop Introduction submitted By Anurag Sharma Department of Computer Science and Engineering Indian Institute of Technology Bombay Chapter 1 What is Hadoop? Apache Hadoop (High-availability

More information

(Refer Slide Time: 02:17)

(Refer Slide Time: 02:17) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #06 IP Subnetting and Addressing (Not audible: (00:46)) Now,

More information

Hadoop Cluster Applications

Hadoop Cluster Applications Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday

More information

Computer Networks. Examples of network applica3ons. Applica3on Layer

Computer Networks. Examples of network applica3ons. Applica3on Layer Computer Networks Applica3on Layer 1 Examples of network applica3ons e- mail web instant messaging remote login P2P file sharing mul3- user network games streaming stored video clips social networks voice

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Henrique Rodrigues, Yoshio Turner, Jose Renato Santos, Paolo Victor, Dorgival Guedes HP Labs WIOV 2011, Portland, OR The

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer Data Center Infrastructure of the future Alexei Agueev, Systems Engineer Traditional DC Architecture Limitations Legacy 3 Tier DC Model Layer 2 Layer 2 Domain Layer 2 Layer 2 Domain Oversubscription Ports

More information

Lecture 14: Data transfer in multihop wireless networks. Mythili Vutukuru CS 653 Spring 2014 March 6, Thursday

Lecture 14: Data transfer in multihop wireless networks. Mythili Vutukuru CS 653 Spring 2014 March 6, Thursday Lecture 14: Data transfer in multihop wireless networks Mythili Vutukuru CS 653 Spring 2014 March 6, Thursday Data transfer over multiple wireless hops Many applications: TCP flow from a wireless node

More information

Apache Hadoop FileSystem Internals

Apache Hadoop FileSystem Internals Apache Hadoop FileSystem Internals Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System dhruba@apache.org Presented at Storage Developer Conference, San Jose September 22, 2010 http://www.facebook.com/hadoopfs

More information

A SENSIBLE GUIDE TO LATENCY MANAGEMENT

A SENSIBLE GUIDE TO LATENCY MANAGEMENT A SENSIBLE GUIDE TO LATENCY MANAGEMENT By Wayne Rash Wayne Rash has been writing technical articles about computers and networking since the mid-1970s. He is a former columnist for Byte Magazine, a former

More information

March 10 th 2011, OSG All Hands Mee6ng, Network Performance Jason Zurawski Internet2 NDT

March 10 th 2011, OSG All Hands Mee6ng, Network Performance Jason Zurawski Internet2 NDT March 10 th 2011, OSG All Hands Mee6ng, Network Performance Jason Zurawski Internet2 NDT Agenda Tutorial Agenda: Network Performance Primer Why Should We Care? (15 Mins) GeNng the Tools (10 Mins) Use of

More information

Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

More information

Texas Digital Government Summit. Data Analysis Structured vs. Unstructured Data. Presented By: Dave Larson

Texas Digital Government Summit. Data Analysis Structured vs. Unstructured Data. Presented By: Dave Larson Texas Digital Government Summit Data Analysis Structured vs. Unstructured Data Presented By: Dave Larson Speaker Bio Dave Larson Solu6ons Architect with Freeit Data Solu6ons In the IT industry for over

More information

IMPROVING QUALITY OF VIDEOS IN VIDEO STREAMING USING FRAMEWORK IN THE CLOUD

IMPROVING QUALITY OF VIDEOS IN VIDEO STREAMING USING FRAMEWORK IN THE CLOUD IMPROVING QUALITY OF VIDEOS IN VIDEO STREAMING USING FRAMEWORK IN THE CLOUD R.Dhanya 1, Mr. G.R.Anantha Raman 2 1. Department of Computer Science and Engineering, Adhiyamaan college of Engineering(Hosur).

More information

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS 137 CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS 8.1 CONCLUSION In this thesis, efficient schemes have been designed and analyzed to control congestion and distribute the load in the routing process of

More information

Google File System. Web and scalability

Google File System. Web and scalability Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

More information

Ring Protection: Wrapping vs. Steering

Ring Protection: Wrapping vs. Steering Ring Protection: Wrapping vs. Steering Necdet Uzun and Pinar Yilmaz March 13, 2001 Contents Objectives What are wrapping and steering Single/dual fiber cut Comparison of wrapping and steering Simulation

More information

Video Streaming with Network Coding

Video Streaming with Network Coding Video Streaming with Network Coding Kien Nguyen, Thinh Nguyen, and Sen-Ching Cheung Abstract Recent years have witnessed an explosive growth in multimedia streaming applications over the Internet. Notably,

More information

CS 91: Cloud Systems & Datacenter Networks Networks Background

CS 91: Cloud Systems & Datacenter Networks Networks Background CS 91: Cloud Systems & Datacenter Networks Networks Background Walrus / Bucket Agenda Overview of tradibonal network topologies IntroducBon to soeware- defined networks Layering and terminology Topology

More information

Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering

Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering Overview 1. Who is Spil Games? 2. Theory 3. Spil Storage Pla9orm 4. Ques=ons? 2 Who are we? Who is Spil

More information

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos Research Challenges Overview May 3, 2010 Table of Contents I 1 What Is It? Related Technologies Grid Computing Virtualization Utility Computing Autonomic Computing Is It New? Definition 2 Business Business

More information

Load Balancing in Data Center Networks

Load Balancing in Data Center Networks Load Balancing in Data Center Networks Henry Xu Computer Science City University of Hong Kong HKUST, March 2, 2015 Background Aggregator Aggregator Aggregator Worker Worker Worker Worker Low latency for

More information

Kaseya Fundamentals Workshop DAY THREE. Developed by Kaseya University. Powered by IT Scholars

Kaseya Fundamentals Workshop DAY THREE. Developed by Kaseya University. Powered by IT Scholars Kaseya Fundamentals Workshop DAY THREE Developed by Kaseya University Powered by IT Scholars Kaseya Version 6.5 Last updated March, 2014 Day Two Overview Day Two Lab Review Patch Management Configura;on

More information

Data Center Networking with Multipath TCP

Data Center Networking with Multipath TCP Data Center Networking with Multipath TCP Costin Raiciu, Christopher Pluntke, Sebastien Barre, Adam Greenhalgh, Damon Wischik, Mark Handley Hotnets 2010 報 告 者 : 莊 延 安 Outline Introduction Analysis Conclusion

More information

Transport Services (TAPS) BOF plan

Transport Services (TAPS) BOF plan Transport Services (TAPS) BOF plan T. Moncaster, M. Welzl, D. Ros: dra5- moncaster- tsvwg- transport- services- 00 h?ps://sites.google.com/site/transportprotocolservices Michael Welzl, with help from (alphabe/cal):

More information

Further considera/ons on data center conges/on control. IETF89@London denglingli@chinamobile.com

Further considera/ons on data center conges/on control. IETF89@London denglingli@chinamobile.com Further considera/ons on data center conges/on control IETF89@London denglingli@chinamobile.com Outline Review on TCP CC in Internet DCs Discussion on CC in Operator DCs Recap on E2E Conges/on Control

More information

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

Behavior Analysis of TCP Traffic in Mobile Ad Hoc Network using Reactive Routing Protocols

Behavior Analysis of TCP Traffic in Mobile Ad Hoc Network using Reactive Routing Protocols Behavior Analysis of TCP Traffic in Mobile Ad Hoc Network using Reactive Routing Protocols Purvi N. Ramanuj Department of Computer Engineering L.D. College of Engineering Ahmedabad Hiteishi M. Diwanji

More information

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)

More information

Can Cloud Hos+ng Providers Really Replace. Your Cri(cal IT Infrastructure?

Can Cloud Hos+ng Providers Really Replace. Your Cri(cal IT Infrastructure? Can Cloud Hos+ng Providers Really Replace Your Cri(cal IT Infrastructure? Housekeeping Welcome to Align s Webinar Can Cloud Hos+ng Providers Really Replace Your Cri(cal IT Infrastructure? Informa+on for

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

Operating Systems. Cloud Computing and Data Centers

Operating Systems. Cloud Computing and Data Centers Operating ystems Fall 2014 Cloud Computing and Data Centers Myungjin Lee myungjin.lee@ed.ac.uk 2 Google data center locations 3 A closer look 4 Inside data center 5 A datacenter has 50-250 containers A

More information

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk WHITE PAPER Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk 951 SanDisk Drive, Milpitas, CA 95035 2015 SanDisk Corporation. All rights reserved. www.sandisk.com Table of Contents Introduction

More information

TCP in Wireless Mobile Networks

TCP in Wireless Mobile Networks TCP in Wireless Mobile Networks 1 Outline Introduction to transport layer Introduction to TCP (Internet) congestion control Congestion control in wireless networks 2 Transport Layer v.s. Network Layer

More information

TCP for Wireless Networks

TCP for Wireless Networks TCP for Wireless Networks Outline Motivation TCP mechanisms Indirect TCP Snooping TCP Mobile TCP Fast retransmit/recovery Transmission freezing Selective retransmission Transaction oriented TCP Adapted

More information

Paolo Costa costa@imperial.ac.uk

Paolo Costa costa@imperial.ac.uk joint work with Ant Rowstron, Austin Donnelly, and Greg O Shea (MSR Cambridge) Hussam Abu-Libdeh, Simon Schubert (Interns) Paolo Costa costa@imperial.ac.uk Paolo Costa CamCube - Rethinking the Data Center

More information

Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications

Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications Liqiang (Eric) Wang, Hong Zhang University of Wyoming Hai Huang IBM T.J. Watson Research Center Background Hadoop: Apache Hadoop

More information

Background. Personal cloud services are gaining popularity

Background. Personal cloud services are gaining popularity Background Personal cloud services are gaining popularity Many providers enter the market. (e.g. Dropbox, Google, Microso

More information

Scalus A)ribute Workshop. Paris, April 14th 15th

Scalus A)ribute Workshop. Paris, April 14th 15th Scalus A)ribute Workshop Paris, April 14th 15th Content Mo=va=on, objec=ves, and constraints Scalus strategy Scenario and architectural views How the architecture works Mo=va=on for this MCITN Storage

More information

Enterprise QoS. Tim Chung Google Corporate Netops Architecture Nanog 49 June 15th, 2010

Enterprise QoS. Tim Chung Google Corporate Netops Architecture Nanog 49 June 15th, 2010 Enterprise QoS Tim Chung Google Corporate Netops Architecture Nanog 49 June 15th, 2010 Agenda Challenges Solu5ons Opera5ons Best Prac5ces Note: This talk pertains to Google enterprise network only, not

More information

Object Storage: Out of the Shadows and into the Spotlight

Object Storage: Out of the Shadows and into the Spotlight Technology Insight Paper Object Storage: Out of the Shadows and into the Spotlight By John Webster December 12, 2012 Enabling you to make the best technology decisions Object Storage: Out of the Shadows

More information

Computer Networks COSC 6377

Computer Networks COSC 6377 Computer Networks COSC 6377 Lecture 25 Fall 2011 November 30, 2011 1 Announcements Grades will be sent to each student for verificagon P2 deadline extended 2 Large- scale computagon Search Engine Tasks

More information

Neil Stobart Cloudian Inc. CLOUDIAN HYPERSTORE Smart Data Storage

Neil Stobart Cloudian Inc. CLOUDIAN HYPERSTORE Smart Data Storage Neil Stobart Cloudian Inc. CLOUDIAN HYPERSTORE Smart Data Storage Storage is changing forever Scale Up / Terabytes Flash host/array Tradi/onal SAN/NAS Scalability / Big Data Object Storage Scale Out /

More information

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015 Distributed Systems 23. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2015 November 17, 2015 2014-2015 Paul Krzyzanowski 1 Motivation Serving web content from one location presents

More information

Multi-Datacenter Replication

Multi-Datacenter Replication www.basho.com Multi-Datacenter Replication A Technical Overview & Use Cases Table of Contents Table of Contents... 1 Introduction... 1 How It Works... 1 Default Mode...1 Advanced Mode...2 Architectural

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

QoS issues in Voice over IP

QoS issues in Voice over IP COMP9333 Advance Computer Networks Mini Conference QoS issues in Voice over IP Student ID: 3058224 Student ID: 3043237 Student ID: 3036281 Student ID: 3025715 QoS issues in Voice over IP Abstract: This

More information

Advanced Computer Networks. Datacenter Network Fabric

Advanced Computer Networks. Datacenter Network Fabric Advanced Computer Networks 263 3501 00 Datacenter Network Fabric Patrick Stuedi Spring Semester 2014 Oriana Riva, Department of Computer Science ETH Zürich 1 Outline Last week Today Supercomputer networking

More information

Quantum StorNext. Product Brief: Distributed LAN Client

Quantum StorNext. Product Brief: Distributed LAN Client Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without

More information

Hadoop Distributed File System (HDFS) Overview

Hadoop Distributed File System (HDFS) Overview 2012 coreservlets.com and Dima May Hadoop Distributed File System (HDFS) Overview Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized

More information

Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division

Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division In this talk Big data storage: Current trends Issues with current storage options Evolution of storage to support big

More information

How To Create A P2P Network

How To Create A P2P Network Peer-to-peer systems INF 5040 autumn 2007 lecturer: Roman Vitenberg INF5040, Frank Eliassen & Roman Vitenberg 1 Motivation for peer-to-peer Inherent restrictions of the standard client/server model Centralised

More information

Network edge and network core. millions of connected compu?ng devices: hosts = end systems running network apps

Network edge and network core. millions of connected compu?ng devices: hosts = end systems running network apps Computer Networks 1-1 What s the Internet: nuts and bolts view PC server wireless laptop cellular handheld access points wired links millions of connected compu?ng devices: hosts = end systems running

More information

Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft

Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft Agenda: Windows Azure Storage Overview Architecture Key Design Points 2 Overview Windows Azure Storage Cloud Storage - Anywhere and

More information

NextGen Infrastructure for Big DATA Analytics.

NextGen Infrastructure for Big DATA Analytics. NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures

More information