Systems for Fun and Profit
|
|
- Claribel Delphia Hensley
- 8 years ago
- Views:
Transcription
1 Department of Computing Building Internet-Scale Distributed Systems for Fun and Profit Peter Pietzuch Large-Scale Distributed Systems Group Peter R. Pietzuch Distributed Software Engineering (DSE) Section Department of Computing Imperial College London Oxford University Computer Laboratory Oxford June 2009
2 Internet-Scale Distributed Systems - Search engines (e.g. Google, Yahoo,...) Global crawling, indexing and search Google: over 450,000 servers in at least 30 data centres world-wide (?) Content delivery networks (CDNs) (e.g. Akamai, Limelight,...) Scalable web hosting, file distribution, media streaming,... Akamai: hosting for Microsoft.com, CNN.com, BBC iplayer,... Social networking sites (e.g. Facebook, Twitter, LinkedIn,...) Facebook: serves 200 million users and stores 40 billion photos Cloud computing applications (e.g. Amazon, Microsoft, Google,...) Pay-as-you-use as you use storage and computation for applications Amazon: bought servers worth $86 million in 2008 alone 2
3 Internet-Scale Distributed Systems Peer-to-peer computing (e.g. Bittorrent, BOINC,...) Contribute users users resources for file sharing, scientific computing Bittorrent: 1/3 of all Internet traffic (?) [CacheLogic computing: Large-scale test-beds (e.g. PlanetLab, Emulab,...) Possible to deploy research systems in real-world l ld PlanetLab: 1041 nodes at 500 sites (May 09) Great for student projects! 3
4 Properties of Internet-Scale Systems Large number of users, requests, resources,... Single/multiple data centres, hosts and/or mobile clients Requirement: Scalability Wide-area Internet communication Cannot ignore network effects Requirement: Network-awareness Long-running, 24/7 service Must adapt to changing conditions and failure Requirement: Fault-tolerance t l 4
5 Why is Building Internet-Scale Systems Hard? Scalability is hard to achieve How to organise 1000s of processing hosts? What is the programming model? Applications must be intelligent about network use How can we achieve application requirements? Lead to data loss, loss resource shortages shortages, inconsistency PlanetLab: 630 healthy machines outt off 1041 ttotal t l (May 09) Google: 1 failure per hour in 10,000 node clusters source: Google Continuous network, node failures 5
6 High-level Abstractions Help Google uses several layers of abstraction Runs applications (search, mail,...) on top of highest layer Each layer is scalable, network-aware and fault-tolerant Google Apps Google Apps Google Apps MapReduce computation BigTable storage system Chubby lock service Google File System 6
7 Large-Scale Distributed Systems Group Research goal: Support the design and engineering of scalable and robust Internet-wide e applications Need to provide higher-level abstractions at different layers Many success stories from research exist e.g. overlay networks, distributed hash tables, network coordinates, storage and replication mechanisms,... Combination of networks, distributed systems & database research Data management layer Application layer Network layer 7
8 Talk Structure III. Data management layer: Supporting imperfect data processing DISSP Project: Dependable Internet-scale stream processing II. Application layer: Building adaptive overlay networks LANC CDN Project: Network/load-awareaware content delivery I. Network layer: Improving Internet routing Ukairo Project: Detour routing for applications 8
9 I. Improving Internet Routing Internet-scale applications want custom communication paths Skype wants path with low packet loss itunes wants path with high download rate Internet et uses two-level eehierarchical eac carouting gsce schemee AS 2 AS 3 AS 4 b 2 a AS 1 AS 5 AS 6 Internet hosts part of autonomous systems (ASs) Inter-AS routing (BGP) and intra-as routing (OSPF) Internet routing optimises for ISPs concerns! One path for all applications and no control over returned path 9
10 Taking Detours on the Internet Idea: Take multiple Internet paths and stitch them together Direct Path a AS 1 AS 2 AS 3 AS 4 AS 5 b AS 6 d Detour Path Resulting detour path may have better properties What causes Internet detour paths? Inter-AS routing not optimal + limited expressiveness 10
11 Finding Detours in the AS Graph [IPTPS 09] Idea: Analyse detours in the Internet AS graph Assume that similar AS-level paths benefit from similar detours Shared AS link a AS 1 AS2 AS 3 AS 4 b c Known good AS 5 detour AS 6 d AS 7 e Potential good detour Perform clustering on similarity metric: shared link count 11
12 Ukairo Project: Detour Routing for Applications Deploying general-purpose detour routing plane on PlanetLab Continuously searches for Internet detour paths Node exchange found detours using gossiping Applications can use it transparently, e.g. web browser download Open research questions What is the overhead of finding detour paths? What happens if everybody uses detour routing? What do ISPs think about this? What are the lessons for future Internet designs? 12
13 Talk Structure III. Data management layer: Supporting imperfect data processing DISSP Project: Dependable Internet-scale stream processing II. Application layer: Building adaptive overlay networks LANC CDN Project: Network/load-awareaware delivery of content I. Network layer: Improving Internet routing Ukairo Project: Detour routing for applications 13
14 II. Building Adaptive Overlay Networks Imagine your start-up idea of mugbook becomes an overnight success... mugbook mugbook How do you support such a website? Single web server? Multiple web servers in single data centre? 14
15 Content Delivery Networks Content delivery networks (CDNs) serve content to many clients ce world-wide de Overlay network consists of: Distributed set of servers that maintain content replicas Clients (web browsers) that request content 15
16 Mapping Clients to Content Servers How do we assign clients to content servers? Load awareness Don t direct clients to overloaded content servers Network awareness Don t send traffic on congested network paths Many heuristics proposed in the past Geographic location Clustering of address prefixes Proprietary solutions 16
17 Cost Graph Associate each client/server pair with cost Use download times from servers as cost metric Incorporates load and network congestion But: measurement overhead remains high Can t measure all costs need to estimate missing ones 17
18 Network Coordinates Idea: Assume cost graph embeddable in metric space Approximate missing measurements using Euclidean distances Assign each client/server a network coordinate C Distances between coordinates estimate download costs C(Client1) C(Server1) = download_time 18
19 Computing Network Coordinates Scalable, decentralised computation (e.g. using Vivaldi algorithm) [Dabek 04] 2-5 dimensions sufficient in practice Low measurement overhead Continuous process ~1500 web servers with network delay as cost 19
20 LANC Content-Delivery Network [ROADS 08] Use network coordinates to organise content servers and clients Clients keep track of content servers in neighbourhood Map clients to nearest content servers in space Overloaded content servers move away 20
21 Does it really work? (Yes!) Deployed LANC CDN on PlanetLab 119 content servers and 16 clients Downloaded Linux distribution from 100 web servers world-wide Tried several different assignment strategies 1.0 LANC CDN CDF Nearest Random Direct Transfer data rate per request (KB/s) 21
22 Talk Structure III. Data management layer: Supporting imperfect data processing DISSP Project: Dependable Internet-scale stream processing II. Application layer: Building adaptive overlay networks LANC CDN Project: Network/load-awareaware delivery of content I. Network layer: Improving Internet routing Ukairo Project: Detour routing for applications 22
23 III. Supporting Imperfect Data Processing Global sensing infrastructures Users Mobile sensing devices Applications Traffic monitors Data collection, f sion fusion, aggregation & dissemination Scientific instruments RFID tags g Cameras Body sensor networks Webfeeds Embedded sensors Wireless sensor networks Web content Runs continuous queries over sensor streams Failure takes out resources 23
24 Stream Data Model Data sources emit streams of data tuples Tuples contain schema with fields ts coord image ts coord image ts coord image ts coord image ts coord image User submit declarative queries Range of operators (filter, join, transform,...) process data tuples image merging operator coordinate transform operator coordinate transform operator 24
25 Failure Recovery in Stream Processing Use redundant resources to achieve dependability image merging operator coordinate transform operator image merging operator coordinate transform operator Run multiple copies of same query operator But: Internet-scale system may have not enough spare resources Instead accept degradation in processing quality Idea: Enhance stream data model to include quality information 25
26 Quality-Centric Stream Data Model Enhance data tuples with: D8 D7 data weight recall 3 D8 2 D D1 1 D3 1 D5 1 D1 1 1 D3 1 1 D5 1 1 D2 1 D4 1 D6 1 D2 1 1 D4 1 1 D6 1 1 Weight Number of data sources in tuples Recall Fraction of received tuples 26
27 What is it Good for? Provide feedback about result quality to users Measure of how much data made it into the result tuple Allow system to handle node and network failures 1. Proactive operator replication Invest resources where failure impact highest 2. Reactive failure recovery Decide based on lost recall if recovery worthwhile Support for smart load-shedding under resource shortage Discard tuples with lowest impact on overloaded processing nodes 27
28 DISSP Project: Dependable Internet-Scale Stream Processing Currently building prototype system Anybody will be able to connect sensor sources + run queries System provide best effort service given available resources Users Applications Mobile sensing devices Scientific instruments Data collection, fusion, aggregation & dissemination Traffic monitors RFID tags Cameras Body sensor networks Open questions What s the right data model for processing sensor data? How to discovery data sources in a scalable fashion? How to perform query optimisation at a global scale? Webfeeds Embedded sensors Wireless sensor networks Web content 28
29 Research Outlook Programming model What are the right abstractions for building Internet-scale systems? Need richer Internet interface not just send(packet,dest_ip) How do we build robust cloud applications? Currently too much focus on low-level services System management How do we provision Internet-scale systems? Scale up/down for sudden rise in popularity p flash crowds Testing and evaluation How do we test, debug and evaluate Internet-scale systems? Hard to obtain reproducible results from PlanetLab experiments 29
30 Conclusions Internet-scale apps have new network requirements One size doesn t fitall but it s hard to change the Internet Ukairo: Overlay networks can provide custom routing Internet-scale systems need new overlay abstractions Apply geometric algorithm to solve distributed systems problems LANC CDN: Metric space for node organisation in CDN Internet-scale systems require new data models Unrealistic to expect perfect processing Instead accept failure and overload as a fact of life DISSP: Make impact of failure on processing explicit Thank You! Any Questions? Peter Pietzuch <prp@doc.ic.ac.uk> 30
From Internet Data Centers to Data Centers in the Cloud
From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs
More informationDATA COMMUNICATOIN NETWORKING
DATA COMMUNICATOIN NETWORKING Instructor: Ouldooz Baghban Karimi Course Book: Computer Networking, A Top-Down Approach, Kurose, Ross Slides: - Course book Slides - Slides from Princeton University COS461
More informationIndirection. science can be solved by adding another level of indirection" -- Butler Lampson. "Every problem in computer
Indirection Indirection: rather than reference an entity directly, reference it ( indirectly ) via another entity, which in turn can or will access the original entity A x B "Every problem in computer
More informationWeb Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)
1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication
More informationWeb Caching and CDNs. Aditya Akella
Web Caching and CDNs Aditya Akella 1 Where can bottlenecks occur? First mile: client to its ISPs Last mile: server to its ISP Server: compute/memory limitations ISP interconnections/peerings: congestion
More informationMeasuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD
Measuring the Web: Part I - - Content Delivery Networks Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD Acknowledgement Material presented in these slides is borrowed from presentajons
More informationData Center Content Delivery Network
BM 465E Distributed Systems Lecture 4 Networking (cont.) Mehmet Demirci Today Overlay networks Data centers Content delivery networks Overlay Network A virtual network built on top of another network Overlay
More informationDistributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015
Distributed Systems 23. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2015 November 17, 2015 2014-2015 Paul Krzyzanowski 1 Motivation Serving web content from one location presents
More informationPeer-to-Peer Networks
Peer-to-Peer Networks Chapter 1: Introduction Jussi Kangasharju Chapter Outline Course outline and practical matters Peer-to-peer (P2P) overview Definition of P2P What is P2P and how it is different from
More informationScalable Internet/Scalable Storage. Seif Haridi KTH/SICS
Scalable Internet/Scalable Storage Seif Haridi KTH/SICS Interdisk: The Big Idea 2 Interdisk: The Big Idea I: 3 Interdisk: The Big Idea I: Internet is global data communication 4 Interdisk: The Big Idea
More informationDistributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014
Distributed Systems 25. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2014 November 16, 2014 2014 Paul Krzyzanowski 1 Motivation Serving web content from one location presents
More informationTHEMIS: Fairness in Data Stream Processing under Overload
THEMIS: Fairness in Data Stream Processing under Overload Evangelia Kalyvianaki City University London, UK Marco Fiscato Imperial College London, UK Theodoros Salonidis IBM Research, USA Peter R. Pietzuch
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationHow To Understand The Power Of A Content Delivery Network (Cdn)
Overview 5-44 5-44 Computer Networking 5-64 Lecture 8: Delivering Content Content Delivery Networks Peter Steenkiste Fall 04 www.cs.cmu.edu/~prs/5-44-f4 Web Consistent hashing Peer-to-peer CDN Motivation
More informationThe Effect of Caches for Mobile Broadband Internet Access
The Effect of s for Mobile Jochen Eisl, Nokia Siemens Networks, Munich, Germany Haßlinger, Deutsche Telekom Technik,, Darmstadt, Germany IP-based content delivery: CDN & cache architecture Impact of access
More informationTesting & Assuring Mobile End User Experience Before Production. Neotys
Testing & Assuring Mobile End User Experience Before Production Neotys Agenda Introduction The challenges Best practices NeoLoad mobile capabilities Mobile devices are used more and more At Home In 2014,
More informationTraffic delivery evolution in the Internet ENOG 4 Moscow 23 rd October 2012
Traffic delivery evolution in the Internet ENOG 4 Moscow 23 rd October 2012 January 29th, 2008 Christian Kaufmann Director Network Architecture Akamai Technologies, Inc. way-back machine Web 1998 way-back
More informationDistributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski pxk@cs.rutgers.edu
Distributed Systems 19. Content Delivery Networks (CDN) Paul Krzyzanowski pxk@cs.rutgers.edu 1 Motivation Serving web content from one location presents problems Scalability Reliability Performance Flash
More informationA Topology-Aware Relay Lookup Scheme for P2P VoIP System
Int. J. Communications, Network and System Sciences, 2010, 3, 119-125 doi:10.4236/ijcns.2010.32018 Published Online February 2010 (http://www.scirp.org/journal/ijcns/). A Topology-Aware Relay Lookup Scheme
More informationContent Distribu-on Networks (CDNs)
Content Distribu-on Networks (CDNs) Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:0am in Architecture N101 hjp://www.cs.princeton.edu/courses/archive/spr12/cos461/ Second Half of the Course
More informationA programming model in Cloud: MapReduce
A programming model in Cloud: MapReduce Programming model and implementation developed by Google for processing large data sets Users specify a map function to generate a set of intermediate key/value
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationFlash Crowds & Denial of Service Attacks
Flash Crowds & Denial of Service Attacks Characterization and Implications for CDNs and Web sites Jaeyeon Jung MIT Laboratory for Computer Science Balachander Krishnamurthy and Michael Rabinovich AT&T
More informationSoftware Defined Networking What is it, how does it work, and what is it good for?
Software Defined Networking What is it, how does it work, and what is it good for? slides stolen from Jennifer Rexford, Nick McKeown, Michael Schapira, Scott Shenker, Teemu Koponen, Yotam Harchol and David
More informationCommunications Software. CSE 123b. CSE 123b. Spring 2003. Lecture 13: Load Balancing/Content Distribution. Networks (plus some other applications)
CSE 123b CSE 123b Communications Software Spring 2003 Lecture 13: Load Balancing/Content Distribution Networks (plus some other applications) Stefan Savage Some slides courtesy Srini Seshan Today s class
More informationHypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
More informationNetworking in the Hadoop Cluster
Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
More informationDistributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013
Distributed Systems 24. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2013 November 27, 2013 2013 Paul Krzyzanowski 1 Motivation Serving web content from one location presents
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationNoSQL. Thomas Neumann 1 / 22
NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationCHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS
137 CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS 8.1 CONCLUSION In this thesis, efficient schemes have been designed and analyzed to control congestion and distribute the load in the routing process of
More informationDistribution transparency. Degree of transparency. Openness of distributed systems
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed
More informationStudy of Flexible Contents Delivery System. With Dynamic Server Deployment
Study of Flexible Contents Delivery System With Dynamic Server Deployment Yuko KAMIYA Toshihiko SHIMOKAWA and orihiko YOSHIDA Graduate School of Information Science, Kyushu Sangyo University, JAPA Faculty
More informationPeer-to-Peer Networks. Chapter 6: P2P Content Distribution
Peer-to-Peer Networks Chapter 6: P2P Content Distribution Chapter Outline Content distribution overview Why P2P content distribution? Network coding Peer-to-peer multicast Kangasharju: Peer-to-Peer Networks
More informationExploring Big Data in Social Networks
Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationFortiBalancer: Global Server Load Balancing WHITE PAPER
FortiBalancer: Global Server Load Balancing WHITE PAPER FORTINET FortiBalancer: Global Server Load Balancing PAGE 2 Introduction Scalability, high availability and performance are critical to the success
More informationA very short history of networking
A New vision for network architecture David Clark M.I.T. Laboratory for Computer Science September, 2002 V3.0 Abstract This is a proposal for a long-term program in network research, consistent with the
More informationContent Delivery Networks. Shaxun Chen April 21, 2009
Content Delivery Networks Shaxun Chen April 21, 2009 Outline Introduction to CDN An Industry Example: Akamai A Research Example: CDN over Mobile Networks Conclusion Outline Introduction to CDN An Industry
More informationIntroduction: Why do we need computer networks?
Introduction: Why do we need computer networks? Karin A. Hummel - Adapted slides of Prof. B. Plattner, plattner@tik.ee.ethz.ch - Add-on material included of Peterson, Davie: Computer Networks February
More informationCloud Computing. Theory and Practice. Dan C. Marinescu. Morgan Kaufmann is an imprint of Elsevier HEIDELBERG LONDON AMSTERDAM BOSTON
Cloud Computing Theory and Practice Dan C. Marinescu AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO M< Morgan Kaufmann is an imprint of Elsevier
More informationUKAIRO: Internet-Scale Bandwidth Detouring
UKAIRO: Internet-Scale Bandwidth Detouring Thom Haddow Imperial College London Moez Draief Imperial College London Sing Wang Ho Imperial College London Peter Pietzuch Imperial College London Cristian Lumezanu
More informationInternet Content Distribution
Internet Content Distribution Chapter 4: Content Distribution Networks (TUD Student Use Only) Chapter Outline Basics of content distribution networks (CDN) Why CDN? How do they work? Client redirection
More informationARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
More informationIntroduction to Hadoop
Introduction to Hadoop Miles Osborne School of Informatics University of Edinburgh miles@inf.ed.ac.uk October 28, 2010 Miles Osborne Introduction to Hadoop 1 Background Hadoop Programming Model Examples
More informationGLOBAL SERVER LOAD BALANCING WITH SERVERIRON
APPLICATION NOTE GLOBAL SERVER LOAD BALANCING WITH SERVERIRON Growing Global Simply by connecting to the Internet, local businesses transform themselves into global ebusiness enterprises that span the
More informationCloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu
Lecture 4 Introduction to Hadoop & GAE Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu Outline Introduction to Hadoop The Hadoop ecosystem Related projects
More informationStudying Black Holes on the Internet with Hubble
Studying Black Holes on the Internet with Hubble Ethan Katz-Bassett, Harsha V. Madhyastha, John P. John, Arvind Krishnamurthy, David Wetherall, Thomas Anderson University of Washington August 2008 This
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationDistributed Systems. Tutorial 12 Cassandra
Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse
More informationBig Table A Distributed Storage System For Data
Big Table A Distributed Storage System For Data OSDI 2006 Fay Chang, Jeffrey Dean, Sanjay Ghemawat et.al. Presented by Rahul Malviya Why BigTable? Lots of (semi-)structured data at Google - - URLs: Contents,
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More informationContent Distribution Networks (CDN)
229 Content Distribution Networks (CDNs) A content distribution network can be viewed as a global web replication. main idea: each replica is located in a different geographic area, rather then in the
More informationA PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM Dr.S. DHANALAKSHMI 1, R. ANUPRIYA 2 1 Prof & Head, 2 Research Scholar Computer Science and Applications, Vivekanandha College of Arts and Sciences
More informationAdvanced Farm Administration with XenApp Worker Groups
WHITE PAPER Citrix XenApp Advanced Farm Administration with XenApp Worker Groups XenApp Product Development www.citrix.com Contents Overview... 3 What is a Worker Group?... 3 Introducing XYZ Corp... 5
More informationSpeak<geek> Tech Brief. RichRelevance Distributed Computing: creating a scalable, reliable infrastructure
3 Speak Tech Brief RichRelevance Distributed Computing: creating a scalable, reliable infrastructure Overview Scaling a large database is not an overnight process, so it s difficult to plan and implement
More informationCloud Enabled Emergency Navigation Using Faster-than-real-time Simulation
Cloud Enabled Emergency Navigation Using Faster-than-real-time Simulation Huibo Bi and Erol Gelenbe Intelligent Systems and Networks Group Department of Electrical and Electronic Engineering Imperial College
More informationInter-domain Routing. Outline. Border Gateway Protocol
Inter-domain Routing Outline Border Gateway Protocol Internet Structure Original idea Backbone service provider Consumer ISP Large corporation Consumer ISP Small corporation Consumer ISP Consumer ISP Small
More informationLarge-Scale Web Applications
Large-Scale Web Applications Mendel Rosenblum Web Application Architecture Web Browser Web Server / Application server Storage System HTTP Internet CS142 Lecture Notes - Intro LAN 2 Large-Scale: Scale-Out
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationScaling Big Data Mining Infrastructure: The Smart Protection Network Experience
Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience 黃 振 修 (Chris Huang) SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 About Me SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 SPN Hadoop 基 礎 運 算 架 構 師 Hadoop in Taiwan
More informationDESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández
DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING Carlos de Alfonso Andrés García Vicente Hernández 2 INDEX Introduction Our approach Platform design Storage Security
More informationNetwork Flow Data Fusion GeoSpatial and NetSpatial Data Enhancement
Network Flow Data Fusion GeoSpatial and NetSpatial Data Enhancement FloCon 2010 New Orleans, La Carter Bullard QoSient, LLC carter@qosient.com 1 Carter Bullard carter@qosient.com QoSient - Research and
More informationLoad Distribution in Large Scale Network Monitoring Infrastructures
Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu
More informationBased on Computer Networking, 4 th Edition by Kurose and Ross
Computer Networks Internet Routing Based on Computer Networking, 4 th Edition by Kurose and Ross Intra-AS Routing Also known as Interior Gateway Protocols (IGP) Most common Intra-AS routing protocols:
More informationHigh Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es
High Throughput Computing on P2P Networks Carlos Pérez Miguel carlos.perezm@ehu.es Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More informationSkype network has three types of machines, all running the same software and treated equally:
What is Skype? Why is Skype so successful? Everybody knows! Skype is a P2P (peer-to-peer) Voice-Over-IP (VoIP) client founded by Niklas Zennström and Janus Friis also founders of the file sharing application
More informationEssential Ingredients for Optimizing End User Experience Monitoring
Essential Ingredients for Optimizing End User Experience Monitoring An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Neustar IT MANAGEMENT RESEARCH, Table of Contents Executive Summary...1
More informationInternet Firewall CSIS 4222. Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS 4222. net15 1. Routers can implement packet filtering
Internet Firewall CSIS 4222 A combination of hardware and software that isolates an organization s internal network from the Internet at large Ch 27: Internet Routing Ch 30: Packet filtering & firewalls
More informationShould and Can a Communication System. Adapt Pervasively An Unofficial View http://san.ee.ic.ac.uk
Should and Can a Communication System MSOffice1 Adapt Pervasively An Unofficial View http://san.ee.ic.ac.uk Erol Gelenbe www.ee.ic.ac.uk/gelenbe Imperial College London SW7 2BT e.gelenbe@imperial.ac.uk
More informationValidating the System Behavior of Large-Scale Networked Computers
Validating the System Behavior of Large-Scale Networked Computers Chen-Nee Chuah Robust & Ubiquitous Networking (RUBINET) Lab http://www.ece.ucdavis.edu/rubinet Electrical & Computer Engineering University
More informationApplication and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang 2011-10
Application and practice of parallel cloud computing in ISP Guangzhou Institute of China Telecom Zhilan Huang 2011-10 Outline Mass data management problem Applications of parallel cloud computing in ISPs
More informationInternational Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 An Efficient Approach for Load Balancing in Cloud Environment Balasundaram Ananthakrishnan Abstract Cloud computing
More informationParallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data
Case Study 2: Document Retrieval Parallel Programming Map-Reduce Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 31 st, 2013 Carlos Guestrin
More informationCloud Computing Trends
UT DALLAS Erik Jonsson School of Engineering & Computer Science Cloud Computing Trends What is cloud computing? Cloud computing refers to the apps and services delivered over the internet. Software delivered
More informationOpen source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
More informationDistributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1
Distributed Systems REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1 1 The Rise of Distributed Systems! Computer hardware prices are falling and power increasing.!
More informationOptimizing Data Center Networks for Cloud Computing
PRAMAK 1 Optimizing Data Center Networks for Cloud Computing Data Center networks have evolved over time as the nature of computing changed. They evolved to handle the computing models based on main-frames,
More informationChoosing a Content Delivery Method
Choosing a Content Delivery Method Executive Summary Cache-based content distribution networks (CDNs) reach very large volumes of highly dispersed end users by duplicating centrally hosted video, audio
More informationImperial College London
Imperial College London Department of Computing Challenges in Cooperation between Internet Service Providers and Peer-to-Peer Applications by Konstantinos G. Gkerpinis Submitted in partial fulfilment of
More informationThe Internet: A Remarkable Story. Inside the Net: A Different Story. Networks are Hard to Manage. Software Defined Networking Concepts
The Internet: A Remarkable Story Software Defined Networking Concepts Based on the materials from Jennifer Rexford (Princeton) and Nick McKeown(Stanford) Tremendous success From research experiment to
More informationCDN and Traffic-structure
CDN and Traffic-structure Outline Basics CDN Traffic Analysis 2 Outline Basics CDN Building Blocks Services Evolution Traffic Analysis 3 A Centralized Web! Slow content must traverse multiple backbones
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationIPTV AND VOD NETWORK ARCHITECTURES. Diogo Miguel Mateus Farinha
IPTV AND VOD NETWORK ARCHITECTURES Diogo Miguel Mateus Farinha Instituto Superior Técnico Av. Rovisco Pais, 1049-001 Lisboa, Portugal E-mail: diogo.farinha@ist.utl.pt ABSTRACT IPTV and Video on Demand
More informationSoftware Defined Networking & Openflow
Software Defined Networking & Openflow Autonomic Computer Systems, HS 2015 Christopher Scherb, 01.10.2015 Overview What is Software Defined Networks? Brief summary on routing and forwarding Introduction
More informationAvailability of Services in the Era of Cloud Computing
Availability of Services in the Era of Cloud Computing Sanjay P. Ahuja 1 & Sindhu Mani 1 1 School of Computing, University of North Florida, Jacksonville, America Correspondence: Sanjay P. Ahuja, School
More informationPortable Wireless Mesh Networks: Competitive Differentiation
Portable Wireless Mesh Networks: Competitive Differentiation Rajant Corporation s kinetic mesh networking solutions combine specialized command and control software with ruggedized, high-performance hardware.
More informationLecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How
More informationThe old Internet. Software in the Network: Outline. Traditional Design. 1) Basic Caching. The Arrival of Software (in the network)
The old Software in the Network: What Happened and Where to Go Prof. Eric A. Brewer UC Berkeley Inktomi Corporation Local networks with local names and switches IP creates global namespace and links the
More informationBigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic
BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop
More informationDistributed Systems Lecture 1 1
Distributed Systems Lecture 1 1 Distributed Systems Lecturer: Therese Berg therese.berg@it.uu.se. Recommended text book: Distributed Systems Concepts and Design, Coulouris, Dollimore and Kindberg. Addison
More information1. Comments on reviews a. Need to avoid just summarizing web page asks you for:
1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of
More informationThe Power of Social Data: Transforming Big Data into Decisions. Andreas Weigend
Milano, 04 Dec 2013 1 The Power of Social Data: Transforming Big Data into Decisions Andreas Weigend bit.ly/weigenditalia 1. Data and Decisions Value of Data? Agenda 2. Amazon as Data Refinery Equation
More informationHPAM: Hybrid Protocol for Application Level Multicast. Yeo Chai Kiat
HPAM: Hybrid Protocol for Application Level Multicast Yeo Chai Kiat Scope 1. Introduction 2. Hybrid Protocol for Application Level Multicast (HPAM) 3. Features of HPAM 4. Conclusion 1. Introduction Video
More informationSiteCelerate white paper
SiteCelerate white paper Arahe Solutions SITECELERATE OVERVIEW As enterprises increases their investment in Web applications, Portal and websites and as usage of these applications increase, performance
More informationOur Data & Methodology. Understanding the Digital World by Turning Data into Insights
Our Data & Methodology Understanding the Digital World by Turning Data into Insights Understanding Today s Digital World SimilarWeb provides data and insights to help businesses make better decisions,
More information