P2P Storage System. Presented by: Aakash Therani, Ankit Jasuja & Manish Shah

Size: px
Start display at page:

Download "P2P Storage System. Presented by: Aakash Therani, Ankit Jasuja & Manish Shah"

Transcription

1 P2P Storage System Presented by: Aakash Therani, Ankit Jasuja & Manish Shah

2 What is a P2P storage system? Peer-to-Peer(P2P) storage systems leverage the combined storage capacity of a network of storage devices(peers) contributed typically by autonomous end-users as a common pool of storage space to store and share content. Applications Distributed file systems Content sharing Back-up & archival storage Peer data management systems

3 What is a P2P storage system? Peer-to-Peer(P2P) storage systems leverage the combined storage capacity of a network of storage devices(peers) contributed typically by autonomous end-users as a common pool of storage space to store and share content. Applications Distributed file systems Content sharing Back-up & archival storage Peer data management systems

4 Designing P2P Storage Systems Factors to keep in mind while designing p2p storage systems Persistent Storage Availability- in the presence of network partitions Durability- against failure and attack Security Issues Access control Protection against content pollution Transactions Concurrency Control Fault Tolerance

5 Cloud Storage v/s P2P Storage When data is stored at server clusters within the internet, this kind of data storage is referred to as cloud storage. Cloud Storage Products Amazon S3- Amazon S3 (Simple Storage Service) is a web service that offers cloud storage through a simple HTTP-based interface. Dropbox- Dropbox is a cloud storage provider and file synchronization tool using the Amazon S3 storage facility as a back-end. When relying on the members of a group storing each other s data, it is called peer-to-peer (p2p) storage. P2P Storage Products Wuala- Wuala [37] is a commercial, distributed storage service that allows users to trade storage capacity in a P2P way

6 Classification of Storage Products Products can be classified based on the types of storage needs:- 1) Backup- Using the service as a backup facility for files stored locally on a computer (which is part of the peer network). This may involve keeping track of versions of files, asthey change over time. 2) File Synchronization- Keeping the same file tree that exists on a number of different computers in sync. When one file is changed on one computer, the copy of that file on the other computers is automatically updated. This type of functionality must deal with conflicts, e.g., in case the same file is changed on multiple computers at the same time. 3) Distributed file system-the online storage capacity is used to implement a distributed file system. One or more computers access the storage in a manner that is very similar to local file systems. 4) Content Sharing-Parts of the file tree stored online are used to share data with other people. By providing credentials to others, they can use the storage facility to read the part of the tree they were granted access to.

7 OceanStore: An Architecture for Global-Scale Persistent Storage

8 OceanStore: A True Data Utility Utility model: consumers pay a monthly fee in exchange for access to persistent storage Highly available data from anywhere Automatic replication for disaster recovery Strong security Providers would buy and sell capacity among themselves for mobile users Deep archival storage: use excess of storage space to ease data management

9 Ubiquitous Computing

10 Two Unique Goals 1) Ability to be constructed from an untrusted infrastructure Servers may crash without warning All information entering the infrastructure must be encrypted Servers participate in protocols for distributed consistency management 2) Support for Nomadic Data Locality is of utmost importance Promiscuous Caching: Data can be cached anywhere, anytime Continuous introspective monitoring to manage caching & locality

11 System Overview Persistent object: The fundamental unit in OceanStore Each object is named by a Globally Unique Identifier (GUID) Objects are replicated and stored on multiple servers Floating replicas: Replicas are independent of the server Two mechanisms to locate a replica 1) A fast, probabilistic algorithm to find the object near the requesting machine 2) If (1) fails, then it is located through a slower, deterministic algorithm

12 Underlying Technologies Naming Access Control Data Location and Routing Data Update Deep Archival Storage Introspection

13 Naming GUID: psuedo-random, fixed-length bit string Decentralized & resistant to attempts by adversaries Self-certifying path names GUID=hash(owner s key, filename) GUID of a server is a secure hash of its key GUID of a data fragment is a secure hash of the data content

14 Access Control OceanStore supports two primitive types of access controls 1) Reader Restriction Encrypt non-public data and distribute the key to users with read access Problem: There is no way to make a reader forget what he has read 2) Writer Restriction Through ACLs specified for each object by its owner Each user has a signing key, ACLs use that key for granting access Note: Reads are restricted at clients via key distribution, while writes are restricted at servers by ignoring unauthorized updates

15 Data Location and Routing Objects can reside on any of the OceanStore servers Use query routing to locate objects Every object is identified by one or more GUIDs Different replicas of the same object has the same GUID OceanStore messages are labeled with A destination GUID (built on top of IP) A random number A small predicate

16 Distributed Routing in OceanStore Routing is a two phase process. Data location and routing combined Advantage being we avoid multiple round trip time Routing itself is 2 tiered Fast probabilistic algorithm and slow reliable hierarchical method.

17 Bloom Filters Based on the idea of hill-climbing If a query cannot be satisfied by a server, local information is use to route the query to a likely neighbor - Via a modified version of a Bloom filter

18 Attenuated Bloom Filters An attenuated Bloom filter of depth D is an array of D normal Bloom filters ith Bloom filter is the union of all the Bloom filters for all of the nodes at a distance i One filter per network edge

19 Attenuated Bloom Filters Lookup 11010

20 The Global Algorithm: Wide-Scale Distributed Data Location Plaxton s randomized hierarchical distributed data structure Resolve one digit of the node id at a time Links form a series of random embedded trees, with each node as the root of one of these trees. Neighbor links can be used to route from anywhere to a given node If information about the GUID (such as its location) were stored at its root, then anyone could find this information simply by following neighbor links until they reached the root node for the GUID.

21 The Global Algorithm: Wide-Scale Distributed Data Location

22 Achieving Locality When a replica is placed somewhere in the system, its location is published to the routing infrastructure. The publishing process works its way to the object s root and deposits a pointer at every hop along the way. Each new replica only needs to traverse O(log(n)) hops to reach the root, where n is the number of the servers When someone searches for information, they climb the tree until they run into a pointer, after which they route directly to the object.

23 Achieving Fault Tolerance Avoid failures at roots Each root GUID is hashed with a small number of different salt values Make it difficult to target a single GUID for DoS attacks If failures are detected, just jump to any node to reach the root OceanStore continually monitors and repairs broken pointers

24 Advantages of Distributed Information Redundant paths to roots Scalable with a combination of probabilistic and global algorithms Easy to locate and recover failed components Plaxton links form a natural substrate for admission controls and multicasts

25 Achieving Maintenance-Free Operation Recursive node insertion and removal Replicated roots Use beacons to detect faults Time-to-live fields to update routes Second-chance algorithm to avoid false diagnoses of failed components Avoid the cost of recovering lost nodes Automatic reconstruction of data for failed servers

26 Update: Format and Semantics An update: a list of predicates associated with actions A set of predicates is evaluated in order The actions of the earliest true predicate are atomically applied Update is logged if it commits or aborts. Predicates: compare-version, compare-block, compare-size, search Actions: replace-block, insert-block, delete-block, append

27 Serializing Updates in an Untrusted Infrastructure Use a small primary tier of replicas to serialize updates Runs Byzantine agreement protocol Minimize communication Meanwhile, a secondary tier of replicas optimistically propagate updates among themselves Final ordering from primary tier is multicasted to secondary replicas

28 Update Path of an update: a) After generating an update, a client sends it directly to the object s inner ring b) While inner ring performs a Byzantine agreement to commit the update, secondary nodes propagate the update among themselves c) The result of update is multicast down the dissemination tree to all secondary nodes

29 The Full Update Path

30 Update commitment Fault tolerance: Guarantees fault tolerance if less than one third of the servers in the inner ring is malicious Secondary nodes do not participate in the Byzantine protocol, but receive consistency information

31 A Direct Path to Clients and Archival Storage Updates flow directly from a client to the primary tier, where they are serialized and then multicast to the secondary servers down the dissemination tree Updates are tightly coupled with archival Archival fragments are generated at serialization time, signed, encoded and distributed with updates

32 Deep Archival Storage Data is fragmented Each fragment is an object Erasure coding is used to increase reliability Administrative domains are ranked by their reliability and trustworthiness Avoid locations with correlated failures

33 Erasure Codes Erasure coding is a process that treats input data as a series of fragments (say n) and transforms these fragments into a greater number of fragments (say 2nor 4n) n Message Encoding Algorithm cn Encoding Transmission n Received Decoding Algorithm n Message

34 Introspection computation optimization observation Observation modules monitor the activity of a running system and track system behavior Optimization modules adjust the computation

35 Introspection Event handlers summarizes local events. These summaries are stored in a database. The information in the database is periodically analyzed and necessary actions are taken. A summary is sent to other nodes.

36 Uses of Introspection Cluster recognition Identify related files Replica management Adjust replication factors Migrate floating replicas

37 Introspection If a replica becomes unavailable: Clients will receive service from a more distant replica This produces extra load on distant replicas Introspective mechanism detects this and new replicas are created Above actions provide fault tolerance and automatic repair

38 Applications Groupware applications Personal information management tools Contact lists Calendars Distributed design tools

39 Conclusion Different from other systems : Utility model Untrusted infrastructure Truly nomadic data Use of introspection Prevention of denial of service attacks Rapid response to regional outages Analysis of access patterns

40 Dynamo: Amazon s Highly Available Keyvalue Store

41 Motivation Build a distributed storage system: Scale Simple: key-value Highly available Guarantee Service Level Agreements (SLA) Service Level Agreements (SLA) Application can deliver its functionality in abounded time: Every dependency in the platform needs to deliver its functionality with even tighter bounds. Example: service guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second.

42 Design Consideration 1) Sacrifice strong consistency for availability 2) Conflict resolution is executed during read instead of write, i.e. always writeable. 3) Other principles: Incremental scalability. Symmetry. Decentralization. Heterogeneity.

43 Partition Algorithm Consistent hashing: the output range of a hash function is treated as a fixed circular space or ring. Virtual Nodes: Each node can be responsible for more than one virtual node. Advantages of using virtual nodes If a node becomes unavailable the load handled by this node is evenly dispersed across the remaining available nodes. When a node becomes available again, the newly available node accepts a roughly equivalent amount of load from each of the other available nodes. The number of virtual nodes that a node is responsible can decided based on its capacity, accounting for heterogeneity in the physical infrastructure.

44 Data Versioning & Vector Clock A put() call may return to its caller before the update has been applied at all the replicas A get() call may return many versions of the same object. Challenge: an object having distinct version sub-histories, which the system will need to reconcile in the future. Solution: uses vector clocks in order to capture causality between different versions of the same object. A vector clock is a list of (node, counter) pairs. Every version of every object is associated with one vector clock. If the counters on the first object s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten.

45 Execution 1) Read / Write request on a key Arrives at a node (coordinator) Ideally the node responsible for the particular key Else forwards request to the node responsible for that key and that node will become the coordinator The first N healthy and distinct nodes following the key position are considered for the request Quorums are used R Read Quorum W Write Quorum R+W>N 2) Writes Requires generation of a new vector clock by coordinator Coordinator writes locally Forwards to N nodes, if W-1 respond then the write was successful 3) Reads Forwards to N nodes, if R-1 respond then forwards to user Only unique responses forwarded User handles merging if multiple versions exist

46 FreeNet: A Distributed Anonymous Information Storage and Retrieval System

47 FreeNet Introduction: P2P network for anonymous publishing and retrieval of data Decentralized Nodes collaborate in storage and routing Data centric routing Adapts to demands Addresses privacy & availability concerns Features: Anonymity for producers and consumers Deniability for information stores Resistance to denial attacks Efficient storing and routing Does NOT provide Permanent file storage Load balancing Anonymity for general n/w usage

48 Architecture Request: 1. Key 2. Hops to live 3. ID 4. Depth Each node local data store + routing table Request file through location independent keys Routing - chain of proxy requests - decision is local Graph structure actively evolves over time

49 Keys and Searching Problems with SSK - updating, versioning Content Hash Keys (CHK) Encrypted by a random encryption key Publish CHK + decryption key CHK + SSK easily updateable files 2 step process publish file, publish pointer Results in pointers to newer version Older versions accessed thru CHK Can be used for splitting files

50 File retrieving a d b e c f Location of keys: Hypertext spider Indirect files published with KSK of search words Publish bookmarks File retrieval Request forwarded to node in RT with closest lexicographic match for the binary key Request routing follows steepest-ascent hill climbing: first choice failure backtrack second choice Timers, hops - curtail request threads Files cached all along the retrieval path Self-reinforcing cycle results in key expertise

51 Data Management Finite data stores - nodes resort to LRU Routing table entries linger after data eviction Outdated (or unpopular) docs disappear automatically Bipartite eviction short term policy New files replace most recent files Prevents established files being evicted by attacks

52 Protocol and Security PROTOCOL Nodes with frequently changing IPs use ARKs Return address specified in requests threat? Messages do not always terminate when hops-to-live reaches 1 Depth is initialized by original requestor to arbitrarily small value Request state maintained at each node timers LRU SECURITY File integrity - KSK vulnerable to dictionary attacks DOS attacks Hash Cash to slow down Attempts to displace valid files are constrained by the insert procedure

53 Thank You..!!!

Peer-to-Peer and Grid Computing. Chapter 4: Peer-to-Peer Storage

Peer-to-Peer and Grid Computing. Chapter 4: Peer-to-Peer Storage Peer-to-Peer and Grid Computing Chapter 4: Peer-to-Peer Storage Chapter Outline Using DHTs to build more complex systems How DHT can help? What problems DHTs solve? What problems are left unsolved? P2P

More information

P2P Storage Systems. Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung

P2P Storage Systems. Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung P2P Storage Systems Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung Outline Introduction Distributed file systems P2P file-swapping systems P2P storage systems Strengths

More information

OceanStore: An Architecture for Global-Scale Persistent Storage

OceanStore: An Architecture for Global-Scale Persistent Storage OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon,

More information

Acknowledgements. Peer to Peer File Storage Systems. Target Uses. P2P File Systems CS 699. Serving data with inexpensive hosts:

Acknowledgements. Peer to Peer File Storage Systems. Target Uses. P2P File Systems CS 699. Serving data with inexpensive hosts: Acknowledgements Peer to Peer File Storage Systems CS 699 Some of the followings slides are borrowed from a talk by Robert Morris (MIT) 1 2 P2P File Systems Target Uses File Sharing is one of the most

More information

P2P-based Storage Systems

P2P-based Storage Systems Peer-to-Peer Networks and Applications P2P-based Storage Systems Dr.-Ing. Kalman Graffi Faculty for Electrical Engineering, Computer Science at the University of Paderborn This slide set is based on the

More information

Distributed Data Stores

Distributed Data Stores Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High

More information

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Cassandra A Decentralized, Structured Storage System

Cassandra A Decentralized, Structured Storage System Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922

More information

Scalability. We can measure growth in almost any terms. But there are three particularly interesting things to look at:

Scalability. We can measure growth in almost any terms. But there are three particularly interesting things to look at: Scalability The ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. We can measure growth in almost

More information

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf

More information

Async: Secure File Synchronization

Async: Secure File Synchronization Async: Secure File Synchronization Vera Schaaber, Alois Schuette University of Applied Sciences Darmstadt, Department of Computer Science, Schoefferstr. 8a, 64295 Darmstadt, Germany vera.schaaber@stud.h-da.de

More information

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

Outline. 15-744: Computer Networking. Narrow Waist of the Internet Key to its Success. NSF Future Internet Architecture

Outline. 15-744: Computer Networking. Narrow Waist of the Internet Key to its Success. NSF Future Internet Architecture Outline 15-744: Computer Networking L-15 Future Internet Architecture 2 Motivation and discussion Some proposals: CCN Nebula Mobility First XIA XIA overview AIP Scion 2 NSF Future Internet Architecture

More information

An Introduction to Peer-to-Peer Networks

An Introduction to Peer-to-Peer Networks An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003 Agenda Overview of P2P Characteristics Benefits Unstructured P2P

More information

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015 Distributed Systems 23. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2015 November 17, 2015 2014-2015 Paul Krzyzanowski 1 Motivation Serving web content from one location presents

More information

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies

More information

The Sierra Clustered Database Engine, the technology at the heart of

The Sierra Clustered Database Engine, the technology at the heart of A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel

More information

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks University of Tübingen Chair for Computer Networks and Internet Adapting Distributed Hash Tables for Mobile Ad Hoc Networks Tobias Heer, Stefan Götz, Simon Rieche, Klaus Wehrle Protocol Engineering and

More information

CS5412: TIER 2 OVERLAYS

CS5412: TIER 2 OVERLAYS 1 CS5412: TIER 2 OVERLAYS Lecture VI Ken Birman Recap 2 A week ago we discussed RON and Chord: typical examples of P2P network tools popular in the cloud Then we shifted attention and peeked into the data

More information

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid

ICP. Cache Hierarchies. Squid. Squid Cache ICP Use. Squid. Squid Caching & CDN s 15-44: Computer Networking L-21: Caching and CDNs HTTP APIs Assigned reading [FCAB9] Summary Cache: A Scalable Wide- Area Cache Sharing Protocol [Cla00] Freenet: A Distributed Anonymous

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Service Overview CloudCare Online Backup

Service Overview CloudCare Online Backup Service Overview CloudCare Online Backup CloudCare s Online Backup service is a secure, fully automated set and forget solution, powered by Attix5, and is ideal for organisations with limited in-house

More information

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful. Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one

More information

www.basho.com Technical Overview Simple, Scalable, Object Storage Software

www.basho.com Technical Overview Simple, Scalable, Object Storage Software www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...

More information

Sync Security and Privacy Brief

Sync Security and Privacy Brief Introduction Security and privacy are two of the leading issues for users when transferring important files. Keeping data on-premises makes business and IT leaders feel more secure, but comes with technical

More information

Multicast vs. P2P for content distribution

Multicast vs. P2P for content distribution Multicast vs. P2P for content distribution Abstract Many different service architectures, ranging from centralized client-server to fully distributed are available in today s world for Content Distribution

More information

Peer-to-Peer Networks. Chapter 6: P2P Content Distribution

Peer-to-Peer Networks. Chapter 6: P2P Content Distribution Peer-to-Peer Networks Chapter 6: P2P Content Distribution Chapter Outline Content distribution overview Why P2P content distribution? Network coding Peer-to-peer multicast Kangasharju: Peer-to-Peer Networks

More information

Anonymous Communication in Peer-to-Peer Networks for Providing more Privacy and Security

Anonymous Communication in Peer-to-Peer Networks for Providing more Privacy and Security Anonymous Communication in Peer-to-Peer Networks for Providing more Privacy and Security Ehsan Saboori and Shahriar Mohammadi Abstract One of the most important issues in peer-to-peer networks is anonymity.

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014 Distributed Systems 25. Content Delivery Networks (CDN) Paul Krzyzanowski Rutgers University Fall 2014 November 16, 2014 2014 Paul Krzyzanowski 1 Motivation Serving web content from one location presents

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE

BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE Your external DNS is a mission critical business resource. Without

More information

Dr. Arjan Durresi Louisiana State University, Baton Rouge, LA 70803 durresi@csc.lsu.edu. DDoS and IP Traceback. Overview

Dr. Arjan Durresi Louisiana State University, Baton Rouge, LA 70803 durresi@csc.lsu.edu. DDoS and IP Traceback. Overview DDoS and IP Traceback Dr. Arjan Durresi Louisiana State University, Baton Rouge, LA 70803 durresi@csc.lsu.edu Louisiana State University DDoS and IP Traceback - 1 Overview Distributed Denial of Service

More information

Global Server Load Balancing

Global Server Load Balancing White Paper Overview Many enterprises attempt to scale Web and network capacity by deploying additional servers and increased infrastructure at a single location, but centralized architectures are subject

More information

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens. Secure Data transfer in Cloud Storage Systems using Dynamic Tokens. P.Srinivas *,K. Rajesh Kumar # M.Tech Student (CSE), Assoc. Professor *Department of Computer Science (CSE), Swarnandhra College of Engineering

More information

SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING

SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING Saranya.V 1, Suganthi.J 2, R.G. Suresh Kumar 3 1,2 Master of Technology, Department of Computer Science and Engineering, Rajiv Gandhi College of Engineering

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Optimize Application Delivery Across Your Globally Distributed Data Centers

Optimize Application Delivery Across Your Globally Distributed Data Centers BIG IP Global Traffic Manager DATASHEET What s Inside: 1 Key Benefits 2 Globally Available Applications 4 Simple Management 5 Secure Applications 6 Network Integration 6 Architecture 7 BIG-IP GTM Platforms

More information

Hypertable Architecture Overview

Hypertable Architecture Overview WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for

More information

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing DFSgc Distributed File System for Multipurpose Grid Applications and Cloud Computing Introduction to DFSgc. Motivation: Grid Computing currently needs support for managing huge quantities of storage. Lacks

More information

Cloud Storage and Peer-to-Peer Storage End-user considerations and product overview

Cloud Storage and Peer-to-Peer Storage End-user considerations and product overview Cloud Storage and Peer-to-Peer Storage End-user considerations and product overview Project : Gigaport3 Project year : 2010 Coordination : Rogier Spoor Author(s) : Arjan Peddemors Due date : Q2 2010 Version

More information

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing Department of Computer Science Institute for System Architecture, Chair for Computer Networks File Sharing What is file sharing? File sharing is the practice of making files available for other users to

More information

Distributed Data Management

Distributed Data Management Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that

More information

How to Choose Between Hadoop, NoSQL and RDBMS

How to Choose Between Hadoop, NoSQL and RDBMS How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A

More information

Tornado: A Capability-Aware Peer-to-Peer Storage Network

Tornado: A Capability-Aware Peer-to-Peer Storage Network Tornado: A Capability-Aware Peer-to-Peer Storage Network Hung-Chang Hsiao hsiao@pads1.cs.nthu.edu.tw Chung-Ta King* king@cs.nthu.edu.tw Department of Computer Science National Tsing Hua University Hsinchu,

More information

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

Cloud Computing at Google. Architecture

Cloud Computing at Google. Architecture Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

Distributed Systems. Tutorial 12 Cassandra

Distributed Systems. Tutorial 12 Cassandra Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse

More information

CS435 Introduction to Big Data

CS435 Introduction to Big Data CS435 Introduction to Big Data Final Exam Date: May 11 6:20PM 8:20PM Location: CSB 130 Closed Book, NO cheat sheets Topics covered *Note: Final exam is NOT comprehensive. 1. NoSQL Impedance mismatch Scale-up

More information

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter

More information

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral

More information

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know

More information

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing) 1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication

More information

Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol

Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol Andraž Jere, Marko Meža, Boštjan Marušič, Štefan Dobravec, Tomaž Finkšt, Jurij F. Tasič Faculty of Electrical Engineering Tržaška

More information

How To Use Attix5 Pro For A Fraction Of The Cost Of A Backup

How To Use Attix5 Pro For A Fraction Of The Cost Of A Backup Service Overview Business Cloud Backup Techgate s Business Cloud Backup service is a secure, fully automated set and forget solution, powered by Attix5, and is ideal for organisations with limited in-house

More information

Security in Structured P2P Systems

Security in Structured P2P Systems P2P Systems, Security and Overlays Presented by Vishal thanks to Dan Rubenstein Columbia University 1 Security in Structured P2P Systems Structured Systems assume all nodes behave Position themselves in

More information

Ingegneria del Software II academic year: 2004-2005 Course Web-site: [www.di.univaq.it/ingegneria2/]

Ingegneria del Software II academic year: 2004-2005 Course Web-site: [www.di.univaq.it/ingegneria2/] Course: Ingegneria del Software II academic year: 2004-2005 Course Web-site: [www.di.univaq.it/ingegneria2/] Middleware Technology: Middleware Applications and Distributed Systems Lecturer: Henry Muccini

More information

SCALABILITY AND AVAILABILITY

SCALABILITY AND AVAILABILITY SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase

More information

A Survey of Peer-to-Peer File Sharing Technologies

A Survey of Peer-to-Peer File Sharing Technologies Athens University of Economics and Business The ebusiness Centre (www.eltrun.gr) A Survey of Peer-to-Peer File Sharing Technologies White Paper Page 1 of 1 A Survey of Peer-to-Peer File Sharing Technologies

More information

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL

More information

Achta's IBAN Validation API Service Overview (achta.com)

Achta's IBAN Validation API Service Overview (achta.com) Tel: 00 353 (0) 14773295 e: info@achta.com Achta's IBAN Validation API Service Overview (achta.com) Summary At Achta we have built a secure, scalable and cloud based API for SEPA. One of our core offerings

More information

Decentralized Peer-to-Peer Network Architecture: Gnutella and Freenet

Decentralized Peer-to-Peer Network Architecture: Gnutella and Freenet Decentralized Peer-to-Peer Network Architecture: Gnutella and Freenet AUTHOR: Jem E. Berkes umberkes@cc.umanitoba.ca University of Manitoba Winnipeg, Manitoba Canada April 9, 2003 Introduction Although

More information

A1 and FARM scalable graph database on top of a transactional memory layer

A1 and FARM scalable graph database on top of a transactional memory layer A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb

More information

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet: A Distributed Anonymous Information Storage and Retrieval System Freenet: A Distributed Anonymous Information Storage and Retrieval System Ian Clarke 1, Oskar Sandberg 2, Brandon Wiley 3, and Theodore W. Hong 4 1 Uprizer, Inc., 1007 Montana Avenue #323, Santa Monica,

More information

Module 15: Network Structures

Module 15: Network Structures Module 15: Network Structures Background Topology Network Types Communication Communication Protocol Robustness Design Strategies 15.1 A Distributed System 15.2 Motivation Resource sharing sharing and

More information

Application Design and Development

Application Design and Development C H A P T E R9 Application Design and Development Practice Exercises 9.1 What is the main reason why servlets give better performance than programs that use the common gateway interface (CGI), even though

More information

Scality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons

Scality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons Scality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons Friday, March 18, 2011 MARKET ExponenAal Storage Demand The Digital Universe: Growing by a factor of 44 in

More information

Improving data integrity on cloud storage services

Improving data integrity on cloud storage services International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 2 Issue 2 ǁ February. 2013 ǁ PP.49-55 Improving data integrity on cloud storage services

More information

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models 1 THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY TABLE OF CONTENTS 3 Introduction 14 Examining Third-Party Replication Models 4 Understanding Sharepoint High Availability Challenges With Sharepoint

More information

BLOOM FILTERS: 1) Operations:

BLOOM FILTERS: 1) Operations: BLOOM FILTERS: - short binary array with of length m. - k different 'collision-resistant' hash functions - n number of elements mapped into the bloom filter 1) Operations: - add element: hash the element

More information

High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es

High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es High Throughput Computing on P2P Networks Carlos Pérez Miguel carlos.perezm@ehu.es Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured

More information

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases Distributed Architectures Distributed Databases Simplest: client-server Distributed databases: two or more database servers connected to a network that can perform transactions independently and together

More information

A block based storage model for remote online backups in a trust no one environment

A block based storage model for remote online backups in a trust no one environment A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, kenneth@duplicati.com) René Stach (editor, rene@duplicati.com) Abstract

More information

ATTACKS ON CLOUD COMPUTING. Nadra Waheed

ATTACKS ON CLOUD COMPUTING. Nadra Waheed ATTACKS ON CLOUD COMPUTING 1 Nadra Waheed CONTENT 1. Introduction 2. Cloud computing attacks 3. Cloud TraceBack 4. Evaluation 5. Conclusion 2 INTRODUCTION Today, cloud computing systems are providing a

More information

Measurement Study of Wuala, a Distributed Social Storage Service

Measurement Study of Wuala, a Distributed Social Storage Service Measurement Study of Wuala, a Distributed Social Storage Service Thomas Mager - Master Thesis Advisors: Prof. Ernst Biersack Prof. Thorsten Strufe Prof. Pietro Michiardi Illustration: Maxim Malevich 15.12.2010

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh

More information

REMOTE BACKUP-WHY SO VITAL?

REMOTE BACKUP-WHY SO VITAL? REMOTE BACKUP-WHY SO VITAL? Any time your company s data or applications become unavailable due to system failure or other disaster, this can quickly translate into lost revenue for your business. Remote

More information

Plaxton Routing. - From a peer - to - peer network point of view. Lars P. Wederhake

Plaxton Routing. - From a peer - to - peer network point of view. Lars P. Wederhake Plaxton Routing - From a peer - to - peer network point of view Lars P. Wederhake Ferienakademie im Sarntal 2008 FAU Erlangen-Nürnberg, TU München, Uni Stuttgart September 2008 Overview 1 Introduction

More information

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing. Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Privacy - Preserving

More information

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Agenda. Distributed System Structures. Why Distributed Systems? Motivation Agenda Distributed System Structures CSCI 444/544 Operating Systems Fall 2008 Motivation Network structure Fundamental network services Sockets and ports Client/server model Remote Procedure Call (RPC)

More information

HRG Assessment: Stratus everrun Enterprise

HRG Assessment: Stratus everrun Enterprise HRG Assessment: Stratus everrun Enterprise Today IT executive decision makers and their technology recommenders are faced with escalating demands for more effective technology based solutions while at

More information

High Availability Design Patterns

High Availability Design Patterns High Availability Design Patterns Kanwardeep Singh Ahluwalia 81-A, Punjabi Bagh, Patiala 147001 India kanwardeep@gmail.com +91 98110 16337 Atul Jain 135, Rishabh Vihar Delhi 110092 India jain.atul@wipro.com

More information

Plaxton routing. Systems. (Pastry, Tapestry and Kademlia) Pastry: Routing Basics. Pastry: Topology. Pastry: Routing Basics /3

Plaxton routing. Systems. (Pastry, Tapestry and Kademlia) Pastry: Routing Basics. Pastry: Topology. Pastry: Routing Basics /3 Uni Innsbruck Informatik Uni Innsbruck Informatik Peerto topeer Systems DHT examples, part (Pastry, Tapestry and Kademlia) Michael Welzl michael.welzl@uibk.ac.at DPS NSG Team http://dps.uibk.ac.at dps.uibk.ac.at/nsg

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

Database Replication with MySQL and PostgreSQL

Database Replication with MySQL and PostgreSQL Database Replication with MySQL and PostgreSQL Fabian Mauchle Software and Systems University of Applied Sciences Rapperswil, Switzerland www.hsr.ch/mse Abstract Databases are used very often in business

More information

DISASTER RECOVERY WITH AWS

DISASTER RECOVERY WITH AWS DISASTER RECOVERY WITH AWS Every company is vulnerable to a range of outages and disasters. From a common computer virus or network outage to a fire or flood these interruptions can wreak havoc on your

More information

Social Networks and the Richness of Data

Social Networks and the Richness of Data Social Networks and the Richness of Data Getting distributed Webservices Done with NoSQL Fabrizio Schmidt, Lars George VZnet Netzwerke Ltd. Content Unique Challenges System Evolution Architecture Activity

More information

PSON: A Scalable Peer-to-Peer File Sharing System Supporting Complex Queries

PSON: A Scalable Peer-to-Peer File Sharing System Supporting Complex Queries PSON: A Scalable Peer-to-Peer File Sharing System Supporting Complex Queries Jyoti Ahuja, Jun-Hong Cui, Shigang Chen, Li Lao jyoti@engr.uconn.edu, jcui@cse.uconn.edu, sgchen@cise.ufl.edu, llao@cs.ucla.edu

More information

CHAPTER 7 SUMMARY AND CONCLUSION

CHAPTER 7 SUMMARY AND CONCLUSION 179 CHAPTER 7 SUMMARY AND CONCLUSION This chapter summarizes our research achievements and conclude this thesis with discussions and interesting avenues for future exploration. The thesis describes a novel

More information

OSPF Version 2 (RFC 2328) Describes Autonomous Systems (AS) topology. Propagated by flooding: Link State Advertisements (LSAs).

OSPF Version 2 (RFC 2328) Describes Autonomous Systems (AS) topology. Propagated by flooding: Link State Advertisements (LSAs). OSPF Version 2 (RFC 2328) Interior gateway protocol (IGP). Routers maintain link-state database. Describes Autonomous Systems (AS) topology. Propagated by flooding: Link State Advertisements (LSAs). Router

More information

Locality Based Protocol for MultiWriter Replication systems

Locality Based Protocol for MultiWriter Replication systems Locality Based Protocol for MultiWriter Replication systems Lei Gao Department of Computer Science The University of Texas at Austin lgao@cs.utexas.edu One of the challenging problems in building replication

More information

Mitigating Server Breaches with Secure Computation. Yehuda Lindell Bar-Ilan University and Dyadic Security

Mitigating Server Breaches with Secure Computation. Yehuda Lindell Bar-Ilan University and Dyadic Security Mitigating Server Breaches with Secure Computation Yehuda Lindell Bar-Ilan University and Dyadic Security The Problem Network and server breaches have become ubiquitous Financially-motivated and state-sponsored

More information

Managing and Maintaining Windows Server 2008 Servers

Managing and Maintaining Windows Server 2008 Servers Managing and Maintaining Windows Server 2008 Servers Course Number: 6430A Length: 5 Day(s) Certification Exam There are no exams associated with this course. Course Overview This five day instructor led

More information

FortiBalancer: Global Server Load Balancing WHITE PAPER

FortiBalancer: Global Server Load Balancing WHITE PAPER FortiBalancer: Global Server Load Balancing WHITE PAPER FORTINET FortiBalancer: Global Server Load Balancing PAGE 2 Introduction Scalability, high availability and performance are critical to the success

More information