Peer-to-Peer and Grid Computing. Chapter 4: Peer-to-Peer Storage
|
|
|
- Rhoda Randall
- 9 years ago
- Views:
Transcription
1 Peer-to-Peer and Grid Computing Chapter 4: Peer-to-Peer Storage
2 Chapter Outline Using DHTs to build more complex systems How DHT can help? What problems DHTs solve? What problems are left unsolved? P2P storage basics, with examples Splitting into blocks (CFS) Wide-scale replication (OceanStore) Modifiable filesystem with logs (Ivy) Future of P2P filesystems Kangasharju: Peer-to-Peer Networks 2
3 How to Use a DHT? Recall: DHT maps keys to values Applications based on DHTs must need this functionality Or: Must be designed in this way! Possible to design an application in several ways Keys and values are application specific For filesystem: Value = file For Value = message For distributed DB: Value = contents of entry, etc. Application stores values in DHT and uses them Simple, but a powerful tool Kangasharju: Peer-to-Peer Networks 3
4 Problems Solved by DHT DHT solves the problem of mapping keys to values in the distributed hash table Efficient storage and retrieval of values Efficient routing Robust against many failures Efficient in terms of network usage Provides hash table-like abstraction to application Kangasharju: Peer-to-Peer Networks 4
5 Problems NOT Solved by DHT Everything else except what is on previous slide In particular, the following problems Robustness No guarantees against big failures Threat models for DHTs not well-understood yet Availability Data not guaranteed to be available Only probabilistic guarantees (but possible to get high prob.) Consistency No support for consistency Data in DHT often highly replicated, consistency is a problem Version management No support for version management Might be possible to support this to some degree Kangasharju: Peer-to-Peer Networks 5
6 P2P FS: Introduction P2P filesystems (FS) or P2P storage systems were the first applications of DHTs Fundamental principle: Key = filename, Value = file contents Different kinds of systems Storage for read-only objects Read-write support Stand-alone storage systems Systems with links to standard filesystems (e.g., NFS) Kangasharju: Peer-to-Peer Networks 6
7 P2P FS: Current State Only examples of P2P filesystems come from research Research prototypes exist for many systems No wide-area deployment Experiments on research testbeds No examples of real deployment and real usage in wide-area After initial work, no recent advances? At least, not visible advances Three examples: Cooperative File System, CFS OceanStore Ivy Kangasharju: Peer-to-Peer Networks 7
8 P2P FS: Why? Why build P2P filesystems? Light-weight, reliable, wide-area storage At least in principle Distributed filesystems not widely deployed either Were studied already long time ago Gain experience with DHT and how DHTs could be used in real applications DHT abstraction is powerful, but it has limitations Understanding of the limitations is valuable Kangasharju: Peer-to-Peer Networks 8
9 P2P FS: Basic Techniques Three fundamental basic techniques for building distributed storage systems 1. Splitting files into blocks 2. Replicating files (or blocks!) 3. Using logs to allow modifications For now: Simple analysis of advantages and disadvantages and three examples Detailed performance analysis in Chapter 5 For blocks and replication Kangasharju: Peer-to-Peer Networks 9
10 Splitting Files into Blocks Why: Files are of different sizes and peers storing large files have to serve more data Pro: Dividing files into equal-sized blocks and storing blocks on different peers can achieve load balance If different files share blocks, we can save on storage Con: Instead of needing one peer online, all peers with all blocks must be online (see below) Need metadata about blocks to be stored somewhere Granularity tradeoff: Small blocks -> Good load balance, but lots of overhead and vice versa Kangasharju: Peer-to-Peer Networks 10
11 Replication Why: If file (or block) is stored only on one peer and that peer is offline, data is not available Replicating content to multiple peers significantly increases content availability Pro: High availability and reliability But only probabilistic guarantees Con: How to coordinate lots of replicas? Especially important if content can change Unreliable network requires high degree of replication for decent availability Wastes storage space Kangasharju: Peer-to-Peer Networks 11
12 Logs Why: If we want to change the stored files, we need to modify every stored replica Keep a log for every file (user, ) which gives information about the latest version Pro: Changes concentrated in one place Anyone can figure out what is the latest version Con: How to keep the log available? By replicating it? ;-) Kangasharju: Peer-to-Peer Networks 12
13 P2P FS: Overview Three examples of P2P filesystems CFS (blocks and replication) Basic, read-only system Based on Chord OceanStore (replication) Vision for a global storage system Based on Tapestry Ivy (logs) Read-write, provide NFS semantics Based on Chord Kangasharju: Peer-to-Peer Networks 13
14 CFS CFS = Cooperative File System Developed at MIT, by same people as Chord CFS based on the Chord DHT Read-only system, only 1 publisher CFS stores blocks instead of whole files Part of CFS is a generic block storage layer Features: Load balancing Performance equal to FTP Efficiency and fast recovery from failures Kangasharju: Peer-to-Peer Networks 14
15 CFS Properties Decentralized control Scalability (comes from Chord) Availability In absence of catastrophic failures, of course Load balance Load balanced according to peers capabilities Persistence Data will be stored for as long as agreed Quotas Possibility of per-user quotas Efficiency As fast as common FTP in wide area Kangasharju: Peer-to-Peer Networks 15
16 CFS: Layers, Clients, and Servers FS DHash DHash DHash Chord Chord Chord Client Server Server FS: provide filesystem API, interpret blocks as files DHash: Provides block storage Chord: DHT layer, slightly modified Clients access files, servers just provide storage Kangasharju: Peer-to-Peer Networks 16
17 Chord Layer in CFS Chord layer is slightly modified from basic Chord Instead of 1 successor, each node keeps track of r successors Finger tables as before Also, try to reduce lookup latency Nodes measure latencies to other nodes Report measured latencies to other nodes Kangasharju: Peer-to-Peer Networks 17
18 DHash Layer DHash stores blocks as opposed to whole files Better load balancing More network query traffic, but not really significant Each block replicated k times, with r k Two kinds of blocks: Content block, addressed by hash of contents Signed blocks (= root blocks), addressed by public key - Signed block is the root of the filesystem - One filesystem per publisher Blocks are cached in network Most of caching near the responsible node Blocks in local cache replaced according to least-recentlyused Consistency not a problem, blocks addressed by content - Root blocks different, may get old (but consistent) data Kangasharju: Peer-to-Peer Networks 18
19 DHash API DHash provides following API to clients: Put_h(block) Put_s(block, pubkey) Get(key) Store block Store or update signed block Fetch block associated with key Filesystem starts with root (= signed) block Block under publisher s public key and signed with private key For clients, read-only, publisher can modify filesystem by inserting a new root block Root block has pointers to other blocks - Either piece of a file or filesystem metadata (e.g., directory) Data stored for an agreed-upon finite interval Kangasharju: Peer-to-Peer Networks 19
20 CFS Filesystem Structure Public key Root block H(D) D H(F) F H(B1) B1 H(B2) Signature B2 Root block identified by publisher s public key Each publisher has its own filesystem Different publishers are on separate filesystems Other blocks identified based on hash of contents Other blocks can be metadata or pieces of file Kangasharju: Peer-to-Peer Networks 20
21 Load Balancing and Quotas Different servers have different capabilities One real server can run several virtual servers Number of virtual servers depends on the capabilities Big servers run more virtual servers CFS operates at virtual server level Virtual nodes on a single real node know each other Possible to use short cuts in routing Quotas can be used to limit storage for each node Quotas set on a per-ip address basis Better quotas require central administration Some systems implement better quotas, e.g., PAST Kangasharju: Peer-to-Peer Networks 21
22 Updates in CFS Only publisher can change data in filesystem CFS will store any block under its content hash Highly unlikely to find two blocks with same SHA-1 hash No explicit protection for content blocks needed Root block is signed by publisher Publisher must keep private key secret No explicit delete operation Data stored only for agreed-upon period Publisher must refresh periodically if persistence is needed Kangasharju: Peer-to-Peer Networks 22
23 OceanStore OceanStore developed at UC Berkeley Runs on Tapestry DHT Supports object modification Vision of ubiquitous computing: Intelligent devices, transparently in the environment Highly dynamic, untrusted environment Question: Where does persistent information reside? OceanStore aims to be the answer OceanStore s target: users, each with files, i.e., files total Kangasharju: Peer-to-Peer Networks 23
24 OceanStore: Basics and Goals Users pay for the storage service Several companies can provide services together Two goals: 1. Untrusted infrastructure Everything is encrypted, infrastructure unreliable However, assume that most servers are working correctly most of the time One class of servers trusted to follow protocol (but not trusted with data) 2. Nomadic data Anytime, anywhere Introspection used to tune system at run-time Kangasharju: Peer-to-Peer Networks 24
25 OceanStore Applications OceanStore suitable for many kinds of applications Storage and sharing of large amounts of data Data follows users dynamically Groupware applications Concurrent updates from many people For example, calendars, contact lists, etc. In particular, and other communication applications Streaming applications Also for sensor networks and dissemination Here we concentrate on storage Kangasharju: Peer-to-Peer Networks 25
26 System Overview Each object has globally unique identifier (GUID) Objects replicated and migrated on the fly Replicas located in two ways Probabilistic, fast algorithm tried first Slower, but deterministic algorithm used if first one fails Objects exist in two forms, active and archival Active is the latest version with handle for updates Archival is a permanent, read-only version - Archive versions encoded with erasure codes with lot of redundancy - Only a global disaster can make data disappear Kangasharju: Peer-to-Peer Networks 26
27 Naming and Access Control Object naming Self-certifying object names Hash of the object s owners key and human readable name Allows directories System has no root, but user can select her own root(s) Access control Restricting readers All data is encrypted, can restrict readers by not giving key Revoke read permission by re-encrypting everything Restricting writers All writes must be signed and compared against ACL Kangasharju: Peer-to-Peer Networks 27
28 Locating Objects OceanStore uses two mechanisms for locating objects 1. Probabilistic algorithm Frequently accessed objects likely to be nearby and easily found This algorithm finds them fast Uses attenuated Bloom filters See below for more details 2. Deterministic algorithm OceanStore based on Tapestry Deterministic routing is Tapestry s routing Guaranteed to find the object See Chapter 3 for the details Kangasharju: Peer-to-Peer Networks 28
29 Sidenote: Bloom Filters Bloom filters are a space-efficient probabilistic data structure that is used to test whether or not an element is a member of a set False positives are possible False negatives are NOT possible Bloom filter is an array of k bits Also need m different hash functions, each maps key to a bit To insert, calculate all m hash functions and set bits to 1 To check, calculate all m hash functions and if all bits are 1, key is probably in the set If any bit is 0, then it is definitely not in Kangasharju: Peer-to-Peer Networks 29
30 Sidenote: Bloom Filters To insert or check for an item, Bloom filters take average O(m) time Fixed constant! Independent of number of entries No other data structure allows for this (hash table is close) Attenuated Bloom filter of depth D is same as an array of D normal Bloom filters First filter is for locally stored objects at current node The i th Bloom filter is the union of all Bloom filters at distance i through any path from current node Attenuated Bloom filter for each network edge Queries routed on the edge where the distance to object is shortest Kangasharju: Peer-to-Peer Networks 30
31 Probabilistic Query Process Node Local filter Neighbor filter n 3 X n 1 n n Node n 1 wants to find object X, X hashes to bits 0, 1, 3 Node n 1 does not have 0, 1, and 3 in local filter Neighbor filter for n 2 has them, forward query to n 2 Node n 2 does not have them in local filter Filter for neighbor n 3 has them, forward to n 3 Node n 3 has object Kangasharju: Peer-to-Peer Networks 31
32 Update Model Update model in OceanStore based on conflict resolution Update semantics Each update has a list of predicates with associated actions Predicates evaluated in order Actions for first true predicate are atomically applied (commit) If no predicates are true, action aborts Update is logged in both cases Kangasharju: Peer-to-Peer Networks 32
33 Update Model: Predicates List of predicates is short Untrusted environment limits what predicates can do Available predicates: Compare-version (metadata comparison, easy) Compare-size (same as above) Compare-block (easy if encryption is position-dependent block cipher) Search (possible to search ciphertext, get boolean result) Kangasharju: Peer-to-Peer Networks 33
34 Available Operations Four operations available Replace-block Insert-block Delete-block Append If position-dependent cipher, Replace-block and Append are easy operations For Insert-block and Delete-block: Two kinds of blocks: Index and data blocks Index blocks can contain pointers to other blocks To insert a block, we replace old block with an index block which points to old block and new block - Actual blocks are appended to object May be susceptible to traffic analysis Kangasharju: Peer-to-Peer Networks 34
35 Serializing Updates Replicas divided into two tiers Primary tier is trusted to follow protocol Secondary tier is everyone else Primary tier cooperate in a Byzantine agreement protocol Secondary tier communicates with primary tier and secondary via epidemic algorithms Reason for two tiers: Fault-tolerant protocols possible with only a small number of replicas, protocols communication-intensive Primary tier is well-connected and small Kangasharju: Peer-to-Peer Networks 35
36 Byzantine Generals Problem Several divisions of the Byzantine army surround an enemy city. Each division is commanded by a general. The generals communicate only through messenger Need to arrive at a common plan after observing the enemy Some of the generals may be traitors Traitors can send false messages Required: An algorithm to guarantee that 1. All loyal generals decide upon the same plan of action, irrespective of what the traitors do. 2. A small number of traitors cannot cause the loyal generals to adopt a bad plan. Solution: (from L. Lamport) If no more than m generals out of n = 3m + 1 are traitors, everybody will follow the orders Kangasharju: Peer-to-Peer Networks 36
37 Path of an Update Update is sent to primary tier and to random replicas in secondary tier for that object Kangasharju: Peer-to-Peer Networks 37
38 Path of an Update Primary tier performs Byzantine agreement Secondary tier propagates update epidemically Kangasharju: Peer-to-Peer Networks 38
39 Path of an Update When primary tier has finished agreement protocol, the update is sent over multicast to all secondary replicas Kangasharju: Peer-to-Peer Networks 39
40 Deep Archival Storage Archival mechanism uses erasure codes Reed-Solomon, Tornado, etc. Generate redundant data fragments Created by the primary tier If there are enough fragments and they are spread widely, then it is likely that we can retrieve the data Archival copies are created when objects are changed Every version is archived Can be tuned to be done less frequently Kangasharju: Peer-to-Peer Networks 40
41 Ivy Ivy developed at MIT Based on Chord Provides NFS-like semantics At least for fully connected networks Any user can modify any file Ivy handles everything through logs Ivy presents a conventional filesystem interface Kangasharju: Peer-to-Peer Networks 41
42 Problems for Distributed Read/Write 1. Multiple distributed writers make it difficult to maintain consistent metadata 2. Unreliable participants make locking unattractive Locking could help maintain consistency 3. Participants cannot be trusted Machines may be compromised Need to be able to undo 4. Distributed filesystem may become partitioned System must remain operational during partitions Help applications repair conflicting updates made during partitions Kangasharju: Peer-to-Peer Networks 42
43 Solution: Logs Each participant maintains log of its changes Logs maintain filesystem data and metadata Participant has private snapshot of logs of others Logs stored in DHash (see under CFS for details) Participant writes to its own log, reads all others Log-head points to most recent log record Version vectors impose order on log records from multiple logs View block Log head : Log records Kangasharju: Peer-to-Peer Networks 43
44 Ivy: Views Each user writing to a filesystem has its own log Participants agree on a view View is a set of logs that comprise the filesystem View block is immutable ( root ) View block has log heads for all participants Kangasharju: Peer-to-Peer Networks 44
45 Combining Logs How to determine order of modifications from logs? Order should obey causality All participants should agree on the order Each new log record has a sequence number and a version vector Sequence number increasing (managed locally) Version vector has sequence numbers from other logs in view Version vector summarizes knowledge about log Log records ordered by comparing version vectors Vectors u and v comparable if u < v, v < u, or v = u Otherwise concurrent Simultaneous operations result in equal or concurrent vectors Ordered by public keys of participants May need special actions to return to consistency (overlapping modifications) Kangasharju: Peer-to-Peer Networks 45
46 Ivy: Snapshots Private snapshots avoid traversing whole filesystem Snapshot contains the entire state Each participant has own snapshot Contents of snapshots mostly the same for all participants DHash will store them only once To create snapshot, node: Gets all logs recent than current snapshot Write new snapshot New user must either build from scratch or take a trusted snapshot Kangasharju: Peer-to-Peer Networks 46
47 P2P FS: General Problems Built on top of unreliable nodes and network How to achieve reliability? Replication gives reliability (see Chapter 5) Replication makes maintaining consistency difficult Nodes cannot be trusted in the general case Must encrypt all data Hard to do content-based conflict resolution (e.g., diff) Performance Distributed filesystems have much lower performance Kangasharju: Peer-to-Peer Networks 47
48 P2P FS: Future What does future hold for P2P filesystems? What is right area of application? Intranet? Trusted environment, high bandwidth Possibly easy to deploy? - Need to make a product first? Global Internet? Lots of untrusted peers, widely distributed All the problems from above Can take off as a hobby project? Nowhere? P2P filesystems are a total waste of time? Kangasharju: Peer-to-Peer Networks 48
49 P2P FS: Future Do we need a P2P FS to build useful applications? Yes, they allow efficient distributed storage Working storage system is the basic building block of a useful application No, DHT is enough Several examples of P2P applications directly on top of a DHT Fully reliable P2P FS would make network into one virtual computer Modulo performance issues Building P2P apps would be like building normal apps Kangasharju: Peer-to-Peer Networks 49
50 P2P FS: Real-World Examples Microsoft has done lot of research in this area Even built prototypes Why no products on the market? Below some wild speculation P2P FS would compete with traditional file servers? P2P needs to be built in to the OS, would kill server market? Impossible to build a good enough P2P FS? Theoretically doable, but too slow and weak in practice? P2P filesystem can be done, but has no advantages? Possibly to build a useful system, but it costs as much as a server-based system? Kangasharju: Peer-to-Peer Networks 50
51 Chapter Summary How to build applications with DHTs Basic technologies of distributed storage As examples, 3 P2P filesystems CFS - Read-only OceanStore Ivy - Modifications allowed, global vision - NFS-like semantics, traditional filesystem interface Discussion about the future of P2P filesystems Kangasharju: Peer-to-Peer Networks 51
P2P Storage Systems. Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung
P2P Storage Systems Prof. Chun-Hsin Wu Dept. Computer Science & Info. Eng. National University of Kaohsiung Outline Introduction Distributed file systems P2P file-swapping systems P2P storage systems Strengths
OceanStore: An Architecture for Global-Scale Persistent Storage
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon,
Peer-to-Peer Networks. Chapter 6: P2P Content Distribution
Peer-to-Peer Networks Chapter 6: P2P Content Distribution Chapter Outline Content distribution overview Why P2P content distribution? Network coding Peer-to-peer multicast Kangasharju: Peer-to-Peer Networks
Acknowledgements. Peer to Peer File Storage Systems. Target Uses. P2P File Systems CS 699. Serving data with inexpensive hosts:
Acknowledgements Peer to Peer File Storage Systems CS 699 Some of the followings slides are borrowed from a talk by Robert Morris (MIT) 1 2 P2P File Systems Target Uses File Sharing is one of the most
COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network
Adapting Distributed Hash Tables for Mobile Ad Hoc Networks
University of Tübingen Chair for Computer Networks and Internet Adapting Distributed Hash Tables for Mobile Ad Hoc Networks Tobias Heer, Stefan Götz, Simon Rieche, Klaus Wehrle Protocol Engineering and
DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing
DFSgc Distributed File System for Multipurpose Grid Applications and Cloud Computing Introduction to DFSgc. Motivation: Grid Computing currently needs support for managing huge quantities of storage. Lacks
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
Peer-to-peer Cooperative Backup System
Peer-to-peer Cooperative Backup System Sameh Elnikety Mark Lillibridge Mike Burrows Rice University Compaq SRC Microsoft Research Abstract This paper presents the design and implementation of a novel backup
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Tornado: A Capability-Aware Peer-to-Peer Storage Network
Tornado: A Capability-Aware Peer-to-Peer Storage Network Hung-Chang Hsiao [email protected] Chung-Ta King* [email protected] Department of Computer Science National Tsing Hua University Hsinchu,
Diagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
A block based storage model for remote online backups in a trust no one environment
A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, [email protected]) René Stach (editor, [email protected]) Abstract
F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013
F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords
SwanLink: Mobile P2P Environment for Graphical Content Management System
SwanLink: Mobile P2P Environment for Graphical Content Management System Popovic, Jovan; Bosnjakovic, Andrija; Minic, Predrag; Korolija, Nenad; and Milutinovic, Veljko Abstract This document describes
Sync Security and Privacy Brief
Introduction Security and privacy are two of the leading issues for users when transferring important files. Keeping data on-premises makes business and IT leaders feel more secure, but comes with technical
PEER TO PEER CLOUD FILE STORAGE ---- OPTIMIZATION OF CHORD AND DHASH. COEN283 Term Project Group 1 Name: Ang Cheng Tiong, Qiong Liu
PEER TO PEER CLOUD FILE STORAGE ---- OPTIMIZATION OF CHORD AND DHASH COEN283 Term Project Group 1 Name: Ang Cheng Tiong, Qiong Liu 1 Abstract CHORD/DHash is a very useful algorithm for uploading data and
Designing a Cloud Storage System
Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes
Scaling Distributed Database Management Systems by using a Grid-based Storage Service
Scaling Distributed Database Management Systems by using a Grid-based Storage Service Master Thesis Silviu-Marius Moldovan [email protected] Supervisors: Gabriel Antoniu, Luc Bougé {Gabriel.Antoniu,Luc.Bouge}@irisa.fr
Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election
Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)
1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication
Software to Simplify and Share SAN Storage Sanbolic s SAN Storage Enhancing Software Portfolio
Software to Simplify and Share SAN Storage Sanbolic s SAN Storage Enhancing Software Portfolio www.sanbolic.com Table of Contents About Sanbolic... 3 Melio File System... 3 LaScala Volume Manager... 3
Security in Structured P2P Systems
P2P Systems, Security and Overlays Presented by Vishal thanks to Dan Rubenstein Columbia University 1 Security in Structured P2P Systems Structured Systems assume all nodes behave Position themselves in
Cloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
Cloud Based Application Architectures using Smart Computing
Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products
Approximate Object Location and Spam Filtering on Peer-to-Peer Systems
Approximate Object Location and Spam Filtering on Peer-to-Peer Systems Feng Zhou, Li Zhuang, Ben Y. Zhao, Ling Huang, Anthony D. Joseph and John D. Kubiatowicz University of California, Berkeley The Problem
An Introduction to Peer-to-Peer Networks
An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003 Agenda Overview of P2P Characteristics Benefits Unstructured P2P
ZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...
A Novel Data Placement Model for Highly-Available Storage Systems
A Novel Data Placement Model for Highly-Available Storage Systems Rama, Microsoft Research joint work with John MacCormick, Nick Murphy, Kunal Talwar, Udi Wieder, Junfeng Yang, and Lidong Zhou Introduction
Magnus: Peer to Peer Backup System
Magnus: Peer to Peer Backup System Naveen Gattu, Richard Huang, John Lynn, Huaxia Xia Department of Computer Science University of California, San Diego Abstract Magnus is a peer-to-peer backup system
A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage
Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf
Cassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
InfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card
Considerations when Choosing a Backup System for AFS
Considerations when Choosing a Backup System for AFS By Kristen J. Webb President and CTO Teradactyl LLC. October 21, 2005 The Andrew File System has a proven track record as a scalable and secure network
Cloud Storage and Peer-to-Peer Storage End-user considerations and product overview
Cloud Storage and Peer-to-Peer Storage End-user considerations and product overview Project : Gigaport3 Project year : 2010 Coordination : Rogier Spoor Author(s) : Arjan Peddemors Due date : Q2 2010 Version
CS5412: TIER 2 OVERLAYS
1 CS5412: TIER 2 OVERLAYS Lecture VI Ken Birman Recap 2 A week ago we discussed RON and Chord: typical examples of P2P network tools popular in the cloud Then we shifted attention and peeked into the data
Chord - A Distributed Hash Table
Kurt Tutschku Vertretung - Professur Rechnernetze und verteilte Systeme Chord - A Distributed Hash Table Outline Lookup problem in Peer-to-Peer systems and Solutions Chord Algorithm Consistent Hashing
Architectures and protocols in Peer-to-Peer networks
Architectures and protocols in Peer-to-Peer networks Ing. Michele Amoretti [[email protected]] II INFN SECURITY WORKSHOP Parma 24-25 February 2004 Contents - Definition of Peer-to-Peer network - P2P
Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.
Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat
Deduplication has been around for several
Demystifying Deduplication By Joe Colucci Kay Benaroch Deduplication holds the promise of efficient storage and bandwidth utilization, accelerated backup and recovery, reduced costs, and more. Understanding
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
Distributed File Systems. Chapter 10
Distributed File Systems Chapter 10 Distributed File System a) A distributed file system is a file system that resides on different machines, but offers an integrated view of data stored on remote disks.
Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)
Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...
A Survey and Comparison of Peer-to-Peer Overlay Network Schemes
% " #$! IEEE COMMUNICATIONS SURVEY AND TUTORIAL, MARCH 2004 1 A Survey and Comparison of Peer-to-Peer Overlay Network Schemes Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma and Steven Lim Abstract
How To Create A P2P Network
Peer-to-peer systems INF 5040 autumn 2007 lecturer: Roman Vitenberg INF5040, Frank Eliassen & Roman Vitenberg 1 Motivation for peer-to-peer Inherent restrictions of the standard client/server model Centralised
iservdb The database closest to you IDEAS Institute
iservdb The database closest to you IDEAS Institute 1 Overview 2 Long-term Anticipation iservdb is a relational database SQL compliance and a general purpose database Data is reliable and consistency iservdb
How to Choose Between Hadoop, NoSQL and RDBMS
How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure IT Department, DCOER, PG Scholar, Computer Department DCOER, Pune University Pune university
A Deduplication-based Data Archiving System
2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.20 A Deduplication-based Data Archiving System
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM Dr.S. DHANALAKSHMI 1, R. ANUPRIYA 2 1 Prof & Head, 2 Research Scholar Computer Science and Applications, Vivekanandha College of Arts and Sciences
Dr Markus Hagenbuchner [email protected] CSCI319. Distributed Systems
Dr Markus Hagenbuchner [email protected] CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know
Increasing Storage Performance, Reducing Cost and Simplifying Management for VDI Deployments
Increasing Storage Performance, Reducing Cost and Simplifying Management for VDI Deployments Table of Contents Introduction.......................................3 Benefits of VDI.....................................4
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh
An Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2
DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.
Understanding Disk Storage in Tivoli Storage Manager
Understanding Disk Storage in Tivoli Storage Manager Dave Cannon Tivoli Storage Manager Architect Oxford University TSM Symposium September 2005 Disclaimer Unless otherwise noted, functions and behavior
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems Finding a needle in Haystack: Facebook
Database Replication with Oracle 11g and MS SQL Server 2008
Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely
Distributed File Systems
Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)
Agenda. Distributed System Structures. Why Distributed Systems? Motivation
Agenda Distributed System Structures CSCI 444/544 Operating Systems Fall 2008 Motivation Network structure Fundamental network services Sockets and ports Client/server model Remote Procedure Call (RPC)
Data Management in the Cloud
Data Management in the Cloud Ryan Stern [email protected] : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
Service Overview CloudCare Online Backup
Service Overview CloudCare Online Backup CloudCare s Online Backup service is a secure, fully automated set and forget solution, powered by Attix5, and is ideal for organisations with limited in-house
The Google File System
The Google File System Motivations of NFS NFS (Network File System) Allow to access files in other systems as local files Actually a network protocol (initially only one server) Simple and fast server
File Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
Distribution transparency. Degree of transparency. Openness of distributed systems
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science [email protected] Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed
Online Transaction Processing in SQL Server 2008
Online Transaction Processing in SQL Server 2008 White Paper Published: August 2007 Updated: July 2008 Summary: Microsoft SQL Server 2008 provides a database platform that is optimized for today s applications,
CHAPTER 6: DISTRIBUTED FILE SYSTEMS
CHAPTER 6: DISTRIBUTED FILE SYSTEMS Chapter outline DFS design and implementation issues: system structure, access, and sharing semantics Transaction and concurrency control: serializability and concurrency
Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師
Lecture 7: Distributed Operating Systems A Distributed System 7.2 Resource sharing Motivation sharing and printing files at remote sites processing information in a distributed database using remote specialized
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL
How to Choose your Red Hat Enterprise Linux Filesystem
How to Choose your Red Hat Enterprise Linux Filesystem EXECUTIVE SUMMARY Choosing the Red Hat Enterprise Linux filesystem that is appropriate for your application is often a non-trivial decision due to
Distributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
Veeam Cloud Connect. Version 8.0. Administrator Guide
Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be
Real World Enterprise SQL Server Replication Implementations. Presented by Kun Lee [email protected]
Real World Enterprise SQL Server Replication Implementations Presented by Kun Lee [email protected] About Me DBA Manager @ CoStar Group, Inc. MSSQLTip.com Author (http://www.mssqltips.com/sqlserverauthor/15/kunlee/)
Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
Considerations when Choosing a Backup System for AFS
Considerations when Choosing a Backup System for AFS By Kristen J. Webb President and CTO Teradactyl LLC. June 18, 2005 The Andrew File System has a proven track record as a scalable and secure network
Big Data With Hadoop
With Saurabh Singh [email protected] The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
Network File System (NFS) Pradipta De [email protected]
Network File System (NFS) Pradipta De [email protected] Today s Topic Network File System Type of Distributed file system NFS protocol NFS cache consistency issue CSE506: Ext Filesystem 2 NFS
Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features
Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...
Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition
Chapter 11: File System Implementation 11.1 Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation
Hierarchy storage in Tachyon. [email protected], [email protected], [email protected]
Hierarchy storage in Tachyon [email protected], [email protected], [email protected] Hierarchy storage in Tachyon... 1 Introduction... 1 Design consideration... 2 Feature overview... 2 Usage design...
Using Peer to Peer Dynamic Querying in Grid Information Services
Using Peer to Peer Dynamic Querying in Grid Information Services Domenico Talia and Paolo Trunfio DEIS University of Calabria HPC 2008 July 2, 2008 Cetraro, Italy Using P2P for Large scale Grid Information
Distributed File Systems
Distributed File Systems File Characteristics From Andrew File System work: most files are small transfer files rather than disk blocks? reading more common than writing most access is sequential most
FAQs for Oracle iplanet Proxy Server 4.0
FAQs for Oracle iplanet Proxy Server 4.0 Get answers to the questions most frequently asked about Oracle iplanet Proxy Server Q: What is Oracle iplanet Proxy Server (Java System Web Proxy Server)? A: Oracle
NFS File Sharing. Peter Lo. CP582 Peter Lo 2003 1
NFS File Sharing Peter Lo CP582 Peter Lo 2003 1 NFS File Sharing Summary Distinguish between: File transfer Entire file is copied to new location FTP Copy command File sharing Multiple users can access
Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com
Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com David Pallmann, Neudesic Windows Azure Windows Azure is the foundation of Microsoft s Cloud Platform It is an Operating
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High
ScaleArc idb Solution for SQL Server Deployments
ScaleArc idb Solution for SQL Server Deployments Objective This technology white paper describes the ScaleArc idb solution and outlines the benefits of scaling, load balancing, caching, SQL instrumentation
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
CipherShare Features and Benefits
CipherShare s and CipherShare s and Security End-to-end Encryption Need-to-Know: Challenge / Response Authentication Transitive Trust Consistent Security Password and Key Recovery Temporary Application
Multicast vs. P2P for content distribution
Multicast vs. P2P for content distribution Abstract Many different service architectures, ranging from centralized client-server to fully distributed are available in today s world for Content Distribution
