PaxosLease: Diskless Paxos for Leases
|
|
|
- Claude Paul
- 9 years ago
- Views:
Transcription
1 PaxosLease: Diskless Paxos for Leases Marton Trencseni, Attila Gazso, Holger Reinhardt, Abstract This paper describes PaxosLease, a distributed algorithm for lease negotiation. PaxosLease is based on Paxos, but does not require disk writes or clock synchrony. PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store. 1 Introduction In concurrent programming, locks are a basic primitive which processes use to synchronize access to a shared resource. In a system where locks are granted without an expiry time (and without a supervisor process), failure of the lock owner before it releases the lock may cause other processes to block. In a highly-available distributed system, one wishes to avoid single node failures to cause the entire system to block. Also, restarting the system in case of failure may be harder than restarting a multi-threaded program. Thus, in distributed systems, leases take the place of locks to avoid starvation. A lease is a lock with an expiry time. If the owner of the lock fails or becomes disconnected from the rest, its lease automatically expires and other nodes can acquire it. We assume the basic setup is as follows: the system consists of a set of proposers who follow the proposer s algorithm and a set of acceptors who follow the acceptor s algorithm. It is assumed the system is non-byzantine, i.e. the nodes do not cheat (are not hacked) by not following their algorithm. The number of acceptors is fixed and does not change. A naive, majority vote type algorithm can be given which correctly solves the distributed lease problem; correct meaning that the lease is held by no more than one node at any time. However, this simple algorithm will frequently block in the presence of many proposers, hence the need for a more sophisticated approach. The naive majority algorithm is as follows: proposers start local timeouts for T seconds and send requests to the acceptors for the lease with timespan T. The acceptors, upon receiving the request, start a timer for T seconds and sends an accept message to the proposer. After the timeout, the acceptors clear their state. If an acceptor receives a request but its state is not empty it either does not answer or sends a reject message. To make sure only one proposer has the lease at any time, the proposer must receive accept messages from a majority of acceptors; then it has the lease until its local timer expires. As discussed previously, it is possible (and likely), that with many proposers, noone will be able to get a majority and proposers will continually block each other. For example, with three proposers 1, 2 and 3 and three acceptors A, B and C, if 1
2 the distributed state is such that A accepts 1 s request, B 2 s and C 3 s, then no proposer has a majority. The system must wait until the timeouts expire and the acceptors discard their state at which point the proposers will try again. However, it is likely that the system will block again. The solution described in this paper is to follow the scheme of Paxos [1], and introduce prepare and propose phases, which avoids this kind of blocking altogether 1. Paxos solves the problem of replicated state-machines, where each node has a local copy of a state-machine, and they wish to reach consensus on the next state transition. Paxos is a majority based algorithm, meaning a majority of the nodes have to be up and communicating for consensus to be possible. Paxos deals with reaching consensus on a single state transition, so in practice several Paxos rounds are run in sequence to negotiate the sequence of state transitions [3]. In Paxos, acceptors record their state to disk before sending responses, which guarantees that once a value (state transition) is chosen, it is chosen at all later times; in other words, all state machines go through the same sequence of state transitions, whatever error conditions occur. Unlike earlier Paxos-based distributed lease algorithms such as Fatlease [5], PaxosLease does not make any time synchrony assumptions about local clocks of nodes (no global synchronization is required). Also, Fatlease runs Paxos rounds in succession for lease commands, whereas PaxosLease makes use of the temporal nature of leases and avoids this complication altogether for a simpler and elegant algorithm. PaxosLease is a natural specialization of Paxos. As in Paxos, it is assumed that the number of nodes is fixed (and node identities are globally known). PaxosLease deals with a special replicated state-machine of the form: node 1 has the lease no one has the lease node 2 has the lease... Figure 1: PaxosLease distributed state machine. To acquire the lease, PaxosLease proposer nodes propose the value: node i has the lease, and will automatically return to the no one has the lease state after the expiry time. Proposers may also extend their leases by proposing the node i has the lease value again before their previous lease expires, and may optionally release the lease before it expires. Similar to Paxos, PaxosLease intrinsically handles all relevant failure conditions: 1. nodes stop and restart 2. network splits 1 Another solution may be to let the system block, but introduce an undo mechanism which lets some proposers undo their proposals to let one proposer acquire the lease. 2
3 3. message loss and reordering 4. in-transit message delays 2 Definitions A PaxosLease cell consists of proposers and acceptors. We assume there are n acceptors and any number of proposers. In practice, nodes often act as proposers and acceptors, but this is an implementation issue and does not affect the discussion. Proposers send prepare request and propose request messages to acceptors, who reply with prepare response and propose response messages. The messages have the following structure: 1. prepare request = ballot number 2. prepare response = ballot number, answer, accepted proposal 3. propose request = ballot number, lease 4. propose response = ballot number, answer A ballot number and a lease together make up a proposal. A lease is made up of a proposer id (who wishes to become the lease owner) and a timespan T. Acceptors store the following state information: 1. highest ballot number promised: acceptors ignore messages whose ballot number is less than this 2. accepted proposal: the last proposal (ballot number and lease) accepted There is a globally known maximal lease time M. Proposers always acquire leases for timespans T < M. Ballot numbers are globally unique and monotonously increasing per proposer. In practice, this can is achieved by ballot numbers being made up of a proposer id field, a restart counter field and a run counter field (at the most significant end) specific to this run of the proposer. The restart counter is incremented each time the proposer starts up and is written to stable storage. PaxosLease preserves the lease-invariant: at any given time, there is no more than one proposer which holds the lease. 3 Basic algorithm This section describes the basic flow of the algorithm in terms of the proposer s and acceptor s algorithm. The proposers send prepare and propose requests, acceptors reply with prepare and propose responses. If all goes well, it takes two round-trip times for the proposer to acquire the lease. 1. A proposer wishes to acquire the lease for T < M seconds. It generates a ballot number [request.ballotnumber], and sends prepare request messages to a majority of acceptors. Proposer::Propose() state.ballotnumber = NextBallotNumber() request.type = PrepareRequest request.ballotnumber = state.ballotnumber Broadcast(request) 3
4 2. Acceptors, upon receiving a prepare request check whether the ballot number [request.ballotnumber] is higher than their local highest ballot number promised [state.highestpromised]. If it is lower, they either discard the message or send a prepare response with a reject answer. If it is equal or higher, the acceptor constructs a prepare response with an accept answer containing the currently accepted proposal[state.acceptedproposal], which may be empty. The acceptor sets its highest ballot number promised [state.highestpromised] equal to the ballot number of the message [request.ballotnumber] and sends the prepare response back to the proposer. Acceptor::OnPrepareRequest() if (request.ballotnumber < state.highestpromised) return state.highestpromised = request.ballotnumber response.type = PrepareRespose response.ballotnumber = request.ballotnumber response.acceptedproposal = state.acceptedproposal // may be empty Send(response) 3. The proposer examines the prepare responses from the acceptors. If a majority of acceptors responded with empty proposals, meaning they are open to accept new proposals, the proposer is free to propose itself as the lease owner for time T. It starts a timer which will expire in T seconds and sends propose requests containing the ballot number and the lease (its proposer id and T). Proposer::OnPrepareResponse() if (response.ballotnumber!= state.ballotnumber) return // some other proposal if (response.acceptedproposal == empty ) numopen++ if (numopen < majority) return state.timeout = T SetTimeout(state.timeout) request.type = ProposeRequest request.ballotnumber = state.ballotnumber request.proposal.proposerid = self.proposerid request.proposal.timeout = state.timeout Broadcast(request) Proposer::OnTimeout() state.ballotnumber = empty // set in Proposer::Propose() state.leaseowner = false // set in Proposer::OnProposeResponse() 4. Acceptors, upon receiving a propose request check whether the ballot number [request.ballotnumber] is higher than their highest ballot number promised [state.highestpromised]. If it is lower, they can either discard the message or send a propose response with a reject answer. If it is equal or higher, the acceptor accepts the proposal: if starts a timeout 4
5 which will expire in T seconds and sets its accepted proposal to the proposal received (if it has stored a previous proposal, it discards it). It then constructs a propose response with an accept answer containing the ballot number [request.ballotnumber]. After the timeout expires the acceptor resets its accepted proposal to empty. Acceptors never reset their highest ballot number promised, except when restarting. Acceptor::OnProposeRequest() if (request.ballotnumber < state.highestpromised) return state.acceptedproposal = request.proposal SetTimeout(state.acceptedProposal.timeout) response.type = ProposeResponse response.ballotnumber = request.ballotnumber Send(response) Acceptor::OnTimeout() state.acceptedproposal = empty 5. The proposer examines the propose responses messages. If a majority of acceptors responded accepting the proposal, the proposer has the lease until its local timeout (started in step 3) expires. The time at which the proposer receives the last message necessary for a majority is the time at which it acquired the lease and may switch its internal state to I have the lease. Proposer::OnProposeResponse() if (response.ballotnumber!= state.ballotnumber) return // some other proposal numaccepted++ if (numaccepted < majority) return state.leaseowner = true // I am the lease owner Note that acceptors do not store their state to stable storage. Upon restart, they start out in a blank state. To make sure restarting nodes do not violate the lease invariant, nodes wait M seconds before rejoining the network. M is a globally known maximal lease time known to all nodes, and proposers always acquire leases for T < M seconds. It is important to note that since all that is transmitted are timespans (relative times), only the proposer which has the lease knows it has the lease. It cannot tell other nodes it has the lease (similar to learn messages in classical Paxos), since other nodes cannot know how much time these learn messages spent in transit. Thus, only the proposer which has the lease knows it has the lease. All the other nodes know is that they do not have the lease. In other words, each proposer has two states regarding the lease: I do not have the lease and I don t know who has the lease and I have the lease. Of course, nodes may send out learn messages as hints which may be used in higher-level applications or heuristics, but this is beyond the scope of this paper. 5
6 start timer for T sec prepare requests prepare responses propose requests propose responses timer expired proposer start timer for T sec proposer has the lease // lease acquired: "I have the lease" X lease expired: "I don't have the lease" acceptor // X acceptor // X acceptor start timer for T sec // X timer expired state discarded Figure 2: Time flow of a proposer acquiring the lease. It is possible that a proposer cannot get a majority of acceptors to send favorable responses in steps 3. and 5. above. In this case the proposer may sleep for a while and then restart its algorithm at step 1. with a higher ballot number. 4 Proof of lease invariance First we give the reader an intuitive notion of why PaxosLease works. Figure 2. (above) is a pictorial explanation: the proposer starts its timer before sending out propose requests, acceptors only start their timer later, before sending out propose responses, thus, if a majority of acceptors stored the state and started timers, no other proposer will be able to get the lease before the proposer s timer expires. No two proposers will simultaneously believe themselves to be lease holders. More formally, PaxosLease guarantees that if proposer i receives propose accept messages from a majority of acceptors for its proposal with ballot number b and timespan T at time t now, then if it started its timer at time t start, no other proposer will receive a similar majority until t end = t start + T. Proof: Suppose proposer p with ballot number b acquired the lease. It received empty prepare responses of type accept from a majority of acceptors and started its timer at t start, received propose responses of type accept from a majority of acceptors at t acquire, and thus has the lease until t end = t start + T. Let A 1 be the majority of acceptors who replied with empty prepare responses to p s prepare requests and A 2 be a majority of acceptors who accepted p s proposal and sent prepare responses of type accept. Part 1. No other proposer q can hold the lease between t acquire and t end with ballot number b < b. In order to hold the lease, the proposer q must get a majority A 2 of acceptors to accept its lease. Let a be an acceptor who is in both A 2 and A 1 : since b < b, a must have first accepted q s proposal and then sent a prepare response to p. However, if a sent an empty prepare response to p its state must have been empty, its timer must have expired, thus q s timer also expired, thus q already lost the lease. There is no overlap between p and q s lease. Part 2. No other proposer q can hold the lease between t acquire and t end with ballot number b < b. In order to hold the lease, the proposer q must get a majority 6
7 A 1 of acceptors to send it empty prepare responses. Let a be an acceptor who is in both A 1 and A 2: since b < b, a must have first accepted p s proposal and then sent a prepare response to q. However, since a accepted p s proposal, if it sent an empty prepare response to q its state must have been empty, its timer must have expired, thus p s timer also expired, thus p already lost the lease. There is no overlap between p and q s lease. 5 Liveness Paxos-type algorithms such as PaxosLease contain the possibility of dynamical deadlock: two proposers can continually generate higher and higher ballot numbers, send prepare requests to acceptors, who continually increase their highest ballot number promised, and neither of the proposers can get acceptors to accept their proposals. In practice, this can be worked around by proposers waiting a small but random time before re-running their algorithm. The main advantage of Paxos-type algorithms is that static deadlocks, as described in the naive majority vote algorithm, are not possible because proposers can overwrite acceptor states and the algorithm guarantees that no majority is overwritten. 6 Extending leases In some situations, once a proposer has the lease, it is important that it can hold on to it beyond its original lease time. A typical case is when the lease identifies the master node in a distributed system and it is desirable that a node can hold on to its mastership for a long time. To accomodate this, only the proposer s algorithm needs to be modified. In step 3., if a majority responded either with empty proposals or its existing proposal whose lease has not yet expired, it can propose itself again as the lease holder. This will allow the proposer to extend its lease by O(T) time. The acceptor s algorithm does not change. 7 Releasing leases In the algorithm described so far, the proposers lease automatically expires after some time. In some situations, it is important to release the lease as soon as possible so that other nodes may acquire it. A typical case is a distributed process, where workers acquire a lease for a resource, do work on it, and then wish to release the lease as soon as possible so that other workers can acquire it. To accomodate this, any proposer may at any point in time send special release messages to acceptors which contain the ballot number of the lease they wish to release. Before sending out the release messages, the proposer switches its internal state from I have the lease to I do not have the lease. If an acceptor receives a release message, it checks whether its accepted ballot number matches. If if does, it discards its state; otherwise it does nothing. Proposers can also send out release messages to other proposers as hints, telling them that they may be able to acquire the lease now. 7
8 8 Leases for many resources The algorithm defines lease actions concerning a single resource R. In practice nodes may deal with many resources, such as leases in a distributed process. PaxosLease can be used by running independent instances of the algorithm for each resource by tagging each message, proposer and acceptor state with a resource identifier. A node acting as proposer and acceptor requires no more than 100 bytes of memory per PaxosLease instance, which means it can handle 10 million resource leases per gigabyte of main memory. This and the fact that PaxosLease does not require disk syncs or time synchronization means the algorithm may be used in a wide variety of scenarios for fine-grained locking. 9 Implementation PaxosLease is used for master lease negotiation in Scalien Software s Keyspace replicated key-value store. Since Keyspace is licensed under the open-source AGPL license, this reference implementation, which contains many practical optimizations, is freely available to all interested readers. The source code and binaries may be downloaded at 10 Genealogy Leslie Lamport invented Paxos in 1990, but it was only published in This paper, The Part-Time Parliament was greek to many readers, which led to a second paper, Paxos Made Simple. Paxos solves the distributed consensus problem by introducing prepare and propose phases and having acceptors write their state to stable storage before replying to messages. Multiple rounds of Paxos may be run sequentially to negotiate state transitions of a replicated state machine. Paxos was popularized by its use in Google s in-house distributed stack, described in Paxos Made Live - An Engineering Perspective and The Chubby Lock Service for Loosely-Coupled Distributed Systems. In Google s Chubby, multiple, sequential rounds of Paxos are used to reach consensus on the next write opereration on a replicated database, another way of looking at a replicated state machine. Fatlease, described in FaTLease: Scalable Fault-Tolerant Lease Negotiation with Paxos solves the same problem as PaxosLease, but it is more complicated as it mimicks the multiple Paxos rounds introduced by the mentioned Google papers, instead of simply expiring the state of acceptors, as in PaxosLease. Also, Fatlease requires the nodes to synchronize their clocks, making it unattractive for realworld use. PaxosLease was inspired by Fatlease and addresses the aforementioned weaknesses. References [1] L. Lamport, The Part-Time Parliament, ACM Transactions on Computer Systems 16, 2 (May 1998), [2] L. Lamport, Paxos Made Simple, ACM SIGACT News 32, 4 (Dec. 2001), [3] T. Chandra, R. Griesemer, J. Redstone, Paxos Made Live - An Engineering Perspective, PODC 07: 26th ACM Symposium on Principles of Distributed Computing 8
9 [4] M. Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems, OSDI 06: Seventh Symposium on Operating System Design and Implementation. [5] F. Hupfeld et al., FaTLease: Scalable Fault-Tolerant Lease Negotiation with Paxos, HPDC08, June 2327, 2008, Boston, Massachusetts, USA. [6] AGPL License. 9
Cheap Paxos. Leslie Lamport and Mike Massa. Appeared in The International Conference on Dependable Systems and Networks (DSN 2004 )
Cheap Paxos Leslie Lamport and Mike Massa Appeared in The International Conference on Dependable Systems and Networks (DSN 2004 ) Cheap Paxos Leslie Lamport and Mike Massa Microsoft Abstract Asynchronous
Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Seth Gilbert Nancy Lynch Abstract When designing distributed web services, there are three properties that
Name: 1. CS372H: Spring 2009 Final Exam
Name: 1 Instructions CS372H: Spring 2009 Final Exam This exam is closed book and notes with one exception: you may bring and refer to a 1-sided 8.5x11- inch piece of paper printed with a 10-point or larger
Vertical Paxos and Primary-Backup Replication
Vertical Paxos and Primary-Backup Replication Leslie Lamport, Dahlia Malkhi, Lidong Zhou Microsoft Research 9 February 2009 Abstract We introduce a class of Paxos algorithms called Vertical Paxos, in which
Chapter 10. Backup and Recovery
Chapter 10. Backup and Recovery Table of Contents Objectives... 1 Relationship to Other Units... 2 Introduction... 2 Context... 2 A Typical Recovery Problem... 3 Transaction Loggoing... 4 System Log...
ECE 358: Computer Networks. Homework #3. Chapter 5 and 6 Review Questions 1
ECE 358: Computer Networks Homework #3 Chapter 5 and 6 Review Questions 1 Chapter 5: The Link Layer P26. Let's consider the operation of a learning switch in the context of a network in which 6 nodes labeled
(Refer Slide Time: 00:01:16 min)
Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control
Paxos Made Live - An Engineering Perspective
Paxos Made Live - An Engineering Perspective Tushar Chandra Robert Griesemer Joshua Redstone June 20, 2007 Abstract We describe our experience in building a fault-tolerant data-base using the Paxos consensus
CHAPTER 11: Flip Flops
CHAPTER 11: Flip Flops In this chapter, you will be building the part of the circuit that controls the command sequencing. The required circuit must operate the counter and the memory chip. When the teach
Linux Process Scheduling Policy
Lecture Overview Introduction to Linux process scheduling Policy versus algorithm Linux overall process scheduling objectives Timesharing Dynamic priority Favor I/O-bound process Linux scheduling algorithm
Homework 1 (Time, Synchronization and Global State) - 100 Points
Homework 1 (Time, Synchronization and Global State) - 100 Points CS25 Distributed Systems, Fall 2009, Instructor: Klara Nahrstedt Out: Thursday, September 3, Due Date: Thursday, September 17 Instructions:
Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification
Introduction Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Advanced Topics in Software Engineering 1 Concurrent Programs Characterized by
Data Management for Internet-Scale Single-Sign-On
Data Management for Internet-Scale Single-Sign-On Sharon E. Perl Google Inc. Margo Seltzer Harvard University & Oracle Corporation Abstract Google offers a variety of Internet services that require user
Advanced Computer Networks Project 2: File Transfer Application
1 Overview Advanced Computer Networks Project 2: File Transfer Application Assigned: April 25, 2014 Due: May 30, 2014 In this assignment, you will implement a file transfer application. The application
Synchronization in. Distributed Systems. Cooperation and Coordination in. Distributed Systems. Kinds of Synchronization.
Cooperation and Coordination in Distributed Systems Communication Mechanisms for the communication between processes Naming for searching communication partners Synchronization in Distributed Systems But...
Victor Shoup Avi Rubin. fshoup,[email protected]. Abstract
Session Key Distribution Using Smart Cards Victor Shoup Avi Rubin Bellcore, 445 South St., Morristown, NJ 07960 fshoup,[email protected] Abstract In this paper, we investigate a method by which smart
How To Understand The Power Of Zab In Zookeeper
Zab: High-performance broadcast for primary-backup systems Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini Yahoo! Research {fpj,breed,serafini}@yahoo-inc.com Abstract Zab is a crash-recovery
Transactional Support for SDN Control Planes "
Transactional Support for SDN Control Planes Petr Kuznetsov Telecom ParisTech WTTM, 2015 Software Defined Networking An emerging paradigm in computer network management Separate forwarding hardware (data
Distributed Systems. Tutorial 12 Cassandra
Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse
1 Organization of Operating Systems
COMP 730 (242) Class Notes Section 10: Organization of Operating Systems 1 Organization of Operating Systems We have studied in detail the organization of Xinu. Naturally, this organization is far from
Transport Layer Protocols
Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
Cloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
Dr Markus Hagenbuchner [email protected] CSCI319. Distributed Systems
Dr Markus Hagenbuchner [email protected] CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know
PART III. OPS-based wide area networks
PART III OPS-based wide area networks Chapter 7 Introduction to the OPS-based wide area network 7.1 State-of-the-art In this thesis, we consider the general switch architecture with full connectivity
Database Replication with Oracle 11g and MS SQL Server 2008
Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely
Quiz for Chapter 6 Storage and Other I/O Topics 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [6 points] Give a concise answer to each
CAs and Turing Machines. The Basis for Universal Computation
CAs and Turing Machines The Basis for Universal Computation What We Mean By Universal When we claim universal computation we mean that the CA is capable of calculating anything that could possibly be calculated*.
1 Workflow Design Rules
Workflow Design Rules 1.1 Rules for Algorithms Chapter 1 1 Workflow Design Rules When you create or run workflows you should follow some basic rules: Rules for algorithms Rules to set up correct schedules
Database Tuning and Physical Design: Execution of Transactions
Database Tuning and Physical Design: Execution of Transactions David Toman School of Computer Science University of Waterloo Introduction to Databases CS348 David Toman (University of Waterloo) Transaction
(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Pessimistic Timestamp Ordering. Write Operations and Timestamps
(Pessimistic) stamp Ordering Another approach to concurrency control: Assign a timestamp ts(t) to transaction T at the moment it starts Using Lamport's timestamps: total order is given. In distributed
Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.
Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. 23 Concurrency Control Part -4 In the last lecture, we have
ZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...
MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 6.033 Computer Systems Engineering: Spring 2013. Quiz I Solutions
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.033 Computer Systems Engineering: Spring 2013 Quiz I Solutions There are 11 questions and 8 pages in this
Distributed Data Stores
Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Distributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
ZME_WALLC-S Z-Wave Secure Wall Controller
ZME_WALLC-S Z-Wave Secure Wall Controller Firmware Version : 1.0 Quick Start S This device operates as Z-Wave sensor in two modes: the normal control mode (daily use) or in management mode for setup. Pushing
Back-up Server DOC-OEMSPP-S/2014-BUS-EN-10/12/13
Back-up Server DOC-OEMSPP-S/2014-BUS-EN-10/12/13 The information contained in this guide is not of a contractual nature and may be subject to change without prior notice. The software described in this
Real-Time Scheduling 1 / 39
Real-Time Scheduling 1 / 39 Multiple Real-Time Processes A runs every 30 msec; each time it needs 10 msec of CPU time B runs 25 times/sec for 15 msec C runs 20 times/sec for 5 msec For our equation, A
How To Make A Correct Multiprocess Program Execute Correctly On A Multiprocedor
How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor Leslie Lamport 1 Digital Equipment Corporation February 14, 1993 Minor revisions January 18, 1996 and September 14, 1996
Redis Cluster. a pragmatic approach to distribution
Redis Cluster a pragmatic approach to distribution All nodes are directly connected with a service channel. TCP baseport+4000, example 6379 -> 10379. Node to Node protocol is binary, optimized for bandwidth
Sequential Logic: Clocks, Registers, etc.
ENEE 245: igital Circuits & Systems Lab Lab 2 : Clocks, Registers, etc. ENEE 245: igital Circuits and Systems Laboratory Lab 2 Objectives The objectives of this laboratory are the following: To design
Analysis and Modeling of MapReduce s Performance on Hadoop YARN
Analysis and Modeling of MapReduce s Performance on Hadoop YARN Qiuyi Tang Dept. of Mathematics and Computer Science Denison University [email protected] Dr. Thomas C. Bressoud Dept. of Mathematics and
Traditional IBM Mainframe Operating Principles
C H A P T E R 1 7 Traditional IBM Mainframe Operating Principles WHEN YOU FINISH READING THIS CHAPTER YOU SHOULD BE ABLE TO: Distinguish between an absolute address and a relative address. Briefly explain
EZ DUPE DVD/CD Duplicator
EZ DUPE DVD/CD Duplicator User s Manual Version 3.0 0 TABLE OF CONTENTS Introduction 2 Setup 11 LCD Front Panel Overview 2 o Auto Start Time 11 Menu Overview 3-5 o Display Mode 12 Functions 6 o Button
A Framework for Highly Available Services Based on Group Communication
A Framework for Highly Available Services Based on Group Communication Alan Fekete [email protected] http://www.cs.usyd.edu.au/ fekete Department of Computer Science F09 University of Sydney 2006,
Modeling Complex Time Limits
Modeling Complex Time Limits Oleg Svatos Department of Information Technologies, University of Economics, Prague, Czech Republic [email protected] Abstract: In this paper we analyze complexity of time limits
EKSAMEN / EXAM TTM4100 18 05 2007
1.1 1.1.1...... 1.1.2...... 1.1.3...... 1.1.4...... 1.1.5...... 1.1.6...... 1.1.7...... 1.1.8...... 1.1.9...... 1.1.10.... 1.1.11... 1.1.16.... 1.1.12... 1.1.17.... 1.1.13... 1.1.18.... 1.1.14... 1.1.19....
Big Table A Distributed Storage System For Data
Big Table A Distributed Storage System For Data OSDI 2006 Fay Chang, Jeffrey Dean, Sanjay Ghemawat et.al. Presented by Rahul Malviya Why BigTable? Lots of (semi-)structured data at Google - - URLs: Contents,
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
Midterm Exam CMPSCI 453: Computer Networks Fall 2011 Prof. Jim Kurose
Midterm Exam CMPSCI 453: Computer Networks Fall 2011 Prof. Jim Kurose Instructions: There are 4 questions on this exam. Please use two exam blue books answer questions 1, 2 in one book, and the remaining
Load Testing and Monitoring Web Applications in a Windows Environment
OpenDemand Systems, Inc. Load Testing and Monitoring Web Applications in a Windows Environment Introduction An often overlooked step in the development and deployment of Web applications on the Windows
Evaluating HDFS I/O Performance on Virtualized Systems
Evaluating HDFS I/O Performance on Virtualized Systems Xin Tang [email protected] University of Wisconsin-Madison Department of Computer Sciences Abstract Hadoop as a Service (HaaS) has received increasing
THE WINDOWS AZURE PROGRAMMING MODEL
THE WINDOWS AZURE PROGRAMMING MODEL DAVID CHAPPELL OCTOBER 2010 SPONSORED BY MICROSOFT CORPORATION CONTENTS Why Create a New Programming Model?... 3 The Three Rules of the Windows Azure Programming Model...
Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review
Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review 1 Rukman Palta, 2 Rubal Jeet 1,2 Indo Global College Of Engineering, Abhipur, Punjab Technical University, jalandhar,india
It is the thinnest layer in the OSI model. At the time the model was formulated, it was not clear that a session layer was needed.
Session Layer The session layer resides above the transport layer, and provides value added services to the underlying transport layer services. The session layer (along with the presentation layer) add
OpenFlow Based Load Balancing
OpenFlow Based Load Balancing Hardeep Uppal and Dane Brandon University of Washington CSE561: Networking Project Report Abstract: In today s high-traffic internet, it is often desirable to have multiple
Notes on Assembly Language
Notes on Assembly Language Brief introduction to assembly programming The main components of a computer that take part in the execution of a program written in assembly code are the following: A set of
Lesson Objectives. To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization
Lesson Objectives To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization AE3B33OSD Lesson 1 / Page 2 What is an Operating System? A
Comparison of Different Implementation of Inverted Indexes in Hadoop
Comparison of Different Implementation of Inverted Indexes in Hadoop Hediyeh Baban, S. Kami Makki, and Stefan Andrei Department of Computer Science Lamar University Beaumont, Texas (hbaban, kami.makki,
Counters. Present State Next State A B A B 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0
ounter ounters ounters are a specific type of sequential circuit. Like registers, the state, or the flip-flop values themselves, serves as the output. The output value increases by one on each clock cycle.
Distributed Databases
C H A P T E R19 Distributed Databases Practice Exercises 19.1 How might a distributed database designed for a local-area network differ from one designed for a wide-area network? Data transfer on a local-area
Concepts of digital forensics
Chapter 3 Concepts of digital forensics Digital forensics is a branch of forensic science concerned with the use of digital information (produced, stored and transmitted by computers) as source of evidence
Level 2 Routing: LAN Bridges and Switches
Level 2 Routing: LAN Bridges and Switches Norman Matloff University of California at Davis c 2001, N. Matloff September 6, 2001 1 Overview In a large LAN with consistently heavy traffic, it may make sense
Leader Election Using NewSQL Database Systems
Leader Election Using NewSQL Database Systems Salman Niazi 1( ), Mahmoud Ismail 1, Gautier Berthou 2, and Jim Dowling 1,2 1 KTH - Royal Institute of Technology, Stockholm, Sweden {smkniazi,maism,jdowling}@kth.se
Java Virtual Machine Locks
Java Virtual Machine Locks SS 2008 Synchronized Gerald SCHARITZER (e0127228) 2008-05-27 Synchronized 1 / 13 Table of Contents 1 Scope...3 1.1 Constraints...3 1.2 In Scope...3 1.3 Out of Scope...3 2 Logical
ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies
ETEC 2301 Programmable Logic Devices Chapter 10 Counters Shawnee State University Department of Industrial and Engineering Technologies Copyright 2007 by Janna B. Gallaher Asynchronous Counter Operation
RT-QoS for Wireless ad-hoc Networks of Embedded Systems
RT-QoS for Wireless ad-hoc Networks of Embedded Systems Marco accamo University of Illinois Urbana-hampaign 1 Outline Wireless RT-QoS: important MA attributes and faced challenges Some new ideas and results
1. Comments on reviews a. Need to avoid just summarizing web page asks you for:
1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
Distributed Systems, Failures, and Consensus. Jeff Chase Duke University
Distributed Systems, Failures, and Consensus Jeff Chase Duke University The Players Choose from a large set of interchangeable terms: Processes, threads, tasks, Processors, nodes, servers, clients, Actors,
Distributed Lucene : A distributed free text index for Hadoop
Distributed Lucene : A distributed free text index for Hadoop Mark H. Butler and James Rutherford HP Laboratories HPL-2008-64 Keyword(s): distributed, high availability, free text, parallel, search Abstract:
Chapter 2: OS Overview
Chapter 2: OS Overview CmSc 335 Operating Systems 1. Operating system objectives and functions Operating systems control and support the usage of computer systems. a. usage users of a computer system:
Exception and Interrupt Handling in ARM
Exception and Interrupt Handling in ARM Architectures and Design Methods for Embedded Systems Summer Semester 2006 Author: Ahmed Fathy Mohammed Abdelrazek Advisor: Dominik Lücke Abstract We discuss exceptions
Licensing for BarTender s Automation Editions. Understanding Printer-Based Licensing WHITE PAPER
Licensing for BarTender s Automation Editions Understanding Printer-Based Licensing and How to Configure Seagull License Server WHITE PAPER Contents Introduction to Printer-Based Licensing 3 Available
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.
DNP Points List and Implementation
S&C Electric Company BankGuard Plus DNP Points List and Implementation This appendix describes the DNP points and DNP implementation for the BankGuard PLUS Control, using software UPPD106S. DNP Points
Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 26 Real - Time POSIX. (Contd.) Ok Good morning, so let us get
KWIC Implemented with Pipe Filter Architectural Style
KWIC Implemented with Pipe Filter Architectural Style KWIC Implemented with Pipe Filter Architectural Style... 2 1 Pipe Filter Systems in General... 2 2 Architecture... 3 2.1 Pipes in KWIC system... 3
CS 557- Project 1 A P2P File Sharing Network
Assigned: Feb 26, 205 Due: Monday, March 23, 205 CS 557- Project A P2P File Sharing Network Brief Description This project involves the development of FreeBits, a file sharing network loosely resembling
ICOM 5026-090: Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM
ICOM 5026-090: Computer Networks Chapter 6: The Transport Layer By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 Outline The transport service Elements of transport protocols A
Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications
Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information
Performance Tuning Guide for ECM 2.0
Performance Tuning Guide for ECM 2.0 Rev: 20 December 2012 Sitecore ECM 2.0 Performance Tuning Guide for ECM 2.0 A developer's guide to optimizing the performance of Sitecore ECM The information contained
Bigdata High Availability (HA) Architecture
Bigdata High Availability (HA) Architecture Introduction This whitepaper describes an HA architecture based on a shared nothing design. Each node uses commodity hardware and has its own local resources
Citrix EdgeSight for Load Testing User s Guide. Citrx EdgeSight for Load Testing 2.7
Citrix EdgeSight for Load Testing User s Guide Citrx EdgeSight for Load Testing 2.7 Copyright Use of the product documented in this guide is subject to your prior acceptance of the End User License Agreement.
Final Exam. Route Computation: One reason why link state routing is preferable to distance vector style routing.
UCSD CSE CS 123 Final Exam Computer Networks Directions: Write your name on the exam. Write something for every question. You will get some points if you attempt a solution but nothing for a blank sheet
Replication on Virtual Machines
Replication on Virtual Machines Siggi Cherem CS 717 November 23rd, 2004 Outline 1 Introduction The Java Virtual Machine 2 Napper, Alvisi, Vin - DSN 2003 Introduction JVM as state machine Addressing non-determinism
- IGRP - IGRP v1.22 Aaron Balchunas
1 - GRP - GRP (nterior Gateway Routing Protocol) GRP is a isco-proprietary Distance-Vector protocol, designed to be more scalable than RP, its standardized counterpart. GRP adheres to the following Distance-Vector
