Distributed Systems Principles and Paradigms. Chapter 04: Communication. 4.1 Layered Protocols. Basic networking model.



Similar documents
Communication. Layered Protocols. Topics to be covered. PART 1 Layered Protocols Remote Procedure Call (RPC) Remote Method Invocation (RMI)

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Requirements of Voice in an IP Internetwork

How To Write A Network Operating System For A Network (Networking) System (Netware)

Per-Flow Queuing Allot's Approach to Bandwidth Management

Introduction CORBA Distributed COM. Sections 9.1 & 9.2. Corba & DCOM. John P. Daigle. Department of Computer Science Georgia State University

IP Network Layer. Datagram ID FLAG Fragment Offset. IP Datagrams. IP Addresses. IP Addresses. CSCE 515: Computer Network Programming TCP/IP

CS555: Distributed Systems [Fall 2015] Dept. Of Computer Science, Colorado State University

IP-Telephony Real-Time & Multimedia Protocols

Voice over IP: RTP/RTCP The transport layer

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

The network we see so far. Internet Best Effort Service. Is best-effort good enough? An Audio Example. Network Support for Playback

Middleware Lou Somers

A distributed system is defined as

Computer Networks. Chapter 5 Transport Protocols

Advanced Networking Voice over IP: RTP/RTCP The transport layer

Introduction to Computer Networks

Chapter 11. User Datagram Protocol (UDP)

Event-based middleware services

Chapter 2: Remote Procedure Call (RPC)

Transport Layer Protocols

Transport and Network Layer

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Traffic Shaping: Leaky Bucket Algorithm

Protocols and Architecture. Protocol Architecture.

Middleware: Past and Present a Comparison

QoS Parameters. Quality of Service in the Internet. Traffic Shaping: Congestion Control. Keeping the QoS

Quality of Service Management for Teleteaching Applications Using the MPEG-4/DMIF

What is CSG150 about? Fundamentals of Computer Networking. Course Outline. Lecture 1 Outline. Guevara Noubir noubir@ccs.neu.

Multimedia Applications. Streaming Stored Multimedia. Classification of Applications

Network Programming TDC 561

Quality of Service. Traditional Nonconverged Network. Traditional data traffic characteristics:

Architecture and Performance of the Internet

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

Multimedia Requirements. Multimedia and Networks. Quality of Service

Dr Markus Hagenbuchner CSCI319. Distributed Systems

Quality of Service (QoS) EECS 122: Introduction to Computer Networks Resource Management and QoS. What s the Problem?

Resource Utilization of Middleware Components in Embedded Systems

EITF25 Internet Techniques and Applications L5: Wide Area Networks (WAN) Stefan Höst

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Operating Systems Design 16. Networking: Sockets

It is the thinnest layer in the OSI model. At the time the model was formulated, it was not clear that a session layer was needed.

Distribution transparency. Degree of transparency. Openness of distributed systems

ICOM : Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM

How To Design A Layered Network In A Computer Network

Solving complex performance problems in TCP/IP and SNA environments.

A Transport Protocol for Multimedia Wireless Sensor Networks

Encapsulating Voice in IP Packets

Ethernet. Ethernet. Network Devices

Communications and Computer Networks

Analysis of IP Network for different Quality of Service

COMMMUNICATING COOPERATING PROCESSES

Access Control: Firewalls (1)

Definition. A Historical Example

Final for ECE374 05/06/13 Solution!!

EINDHOVEN UNIVERSITY OF TECHNOLOGY Department of Mathematics and Computer Science

TCP/IP Fundamentals. OSI Seven Layer Model & Seminar Outline

Computer Networks & Security 2014/2015

The OSI model has seven layers. The principles that were applied to arrive at the seven layers can be briefly summarized as follows:

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Web Services. Copyright 2011 Srdjan Komazec

Distributed Systems 3. Network Quality of Service (QoS)

Objectives of Lecture. Network Architecture. Protocols. Contents

RTP / RTCP. Announcements. Today s Lecture. RTP Info RTP (RFC 3550) I. Final Exam study guide online. Signup for project demos

Computer Networks - CS132/EECS148 - Spring

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed.

Chapter 6. CORBA-based Architecture. 6.1 Introduction to CORBA 6.2 CORBA-IDL 6.3 Designing CORBA Systems 6.4 Implementing CORBA Applications

Lecture 33. Streaming Media. Streaming Media. Real-Time. Streaming Stored Multimedia. Streaming Stored Multimedia

Management of Telecommunication Networks. Prof. Dr. Aleksandar Tsenov

16/5-05 Datakommunikation - Jonny Pettersson, UmU 2. 16/5-05 Datakommunikation - Jonny Pettersson, UmU 4

Voice over IP. Overview. What is VoIP and how it works. Reduction of voice quality. Quality of Service for VoIP

Network Management Quality of Service I

Real-time apps and Quality of Service

The OSI Model and the TCP/IP Protocol Suite

[Prof. Rupesh G Vaishnav] Page 1

Congestion Control Review Computer Networking. Resource Management Approaches. Traffic and Resource Management. What is congestion control?

Internet Firewall CSIS Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS net15 1. Routers can implement packet filtering

Dynamic Load Balancing and Node Migration in a Continuous Media Network

Strategies. Addressing and Routing

Network Simulation Traffic, Paths and Impairment

Network-Oriented Software Development. Course: CSc4360/CSc6360 Instructor: Dr. Beyah Sessions: M-W, 3:00 4:40pm Lecture 2

Computer Networks Network architecture

Names & Addresses. Names & Addresses. Hop-by-Hop Packet Forwarding. Longest-Prefix-Match Forwarding. Longest-Prefix-Match Forwarding

RARP: Reverse Address Resolution Protocol

CSIS CSIS 3230 Spring Networking, its all about the apps! Apps on the Edge. Application Architectures. Pure P2P Architecture

We mean.network File System

Communication Networks. MAP-TELE 2011/12 José Ruela

A NOVEL RESOURCE EFFICIENT DMMS APPROACH

CS335 Sample Questions for Exam #2

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

VoIP QoS. Version 1.0. September 4, AdvancedVoIP.com. Phone:

Transcription:

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 04: Version: November 5, 2012 1 / 55 Layered Protocols Low-level layers Transport layer Application layer Middleware layer 2 / 55 2 / 55 Basic networking model Application Presentation Session Transport Network Data link Physical Application protocol Presentation protocol Session protocol Transport protocol Network protocol Data link protocol Physical protocol 7 6 5 4 3 2 1 Network Drawbacks Focus on message-passing only Often unneeded or unwanted functionality Violates access transparency 3 / 55 3 / 55

Low-level layers Recap Physical layer: contains the specification and implementation of bits, and their transmission between sender and receiver Data link layer: prescribes the transmission of a series of bits into a frame to allow for error and flow control Network layer: describes how packets in a network of computers are to be routed. Observation For many distributed systems, the lowest-level interface is that of the network layer. 4 / 55 4 / 55 Transport Layer Important The transport layer provides the actual communication facilities for most distributed systems. Standard Internet protocols TCP: connection-oriented, reliable, stream-oriented communication UDP: unreliable (best-effort) datagram communication Note IP multicasting is often considered a standard available service (which may be dangerous to assume). 5 / 55 5 / 55 Middleware Layer Observation Middleware is invented to provide common services and protocols that can be used by many different applications A rich set of communication protocols (Un)marshaling of data, necessary for integrated systems Naming protocols, to allow easy sharing of resources Security protocols for secure communication Scaling mechanisms, such as for replication and caching Note What remains are truly application-specific protocols... such as? 6 / 55 6 / 55

Types of communication Request Synchronize at request submission Synchronize at request delivery Transmission interrupt Storage facility Synchronize after processing by server Reply Time Distinguish Transient versus persistent communication Asynchrounous versus synchronous communication 7 / 55 7 / 55 Types of communication Request Synchronize at request submission Synchronize at request delivery Transmission interrupt Storage facility Synchronize after processing by server Reply Time Transient versus persistent Transient communication: Comm. server discards message when it cannot be delivered at the next server, or at the receiver. Persistent communication: A message is stored at a communication server as long as it takes to deliver it. 8 / 55 8 / 55 Types of communication Request Synchronize at request submission Synchronize at request delivery Transmission interrupt Storage facility Synchronize after processing by server Reply Time Places for synchronization At request submission At request delivery After request processing 9 / 55 9 / 55

/ Some observations / computing is generally based on a model of transient synchronous communication: and server have to be active at time of commun. issues request and blocks until it receives reply essentially waits only for incoming requests, and subsequently processes them Drawbacks synchronous communication cannot do any other work while waiting for reply Failures have to be handled immediately: the client is waiting The model may simply not be appropriate (mail, news) 10 / 55 10 / 55 Messaging Message-oriented middleware Aims at high-level persistent asynchronous communication: Processes send each other messages, which are queued Sender need not wait for immediate reply, but can do other things Middleware often ensures fault tolerance 11 / 55 11 / 55 Remote Procedure Call (RPC) Basic RPC operation Parameter passing Variations 12 / 55 12 / 55

Basic RPC operation Observations Application developers are familiar with simple procedure model Well-engineered procedures operate in isolation (black box) There is no fundamental reason not to execute procedures on separate machine Conclusion between caller & callee can be hidden by using procedure-call mechanism. Call remote procedure Request Wait for result Reply Call local procedure and return results Return from call Time 13 / 55 13 / 55 Basic RPC operation machine machine process k = add(i,j) proc: "add" int: val(i) int: val(j) 1. call to procedure stub stub 2. Stub builds message process Implementation of add k = add(i,j) proc: "add" int: val(i) int: val(j) 6. Stub makes local call to "add" 5. Stub unpacks message OS proc: "add" int: val(i) int: val(j) OS 4. OS hands message to server stub 3. Message is sent across the network 1 procedure calls client stub. 2 Stub builds message; calls local OS. 3 OS sends message to remote OS. 4 Remote OS gives message to stub. 5 Stub unpacks parameters and calls server. 6 makes local call and returns result to stub. 7 Stub builds message; calls OS. 8 OS sends message to client s OS. 9 s OS gives message to stub. 10 stub unpacks result and returns to the client. 14 / 55 14 / 55 RPC: Parameter passing Parameter marshaling There s more than just wrapping parameters into a message: and server machines may have different data representations (think of byte ordering) Wrapping a parameter means transforming a value into a sequence of bytes and server have to agree on the same encoding: How are basic data values represented (integers, floats, characters) How are complex data values represented (arrays, unions) and server need to properly interpret messages, transforming them into machine-dependent representations. 15 / 55 15 / 55

RPC: Parameter passing RPC parameter passing: some assumptions Copy in/copy out semantics: while procedure is executed, nothing can be assumed about parameter values. All data that is to be operated on is passed by parameters. Excludes passing references to (global) data. Conclusion Full access transparency cannot be realized. Observation A remote reference mechanism enhances access transparency: Remote reference offers unified access to remote data Remote references can be passed as parameter in RPCs 16 / 55 16 / 55 Asynchronous RPCs Essence Try to get rid of the strict request-reply behavior, but let the client continue without waiting for an answer from the server. Wait for result Wait for acceptance Call remote procedure Return from call Call remote procedure Return from call Request Reply Request Accept request Call local procedure and return results Time Call local procedure Time (a) (b) 17 / 55 17 / 55 Deferred synchronous RPCs Wait for acceptance Interrupt client Call remote procedure Request Return from call Accept request Call local procedure Return results Acknowledge Time Call client with one-way RPC Variation can also do a (non)blocking poll at the server to see whether results are available. 18 / 55 18 / 55

RPC in practice Uuidgen Interface definition file IDL compiler code stub Header stub code #include #include C compiler C compiler C compiler C compiler object file stub object file stub object file object file Linker Runtime library Runtime library Linker binary binary 19 / 55 19 / 55 -to-server binding (DCE) Issues (1) must locate server machine, and (2) locate the server. Directory machine machine 3. Look up server Directory server 2. Register service machine 5. Do RPC 1. Register endpoint 4. Ask for endpoint DCE daemon Endpoint table 20 / 55 20 / 55 Message-Oriented 4.3 Message-Oriented 4.3 Message-Oriented Transient Messaging Message-Queuing System Message Brokers Example: IBM Websphere 21 / 55 21 / 55

Transient messaging: sockets 4.3 Message-Oriented 4.3 Message-Oriented Berkeley socket interface SOCKET Create a new communication endpoint BIND Attach a local address to a socket LISTEN Announce willingness to accept N connections ACCEPT Block until request to establish a connection CONNECT Attempt to establish a connection SEND Send data over a connection RECEIVE Receive data over a connection CLOSE Release the connection 22 / 55 22 / 55 Transient messaging: sockets 4.3 Message-Oriented 4.3 Message-Oriented socket bind listen accept read write close Synchronization point socket connect write read close 23 / 55 23 / 55 4.3 Message-Oriented 4.3 Message-Oriented Sockets: Python code import socket HOST = PORT = SERVERPORT s = socket.socket(socket.af INET, socket.sock STREAM) s.bind((host, PORT)) s.listen(n) # listen to max N queued connection (conn, addr) = s.accept() # returns new socket + addr client while 1: # forever data = conn.recv(1024) if not data: break conn.send(data) conn.close() import socket HOST = distsys.cs.vu.nl PORT = SERVERPORT s = socket.socket(socket.af INET, socket.sock STREAM) s.connect((host, PORT)) s.send( Hello, world ) data = s.recv(1024) s.close() 24 / 55 24 / 55

Message-oriented middleware 4.3 Message-Oriented 4.3 Message-Oriented Essence Asynchronous persistent communication through support of middleware-level queues. Queues correspond to buffers at communication servers. PUT GET POLL NOTIFY Append a message to a specified queue Block until the specified queue is nonempty, and remove the first message Check a specified queue for messages, and remove the first. Never block Install a handler to be called when a message is put into the specified queue 25 / 55 25 / 55 4.3 Message-Oriented 4.3 Message-Oriented Message broker Observation Message queuing systems assume a common messaging protocol: all applications agree on message format (i.e., structure and data representation) Message broker Centralized component that takes care of application heterogeneity in an MQ system: Transforms incoming messages to target format Very often acts as an application gateway May provide subject-based routing capabilities Enterprise Application Integration 26 / 55 26 / 55 4.3 Message-Oriented 4.3 Message-Oriented Message broker Source client Message broker Repository with conversion rules and programs Destination client Broker program OS Queuing layer OS OS Network 27 / 55 27 / 55

4.3 Message-Oriented 4.3 Message-Oriented IBM s WebSphere MQ Basic concepts Application-specific messages are put into, and removed from queues Queues reside under the regime of a queue manager Processes can put messages only in local queues, or through an RPC mechanism 28 / 55 28 / 55 4.3 Message-Oriented 4.3 Message-Oriented IBM s WebSphere MQ Message transfer Messages are transferred between queues Message transfer between queues at different processes, requires a channel At each endpoint of channel is a message channel agent Message channel agents are responsible for: Setting up channels using lower-level network communication facilities (e.g., TCP/IP) (Un)wrapping messages from/in transport-level packets Sending/receiving packets 29 / 55 29 / 55 4.3 Message-Oriented 4.3 Message-Oriented IBM s WebSphere MQ Sending client Routing table Send queue 's receive queue Receiving client Program Queue manager Queue manager Program MQ Interface Stub stub MCA MCA MCA MCA stub Stub RPC (synchronous) Local network Message passing (asynchronous) Enterprise network To other remote queue managers Channels are inherently unidirectional Automatically start MCAs when messages arrive Any network of queue managers can be created Routes are set up manually (system administration) 30 / 55 30 / 55

4.3 Message-Oriented 4.3 Message-Oriented IBM s WebSphere MQ Routing By using logical names, in combination with name resolution to local queues, it is possible to put a message in a remote queue Alias table LA1 QMC LA2 QMD SQ2 QMA Routing table QMB SQ1 QMC SQ1 QMD SQ2 SQ1 Alias table LA1 QMA LA2 QMD SQ1 Routing table QMA SQ1 QMC SQ1 QMD SQ1 QMB Routing table QMA SQ1 QMC SQ2 QMB SQ1 QMD SQ1 SQ2 Alias table LA1 QMA LA2 QMC QMC SQ1 Routing table QMA SQ1 QMB SQ1 QMD SQ1 31 / 55 31 / 55 Stream-oriented communication 4.4 Stream-Oriented 4.4 Stream-Oriented Support for continuous media Streams in distributed systems Stream management 32 / 55 32 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Continuous media Observation All communication facilities discussed so far are essentially based on a discrete, that is time-independent exchange of information Continuous media Characterized by the fact that values are time dependent: Audio Video Animations Sensor data (temperature, pressure, etc.) 33 / 55 33 / 55

4.4 Stream-Oriented 4.4 Stream-Oriented Continuous media Transmission modes Different timing guarantees with respect to data transfer: Asynchronous: no restrictions with respect to when data is to be delivered Synchronous: define a maximum end-to-end delay for individual data packets Isochronous: define a maximum and minimum end-to-end delay (jitter is bounded) 34 / 55 34 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Stream Definition A (continuous) data stream is a connection-oriented communication facility that supports isochronous data transmission. Some common stream characteristics Streams are unidirectional There is generally a single source, and one or more sinks Often, either the sink and/or source is a wrapper around hardware (e.g., camera, CD device, TV monitor) Simple stream: a single flow of data, e.g., audio or video Complex stream: multiple data flows, e.g., stereo audio or combination audio/video 35 / 55 35 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Streams and QoS Essence Streams are all about timely delivery of data. How do you specify this Quality of Service (QoS)? Basics: The required bit rate at which data should be transported. The maximum delay until a session has been set up (i.e., when an application can start sending data). The maximum end-to-end delay (i.e., how long it will take until a data unit makes it to a recipient). The maximum delay variance, or jitter. The maximum round-trip delay. 36 / 55 36 / 55

4.4 Stream-Oriented 4.4 Stream-Oriented Enforcing QoS Observation There are various network-level tools, such as differentiated services by which certain packets can be prioritized. Also Use buffers to reduce jitter: Packet departs source 1 2 3 4 5 6 7 8 Packet arrives at buffer 1 2 3 4 5 6 7 8 Packet removed from buffer 0 5 Time in buffer 10 Time (sec) 1 2 3 4 5 6 7 8 15 20 Gap in playback 37 / 55 37 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Enforcing QoS Problem How to reduce the effects of packet loss (when multiple samples are in a single packet)? 38 / 55 38 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Enforcing QoS Lost packet Sent 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Delivered 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Gap of lost frames (a) Sent Lost packet 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 Delivered 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Lost frames (b) 39 / 55 39 / 55

4.4 Stream-Oriented 4.4 Stream-Oriented Stream synchronization Problem Given a complex stream, how do you keep the different substreams in synch? Example Think of playing out two channels, that together form stereo sound. Difference should be less than 20 30 µsec! 40 / 55 40 / 55 4.4 Stream-Oriented 4.4 Stream-Oriented Stream synchronization Receiver's machine Procedure that reads two audio data units for each video data unit Application Incoming stream OS Network Alternative Multiplex all substreams into a single stream, and demultiplex at the receiver. Synchronization is handled at multiplexing/demultiplexing point (MPEG). 41 / 55 41 / 55 4.5 Multicast 4.5 Multicast Multicast communication Application-level multicasting Gossip-based data dissemination 42 / 55 42 / 55

4.5 Multicast 4.5 Multicast Application-level multicasting Essence Organize nodes of a distributed system into an overlay network and use that network to disseminate data. Chord-based tree building 1 Initiator generates a multicast identifier mid. 2 Lookup succ(mid), the node responsible for mid. 3 Request is routed to succ(mid), which will become the root. 4 If P wants to join, it sends a join request to the root. 5 When request arrives at Q: Q has not seen a join request before it becomes forwarder; P becomes child of Q. Join request continues to be forwarded. Q knows about tree P becomes child of Q. No need to forward join request anymore. 43 / 55 43 / 55 4.5 Multicast 4.5 Multicast ALM: Some costs End host A 1 Ra 30 Re Router 1 20 Rc C 7 5 B 1 Rb 40 Internet Rd 1 D Overlay network Link stress: How often does an ALM message cross the same physical link? Example: message from A to D needs to cross Ra, Rb twice. Stretch: Ratio in delay between ALM-level path and network-level path. Example: messages B to C follow path of length 71 at ALM, but 47 at network level stretch = 71/47. 44 / 55 44 / 55 4.5 Multicast 4.5 Multicast Epidemic Algorithms General background Update models Removing objects 45 / 55 45 / 55

4.5 Multicast 4.5 Multicast Principles Basic idea Assume there are no write write conflicts: Update operations are performed at a single server A replica passes updated state to only a few neighbors Update propagation is lazy, i.e., not immediate Eventually, each update should reach every replica Two forms of epidemics Anti-entropy: Each replica regularly chooses another replica at random, and exchanges state differences, leading to identical states at both afterwards Gossiping: A replica which has just been updated (i.e., has been contaminated), tells a number of other replicas about its update (contaminating them as well). 46 / 55 46 / 55 4.5 Multicast 4.5 Multicast Anti-entropy Principle operations A node P selects another node Q from the system at random. Push: P only sends its updates to Q Pull: P only retrieves updates from Q Push-Pull: P and Q exchange mutual updates (after which they hold the same information). Observation For push-pull it takes O(log(N)) rounds to disseminate updates to all N nodes (round = when every node as taken the initiative to start an exchange). 47 / 55 47 / 55 Anti-entropy: analysis (extra) 4.5 Multicast 4.5 Multicast Basics Consider a single source, propagating its update. Let p i be the probability that a node has not received the update after the i-th cycle. Analysis: staying ignorant With pull, p i+1 = (p i ) 2 : the node was not updated during the i-th cycle and should contact another ignorant node during the next cycle. With push, p i+1 = p i (1 1 N )N(1 p i ) p i e 1 (for small p i and large N): the node was ignorant during the i-th cycle and no updated node chooses to contact it during the next cycle. With push-pull: (p i ) 2 (p i e 1 ) 48 / 55 48 / 55

4.5 Multicast 4.5 Multicast Anti-entropy performance Fraction ignorant 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 5 10 15 20 25 Cycle pull push 49 / 55 49 / 55 4.5 Multicast 4.5 Multicast Gossiping Basic model A server S having an update to report, contacts other servers. If a server is contacted to which the update has already propagated, S stops contacting other servers with probability 1/k. Observation If s is the fraction of ignorant servers (i.e., which are unaware of the update), it can be shown that with many servers s = e (k+1)(1 s) 50 / 55 50 / 55 4.5 Multicast 4.5 Multicast Gossiping ln(s) -2.5-5.0-7.5-10.0-12.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 k Consider 10,000 nodes k s N s 1 0.203188 2032 2 0.059520 595 3 0.019827 198 4 0.006977 70 5 0.002516 25 6 0.000918 9 7 0.000336 3-15.0 Note If we really have to ensure that all servers are eventually updated, gossiping alone is not enough 51 / 55 51 / 55

4.5 Multicast 4.5 Multicast Deleting values Fundamental problem We cannot remove an old value from a server and expect the removal to propagate. Instead, mere removal will be undone in due time using epidemic algorithms Solution Removal has to be registered as a special update by inserting a death certificate 52 / 55 52 / 55 4.5 Multicast 4.5 Multicast Deleting values Next problem When to remove a death certificate (it is not allowed to stay for ever): Run a global algorithm to detect whether the removal is known everywhere, and then collect the death certificates (looks like garbage collection) Assume death certificates propagate in finite time, and associate a maximum lifetime for a certificate (can be done at risk of not reaching all servers) Note It is necessary that a removal actually reaches all servers. Question What s the scalability problem here? 53 / 55 53 / 55 4.5 Multicast 4.5 Multicast Example applications Typical apps Data dissemination: Perhaps the most important one. Note that there are many variants of dissemination. Aggregation: Let every node i maintain a variable x i. When two nodes gossip, they each reset their variable to x i,x j (x i + x j )/2 Result: in the end each node will have computed the average x = i x i /N. 54 / 55 54 / 55

Example application: aggregation 4.5 Multicast 4.5 Multicast Aggregation (continued) When two nodes gossip, they each reset their variable to x i,x j (x i + x j )/2 Result: in the end each node will have computed the average x = i x i /N. Question What happens if initially x i = 1 and x j = 0,j i? Question How can we start this computation without pre-assigning a node i to start as only one with x i 1? 55 / 55 55 / 55