Clock Synchronization using Packet Streams

Clock Synchronization using Packet Streams Philipp Blum and Lothar Thiele Institut für Technische Informatik und Kommunikationsnetze ETH Zürich, ETH Zentrum, CH-892 Zürich, Switzerland E-mail: {blum,thiele}@tik.ee.ethz.ch Abstract. Recent distributed applications in the domains of digital audio and sensor networks require clock synchronization in the order of 1µs. The achievable precision depends on system properties like synchronization message delay and synchronization message pattern. These properties are often unknown or difficult to determine. Therefore we propose a new analysis for clock synchronization algorithms. The analysis is based on two properties: safe synchronization and optimal selective synchronization. We present two algorithms that use an arbitrary stream of synchronization messages to achieve clock synchronization over an asynchronous communication channel and on clocks with unknown and variable drift. The first algorithm needs no parameterization, is safe and optimal selective. The second algorithm improves the first with a drift compensation mechanism. It is safe. A parameter of the algorithm allows to choose between fast drift compensation and high probability of optimal selective behavior. Simulation results show that the algorithms can achieve 1µs precision on 82.11b wireless LAN in ad-hoc mode. 1 Introduction and Related Work Recent distributed applications in the domain of high-quality digital audio require clock synchronization between nodes in the order of 1µs. Similar precision requirements in the context of sensor networks have been reported by [6]. Tight synchronization is difficult because of clock drift and non-deterministic message delay. Another limiting factor is the amount of communication spent on synchronization. Several probabilistic clock synchronization algorithms have been proposed to achieve a good precision under partially unknown system specifications. [3, 1, 9] use a client-server approach to read reference time repeatedly until this succeeds with a specified precision bound. [3] achieves a precision in the order of 1ms using 4 messages per minute, [1] is similar. [9] achieves 1µs using highly deterministic Myrinet communication. The drawback of client-server schemes is that they do not scale well with system size and synchronization message frequency. A producer-consumer approach is taken in [2]. With unidirectional communication only, a precision in the order of 1ms is achieved. This approach requires that statistical properties of the message delays are known. For broadcast networks, [5] proposes a scheme that eliminates the medium access time from the synchronization path and achieves precision in the order of 1µs using commercial off-the-shelf technology. Unknown system properties like non-deterministic message delay and synchronization message pattern make it difficult to apply bounds on the achievable precision as proposed by [8, 7, 1, 12, 11]. Therefore we propose an analysis of clock synchronization algorithms under completely unknown system specifications in terms of message delay and message pattern. Instead of bounds on the achievable precision, we propose two properties that describe good algorithms. Safe synchronization never degrades the precision of a synchronized clock. Optimal selective synchronization never misses to improve the precision if there is a chance to do so. As a consequence, the achieved precision can only improve when more communication is spent on clock synchronization. We propose two new algorithms. The first algorithm needs no parameterization, is safe and optimal selective. The second algorithm This work has been developed in cooperation with BridgeCo AG, Ringstrasse 14, 86 Dübendorf, Switzerland The author has received funding from the Swiss Commission for Technology and Innovation (KTI), Effingerstrasse 27, 33 Bern, Switzerland

improves the first in that it compensates clock drift. It is safe. A parameter of the algorithm allows to chose between fast drift compensation and high probability of optimal selective behavior. We evaluate the algorithms based on real network and clock traces, obtained in a 82.11b wireless LAN network in ad-hoc mode and standard Linux PCs. Our assumptions on communication and clocks are similar to those of [8, 12, 11]. Our system model is different from [3, 1, 9] that assume a client-server synchronization algorithm with communication in both directions. It is different from [2] in that we make no statistical assumptions on message delay distribution. It is different from [5] in that we do not require a broadcast medium, though our producer-consumer scheme is also directly applicable in a broadcast system. Our algorithms are similar to Lamport s algorithm for physical clocks [8]. They are different in that they start software clocks that progress always slower than reference time. Presentation overview: In section 2, we introduce a model for clock synchronization algorithms that use streams of synchronization messages. Section 3 introduces the properties safe synchronization and optimal selective synchronization and presents the local selection algorithm. In section 4, we present the local selection algorithm with drift compensation. Section 5 presents simulation results based on real network and clock traces. 2 System model The system we study in this paper is distributed to two nodes: On the first node, a process that has access to reference time generates synchronization messages. Reference time is assumed to be a positive real number t R. We refer to this node and the associated process as the synchronization source or producer. The second node has no direct access to reference time but it can read a local hardware clock that has an arbitrary offset and an unknown and variable drift relative to reference time. We refer to this node as the synchronization consumer. Upon every arrival of a synchronization message, a clock synchronization algorithm is invoked on the consumer. 2.1 Communication Definition 1. A Synchronization message is a tuple (s, r) where s R represents its send time, i.e. reference time at the moment when the message is sent at the producer node, and where r R represents receive time, i.e. reference time at the moment when the message is received at the consumer node. The payload of a synchronization message includes at least its send time. The delay of a message is the difference between its send and receive time, i.e. d def = r s. The delay of a synchronization message is positive, i.e. d >. Definition 2. A Synchronization stream is an infinite sequence S def = (s i, r i ) i N 1 of synchronization messages received by the consumer, ordered by their receive time, i.e. i N 1 : r i < r i+1. Messages can get lost in the communication channel: The index i only counts those synchronization messages that eventually reach the consumer. Messages can get reordered in the channel: The index i counts the synchronization messages in their order of arrival, which is not necessarily the order in which they are produced. 2.2 Clocks The consumer node has access to a read-only hardware clock h. The clock synchronization algorithm computes a new software clock c i whenever it is executed. An application that requires synchronization always reads the software clock that has been started last. Software clocks are a well known concept [13, 4] and are also known under the name of logical clocks.

Definition 3. The Hardware clock is a monotone increasing and differentiable function h : R R that maps all values of reference time t to clock time h(t). The drift of the hardware clock is a function ρ h : R R that maps all values of reference time t to h s derivation minus 1: ρ h (t) def = dh(t)/dt 1. (1) Assumption 1 The drift of the hardware clock is bounded by the known constant ρ h max. The first derivation of the drift is bounded by the known constant δ h max. t R : ρ h (t) < ρ h max dρ h (t)/dt < δ h max. (2) Definition 4. A Software clock is a monotone increasing and differentiable function c i : R R that maps values of reference time t to clock time c i (t). The function c i is computed at time t = r i and its domain is {t R ri }. The index i expresses that the software clock c i (t) is started in the i-th execution of the synchronization algorithm. The drift of the software clock functions is defined as ρ i (t) def = dc i (t)/dt 1 and the precision as ɛ i (t) def def = c i (t) t. We will use the shortcuts c i = c i (r i ), def def ɛ i = ɛ i (r i ), ρ i = ρ i (r i ) and ρ h def i = ρ h (r i ). These definitions apply analogously to c i, ρ i and used in section 4. ɛ i 2.3 Clock synchronization algorithm Upon the reception of the synchronization message (s i, r i ), the consumer invokes a clock synchronization algorithm. The input for this algorithm is the payload of the synchronization message, i.e. s i and a timestamp from the local hardware clock h i = h(r i ). The output is a new software clock c i (t), which is supposed to be synchronized to reference time. The algorithm has access to memory to retrieve information that has been stored in previous executions. This allows the algorithm to use the current value of all previously started software clocks c j (r i ), j < i in its computation of the new software clock. 3 Analysis of clock synchronization using streams We have made no assumptions on communication except that messages arrive after they have been sent. Therefore it is not possible to calculate a finite bound on the precision a clock synchronization algorithm can achieve. Instead, we propose properties that describe a good algorithm and try to find implementable algorithms for which these properties hold under the assumptions made in the previous section. 3.1 Safe clock synchronization Definition 5. Safety A clock synchronization algorithm is safe if in all executions except the first, it starts a software clock with an initial precision that is equal to or better than the precision of the previously started software clock at this time: i N >1 : ɛ i ɛ i 1 (r i ). (3) The probabilistic algorithms are not safe. [3, 1, 9] select updates if the measured bound on the precision is better than a configured threshold. However this condition does not imply that the precision itself improves by the update. [2] updates the clock after a fixed number of synchronization messages have been received. Thus a better precision of the new clock reading is only probable but not guaranteed.

Definition 6. A Selective synchronization algorithm is a clock synchronization algorithm that calculates a candidate initial value c i R and a decision select i {true, false}. The new software clock is started with the initial value c i : { ci if select c i := i = true (4) c i 1 (r i ) if select i = false We have found a selective clock synchronization algorithm that is safe for all synchronization streams: Definition 7. The Local selection algorithm is a selective synchronization algorithm with: c i := s i, select i := (i = 1) (s i > c i 1 (r i )), (6) c i (t) := c i + (1 ρ h max)(h(t) h i ). (7) def The precision of the candidate initial value is defined as ɛ i = c i r i. In the case of the local selection algorithm, it is ɛ i = s i r i = d i, which is always negative. The algorithm only then selects a candidate initial value if it is ahead of the current software clock. Equations (5) and (6) correspond in essence to Lamport s algorithm [8]. In addition, the local selection algorithm assures by (7) that the drift of the software clock c i (t) is always negative. Proposition 1. The local selection algorithm is safe. Proof. We have to show that if a candidate initial value is selected, it does not degrade precision, i.e. select i ɛ i ɛ i 1 (r i ). From (6) we get select i s i > c i 1 (r i ). Therefore, s i r i > c i 1 (r i ) r i and ɛ i > ɛ i 1 (r i ). Since > ɛ i > ɛ i 1 (r i ), we get ɛ i ɛ i 1 (r i ), which is what we wanted to prove. Since the proof did not require (7), Lamport s algorithm is also safe. 3.2 Optimal selective clock synchronization There exists also a trivial algorithm that is safe: Start all new software clocks with the current reading of the previously started software clock c i = c i 1 (r i ). Though safe, this is obviously not a good clock synchronization algorithm. Therefore we complement the safety property by a second property that a synchronization algorithm should fulfill: Definition 8. Optimal selective A selective synchronization algorithm is optimal selective if in all executions, it starts a software clock with an initial precision that is equal to or better than the precision of the candidate initial value: i N 1 : ɛ i ɛ i. (8) The property is true if an algorithm never misses to improve the precision if it has the chance to do so. The property does not hold for [3, 1, 9] because updates are selected on the base of the precision bounds and not the precision itself. Different interpretations are possible for [2]: The property is either not defined if the algorithm is executed only after the required number of messages have been received or it does not hold, if the algorithm is executed upon every message arrival and only selects an update in fixed intervals. Proposition 2. The local selection algorithm is optimal selective. (5)

Proof. We have to show that all candidate initial values that have not been selected would have degraded precision, i.e. select i ɛ i ɛ i 1 (r i ). From (6) we get select i ɛ i ɛ i 1 (r i ). To conclude our proof we have to show that ɛ i 1 (r i ) is negative. Since ɛ i 1 (r i ) = ɛ i 1 + r i r i 1 ρ i 1 (t)dt and t R ri 1 : ρ i 1 (t) = ρ h (t) ρ h max <, we get ɛ i 1 (r i ) < ɛ i 1. By (5),(6) and (4), we know that ɛ i 1 = max( c i 1, c i 2 (r i 1 )) r i and thus ɛ i 1 < max(, ɛ i 2 ). Together with the anchor ɛ 1 = ɛ 1 <, this proves that i N 1 : ɛ i < by induction. This algorithm makes always the right decisions even though it knows neither the precision of the candidate initial value ɛ i nor that of the current value of the previously started software clock ɛ i 1 (r i ). This is possible, because the software clock and the candidate initial value are both always behind reference time. Then it suffices to select only those candidate initial values that are larger and therefore closer to reference time than the current reading of the previously started clock. The algorithm of Lamport [8] is not optimal selective, because if the consumer clock is ahead of reference time and its drift is positive, the algorithm will never select updates anymore and the precision degrades forever. The drawback of the local selection algorithm is that in three out of four cases, the absolute value of the software clock drift is larger than the drift of the hardware clock, assuming equal probability distribution of ρ h (t) in the interval [ ρ h max, ρ h max]. 4 Drift compensation In this section we extend the local selection algorithm presented in the previous section with a drift compensation mechanism. Definition 9. The Local selection algorithm with drift compensation is a selective synchronization algorithm that uses the same expressions as the local selection algorithm for the candidate initial value c i (5) and the decision select i (6). Additionally it calculates the drift compensation term β i, using the parameter b. A modified software clock c i (t) is started at time t = r i : β i := { ρ h max if i = 1 max{β i 1 δ h max(h i h i 1 ), β i,j j N 1, j < i} otherwise β i,j := β j + c i c j (r i) (h i h j ) b (h i h j ) 2δh max (h i h j ), (1) c i (t) := c i + (1 + max(β i 1/2δ h max (h(t) h i), ρ h max ))(h(t) h i). (11) Without looking at the drift compensation mechanism, we can state that, Proposition 3. The local selection algorithm with drift compensation is safe. The proof of proposition 1 applies directly to the algorithm with drift compensation, since both algorithms use the same c i and select i. But what about optimal selectivity? The proof of proposition 2 required that the current reading of the previously started clock has a negative precision. This is guaranteed if the drift of all software clocks is always negative. The drift of the modified software clock is ρ i (t) = ρh (t) + max(β i δmax h (h(t) h i), ρ h max ), (12) and by assumption 2 smaller than its initial value, i.e. ρ i (t) < ρ i. If it can be shown that the initial drift ρ i is always negative, then the local selection algorithm with drift compensation is optimal selective. After the first execution with β 1 = ρ h max, the modified software clock is the same as that from (7). In later executions, the drift compensation term β i is set to the maximum of all potential drift compensation terms β i,j, but it is never set below the current drift (9)

compensation of the previously started clock β i 1 δmax(h h i h i 1 ). The potential drift compensation terms β i,j are computed based on the comparison of the new candidate initial value c i with the current reading of the previously started software clock c j (r i) j < i. Compared to β j, the drift compensation term β i,j can be increased by ( c i c j (r i))/(h i h j ), but is decreased by b/(h i h j ) + 2δmax h (h i h j ). The parameter b is a constant, therefore it decreases β i,j most for software clocks c j (t) that have been started only recently (small h i h j ). The term 2δmax(h h i h j ) has the opposite effect: Clocks that have been started a long time ago can not be used for drift compensation. Therefore, a real implementation of the algorithm need not evaluate all possible β i,j. Proposition 4. If ɛ j < b, then the local selection algorithm with drift compensation is optimal selective. Proof. We have to show that the drift of a newly started software clock c i (t) is negative, i.e. ρ i <, assuming that drift compensation is based on the previously started software clock c j (t), i.e. β i = β i,j. By (12) we get ɛ j (r i) > ɛ j ρ j (h i h j ) δmax h (h i h j ) 2. Using ρ h j = ρ j β j and ρ h i < ρh j +δh max (h i h j ), we derive ρ h i < β j ( c i c j (r i) + ɛ j )/(h i h j )+δmax h (h i h j ). Introducing ɛ j < b, we get ρh i < β j ( c i c j (r i) b)/(h i h j )+δmax h (h i h j ) and finally ρ i = ρh i + β i <. Clearly, no finite b can guarantee ɛ j < b in our asynchronous communication model. A large b assures a high probability of optimal selectivity but makes drift compensation slow. A small b provides a fast drift compensation that sometimes computes software clocks with a positive initial drift ρ i >. The algorithm is not optimal selective anymore: Future candidate initial values may have a better precision than c i (t), but are not selected, because c i (t) is ahead of reference time. Since the drift of all software clocks c i (t) always decreases and eventually becomes negative, the precision of this software clock also becomes negative at some time in the future. After this moment, all candidate initial values that improve the precision of the software clock are selected again. 5 Experimental results We have evaluated the local selection algorithms through Matlab simulations. The hardware clock drift and the synchronization message delays have been measured and recorded on a real system consisting of two standard Linux PCs and a 82.11b wireless LAN in ad-hoc mode. The timestamp counter register (TSC) of the producer PC served as the reference time source. The TSC of the consumer PC served as the hardware clock. The timestamps s i and h i were recorded within the wireless LAN driver. Additional measures had to be taken to measure r i : Externally generated impulses fed into the parallel ports generate simultaneous interrupts on both PCs. Pairs of timestamps recorded in the corresponding interrupt service routines allow to interpolate the precision of the hardware clock at the time of receiving synchronization messages. Thus, the can be derived. We recorded two streams of 15 synchronization messages each, received in ca. 5 minutes, i.e. on average one synchronization message is received every 2 ms. The first scenario is that of a completely empty communication medium. In the second scenario, the medium is shared with two additional stations that do not participate in clock synchronization, but periodically exchange large files via ftp. The recorded delays are shown in fig.1, left side. The deterministic delay of the synchronization messages has been removed a-priori, because it is not subject of this study. Computation time of the algorithm is neglected. The numerical results of the local selection algorithm without drift compensation is shown in fig.1, right side.

a) delay d i [ms] b) delay d i [ms] c) CDF(d i ) [1] 2 15 1 5 5 1 15 2 25 3 receive time r i 2 15 1 5 5 1 15 2 25 3 1.5 1.5 Synchronization message delays d) precision ε i e) precision ε i 5 1 5 1 15 2 25 3 5 1 5 1 15 2 25 3 1.5 f) 1 CDF(ε i ) [1].5 Precision achieved by the local selection algorithm 1 1 1 delay d i [ms] 1 1 1 2 1 1 1 precision ε i 1 1 1 2 Fig. 1. Left: Delay of messages in a 82.11b ad-hoc network. a) Exclusive access to the medium, no other stations. b) Medium shared with 2 additional stations that periodically exchange large files via ftp. c) Cumulative probability density function of a) and b). The average delay is different, the minimal delay remains constant. Right: Precision of the local selection algorithm without drift compensation. d) Exclusive medium access. e) Shared medium. f) The algorithm achieves in 9% of the time a precision better than 15µs in scenario and 4µs in the shared medium scenario In the implementation of the local selection algorithm with drift compensation, equation (9) has been simplified to β i := max{β i 1 1/2δ h max (h i h i 1 ), β i,i 5 }, which reduces computation time (only β i,i 5 has to be evaluated) and memory requirements (5 times the space required to store c j, h j, β j ). The parameter b has been set to 4µs. Fig.2, left side, shows the numerical results. Drift compensation improves the achieved precision, especially in the shared medium scenario. Fig.2, right side, shows hardware and software clock drift. Drift compensation removes unknown and variable hardware clock drift. 6 Conclusion We found algorithms that never actively degrade precision and do not miss to improve precision if there is a chance to do so. The performance in terms of achieved precision these algorithms achieve always improves, if additional synchronization messages are received. This is not the case for probabilistic algorithms [3, 1, 9] and [2]. While the safety property also applies to Lamport s algorithm [8], only the local selection algorithms are optimal selective. Experimental results show that the local selection algorithm with drift compensation can achieve synchronization of the non-deterministic delay in the range of 1µs on a standard wireless LAN in ad-hoc mode requiring less than one received synchronization message every 2ms. These parameters match with the requirements of high-quality audio distribution and the capabilities of commonly available technology (wireless LAN, Linux PCs). The results are comparable to those of [5]. While our algorithm seems to require more communication, it can also be used when no broadcast medium is available. Schreiber and Sigg describe in their master thesis [14] an implementation of the local selection algorithms on Linux PCs that achieves a precision of 4µs. The discrepancy to the 1µs achieved in simulation is mainly due to the simple deterministic delay elimination mechanism used. Deterministic delay elimination is not possible without communication in the inverse direction, i.e. from the consumer to the producer. It is straightforward to combine the local selection algorithms with a mechanism that progressively removes the deterministic delay by exchanging messages with the synchronization source in a client-server style, rather than the producer-consumer pattern employed by the local selection algorithms. It remains to

a) precision ε i b) precision ε i CDF(ε i ) [1] 2 4 5 1 15 2 25 3 2 4 1.5 Precision achieved by the local selection algorithm with drift compensation 5 1 15 2 25 3 receive time r i 1.5 c) f) 1 1 1 precision ε i 1 1 1 2 d) 1 [ppm] [ppm] ρ h i ρ i [ppm] 5 67.5 67 5 1 15 2 25 3.5 hardware clock drift Clock drift in the shared medium scenario 5 5 1 15 2 25 3 68 e) software clock drift with drift compensation.5 5 1 15 2 25 3 receive time r i ρ i ρ i ρ h i Fig. 2. Left: Precision of the local selection algorithm with drift compensation. a) Exclusive medium access. b) Shared medium. c) The algorithm achieves in 9% of the time a precision better than 8µs in both scenarios. Right: Drift rate of the hardware clock and the two synchronized clocks, all for the shared medium scenario. d) The drift of the software clock without drift compensation ρ(t) is equal to the hardware clock drift ρ h (t) minus the maximal hardware clock drift ρ h max = 1ppm. The drift of the software clock with drift compensation ρ (t) is initially equal to that of the software clock without drift compensation and strives towards zero. e) Close-up of ρ h (t). f) Close-up of ρ (t), considerably more stable than ρ h (t). be studied, how this combination can be made most efficient in terms of total bandwidth consumption and local memory and computation requirements. References 1. Gianluigi Alari and Augusto Ciuffoletti. Implementing a probabilistic clock synchronization algorithm. Real Time Systems, 13(1):25 46, 1997. 2. K. Arvind. Probabilistic clock synchronization in distributed systems. IEEE Transactions on Parallel and Distributed Systems, 5(5):474 487, May 1994. 3. Flaviu Cristian. Probabilistic clock synchronization. Journal of Distributed Computing, 3:146 158, 1989. 4. Danny Dolev, Rüdiger Reischuk, Ray Strong, and Ed Wimmers. A decentralized high performance time service architecture. Technical Report 95/26, Institute for Computer Science, University of Lübeck, November 1995. 5. Jeremy Elson, Lewis Girod, and Deborah Estrin. Fine grained network time synchronization using reference broadcasts. Technical Report 28, Laboratory for Embedded Collaborative Systems LECS, UCLA, May 22. 6. Lewis Girod, Vladimir Bychkovskiy, Jeremy Elson, and Deborah Estrin. Locating tiny sensors in time and space: A case study. In International Conference on Computer Design ICCD, September 22. 7. Joseph Y. Halpern, Nimrod Megiddo, and Ashfaq A. Munshi. Optimal precision in the presence of uncertainty. Journal of Complexity, 1(2):17 196, 1985. 8. Leslie Lamport. Time, clocks and the ordering of events in a distributed system. Communications of the ACM, 21(7):558 565, July 1978. 9. Cheng Liao, Margaret Martonosi, and Douglas W. Clark. Experience with an adaptive globallysynchronizing clock algorithm. In ACM Symposium on Parallel Algorithms and Architectures, pages 16 114, 1999. 1. Jennifer Lundelius and Nancy Lynch. An upper and lower bound for clock synchronization. Information and Control, 62(2/3):19 24, August/September 1984. 11. Rafail Ostrovsky and Boaz Patt-Shamir. Optimal and efficient clock synchronization under drifting clocks. In Symposium on Principles of Distributed Computing, pages 3 12, 1999. 12. Boaz Patt-Shamir and Sergio Rajsbaum. A theory of clock synchronization. In Proceeding of the 26th Annual ACM Symposium on Theory of Computing, Montreal, Canada, pages 81 819, May 1994. 13. F. Schmuck and F. Cristian. Continuous clock amortization need not affect the precision of a clock synchronization algorithm. In Proceedings of the Nineth ACM Symposion on Principles of Distributed Computing, pages 133 143, 199. 14. Eric Schreiber and Daniel Sigg. Clock synchronization for wireless LAN. Master s thesis, ETH Zürich, Departement of Information Technology and Electrical Engineering, Institute TIK, Gloriastrasse 35, ETH Zentrum, 892 Zürich, July 22. (www.tik.ee.ethz.ch/ blum).