The Controlled Logical Clock a Global Time for Trace Based Software Monitoring of Parallel Applications in Workstation Clusters



Similar documents
Luby s Alg. for Maximal Independent Sets using Pairwise Independence

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

DEFINING %COMPLETE IN MICROSOFT PROJECT

Recurrence. 1 Definitions and main statements

Project Networks With Mixed-Time Constraints

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

What is Candidate Sampling

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Conferencing protocols and Petri net analysis

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Loop Parallelization

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

IMPACT ANALYSIS OF A CELLULAR PHONE

An Alternative Way to Measure Private Equity Performance

Fault tolerance in cloud technologies presented as a service

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

An RFID Distance Bounding Protocol

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Traffic-light a stress test for life insurance provisions

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

The Application of Fractional Brownian Motion in Option Pricing

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Extending Probabilistic Dynamic Epistemic Logic

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Can Auto Liability Insurance Purchases Signal Risk Attitude?

A Secure Password-Authenticated Key Agreement Using Smart Cards

Analysis of Premium Liabilities for Australian Lines of Business

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

RequIn, a tool for fast web traffic inference

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Forecasting the Direction and Strength of Stock Market Movement

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment

Adaptive Fractal Image Coding in the Frequency Domain

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

BERNSTEIN POLYNOMIALS

Calculation of Sampling Weights

Brigid Mullany, Ph.D University of North Carolina, Charlotte

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

This circuit than can be reduced to a planar circuit

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

8 Algorithm for Binary Searching in Trees

Traffic State Estimation in the Traffic Management Center of Berlin

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Statistical Methods to Develop Rating Models

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

How To Calculate The Accountng Perod Of Nequalty

Calculating the high frequency transmission line parameters of power cables

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

A Probabilistic Theory of Coherence

Generalizing the degree sequence problem

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

denote the location of a node, and suppose node X . This transmission causes a successful reception by node X for any other node

CHAPTER 14 MORE ABOUT REGRESSION

J. Parallel Distrib. Comput.

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

FORMAL ANALYSIS FOR REAL-TIME SCHEDULING

QoS-based Scheduling of Workflow Applications on Service Grids

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

An Interest-Oriented Network Evolution Mechanism for Online Communities

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Durham Research Online

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

HowHow to Find the Best Online Stock Broker

Performance Analysis and Comparison of QoS Provisioning Mechanisms for CBR Traffic in Noisy IEEE e WLANs Environments

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Dynamic Pricing for Smart Grid with Reinforcement Learning

Quantization Effects in Digital Filters

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

Single and multiple stage classifiers implementing logistic discrimination

Real-Time Process Scheduling

: ;,i! i.i.i; " '^! THE LOGIC THEORY MACHINE; EMPIRICAL EXPLORATIONS WITH A CASE STUDY IN HEURISTICS

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

Transcription:

Copyrght 996 IEEE. Copes may not used n any way that mples IEEE endorsement of a product or servce of an employer. Copes may not be offered for sale. The Controlled Logcal Clock a Global Tme for Trace Based Software Montorng of Parallel Applcatons n Workstaton Clusters Rolf Rabensefner Computng Center Unversty of Stuttgart, Allmandrng 3, D-755 Stuttgart, Germany, Tel. ++49 7 685553, e-mal: rabensefner@rus.un-stuttgart.de Abstract Event tracng and montorng of parallel applcatons are dffcult f each processor has ts own unsynchronzed clock. A survey s gven on several strateges to generate a global tme, and ther lmts are dscussed. The controlled logcal clock s a new method based on Lamport s logcal clock and provdes a method to modfy nexact tmestamps of tracefles. The new tmestamps guarantee the clock condton,.e. that the receve event of a message has a later tmestamp than the send event. The corrected tmestamps can also be used for performance measurements wth pars of events n dfferent processes. The controlled logcal clock s motvated and t s analyzed n detal by computer smulatons. No addtonal protocol overhead s needed for the new method whle tracng an applcaton. It can be mplemented as a flter for tracefles or t can be ntegrated nto montor tools for parallel applcatons. Keywords: Logcal tme, global tme, clock, montorng, clock synchronzaton, causalty, dstrbuted computng, parallel computng.. Introducton Wth global tme a correct representaton of the tme sequence of tracng nformaton s possble. Parallel and dstrbuted applcatons can be analyzed to obtan an nsght nto ther behavor and ther performance. On systems wthout global tme there are dfferent strateges of approxmatng a global tme. Ths paper presents a new algorthm the controlled logcal clock. It was developed especally for workstaton clusters or comparable systems and can be used as a flter for tracefles. The processors clocks used for generatng tmestamps are synchronzed subsequently f the tracefle contans backward references. Backward references are messages wth a tmestamp of the send event that s later than the tmestamp of the receve event. The controlled logcal clock modfes the tmestamps to fulfll the clock condton [8],. e. the send event always has an earler tmestamp than the correspondng receve event. Besdes the generaton of the tmestamps and ther collecton n the montor tool there s no addtonal protocol effort. An exchange of the tmestamps along wth the messages s not necessary because those tools have complete nsght nto the trace nformaton of the applcaton. The controlled logcal clock requres clock dfferences to be lmted to about 2 ms by a prevous synchronzaton. Ths can be acheved by an explct synchronzaton before and after the run of the applcaton [2] and a lnear nterpolaton as done by several montor tools, or t can be done by a resources savng contnuous synchronzaton, e. g. wth xntp [22]. The controlled logcal clock does not modfy the tmestamps of the orgnal clocks C as long as the tmestamp of a receve event s later than the tmestamp of the correspondng send event plus the mnmum message delay. The controlled logcal clock s a modfcaton of Lamport s pecewse dfferentable logcal clock LC wth dlc (t)=dt := dc (t)=dt as descrbed also n [6]. Hence as opposed to ths clock, the tme dfference LC C of the controlled logcal clock s lmted. It s well suted for vsualzng the causal order. It can also be used for averaged tme dfference measurements between events n dfferent processes. For such measurements wthn one process the orgnal clock s more approprate. But the tmestamps of the controlled logcal clock can be used also because, n general, ts devaton s less than 5 %. The small devaton s evdent because most montor tools do not support more than one tme scale. 2. Related Works For montorng and performance analyss of parallel and dstrbuted applcatons there are dfferent approaches

to synchronzng local clocks. Lamport s dscrete logcal clock [8] can be used drectly for montorng [4, 3]. In addton to ths Raynal [27] proposes an algorthm to prevent the drft between the logcal clocks. The vector clock an enhancement from Fdge [, 2] and Mattern [2] allows an equvalent representaton of the causal order gven by the send/receve event pars. It s used n some montor tools [5,, 9]. In [3] global events are ntroduced and n [25] spontaneous events (e.g. collsons on a network) are taken nto account. In the summary n [29] the lmts of the logcal clock and the vector clock are llustrated. A 2 B C 3 4 global wat 5 6 t real tme 2 global wat 3 4 5 6 LC logcal tme Fgure. Global wats vsualzed n real tme and n dscrete logcal tme A further lmt of the usablty of the logcal clock s shown n Fg.. The example has two global wats. Process C wats untl a message can be receved from A or B. The tme axs s dotted durng perods of watng. In the left pcture the example s vsualzed by usng a real tme axs. In the rght pcture the logcal tme s used as tme scale. Ths representaton s only of lmted value because t seems that process A has sent ts message at t =, but ths message s not taken nto account n the global wat of process C also started at t =. The wat s not fnshed untl process B sends ts message at t = 3. Ths example shows that the logcal clock s not suffcent for vsualzng montor nformaton. A representaton wthout backward references can be acheved alternatvely wth a suffcently exact synchronzaton of the local clocks [5, 8, 2]. Often used methods are an exact synchronzaton and drft estmaton before the begnnng of the applcaton [8, 9] or better before and after the applcaton [2] wth a lnear nterpolaton whle the applcaton s runnng. The exact synchronzaton can be done by determnstc [28] or probablstc [2, 3] methods. Contnuous synchronzaton wth lttle resource usage (e. g. xntp [22]) s normally not suffcently exact due to the large jtter of the message delays n local area networks. The trace based synchronzaton s another alternatve. The dfferences between the clocks are computed by the rule that a receve event must not arrve before the send event plus the mnmum transfer tme. Duda [7] has developed two algorthms, one wth a regresson analyss, and another wth a convex hull. Usng a mnmal spannng tree algorthm Jezequel [6] has adopted Duda s algorthm for any processor topologes. Hofmann [4] has mproved Duda s algorthm by usng a smple mnmum/maxmum strategy, and he has proposed dvdng the executon tme nto several ntervals to compensate dfferent clock drfts n long runnng applcatons. Babaoǧlu and Drummond [, 6] show that an almost no cost clock synchronzaton s possble f the applcaton makes a full message exchange between all processors n suffcent short ntervals. A survey of further work can be found n [3]. The lmts of these methods are gven by the message delay jtter, by the non lnear relaton of message delay and message length, and by a one-sded communcaton topology n some applcatons (e. g. producer/consumer scenaros). The controlled logcal clock s a novel development based on both Lamport s logcal clocks, the dscrete case and that wth dlc (t)=dt := dc (t)=dt. It also contans components of the trace based synchronzaton because t does not need any further protocol overhead f t s used n montor and debug tools. It mplements a subsequent synchronzaton based on exstng tmestamps and ther communcaton relatons. In contrary to the trace based synchronzaton, the controlled logcal clock can also be used n systems wth clock tcks longer than the mnmum message delay. But the controlled logcal clock needs a precedng synchronzaton that lmts the dfferences between the orgnal clocks to the fourfold of the mnmum message delay. 3. The Clock Condton For montorng, clocks are needed whch are suffcently exact for performance measurements and whch hold the clock condton defned below [8]. The clock condton must be held to enable the vsualzaton of a program n a spacetme dagram. Defnton n s the number of processes, e j s the j th event n the process, E = fe j j = ::n; j = ::j max()g s the set of events, and M = f(e l k ;ej )jel s a send event, and k ej s ts correspondng receve eventg s the set of send/receve pars. e j s a nternal event f t s not a send event and not a receve event. For two events e l, k ej the relaton el k happened drectly before e j, shortened by el k ~!ej, s held, f and only f (a) (e l k ;ej )2 M,or () (b) the events are n the same process and they succeed one another,. e. k = ^ l = j. (2) 2

The relaton happened before, shortened by!, s the transtve hull of the relaton ~!,. e. the smallest relaton that addtonally satsfes (c) e l k! ej ^ ej! em n =) el k! em n (3) Defnton 2 AclockC : E 7! IR satsfes the Clock Condton f and only f 8 e l e l k 2E;ej2E k! ej =) C(e l k ) < C(ej ) (4) Neglectng dfferent clock drfts the followng theorem can be smply shown: Theorem If the dfferences between the processors clocks are constantly less than the mnmum message delay then the clock condton s held by the processors clocks. A more precse theorem for theoretcal contnuous clocks wth lmted drfts can be found n [7, 2]. In [5] real clocks wth dscrete clock tcks are analyzed. Ths paper examnes the case that the clock dfferences are n general longer than the mnmum message delay, but lmted by about the fourfold of the mnmum message delay. Ths lmt can be achved wth low cpu and network costs by usual software synchronzaton tools. In ths case the premse of Theorem s not held. 4.TheSmpleLogcalClock The smple logcal clock s an enhancement of Lamport s logcal clock. The name smple s chosen to dstngush clearly between t and the controlled logcal clock that s tself based on the smple one. Frst a weak synchronzaton must be done. There are dfferent methods to achve clock errors less than about the fourfold of the mnmum message delay, e.g. SBA samplng before and after the applcaton s run wth a lnear balancng n the meantme [2] or a low cost clock synchronzaton by software (tmeslave, tmed, xntp). Because these methods must not synchronze wth the wall clock tme (UTC), the synchronzaton s modeled wth: Defnton 3 t s the wall clock tme, T (t) s the global tme to whch the process clocks C (t) (=::n) are synchronzed wth lmted errors,.e. constants e and e + exst wth e C (t) T (t) e +. Now for each process the logcal clock LC wll be defned: t nearly stops whle LC > C and else t equals C ;at each receve event t s set on the maxmum of ts current value and of the sender s clock at the tme of sendng plus the mnmum transfer delay. The mnmum transfer delay should be estmated as a byproduct of the synchronzaton at the begnnng or at the end. Ths logcal clock s a modfcaton of Lamport s logcal clock. In the followng t s named the smple logcal clock. In the next secton t wll be enhanced to the controlled logcal clock. Algorthm. The smple logcal clock LC s exactly defned wth 8 max(lc k (e l k )+ k;; LC (e j )+ ; C (t(e j >< ))) f 9 (e l k ;ej )2 M (5) e l k LC (e j ):= max(lc (e j )+ ;C (t(e j ))) otherwse (6) and f j =then the terms >: LC (e j )+ must be omtted wth = mnmal dfference between two events n process,.e. are constants wth > ^ 8 T (t(e j )) T (t(ej )) (7) ;j k; = mnmum message delay of messages from process k to process,.e. k; are constants wth k; > ^ (e l 8 T (t(e j )) T (t(el k )) k; (8) k ;ej )2M The global smple logcal clock s then defned as LC(e j ):=LC (e j ) (9) Theorem 2 The smple logcal clock LC satsfes the clock condton. Proof. The algorthm satsfes Lamport s rules IR and IR2 n [8] and therefore holds the clock condton. 2 In the followng the error wll be modeled based on C : 8 t <t(e j e j )<t e ^ 8 e e C (t) T(t)e + e + () t:t <t<t e.e. the corrected clocks C are at maxmum e + fast and at mnmum e slow n comparson wth the global tme T. Theorem 3 If all clocks C are not more than e + fast then the smple logcal clock LC s also not more than e + fast,.e. 8 C (t(e j )) T (t(ej ))e+ =) 8 LC(e j j ) T (t(ej ))e+ e e j Proof. Assumng there s a (; j) wth LC(e j ) T (t(ej )) >e+ () and wthout loss of generalty e j may be the earlest event satsfyng () (2) 3

Case a) LC(e j )=LC(ej ) s defned wth (5). Then LC(e j ) T (t(ej )) (9) = LC(e j ) T (t(ej )) (5) = max(lc k (e l k)+ k; ; LC(e j )+; C(t(e j ))) T (t(ej )) and () f n the maxmum n (5) the frst term s vald: (5; st term) = LC k (e l k)+ k; T (t(e j )) (8) (2) LC k (e l k) T (t(e l k)) e + and () f n the maxmum n (5) the second term s vald: (5;2 nd term) = LC(e j )+ T(t(e j )) (7) LC(e j ) T (t(e j )) (2) e + and () f n the maxmum n (5) the thrd term s vald: (5;3 rd term) = C(t(e j )) T (t(ej )) premse e + Therefore n all three cases ( assumpton (). ) there s a contradcton to the Case b) LC(e j )=LC(ej ) s defned wth (6). Then analog to the cases a) () anda)() the contradcton can be shown. Therefore n all possble cases a contradcton to the assumpton () can be shown and therefore the theorem s proved. 2 Theorem 4 If n a process the clock C s never more than e slow then the smple logcal clock LC n ths process s also never more than e slow,.e. 8 8 T j (t(ej )) C(ej ) e =) 8 T (t(e j )) LC(ej ) e j Proof. T (t(e j )) LC(ej) (9) 8 = T (t(e j ) LC(ej)) < max(lc k (e l k)+ k; ;LC(e j )+; (5;6) = T (t(e j )) C(t(e j ))) f ::: 9 = : max(lc(e j ; )+;C(t(e j ))) otherwse T (t(e j )) C(t(e j )) premse C(t(e j )) e 2 max(:::last term) Observaton There s no symmetry between Theorem 3 and Theorem 4: The beng fast of the smple logcal clock LC s lmted only globally by the maxmum of the beng fast of all nvolved clocks, whereas the beng slow of LC s lmted n each process ndependently by the mnmal beng slow of the correspondng C. A modelng of the errors e and e + for contnuous and dscrete clocks can be seen n [26]. 5. The Controlled Logcal Clock In the Algorthm the term LC k (e l k )+ k; n (5) can cause the logcal clock LC(e j ) to be set forward compared to C (t(e j )). At subsequent events n the same process the term LC (e j )+ n (6) causes that n prncple LC stops (.e. t advances only the small tcks ) untl t has fallen back to the value of C (t(e j )) that s the last term n (5) and (6)). Now the smple logcal clock from Alg. wll be enhanced n two steps to the controlled logcal clock n order to run more contnuously and to prevent the alternate advancng and stoppng. Algorthm 2. LC s the bass of the controlled logcal clock: 8 max(lck (el k )+ k;; LC (ej )+ ; LC (ej )+ j (C (t(e j )) C (t(e j ))); C (t(e j >< ))) f 9 (e l k ;ej )2 M (3) e l k LC (ej ):= max(lc (ej )+ ; LC (ej )+ j (C (t(e j )) C (t(e j ))); C (t(e j ))) otherwse (4) and f j =then the terms >: LC (e j )+ must be omtted wth and k; defned as n Alg. and wth freely selectable j 2 [; ]. Ths algorthm wll be completed later by the control unt n Alg. 3. The global logcal clock s based on the logcal clock of the ndvdual processes LC (e j ):=LC (ej ) (5) Obvously LC Alg: = LC Alg:2 for j. Forj the defnton of LC s dentcal wth Lamport s pecewse dfferentable logcal clock LC wth dlc =dt = dc =dt for the tme between receve events. Theorem 4 the lmtaton of beng slow s also vald for LC. The proof s the same. However Theorem 3 the lmtaton of beng fast s not vald for LC. It s possble to construct examples for any large e and any postve lower lmt mn that has for each choce of j mn an event e j wth LC (e j ) beng fast more than e,.e. LC (e j ) T (t(ej )) >e. The techncal report [26] shows two examples wth unlmted error ncrease. In most cases related to the dstrbuton of the clock errors and wth normal applcatons, that not only communcate but also compute one can assume that mn > :5,.e. that for values less than.5 the logcal clock LC s lmted n ts beng fast. To lmt the beng fast of LC t makes sense to control j 2 [;] by a control loop because normally values nearby are desrable and also possble. Fg. 2 shows ts 4

In: C (t(e j )) Out: LC (e j ) In: γ j Logcal clock, γ= Controller j In: LC (e ) In: j In : C (t(e )) j j Out: γ LC (e ) In: Out : j j C (t(e )) LC (e ) Logcal clock, γ= var. D ;s := max(lc (e j ) C (t(e j )); q f actor (D ;s q mn )+q mn ) (8) D ;s := max(lc (ej ) C (t(e j )); q f actor (D ;s q mn )+q mn ) (9) := max (2) j+ := 8 >< >: j degress f D;s >l upper D ;s (2) mn( j = degress; max ) f D;s <l lower D ;s (22) j otherwse (23) Fgure 2. Control loop for j structure. The controller tres to lmt the dfferences between LC and T,.e. t lmts the output error LC T. The controller must estmate the output error ndrectly because T (t(e j )) s unknown. The nput error C T can not be determned for the same reason. Provded that the errors of C are lmted 8 e e e j C (t(e j )) T (t(ej )) e+ e + () the Theorem 3 mply and the error of LC LC C {z } e + + e measurable (LC T ) {z } to be lmted by the controller T s lmted by e + + (LC C ) {z } measurable Ths s the motvaton for the followng controller algorthm: Algorthm 3. For ths algorthm the logcal clocks n the Alg. and 2 are calculated stepwse. In each step s new values for LC (e j ) and LC (ej ) are computed for those processes 2 P s for whch the nput values are already determned (.e. 62 P s,f 9 (e l k ;ej ) 2 M and LC k(e l k ) e l k and LC k (el k ) are not computed n a former step). After each step the control varables j are calculated agan for each 2 P s by the followng algorthm: D ; := q nt (6) D ; := q nt (7) j s an abbrevaton of j;s and j ;s s the ndex of that event, of whch the logcal clock s computed n process n step s. Explanatons: In each step, new logcal clock values are computed for as many processes as possble. The proposed control mechansm computes j+ only on the bass of data belongng to process. Therefore the computaton of the logcal clock values can be partally parallelzed and the computaton of the controller can be fully parallelzed. Wth q nt ;q mn ;q f actor ; max ; degress ;l upper and l lower the controller must be adjusted. Based on the results n Secton 6 the followng values are recommended for Ethernet or FDDI based clusters: q nt = 25s, q mn = 25s, q f actor = :9, max = :95, degress =:9, l upper =2:, l lower =:8. D ;s s a measurement for the devaton of the smple logcal clocks LC from the system clocks C. D ;s computes the maxmum of LC C under forgettng older values after some tme. q f actor 2 (; ) s the forget-factor, see (8). q mn n (8) s a lower lmt for D ;s. It says whch clock errors should be tolerated. q mn should also be used as start value n q nt n (6). D;s s defned n the same way as D ;s. It measures the devaton of the controlled logcal clocks LC from the system clocks C. D;s computes the maxmum of LC C (see (7) and (9)). Wth (2) the controller tres to lmt the devaton D;s as soon as t exceeds the upper lmt (l upper D ;s ). By the reducton of j the behavor of the controlled logcal clock becomes stepwse more alke the behavor of the smple logcal clock. j n (22) s stepwse ncreased as soon as D;s comes below the lower lmt (l lower D ;s ). The upper boundary for ths ncrease s max. max determnes how much the controlled logcal clock s slowed down to compensate for beng fast. 6. Smulaton of the Controlled Logcal Clock The effcency of the controlled logcal clock defned by Alg. 2 and 3 was examned by computer smulatons. The components of ths computer smulaton are the followng: 5

(a) the smulated run of a parallel FE-computaton on a regular grd produces the set of events E = fe j g wth the wall clock tmestamps t(e j ), (b) smulated system clocks wth a gven error behavor are used to generate dfferences between C(t(e j )) and t(ej ), (c) the controlled logcal clock computes the values of LC (e j ) and LC(ej),and(d)an analyss module evaluates the controlled logcal clock. Ths secton summarzes the most mportant results. Detals can be found n the techncal report [26]. 6.. The Set of Events The base s a fcttous FE-computaton wth n n 2 processes. Each teraton n each process conssts of. computaton of fnte elements at the borders to 3.. the neghbors, 2. 4. 2. sendng the new results 4. to the neghbors, 2. 3. computng the remanng fnte elements, 4. and recevng the new values from the neghbors. Executon tmes and message delays were defned randomly between gven lmts. These lmts were vared n dfferent smulatons. 6.2. Smulated Clock Errors Fgure 3. A fcttous parallel computaton T (t) :=tcan be chosen because the dfference between T and t does not nfluence the outcome of the algorthms. The dfferent clock errors used n the smulatons are denoted by pctograms. TheyplotC (t) t aganst t. Most types are contnuous clocks wth varyng drft rates. The default maxmum clock error s = s. And one type s a dscrete clock wth the tck length default l = s. 6.3. The Logcal Clock Module The logcal clock module s an mplementaton of the Algorthms, 2 and 3. It needed approxmately szeof(vod) n 2 + n bytes of memory for ts varables and about max(6:2=n :726 ; 2:85n :9 )s executon tme for each event on a R8 processor,.e. between 5 and s for each event f up to processors are producng the events. 6.4. The Analyss Many observables were examned but only a few man results wll be presented here. One advantage of computer smulatons s that observables can be analyzed that are not usually accessble n real experments. In our experments ths s the global tme t and all derved observables as LC T. For the evaluaton the followng observables of the controlled logcal clock are used: F LC the averaged beng fast of LC, S LC the averaged beng slow of LC,andA LC the averaged absolute devaton of LC. A LC s the average over all processes of the sum of jlc T j for all the tme ntervals of two succeedng events and related to the whole executon tme. All these observables were also examned for the smple logcal clock: F LC, S LC,andA LC. Then crtera for the evaluaton were defned as: F LC =F LC should be less than 2,.e. the controller should lmt the beng fast to the double of the beng fast of the smple logcal clock; S LC =S LC should be clearly less than,.e. the controlled logcal clock should clearly mprove any beng slow; A LC s a measurement for the usefulness of the controlled logcal clock for performance measurements. It should be less than 5 %. 6.5. The Results Experments wth 6 and H 6H show that the results are farly ndependent of the duraton all but strctly dependent on the clock error. Experments wth dfferent drft changng profles lke H 6 H, X 6 XX X,and A6show that the results are ndependent of these curves. Therefore the followng experment s representatve of a lot of other experments: One clock from 2 clocks s constantly sfast. 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 controlled logcal clock smple logcal clock Av.=47µs σ=263µs Av.=78µs σ=224µs µs µs 5 µs 5 5 µs 5 3 µs 3 7 µs 7 µs Fgure 4. Average of beng fast F LC and F LC 6

2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 controlled logcal clock Av.=34 σ=46 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 smple logcal clock Av.= σ=244 3 3 2 2 4 4 7 7 9 Fgure 5. Average of the absolute devaton A LC and A LC Fg. 4 and Fg. 5 represent the absolute results for each process and the averages F and A for both logcal clocks LC and LC. Process No. 8 s fast by s all the tme. Fg. 4 shows that the controlled logcal clock s not really more fast than the smple logcal clock, however the controlled logcal clock s fast n a wder neghborhood of process No. 8. Ths s a good sgn because the controlled logcal clock tres to brng all clocks to the fastest one. Fg. 5 shows that the absolute devaton of the controlled logcal clock s clearly better than that of the smple logcal clock. In the protocol of the experment t can be seen that only n sx processes A LC s greater than 5 % and that the maxmum s 3 % whereas the smple logcal clock has values of 4-82 % n three processes. The crtera defned n the last secton are fulflled by the controlled logcal clock n ths experment. Another type of experment s the usage of clocks that are slow. Here the controlled logcal clock can compensate up to 65 % of the beng slow. The absolute devaton of that process s reduced from A LC = 96. % to A LC = 3,2 % (A LC =.7 %) and the crtcal observables are clearly less than the defned crtera. Experments wth dfferent heght of the clock error show that should be lmted by about the fourfold of the mnmum message delay f local tme dfferences n the area of the tenth of the mnmum message delay should be measured wth an error of less than 5 % n the average over all processes. The varaton of the controller parameters shows that the controller s very stable. In the last experment clocks wth a clock tck of ms are used. In ths case the smple logcal clock already provdes all benefts. The controlled logcal clock does not really mprove the results but also does not make them worse. 7. Usage of the Controlled Logcal Clock The controlled logcal clock s desgned for montorng dstrbuted and parallel applcatons. The controlled logcal clock can be used as a flter. A tracefle wth backward references,.e. wth tmestamps not fulfllng the clock condton, wll be modfed subsequently. Then the clock condton s guaranteed and n general performance measurements are possble wth the new tmestamps. The controlled logcal clock s desgned prmarly to work on events sequentally from the begnnng. If t should be used nsde of a montor tool then there s a problem f the tool allows nteractve repostonng wth subsequent scrollng forward and backward. To solve ths, the Algorthms, 2 and 3 can be started at each poston n a tracefle. For scrollng backward one can use these algorthms but the sgn of the tmestamps must be changed and the role of the send and receve events must be exchanged. There s one dsadvantage: the values of the modfed tmestamps depend on the choce of the start event,.e. the user sees the same event wth (slghtly) dfferent tmestamp values. By assgnng two tmestamps to each event the montor can compensate for ths dsadvantage. The two values are the tme of the system clocks, normally corrected by algorthms fulfllng Def. 3; t can be used for measurements of tme dfferences nsde a process or used to refer to an event; the tme of the controlled logcal clock; t can be used to vsualze the events n spacetme dagrams and to measure tme dfferences between two processes; the values depend on the start event chosen after the last repostonng. It s recommended that an estmate of the mnmum transfer delay s obtaned as a byproduct of the used synchronzaton algorthm. The varance of the message delays has a sgnfcant regon below the most frequent delay values. Our measurements n an Ethernet and an FDDI rng ndcate that n general the mnmum message delay s larger than a quarter of the most frequent round trp tme of an empty remote procedure call. Addtonal applcaton areas are possble because the Alg., 2 and 3 can be ntegrated nto parallel applcatons n the same way as Lamport s logcal clock. For ths, only the already computed tmestamps LC k (e l k ) and LC k (el k ) must be addtonally transferred nsde the messages (e l k ;ej )2 M. In ths way all necessary data requred to compute Alg. and 2 s avalable n each process. The control operaton can also be done locally, n whch case the step ndex s must be substtuted by the event ndex j. 8. Summary The controlled logcal clock s a method of correctng tmestamps of a parallel applcaton. It guarantees that the corrected tmestamps fulfll the clock condton. Also the beng fast of the new tmestamps s bounded and the beng slow s reduced. The new tmestamps are suted for performance measurng and for event vsualzaton n space- 7

tme dagrams. Prevously synchronzed clocks wth a maxmum dfference of about two mllseconds are necessary. Usually n workstaton clusters such synchronzaton methods are already nstalled for other purposes. The controlled logcal clock s nsenstve to a drft jtter of a few percent sometmes used for synchronzng the system clocks. In combnaton wth such synchronzaton methods an addtonal synchronzaton before and after the samplng can be dropped. Manly n the case of long executon tmes the controlled logcal clock s better than other methods because these ones assume a lttle varance of the system clocks speed. 9. Future Work Frst tests of the controlled logcal clock n practce as a flter for VAMPIR/PARvs [24] montor tracefles are done. They show that the reducton rule (2) can be weakened. The prncple of the current controller lmts the value of to below, n the practce to.95. For ths reason the controlled logcal clock often falls back to the value of the system clock C. It s planed to allow = or at least = 5 wth a modfed controller. It s planned to modfy the controller to acheve by the controlled logcal clock LC a better approxmaton of the maxmum of all system clocks C. The smulaton has also shown that the clock errors should be lmted to about the fourfold of the message passng delay. Reasons and more precse rules for that lmtaton must be examned. Addtonal questons arse f one wants to use the controlled logcal clock n systems combng message passng wth one-sded communcaton,.e. GET and PUT operatons drectly nto the memory of a remote process. In [23] an ATM adapter s descrbed n whch the PUTs are mplemented manly n hardware. Besdes the problem that normally t s not possble to record a trace event n the remote process, one must study the effects of the dfferent latences of the message passng and of the one-sded communcaton. References [] O. Babaoǧlu and R. Drummond. (Almost) no cost clock synchronzaton. In Proceedngs of 7th Internatonal Symposum on Fault-Tolerant Computng, pages 42 47. IEEE Computer Socety Press, July 987. [2] F. Crstan and C. Fetzer. Probablstc nternal clock synchronzaton. In Proceedngs. 3th Symposum on Relable Dstrbuted Systems, Dana Pont, CA, USA, Oct. 25-27, 994, pages 22 3. IEEE Computer Socety Press, 994. [3] F. Crstan and C. Fetzer. Probablstc nternal clock synchronzaton. Techncal Report CS94-367, Unversty of Calforna, San Dego, May 8 995. ftp://cs.ucsd.edu/pub/team/nternalprobclocksync.ps.z, ftp://cs.ucsd.edu/pub/cfetzer/cs94-367.ps.z. [4] J. E. Cuny, A. A. Hough, and J. Kundu. Logcal tme n vsualzatons produced by parallel programs. In Proceedngs. Vsualzaton 92, Boston, MA, USA, Oct. 9-23, 992, pages 86 93. IEEE Computer Socety Press, 992. [5] G. v. Djk and A. v. d. Wal. Partal orderng of sychronzaton events for dstrbuted debuggng n tghtly-coupled multprocessor systems. In A. Bode, edtor, Dstrbuted Memory Computng, 2nd European Conference, EDMCC2, Munch, FRG, LNCS 487, pages 9. Sprnger-Verlag, Aprl 22-24 99. [6] R. Drummond and O. Babaoǧlu. Low-cost clock synchronzaton. Dstrbuted Computng, 6(4):93 23, July 993. [7] A. Duda, G. Harrus, Y. Haddad, and G. Bernard. Estmatng global tme n dstrbuted systems. In Proceedngs of the 7th Internatonal Conference on Dstrbuted Computng Systems, Berln, September 2-25, 987, pages 299 36. IEEE Computer Socety Press, 987. [8] T. H. Dungan. Hypercube clock synchronzaton. Techncal Report ORNL TM-744, Oak Rdge Natonal Laboratory, TN, February 99. [9] T. H. Dungan. Hypercube clock synchronzaton. ORNL TM-744 (updated), September 994. http://www.epm.ornl.gov/dungan/clock.ps. [] D. Edwards and P. Kearns. DTVS: a dstrbute trace vsualzaton system. In Proceedngs. Sxth IEEE Symposum on Parallel and Dstrbuted Processng, Dallas, Oct. 26-29, 994, pages 28 288. IEEE Computer Socety Press, 994. [] C. J. Fdge. Tmestamps n message-passng systems that preserve partal orderng. In Proceedngs of th Australan Computer Scence Conference, pages 56 66, February 988. [2] C. J. Fdge. Partal orders for parallel debuggng. ACM SIGPLAN Notces, 24():83 94, January 989. [3] D. Haban and W. Wegel. Global events and global breakponts n dstrbuted systems. In Proceedngs of 2st Hawa Internatonal Conference on System Scences, pages 66 75, vol. II, 988. [4] R. Hofmann. Gemensame Zetskala für lokale Eregnsspuren. In B. Walke and O. Spanol, edtors, Messung, Modellerung und Bewertung von Rechen- und Kommunkatonssystemen, 7. GI/ITG-Fachtagung, Aachen, 2.-23. September 993. Sprnger-Verlag, Berln, 993. ftp://fau79.nformatk.un-erlangen.de/pub/doc/mmb93 globtme.ps.z. [5] R. Hofmann. Gescherte Zetbezüge für de Lestungsanalyse n parallelen und vertelten Systemen. Dssertaton, Unverstät Erlangen-Nürnberg, Technsche Fakultät, 993. ftp://fau79.nformatk.un-erlangen.de/pub/doc/mmd26#3.ps.z. [6] J.-M. Jez équel. Buldng a global tme on parallel machnes. In J.-C. Bermond and M. Raynal, edtors, Proceedngs of the 3rd Internatonal Workshop on Dstrbuted Algorthms, LNCS 392, pages 36 47. Sprnger-Verlag, 989. [7] J.-M. Jez équel. Outls pour l expérmentaton d algorthmes dstrbués sur machnes parallèles. PhD thess, Unversté de Rennes,. Oct. 989. [8] L. Lamport. Tme, clocks, and the orderng of events n a dstrbuted system. Communcatons of the ACM, 2(7):558 565, July 978. [9] W. S. Lloyd and P. Kearns. Tracng the executon of dstrbuted programs. Journal of Systems and Software, 2(3):2 24, June 993. [2] E. Mallet and C. Tron. On effcently mplementng global tme for performance evaluaton on multprocessor systems. Journal of Parallel and Dstrbuted Computng, 28:84 93, 995. [2] F. Mattern. Vrtual tme and global states of dstrbuted systems. In M. Cosnard and P. Qunton, edtors, Proceedngs of Internatonal Workshop on Parallel and Dstrbuted Algorthms, Chateau de Bonas, France, October 988, pages 25 226. Elsever Scence Publshers B. V., Amsterdam, 989. [22] D. L. Mlls. Network tme protocol (verson 3), specfcaton, mplementaton and analyss. RFC 35, Request for Comments, March 992. [23] T. Mummert, C. Kosak, P. Steenkste, and A. Fsher. Fne gran parallel communcaton on general purpose LANs. In Internatonal Conference on Supercomputng, ACM, Phladelpha, May 996. http://www.cs.cmu.edu/afs/cs/project/warp/archve/nectar-papers/96cs.ps. [24] W. E. Nagel and A. Arnold. Performance vsualzaton of parallel programs: The parvs envronment. Techncal report, Forschungszentrum Jülch, 995. http://www.kfa-juelch.de/zam/pt/redec/softtools/partools/parvs.html. [25] R. L. Probert, H. Yu, and K. Saleh. Relatve-clock-based specfcaton and test result analyss of dstrbuted systems. In Eleventh Annual Internatonal Phoenx Conference on Computers and Communcatons, Scottsdale, AZ, USA, Aprl -3, 992, pages 687 694. IEEE, New York, 992. [26] R. Rabensefner. De geregelte logcal Clock Defnton, Smulaton und Anwendung. Techncal Report RUS-3, Rechenzentrum, Unverstät Stuttgart, Germany, May 996. http://www.un-stuttgart.de/people/rabensefner/log clock rus3.html. [27] M. Raynal. A dstrbuted algorthm to prevent mutual drft between n logcal clocks. Informaton Processng Letters, 24:99 22, 987. [28] F. B. Schneder. Understandng protocols for byzantne clock synchronzaton. Techncal Report 87-859, Department of Computer Scence, Cornell Unversty, August 987. http://cs-tr.cs.cornell.edu/tr/cornellcs:tr87-859/prnt. [29] R. Schwarz and F. Mattern. Detectng causal relatonshps n dstrbuted computatons: n search of the holy gral. Dstrbuted Computng, 7(3):49 74, 994. [3] Z. Yang and A. Marsland, T. Annotated bblography on global states and tmes n dstrbuted systems. Operatng Systems Revew, 27(3):55 74, July 993. [3] M. Zak, M. El-Nahas, and H. Allam. DPDP: an nteractve debugger for parallel and dstrbuted processng. Journal of Systems and Software, 22():45 6, July 993. 8