Fair Stateless Model Checking



Similar documents
Task is a schedulable entity, i.e., a thread

Multiprocessor Systems-on-Chips

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

The Transport Equation

Chapter 8: Regression with Lagged Explanatory Variables

On the degrees of irreducible factors of higher order Bernoulli polynomials

The Application of Multi Shifts and Break Windows in Employees Scheduling

The Grantor Retained Annuity Trust (GRAT)

Automatic measurement and detection of GSM interferences

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar

Chapter 7. Response of First-Order RL and RC Circuits

Making a Faster Cryptanalytic Time-Memory Trade-Off

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Morningstar Investor Return

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

Option Put-Call Parity Relations When the Underlying Security Pays Dividends

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Appendix D Flexibility Factor/Margin of Choice Desktop Research

Performance Center Overview. Performance Center Overview 1

Chapter 1.6 Financial Management

Distributing Human Resources among Software Development Projects 1

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Quality Assurance in Software Development

Trends in TCP/IP Retransmissions and Resets

Measuring macroeconomic volatility Applications to export revenue data,

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

Why Did the Demand for Cash Decrease Recently in Korea?

Molding. Injection. Design. GE Plastics. GE Engineering Thermoplastics DESIGN GUIDE

Term Structure of Prices of Asian Options

Impact of scripless trading on business practices of Sub-brokers.

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

Individual Health Insurance April 30, 2008 Pages

CHARGE AND DISCHARGE OF A CAPACITOR

UNDERSTANDING THE DEATH BENEFIT SWITCH OPTION IN UNIVERSAL LIFE POLICIES. Nadine Gatzert

OPERATION MANUAL. Indoor unit for air to water heat pump system and options EKHBRD011ABV1 EKHBRD014ABV1 EKHBRD016ABV1

Stock Trading with Recurrent Reinforcement Learning (RRL) CS229 Application Project Gabriel Molina, SUID

Model-Based Monitoring in Large-Scale Distributed Systems

Sampling Time-Based Sliding Windows in Bounded Space

Caring for trees and your service

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM

How To Understand The Rules Of The Game Of Chess

DDoS Attacks Detection Model and its Application

Making Use of Gate Charge Information in MOSFET and IGBT Data Sheets

9. Capacitor and Resistor Circuits

NASDAQ-100 Futures Index SM Methodology

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

Distributed and Secure Computation of Convex Programs over a Network of Connected Processors

How To Predict A Person'S Behavior

Real-time Particle Filters

Premium Income of Indian Life Insurance Industry

Task-Execution Scheduling Schemes for Network Measurement and Monitoring

Nikkei Stock Average Volatility Index Real-time Version Index Guidebook

CALCULATION OF OMX TALLINN

cooking trajectory boiling water B (t) microwave time t (mins)

Inductance and Transient Circuits

INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES

Top-K Structural Diversity Search in Large Networks

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt

Signal Processing and Linear Systems I

THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS

A Universal Pricing Framework for Guaranteed Minimum Benefits in Variable Annuities *

Module 3 Design for Strength. Version 2 ME, IIT Kharagpur

ARCH Proceedings

Permutations and Combinations

Niche Market or Mass Market?

STABILITY OF LOAD BALANCING ALGORITHMS IN DYNAMIC ADVERSARIAL SYSTEMS

Direc Manipulaion Inerface and EGN algorithms

LEASING VERSUSBUYING

GUIDE GOVERNING SMI RISK CONTROL INDICES

Chapter 2 Problems. 3600s = 25m / s d = s t = 25m / s 0.5s = 12.5m. Δx = x(4) x(0) =12m 0m =12m

Differential Equations. Solving for Impulse Response. Linear systems are often described using differential equations.

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

Improvement of a TCP Incast Avoidance Method for Data Center Networks

MODEL AND ALGORITHMS FOR THE REAL TIME MANAGEMENT OF RESIDENTIAL ELECTRICITY DEMAND. A. Barbato, G. Carpentieri

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal

Capacity Planning and Performance Benchmark Reference Guide v. 1.8


Markit Excess Return Credit Indices Guide for price based indices

17 Laplace transform. Solving linear ODE with piecewise continuous right hand sides

Analysis of Pricing and Efficiency Control Strategy between Internet Retailer and Conventional Retailer

AP Calculus AB 2010 Scoring Guidelines

Ecotopia: An Ecological Framework for Change Management in Distributed Systems

Efficient One-time Signature Schemes for Stream Authentication *

4. International Parity Conditions

Economics Honors Exam 2008 Solutions Question 5

Transcription:

Fair Saeless Model Checking Madanlal Musuvahi Shaz Qadeer Microsof Research {madanm,qadeer@microsof.com Absrac Saeless model checking is a useful sae-space exploraion echnique for sysemaically esing complex real-world sofware. Exising saeless model checkers are limied o he verificaion of safey properies on erminaing programs. However, realisic concurren programs are nonerminaing, a propery ha significanly reduces he efficacy of saeless model checking in esing hem. Moreover, exising saeless model checkers are unable o verify ha a nonerminaing program saisfies he imporan liveness propery of livelock-freedom, a propery ha requires he program o make coninuous progress for any inpu. To address hese shorcomings, his paper argues for incorporaing a fair scheduler in saeless exploraion. The key conribuion of his paper is an explici scheduler ha is (srongly) fair and a he same ime sufficienly nondeerminisic o guaranee full coverage of safey properies. We have implemened he fair scheduler in he CHESS model checker. We show hrough heoreical argumens and empirical evaluaion ha our algorihm saisfies wo imporan properies: 1) i visis all saes of a finie-sae program achieving sae coverage a a faser rae han exising echniques, and 2) i finds all livelocks in a finie-sae program. Before his work, nonerminaing programs had o be manually modified in order o apply CHESS o hem. The addiion of fairness has allowed CHESS o be effecively applied o real-world nonerminaing programs wihou any modificaion. For example, we have successfully booed he Singulariy operaing sysem under he conrol of CHESS. Caegories and Subjec Descripors D.2.4 [Sofware Engineering]: Sofware/Program Verificaion formal mehods, validaion; D.2.5 [Sofware Engineering]: Tesing and Debugging debugging aids, diagnosics, moniors, racing General Terms Algorihms, Reliabiliy, Verificaion Keywords Concurrency, fairness, liveness, model checking, mulihreading, shared-memory programs, sofware esing 1. Inroducion Concurren programs are difficul o ge righ. Suble ineracions among communicaing hreads in he program can resul in unexpeced behaviors. These behaviors ypically resul in bugs ha occur lae in he sofware developmen cycle or even afer he sofware Permission o make digial or hard copies of all or par of his work for personal or classroom use is graned wihou fee provided ha copies are no made or disribued for profi or commercial advanage and ha copies bear his noice and he full ciaion on he firs page. To copy oherwise, o republish, o pos on servers or o redisribue o liss, requires prior specific permission and/or a fee. PLDI 08, June 7 13, 2008, Tucson, Arizona, USA. Copyrigh c 2008 ACM 978-1-59593-860-2/08/06... $5.00. Phil1: while ( rue ) { Acquire ( fork1 ); if ( TryAcquire ( fork2 ) ) break; Release ( fork1 ); // ea Release ( fork1 ); Release ( fork2 ); Phil2: while ( rue ) { Acquire ( fork2 ); if ( TryAcquire ( fork1 ) ) break; Release ( fork2 ); // ea Release ( fork2 ); Release ( fork1 ); Figure 1. Example of a nonerminaing program. is released. Tradiional mehods of esing, such as various forms of sress and random esing, more ofen han no miss hese bugs. Model checking [5, 24] is a promising mehod for deecing and debugging deep concurrency-relaed errors. A model checker sysemaically explores he sae space of given sysem and verifies ha each reachable sae saisfies a given propery. This paper is concerned wih saeless model checking, a syle of sae-space search firs proposed in Verisof [8]. A saeless model checker explores he sae space of he program wihou capuring he individual program saes. The program is execued under he conrol of a special scheduler ha conrols all he nondeerminism in he program. This scheduler sysemaically enumeraes all execuion pahs of he program obained by he nondeerminisic choices. Saeless model checking is paricularly suied for exploring he sae space of large programs, because precisely capuring all he essenial sae of a large program can be a dauning ask. Apar from he global variables, heap, hread sacks, and regiser conexs, he sae of a running program can be sored in he operaing sysem, he hardware, and in he wors case, in a differen machine across a nework. Even if all he program sae can be capured, processing such large saes can be very expensive [12, 21]. On he downside, saeless model checking is direcly applicable only o erminaing programs. Such programs erminae under all execuions and equivalenly, have acyclic sae spaces. In our experience, mos realisic concurren programs have cyclic sae spaces. This paper inroduces he novel echnique of fair saeless model checking for effecively searching he sae spaces of nonerminaing programs. Nonerminaion and cyclic sae spaces presen a significan obsacle o exising saeless model checkers. To illusrae he problem, consider he nonerminaing program in Figure 1. The program is a variaion of he dining philosophers example wih wo hreads Phil1 and Phil2 rying o acquire wo resources fork1 and fork2. Phil1 acquires fork1 and hen aemps o acquire fork2 wihou blocking. If his aemp fails, hen i releases fork1 and reries. Phil2 ries o acquire he wo resources in he reverse

# Nonerminaing Execuions 100000 10000 1000 100 10 1 15 20 25 30 35 40 Deph bound Figure 2. Number of nonerminaing execuions increases exponenially wih he deph bound. order. The rery loops in he wo hreads creae cycles in he sae space of he program. A ypical saeless model checker is ineffecive in deecing errors in such programs for wo fundamenal reasons. Firs, o avoid divergence resuling from nonerminaing execuions, he model checker mus be run wih a deph bound. To ge good coverage for safey verificaion, his bound mus be large enough o allow exploring he deepes sae in he sae space. However, as he bound increases, he model checker spends exponenially more resources unrolling cycles in he sae space han visiing new saes. A useful measure of he waseful work performed during he search is he number of non-erminaing execuions explored for a paricular deph bound. Figure 2 shows ha, for our example (Figure 1), as he deph bound increases he number of nonerminaing execuions explored increases exponenially. Second, nonerminaion inroduces he possibiliy of livelocks, an enirely new class of errors characerized by he inabiliy of he program o make progress. For example, he repeaed execuion of he ransiion sequence Phil1: Acquire(fork1), Phil2: Acquire(fork2), Phil1: TryAcquire(fork2), Phil2: TryAcquire(fork1), Phil1: Release(fork1), Phil2: Release(fork2) is a livelock. Deph-bounded saeless model checking does no have he abiliy o deec such errors. Fair saeless model checking solves boh he aforemenioned problems by performing sae-space search wih respec o a fair and demonic scheduler. Our firs key insigh is ha correc programs make coninuous progress on fair schedules. A schedule is fair if every hread ha is enabled infiniely ofen is scheduled infiniely ofen. Conversely, a schedule is unfair if a hread is sarved of is chance o execue despie being enabled infiniely ofen. 1 For example, he schedule in which Phil1 performs Acquire(fork1) and hen Phil2 repeaedly execues Acquire(fork2), TryAcquire(fork1), Release(fork2) is unfair. A cycle in he sae space of a correc program corresponds o an unfair schedule in which an enabled hread is sarved coninuously so ha he oher hreads paricipaing in he cycle are unable o make progress. By performing sae-space search wih respec o a fair scheduler, he model checker is able o prune such cycles away. Noe ha a cycle in an incorrec program, such as he one in Figure 1, migh correspond o a livelock. However, such an erroneous cycle mus be fair, oherwise i would no be considered an error by he programmer. Our scheduler does no prune such cycles and will in fac generae an infinie execuion in he limi. I is his abiliy o disinguish beween fair and unfair execuions ha gives fair saeless model checking he abiliy o deec livelocks. 1 In he lieraure, his noion of fairness is qualified as srong fairness. For breviy, we simply refer o his noion wihou he qualifier in his paper. Thread a: x := 1; b: end; Ini x := 0; Thread u c: while (x!= 1) d: yield(); e: end; a,c b,c u b,e Sae Space Figure 3. Example of a nonerminaing program. u u u a,d Obviously, a fair scheduler is resriced from making some scheduling decisions ha are oherwise available o a scheduler wih no fairness requiremen. I is imporan o ensure ha hese resricions do no reduce he coverage achieved during sae-space exploraion. Our second key insigh enables us o do so. We observe ha hreads in correc programs indicae when hey are unable o make progress by yielding he processor. A yield is usually indicaed by he presence of a sleep operaion or a imeou while waiing on a resource. To achieve fairness, he scheduler only penalizes yielding hreads and prioriizes hreads ha are able o make progress. In paricular, he scheduler is fully nondeerminsic in he absence of yield operaions. Secion 3 describes he fair scheduling algorihm and provides heoreical resuls o characerize is soundness. To suppor hese heoreical resuls, Secion 4 provides experimenal resuls which indicae ha our algorihm achieves complee sae coverage on a variey of programs. We have implemened our scheduler in he CHESS model checker. This algorihm exends he abiliy of CHESS o handle nonerminaing programs. Prior o his implemenaion, any program given o he checker had o be manually modified o erminae under all schedules. This manual effor was a significan hurdle o he deploymen of CHESS because in our experience, real programs are almos always nonerminaing. By no requiring his manual effor, he fair scheduling algorihm has significanly improved he applicabiliy of CHESS o real-world programs; we can now boo he Singulariy operaing sysem [13] under he conrol of CHESS. We presen our evaluaion, including several bugs found, in Secion 4. In summary, he main novel conribuions of his paper are he following: We have inroduced fair saeless model checking, a novel echnique for sysemaic esing of nonerminaing programs. Our mehod significanly enhances he applicabiliy of saeless model checking o large programs wih cyclic sae spaces. In addiion, i allows saeless model checkers o deec a new class of livelock errors. We have implemened our algorihm in he saeless model checker CHESS. The algorihm makes i much easier o apply CHESS o large programs and found hree previously unknown errors in real-world programs of which wo are livelocks. 2. Overview In his secion, we presen an overview of our mehod for sysemaically esing nonerminaing programs. We use he example program in Figure 3 o moivae he discussion. This program has wo hreads and a global variable x iniially se o zero. The firs hread ses x o 1, while he second hread u spins in a loop waiing for he updae of x by. The sae space of his program is shown a he b,d

righ of Figure 3. For his simple program, he sae can be capured by he program couner of he wo hreads. For insance, he sae (a,c) is he iniial sae where he wo hreads are abou o execue he insrucions a locaions a and c respecively. The sae space conains a cycle beween (a,c) and (a,d), resuling from he he spin loop of u. Obviously, his program does no erminae under he schedule ha coninuously runs u wihou giving a chance for o execue. Our mehod is applicable o programs ha are expeced o erminae under all fair schedules. Tha is, nonerminaion under a fair schedule is unexpeced and is poenially an error. However, here is no requiremen on hese programs o erminae under unfair schedules. We call such programs fair-erminaing. The program in Figure 3 is fair-erminaing since is only infinie execuion is no fair. This execuion coninuously sarves hread despie being enabled infiniely ofen. Our inuiion for fair-erminaing programs is based upon our observaion of he es harnesses for real-world concurren programs. In pracice, concurren programs are esed by combining hem wih a suiable es harness ha makes hem fair-erminaing. A fair scheduler evenually gives a chance o every hread in he program o make progress ensuring ha he program as a whole makes progress owards he end of he es. Such a es harness can be creaed even for sysems such as cache-coherence proocols ha are designed o run forever ; he harness limis he number of cache requess from he exernal environmen. In addiion, he noion of fair erminaion coincides wih he inuiive behavior programmers expec of heir concurren programs. For insance, one expecs he program in Figure 3 o erminae when run on a real machine. This expecaion is due o our implici assumpion ha he underlying hread scheduler in he operaing sysem is fair. In his paper, we provide a soluion o he following imporan problem: Inpu: A concurren program Q and a safey propery ϕ Problem: Deermine if Q is fair-erminaing and saisfies ϕ. If Q is no fair-erminaing, produce a fair nonerminaing execuion of Q. If Q violaes ϕ, produce a finie execuion of Q violaing ϕ. All previous soluions proposed for his problem are saeful; hey require capuring he sae of he program Q. As discussed in he inroducion, capuring he sae of large program is error-prone and expensive. The main conribuion of his paper is a pracical saeless soluion o his problem. Our soluion, called fair saeless model checking, uses a fair and demonic scheduler for sysemaically exploring he se of fair execuions of he program Q. The scheduler mainains a parial order on he se of hreads in each sae. Inuiively, his parial order defines a scheduling prioriy over hreads in each sae an enabled hread canno be scheduled in a sae if a higher prioriy hread, as deermined by he parial order, is enabled in ha sae. The prioriy is updaed appropriaely during he execuion of a program wih he guaranee ha any infinie execuion generaed by our scheduler is fair (Secion 3). An execuion obained by unrolling an unfair cycle in he sae space of a nonerminaing program is pruned by our scheduler, hereby leading o a more efficien search. While being fair, he scheduler mus also be demonic and aemp o generae enough schedules o achieve full sae coverage. For example, a scheduler ha generaes a single fair schedule is useless for finding bugs because i misses mos behaviors of he program! Similarly, a round-robin scheduler does no consider many inerleavings of he hreads in he program. Ideally, i would be desirable for a fair scheduler o generae all possible fair execuions of he program. Bu he se of all fair execuions of a fairerminaing program, even for a fixed inpu, may be (enumerably) infinie. Therefore, i is impossible for any saeless model checker, including ours, o generae all fair execuions in a bounded amoun of ime. However, for checking safey properies, i is only necessary o generae enough execuions o cover all reachable saes of he program. To achieve full sae-coverage, our scheduler depends on an imporan characerisic of correc programs. We observed ha hreads in correc programs indicae when hey are unable o make progress by yielding he processor. Whenever a hread wais for a resource ha is no available, i eiher blocks on he resource or yields he processor. A block or a yield ells he operaing sysem scheduler o perform a conex swich, hopefully o he hread holding he resource required by he waiing hread. If he waiing hread does no yield he processor and coninues o spin idly, i will needlessly wase is ime slice and slow down he program; such a behavior is consequenly considered an error. Therefore, in addiion o being fair-erminaing, correc programs also saisfy he following good samarian propery: if a hread is scheduled infiniely ofen, hen i yields infiniely ofen. The program in Figure 3 saisfies his propery because here is a yield saemen in he spin loop of hread u. Also, a hread ha erminaes afer execuing a finie number of seps obviously saisfies he good samarian propery. Our scheduler inroduces an edge in he prioriy order only when a hread yields and hereby indicaes lack of progress. Thus, our scheduler ensures ha in he absence of yield operaions, he prioriy order remains empy and all hreads have equal prioriy. A each scheduling poin, he scheduler nondeerminisically aemps all scheduling choices ha respec he prioriy order. Our inuiion is ha programs are parsimonious in he use of he yield operaions as heir excessive use may unnecessarily decrease performance. Therefore, hese operaions are used sparingly, ypically a he back edges of spin loops. Consequenly, every reachable sae is likely o be reachable via a yield-free execuion along which our algorihm behaves like he sandard nondeerminisic scheduler used in model checkers. We provide heoreical resuls (in Secion 3) characerizing he coverage of our algorihm. We also provide experimenal evaluaion (in Secion 4) o show ha our algorihm provides complee sae coverage on a variey of realisic programs. In summary, fair saeless model checking is a semi-algorihm which akes as inpu a program ha is expeced o saisfy he good samarian propery and be fair-erminaing. There are four oucomes possible when his algorihm is applied o such a program. 1. The algorihm erminaes wih a safey violaion. 2. The algorihm diverges and generaes, in he limi, an infinie execuion ha violaes he good samarian propery. 3. The algorihm diverges and generaes, in he limi, an infinie fair execuion. 4. The algorihm erminaes wihou finding any errors. In heory, he second and hird oucomes manifes in an infinie execuion being generaed. In pracice, i is no possible for a saeless model checker o idenify or generae an infinie execuion. Therefore, we ask he user o se a large bound on he execuion deph. This bound can be orders of magniude greaer han he maximum number of seps he user expecs he program o execue. The model checker sops if an execuion exceeding he bound is reached and repors a warning o he user. This execuion is hen examined by he user o see if i acually indicaes an error. In he rare case i is no, he user simply increases he bound and runs he model checker again. Using his mechanism, our algorihm is able o deec he livelock in he program of Figure 1.

3. Fair saeless model checking In his secion, we describe fair saeless model checking in deail. We fix a mulihreaded program Q wih a finie se Tid of hreads. The program Q sars execuion in is iniial sae s 0. A each sep, one hread in Tid performs a ransiion and updaes he sae. In his presenaion, we assume ha he ransiion relaion of each hread is deerminisic and consequenly hread scheduling is he only source of nondeerminism. However, our mehod is easily generalized o accommodae a nondeerminisic bu finiely-branching hread ransiion relaion. The program Q is equipped wih sae predicaes enabled() and yield() for each hread Tid. The predicae enabled() is rue in a sae s iff hread is enabled in s. The predicae yield() is rue in a sae s iff hread is enabled in s and execuing hread in s resuls in a yield. An execuion s 0 0 s1 1 s2... is a finie or infinie sequence of saes and ransiions. Each such execuion is equipped wih a sae predicae sched() for each hread Tid such ha for all n 0, sched() is rue in s n if and only if n =. A finie execuion s 0 0 s1 1 s2... s n is erminaing if enabled() is false a s n for each Tid. Such a sae s n is called a deadlock. A sae s is reachable if i is he final sae of an execuion. An infinie execuion σ = s 0 0 s1 1 s2... is fair iff for all hreads Tid, if is enabled infiniely ofen in σ hen is scheduled infiniely ofen in σ. This propery is formalized as he following linear emporal logic [23] formula: SF = Tid : GFenabled() GFsched() Every infinie execuion σ of Q is expeced o saisfy he following good samarian propery: GS = Tid : GFsched() GF(sched() yield()) Inuiively, his propery saes ha for all hreads, if is scheduled infiniely ofen, hen in infiniely many of hose ransiions hread also yields. The fair saeless model checking algorihm, apar from deecing safey violaions, also aemps o deec an infinie execuion ha eiher violaes he good samarian propery or is fair. We refer o he value of he predicaes enabled(), sched(), and yield() in sae s as s.enabled(), s.sched(), and s.yield() respecively. We also use s.es o refer o he se {x Tid s.enabled() of hreads enabled in sae s. Given a relaion R Tid Tid and a se X Tid, we define pre(r, X) = {x Tid y Tid : (x, y) R y X. We presen he fair model checking algorihm (Algorihm 1) as a nondeerminisic fair scheduler. Our algorihm makes explici he available nondeerminisic choices a each scheduling poin. I is easy o augmen his descripion wih eiher a sack o perform deph-firs search or a queue o perform breadh-firs search. To focus on he essence of our algorihm, we have elided he (sandard) search mechanism from he algorihm descripion. The algorihm akes as inpu a mulihreaded program Q ogeher wih is iniial sae ini. I assumes ha he program Q comes wih a funcion NexSae ha akes a sae s and a hread and reurns he sae ha resuls from execuing in sae s. The algorihm sars wih he iniial sae and an empy execuion. In each ieraion of he loop (lines 7 30), he algorihm exends he curren execuion by one sep. The algorihm erminaes wih a complee erminaing execuion, when he reurn saemen on line 9 is execued. Each hread Tid pariions an execuion σ ino windows; a window of hread lass from a sae immediaely afer a yielding ransiion by hread o he sae immediaely afer he nex yielding ransiion by. Our algorihm mainains for each sae s several 1 ini.p := {; 2 u Tid : ini.e(u) := {; 3 u Tid : ini.d(u) := Tid; 4 u Tid : ini.s(u) := Tid; 5 curr := ini; 6 while rue do 7 T := curr.es \ pre(curr.p, curr.es); 8 if T = { hen 9 reurn; 10 end 11 := Choose(T ); 12 nex := NexSae(curr, ); 13 nex.p := curr.p \ (Tid {); 14 foreach u Tid do 15 nex.e(u) := curr.e(u) nex.es; 16 if u = hen 17 nex.d(u) := curr.d(u) (curr.es \ nex.es); 18 else 19 nex.d(u) := curr.d(u); 20 end 21 nex.s(u) := curr.s(u) {; 22 end 23 if curr.yield() hen 24 H := (nex.e() nex.d()) \ nex.s(); 25 nex.p := nex.p ({ H); 26 nex.e() := nex.es; 27 nex.d() := {; 28 nex.s() := {; 29 end 30 curr := nex; 31 end Algorihm 1: Fair saeless model checking auxiliary predicaes ha record informaion abou a window of hread. 1. S() is he se of hreads ha have been scheduled since he las yield by hread. 2. E() is he se of hreads ha have been coninuously enabled since he las yield by hread. 3. D() is he se of hreads ha have been disabled by some ransiion of hread since he las yield by hread. In addiion o hese predicaes, each sae s also conains a relaion s.p which represens a prioriy ordering on hreads. Specifically, if (, u) s.p hen will be scheduled in s only when s.enabled() and s.enabled(u). In each ieraion, he algorihm firs compues T (line 7), he se of schedulable hreads ha saisfy he prioriies in curr.p. If T is empy, hen he execuion erminaes. Oherwise, he algorihm selecs a hread nondeerminisically from T (line 11) and schedules o obain he nex sae. I is his nondeerminism inheren in he execuion of he Choose(T ) on line 11 ha a model checker mus explore. As explained earlier, i is easy o add sysemaic deph-firs or breadh-firs search capabiliy o our algorihm. Line 13 removes all edges wih sink from P o decrease he relaive prioriy of. The loop a lines 14 22 updaes he auxiliary predicaes for each hread u Tid. The se E of coninuously enabled hreads is updaed by aking he inersecion of is curren value wih he se of enabled hreads in nex (line 15). The se D of hreads disabled by hread is updaed by aking he union of is curren value wih

a,c ES = {, u, S(u) = {, u, D(u) = {, u, E(u) = {, P = { u: while (x!= 1) a,d ES = {, u, S(u) = {, u, D(u) = {, u, E(u) = {, P = { u: yield() a,c ES = {, u, S(u) = {, D(u) = {, E(u) = {, u, P = { u: while (x!= 1) a,d ES = {, u, S(u) = {u, D(u) = {, E(u) = {, u, P = { u: yield() a,c ES = {, u, S(u) = {, D(u) = {, E(u) = {, u, P = { (u,) u: while (x!= 1) Figure 4. Emulaion of Algorihm 1 on he spin loop in Figure 3. When hread u yields in he second ransiion from sae (a,d), he P in he subsequen sae ensures ha u does no ener he spin loop he second ime. he se of hreads disabled by he laes ransiion (line 17). The se of scheduled hreads is updaed on line 21. Finally, if he ransiion jus execued is a yielding ransiion, hen he daa srucures are updaed appropriaely o mark he beginning of a new window of u (line 23 29). The se H compued on line 24 conains only hose hreads ha were never scheduled in he curren window of hread and were eiher coninuously enabled, or disabled by hread a some poin in he window. Line 25 reduces he prioriy of wih respec o he hreads in H. Figure 4 shows an emulaion of Algorihm 1 for he program in Figure 3. For conciseness, Figure 4 only shows he emulaion when he scheduler aemps o schedule he hread u coninuously. We focus he emulaion on he values of he relaion P and he predicaes S(u), D(u), and E(u). The relaion P is iniialized o be empy. The predicaes S(u), D(u) and E(u) are iniialized in such a way ha heir values remain unchanged unil he firs yield of hread u. These values also provide he addiional guaranee ha he updae of P a he firs yield of any hread is guaraneed o leave he value of P unchanged. This behavior ensures ha he firs window of any hread begins afer is firs yield a which poin he predicaes S(u), D(u) and E(u) ge iniialized appropriaely. In he emulaion, he scheduler execues hread u coninuously. Saring from he iniial sae (a,c), he firs window of u begins once he scheduler has scheduled u wice. A his poin, u has gone hrough he spin loop once and he sae is (a,c) again. In his sae, P = {, S(u) = {, D(u) = {, and E(u) = ES = {, u. When u is execued for one more sep, u is added o S(u) and he sae becomes (a,d). In his sae, yield(u) is rue as u will yield if execued from his sae. However, he P relaion is sill empy allowing he scheduler o choose eiher of he wo hreads. If he scheduler chooses o schedule u again, he hread complees he second ieraion of he loop and he program eners he sae (a,c). Algorihm 1 adds he edge (u, ) o P because he se H on line 24 evaluaes o {. Thus, he algorihm giving he yielding hread u a lower prioriy han he pending hread. This updae o P makes he se of scheduler choices T = {. Thus, he scheduler is forced o schedule, which enables u o exi is loop. Generalizing his example, if he hread was no enabled in he sae (a,c), say if was waiing on a lock currenly held by u, he scheduler will coninue o schedule u ill i releases he lock. Furher, if was waiing on a lock held by some oher hread v in he program, he fairness algorihm will guaranee ha evenually v makes progress releasing he lock. THEOREM 1. Every infinie execuion σ generaed by Algorihm 1 saisfies he propery GS SF. PROOF. We do he proof by conradicion. Suppose Algorihm 1 generaes an infinie execuion σ = s 0 0 s1 1 s2... ha saisfies GS bu does no saisfy SF. Therefore, here is a hread u such ha σ saisfies GFenabled(u) FG sched(u). Tha is, he execuion σ evenually reaches a sae s i afer which u is never scheduled bu is enabled infiniely ofen. Le T be he se of hreads ha are scheduled infiniely ofen in σ. Since Tid is finie and σ is an infinie execuion, he se T is nonempy. Since σ saisfies GS, every hread in T mus yield infiniely ofen. There are wo cases: hread u is enabled forever afer s i or hread u is boh enabled infiniely ofen and disabled infiniely ofen afer s i. Case 1: Suppose hread u is enabled forever afer s i. Consider an arbirary T. Since σ saisfies GS and is scheduled infiniely ofen in σ, here exis j and k such ha i < j < k, s j.sched(), s j.yield(), s k.sched() and s k.yield(). Consider he ieraion of he loop in lines 6 31 of Algorihm 1 in which he curr = s k. Since u is forever enabled bu never scheduled afer s i, we have ha u nex.e() and u nex.s() a line 24. Therefore, a line 25 u H and he edge (, u) is added o nex.p. Since hread u is coninuously enabled afer s i and herefore afer s k, he edge (, u) precludes he scheduling of afer s k (line 7). This is a conradicion since is scheduled infiniely ofen. Case 2: Suppose hread u is boh enabled infiniely ofen and disabled infiniely ofen afer s i. Since T is finie, here mus be some hread T ha disables u infiniely ofen. Since σ saisfies GS and is scheduled infiniely ofen in σ, disables u a some poin afer i and beween wo saes where yield() holds. Formally, here exis j such ha i < j < k l, s j.sched(), s j.yield(), s k.sched(), s k.enabled(u), s k+1.enabled(u), s l.sched(), and s l.yield(). Consider he ieraion of he loop in lines 6 31 of Algorihm 1 in which he curr = s l. Since u is never scheduled afer s i, we have ha u nex.d() and u nex.s() a line 24. Therefore, a line 25 u H and he edge (, u) is added o nex.p. Since hread u is never scheduled afer s i and herefore afer s k, he edge (, u) is presen in s n.p for all n k. This edge precludes he scheduling of afer s k (line 7) whenever hread u is also enabled. This is a conradicion since hread disables hread u infiniely ofen. Theorem 1 yields he following erminaion guaranee abou Algorihm 1. THEOREM 2. If no infinie execuion of Q saisfies he propery GS SF, hen Algorihm 1 erminaes on Q. PROOF. We do he proof by conradicion. Suppose Algorihm 1 does no erminae on Q. Then he ree of execuions explored by he algorihm is infinie. Since Tid is finie, his execuion ree is finiely branching. By König s lemma, here mus be an infinie execuion in his ree. Since every execuion generaed by Algorihm 1 is an execuion of Q, his execuion mus saisfy GS SF (by Theorem 1). Hence, we arrive a a conradicion. Now, we prove cerain desirable properies of he fair model checking algorihm.

THEOREM 3. A line 7 of Algorihm 1, he se T is empy if and only if he se curr.es is empy. PROOF. The proof relies on he fac ha he P relaion when viewed as edges in a graph wih nodes from Tid conains no cycles. The loop invarian a line 6 requiring ha curr.p is an acyclic relaion is sufficien o prove our heorem. We firs prove his loop invarian. Upon loop enry, we have curr = ini. The relaion curr.p is empy and he invarian is rivially rue. Consider an arbirary ieraion of he loop. We assume he loop invarian a he beginning of his ieraion and prove i a he end. In each ieraion of he loop, whenever ougoing edges from are added o P a line 25, all incoming edges ino have already been removed earlier a line 13. Line 21 adds o nex.s(u) for all u Tid and herefore i is guaraneed ha nex.s() a line 24 and H a line 25. Thus, even line 25 does no add any incoming edges ino and each ieraion of he loop leaves P acyclic. We now show ha he loop invarian implies our heorem. Clearly, if curr.es is empy a line 7 hen T is also empy. We prove he oher direcion by conradicion. Suppose T is empy bu curr.enabledse is nonempy. Therefore curr.es pre(curr.p, curr.es). Consider he projecion of he relaion curr.p relaion on o he se curr.es. Since curr.p is acyclic and curr.es is nonempy, his projecion is a nonempy acyclic relaion and herefore conains a maximal elemen. Tha is, here exiss curr.es such ha u curr.es : (, u) curr.p. This conradics our assumpion curr.es pre(curr.p, curr.es). Theorem 3 guaranees ha Algorihm 1 never repors a false deadlock. In pracice, his means ha he algorihm can always drive he program o a erminaing sae, wihou requiring he execuion o be pruned a a parial execuion. Such pruning, as can ypically happen wih deph-bounding echniques, avoids wasing scarce model checking resources. While he heorems above hold even for infinie-sae sysems, he heorems saed below provide inuiion for he efficacy of he fair model checking algorihm on finie-sae sysems. For he remainder of his secion, we assume ha Q is a finie-sae program. For finie-sae sysems, all infinie behaviors raverse cycles in he sae space. A cycle τ is a ransiion sequence x 0 0 x 1 x n n x 0 such ha he saes x 0, x 1,..., x n are all disinc. The cycle τ is reachable if he sae x 0 is reachable. The cycle τ is fair if for every hread Tid, eiher x i.enabled() for all i [0, n] or = i for some i [0, n]. The cycle τ is unfair if i is no fair. Noe ha an infinie execuion ha raverses an unfair cycle forever is no fair. The following heorem shows ha our algorihm unrolls an unfair cycle fully a mos wice and hus significanly reduces waseful search. THEOREM 4. Suppose every infinie execuion of Q saisfies he propery GS, s 0 s 1 x 0 is a finie execuion of Q, and x 0 0 x1 x n n x 0 is an unfair cycle in Q. Then, Algorihm 1 does no generae he execuion s 0 s 1 x 0 0 x1 x n n x 0 0 x1 x n n x 0 0 x1 x n. PROOF. Since x 0 0 x1 x n n x 0 is an unfair cycle, here is a hread u such ha u i for all i [0, n] and x i.enabled(u) for some i [0, n]. Le N(i) denoe (i + 1)mod(n + 1). Since x i.enabled(u) for some i [0, n], here are only wo cases: eiher x i.enabled(u) for all i [0, n] or here exiss i [0, n] such ha x i.enabled(u) and x N(i).enabled(u). Case 1: Suppose x i.enabled(u) for all i [0, n]. Since every infinie execuion of Q saisfies GS, here exiss i [0, n] such ha x i.yield( i) is rue. Therefore he edge ( i, u) is in he prioriy graph afer he execuion of i he second ime he unfair cycle is execued. Since u is never scheduled, his edge is no removed, and consequenly i canno be scheduled a he nex occurrence of x i. Case 2: Suppose here exiss i [0, n] such ha x i.enabled(u) and x N(i).enabled(u). Since every infinie execuion of Q saisfies GS, here exiss j [0, n] such ha i = j and x j.yield( j) is rue. Therefore he edge ( j, u) is added o he prioriy graph afer he execuion of j he second ime he unfair cycle is execued. Since u is never scheduled, his edge is no removed, and consequenly i = j canno be scheduled a he nex occurrence of x i. Now, we presen wo heorems ha characerize he soundness of our algorihm on finie-sae sysems. Consider a finie ransiion sequence α = x 0 0 x1 x n. The yield coun of hread in α, denoed by δ(α, ), is he cardinaliy of he se {0 i < n = i x i.yield(). The yield coun of α, denoed by δ(α) is he maximum of δ(α, ) over all hreads Tid. The yield coun of a reachable sae s is he minimum of δ(σ) over all execuions σ whose final sae is s. The following heorem capures he soundness guaranee of our algorihm for safey properies. THEOREM 5. Algorihm 1 eiher generaes an infinie execuion or visis every reachable sae of Q whose yield coun is zero. PROOF. Suppose Algorihm 1 does no generae an infinie execuion. Therefore, he ree of execuions explored by he algorihm is finie (by König s lemma). This ree is guaraneed o conain all execuions whose yield coun is zero because along such an execuion he prioriy graph remains empy hroughou. Every reachable sae of Q wih yield coun zero is he final sae of an execuion in which here are no yields. Therefore, he algorihm evenually visis all such reachable saes. Every infinie execuion generaed by our algorihm reveals a liveness error (Theorem 1). Theorem 5 indicaes ha our algorihm is sound wih respec o safey properies if he program Q does no have any liveness errors and all reachable saes are reachable by yield-free execuions. Theorem 6 below capures he soundness of our algorihm wih respec o liveness properies as well. THEOREM 6. Suppose x 0 is a reachable sae of Q whose yield coun is zero and τ = x 0 0 x1 x n n x 0 is a fair cycle whose yield coun is a mos one. Then Algorihm 1 generaes an infinie execuion. PROOF. We do he proof by conradicion. Suppose Algorihm 1 does no generae an infinie execuion. By Theorem 5, he sae x 0 is evenually visied wih an empy prioriy graph. If he yield coun of τ is zero, hen here are no yields in τ and τ can be execued repeaedly o generae an infinie execuion. Suppose he yield coun of τ is one. Since τ is fair, every hread ha is enabled anywhere in he cycle is also scheduled in he cycle. Moreover, a hread may yield a mos once in τ. Therefore, he se S() conains boh E() and D() a he unique yield poin of, if any. Consequenly, he yield of does no add any edges o P. As discussed in Secion 2, we expec ha all reachable saes are reachable by a yield-free execuion due o he parsimonious use of he yield operaion by real programs. If his is no he case, hen our algorihm can be parameerized by a small consan k > 0 so as o only process every k-h yield of a hread. The soundness heorems (boh for safey and liveness) for he parameerized algorihm are sraighforward generalizaions of he corresponding heorems saed above. 4. Evaluaion This secion presens he empirical evaluaion of he fair demonic scheduler described in Secion 3. We have implemened our algo-

Programs LOC Threads Synch Ops Dining Philosophers 54 3 48 Work-Sealing Queue 1266 3 99 Promise 14044 3 26 APE 18947 4 247 Dryad Channels 16036 5 273 Dryad Fifo 18093 25 4892 Singulariy kernel 174601 14 167924 Table 1. Characerisics of inpu programs o CHESS rihm in he CHESS sofware model checker. CHESS is designed for sysemaic esing of shared-memory mulihreaded programs. To use CHESS, he user provides a es case ha exercises a concurren scenario. CHESS execues his es repeaedly, while conrolling he hread schedule such ha every execuion of he es akes a differen inerleaving. CHESS is saeless and avoids capuring any sae, including he iniial sae of he program, during sae-space search. The implemenaion of he fair scheduler in CHESS mainains daa srucures o implemen he auxiliary sae used in Algorihm 1. An imporan issue is he inference of yielding ransiions; our implemenaion reas every synchronizaion operaion wih a finie imeou and every explici processor yield as yielding operaions. Anoher imporan issue is he inegraion of fair-scheduling wih he conex-bounded search [22] sraegy implemened in CHESS. In a concurren execuion, a preempion occurs when he scheduler forces a conex swich despie he curren running hread being enabled. Conex-bounded search explores only hose execuions in which he number of preempions is bounded by a small number provided by he user of CHESS. Fair scheduling is easily combined wih conex-bounding. The only suble aspec of he combinaion is ha fair scheduling can inroduce a preempion when he currenly running hread ges a lower prioriy han anoher enabled hread. For soundness of he conex-bounded search, i is imporan o no coun such preempions. 4.1 Abiliy o handle large nonerminaing programs Prior o he implemenaion of he fair scheduler, CHESS, like oher saeless model checkers, could only handle erminaing programs. As deph bounding was unsaisfacory for our purposes, any inpu program wih nonerminaing behavior required manual modificaion. As an example of he effor required, consider he simple program in Figure 3. One can fix he nonerminaing behavior by inroducing a synchronizaion variable ha u blocks on when waiing for an updae o x. In addiion, he behavior of (and all oher hreads ha access x) should be modified o appropriaely signal he synchronizaion variable afer an updae o x, a non-local change o he program. Finally, one has o ensure ha he inroduced synchronizaion does no resul in deadlocks due o adverse ineracions wih exising synchronizaions in he program. In pracice, such modificaions ypically require inimae knowledge of he program, and in our experience, are difficul and error-prone. Previously, i ook us several weeks o prepare a realisic program as an inpu o CHESS. Wih he fair scheduler in place, CHESS could readily handle nonerminaing programs. Table 1 describes he programs CHESS is currenly able o handle. Table 1 also provides he maximum number of hreads creaed and synchronizaion operaions performed per execuion of hese programs in CHESS. In paricular, we are able o sysemaically es he enire boo and shudown process of he Singulariy operaing sysem [13]. Also, we have run CHESS on unmodified versions of Dryad, a disribued execuion engine for coarse-grained daa-parallel applicaions [15], and APE (Asynchronous Processing Environmen), a library in he Windows operaing sysem ha provides a se of daa srucures and funcions for asynchronous mulihreaded code. Apar from hese large programs, CHESS is also able o handle low-level synchronizaion libraries ha ypically employ nonblocking algorihms. Manually modifying hem o be erminaing is eiher impossible or requires algorihmic changes. We have applied CHESS o an implemenaion [20] of he work-sealing queue algorihm originally designed for he Cilk mulihreaded programming sysem [7], and Promise, a library for daa-parallel programs. 4.2 Coverage of safey properies This secion demonsraes ha he fairness algorihm is effecive for checking safey properies. Firs, we show ha he algorihm achieves 100% sae coverage for he firs wo programs in Table 1. Also, we show ha fairness significanly improves he speed of he sae space search, for various search sraegies. Finally, we demonsrae he abiliy of he he fairness algorihm o find boh exising and previously unknown safey violaions in large programs. All he experimens described in he paper were performed on an off-he-shelf compuer wih Inel Xeon 2.80GHz CPU wih 2 processors and 3 GB of memory running Windows Visa Enerprise operaing sysem. 4.2.1 Sae coverage CHESS is a saeless model checker and hus does no have he capabiliy o capure program saes. To measure sae coverage, we manually added faciliies o exrac saes for wo examples; he dining philosophers and he work-sealing queue. The sae of hese programs consiss of he sae of all global variables, he heap, and he sack of all hreads in he program. While he bis comprising he sae can be auomaically exraced, we had o manually absrac he (infinie) sae of he program ino a reasonable, finie represenaion. Also, in order o avoid muliple represenaions of behaviorally equivalen heaps, we used a simple heapcanonicalizaion algorihm [14]. Table 2 shows he resuls from our coverage experimens for hese wo examples, each wih wo configuraions. For each of he configuraions, we used four search sraegies a conex-bounded search wih bounds (cb) from 1 o 3 and a deph-firs search (dfs). For each sraegy, we ran CHESS wih and wihou he fairness algorihm. As erminaion is no guaraneed wihou fairness, he search proceeds only upo a deph bound (db) varying from 20 o 60. Once he deph-bound is reached, a random search [17] is performed unil he end of he execuion is reached. New saes visied during he random search are included while measuring sae coverage. To measure he oal number of saes reachable wih a sraegy, we also performed a saeful search of he sae space and sored he sae signaures in a hash able. We used his able o check if he subsequen runs cover all of he saes. In our experimens, we found ha he fairness algorihm achieves 100% coverage on all bu one of he cases. The fairness algorihm imes ou for a deph-firs search sraegy on he work-sealing queue wih wo sealers. The fourh column in Table 2 shows he number of saes explored wih fairness, and excep for he one case above, his number is greaer han or equal o he oal number of saes in he hird column. The number of saes explored wih fairness is larger han he oal number of saes, as he fairness algorihm inroduces addiional preempion poins. This essenially forces he fairness algorihm o visi saes ha are beyond he curren conex-bound. For comparison, Table 2 also shows he number of saes visied for runs wihou fairness wih differen deph bounds. For some runs, he search wih small deph bounds erminae wihou visiing all he saes. In oher cases wih larger deph-bounds, he search

Time (secs) Time (secs) Search Toal Wih Wihou fairness Configuraion Sraegy Saes Fairness db=20 db=30 db=40 db=50 db=60 Dining cb=1 27 27 27 27 27 27 27 Philosophers cb=2 28 28 28 28 28 28 28 2 philosophers cb=3 29 29 29 29 29 29 29 dfs 29 29 29 29 29 29 29 Dining cb=1 102 102 102 102 102 102 102 Philosophers cb=2 144 144 143 144 144 144 144 3 philosophers cb=3 167 171 167 169 169 169 169 dfs 177 177 174 177 177 168* 139* Work-Sealing cb=1 278 278 236 278 278 278 278 Queue cb=2 814 814 554 765 814 814 814 1 sealer cb=3 1287 1297 694 1133 1287 1287 1287 dfs 1726 1726 871 1505 1726 1307* 683* Work-Sealing cb=1 350 378 238 334 350 350 350 Queue cb=2 1838 2000 971 1630 1822 1838 1838 2 sealers cb=3 3271 3311 1805 2955 3269 3271 3271 dfs 4826 1321* 1686* 1239* 460* 245* 245* Table 2. Number of saes visied for he conex-bounded and deph-firs sraegies boh wih and wihou fairness. Search wihou fairness is no guaraneed o erminae and hus needs o be pruned a a deph bound. The sae coverage achieved for runs ha did no erminae wihin 5000 seconds is marked wih a *. imes ou. For he work-sealing queue wih wo sealers, all runs using he deph-bounded sraegy ime ou. 4.2.2 Rae of sae coverage Fair saeless search can be more efficien as i does no unroll unfair cycles in he sae space (Theorem 4). To quanify his, Figures 5 and 6 show he ime aken o complee he search for wo of he four configuraions in Table 2. The resuls were similar for he oher wo (smaller) configuraions. The figures show he ime aken o complee he search for each of he search sraegies wih and wihou fairness. For he runs wihou fairness, we repor he ime for various deph bounds. Noe, he y-axis on hese figures is in log scale. The runs wih fairness explores he sae space exponenially faser han he runs wihou fairness, wihou sacrificing sae coverage. In Figure 6, for cb=3 sraegy, he run wih a deph bound of 20 complees faser han he run wih fairness bu does no cover all saes. Also, he deph-firs sraegy imes ou in all runs. 4.2.3 Abiliy o find errors The experimens above show ha fairness improves he efficacy of safey checking wihou sacrificing soundness for wo programs. On larger programs, for which sae exracion is no manually feasible, we demonsrae he efficacy of he fairness algorihm indirecly by demonsraing is abiliy o find safey errors in he programs. A prior version of CHESS had found six bugs on versions of he work-sealing queue and he Dryad channels, modified o be erminaing. We ran CHESS on he unmodified programs wih a conexbound of 2 preempions, boh wih and wihou fairness. Since hese programs are nonerminaing, we se he deph-bound o 250 for he search wihou fairness. This deph-bound is he minimum required o find hese errors. Table 3 compares he performance of he search wih fairness o he search wihou fairness. As he able shows he fairness algorihm finds he firs five errors much faser boh in erms of he number of execuions explored prior o he buggy execuion and he ime aken for he search. The sixh error is no found by he search wihou fairness. The sevenh row in Table 3 shows a previously unknown bug ha he fairness algorihm found in Dryad. The bug is caused by an incorrec fix of bug 3 by he developer of Dryad. This error is also no found by he search wihou fairness. 10000 1000 100 10 1 10000 1000 100 10 4.3 Liveness violaions 1 Dining Philosophers 3 phils cb = 1 cb = 2 cb = 3 dfs Figure 5. Dining philosophers (3). Work Sealing Queue 2 sealers cb = 1 cb = 2 cb = 3 dfs Figure 6. Work-sealing queue (2). fair nf db=20 nf db=30 nf db=40 nf db=50 nf db=60 fair nf db=20 nf db=30 nf db=40 nf db=50 nf db=60 The fairness algorihm diverges in wo cases, when he program violaes he good samarian propery (Secion 2) or conains a livelock. Boh hese oucomes indicae correcness or performance errors in he program. We demonsrae one insance of each violaion ha CHESS found in exising programs.

No. of execuions Time (secs) Wih Wihou Wih Wihou Bugs Fairness Fairness Fairness Fairness WSQ bug 1 82 182 2 8 WSQ bug 2 112 432 8 24 WSQ bug 3 212 843 12 74 Dryad bug 1 310 11024 8 273 Dryad bug 2 4002 47030 114 1247 Dryad bug 3 25113-754 >7200 Dryad bug 4 21014-677 >7200 Table 3. Number of execuions of he es and he ime required o find errors wih and wihou fairness in work-sealing queue (WSQ) and Dryad channels. The fourh Dryad error is a previously unknown error ha CHESS found in he fix of he firs hree errors. void Worker::Run(Objec obj) { while (!sop) { while (!sop && ask!= null) { // perform ask... ask = PopNexTask(); if (!sop) { ask = group.idle(his); Task WorkerGroup::Idle(Worker currenworker) { while (!sop) {... // No work o be found // Yield o oher hreads. currenworker.yieldexponenial();... reurn null; Figure 7. Violaion of he good samarian propery. Under cerain cases, he ouer loop in Worker::Run resuls in he hread spinning idly ill ime-slice expiraion. 4.3.1 Good samarian propery violaion We used CHESS o es he implemenaion of a library ha provides efficien parallel execuion of asks. This library mainains a collecion of worker hreads, pariioned ino a se of worker groups. CHESS deeced a violaion of he good samarian propery during he shudown of he library. There is a field called sop in boh he Worker and WorkerGroup classes. During he shudown process, he sop field in a worker group and he sop field in each worker of ha worker group is se o rue, causing all he workers o evenually finish. However, here is a small window of ime during which he sop field in he worker group is rue bu he sop field in one of he workers is false. In his siuaion, if he queue of asks is empy, hen he worker spins in a loop wihou yielding he processor unil is ime-slice expires. This behavior sarves oher hreads, poenially including he one ha is responsible for seing he sop field of he worker o rue. volaile in x; //... in x_emp = InerlockedRead(x); if(common case 1) break; if(common case 2) break;... // spin in he uncommon case while(x_emp!= 1){ Sleep(1); //yield // BUG: should read x once again Figure 8. The spinloop incorrecly wais on a emporary cache of he global variable, resuling in a livelock. 4.3.2 Livelock in Promise We used CHESS o es he implemenaion of promises, a concurrency primiive for specifying daa parallelism. The implemenaion is opimized for efficiency and selecively uses low-level hardware primiives for performance. CHESS deeced a livelock in promises caused by simple programming error. While we are unable o provide he acual code snippe, Figure 8 shows pseudocode exhibiing he same error. For he sake of performance, programmers end o make local copies of shared global variables. The livelock in Figure 8 occurs when he program erroneously wais for he local copy o change, wihou updaing he copy wih he value in he global variable. This bug was hard o deec as i only occurred in hose rare hread inerleavings in which he common cases shown in he pseudo-code were inapplicable. 5. Relaed work The need for fairness when reasoning abou concurren programs is well known [19, 2, 6, 3, 10]. Of he differen useful noions of fairness [6, 18], his paper deals wih srong fairness (also known as srong process fairness [3] or fairness [19]). Fairness has also been sudied exensively in he conex of model checking [5, 24, 25, 1, 12] of emporal logic specificaions, all of which deal wih saeful model checking. To our knowledge, his paper is he firs o propose fairness as a means of improving he efficacy of saeless model checking for safey verificaion. Also, his paper is he firs o exend he abiliy of saeless model checking o comprehensively deec livelocks. Our fair scheduling algorihm is relaed o he explici scheduler consrucion in Ap and Olderog [2, 6]. This scheduler is primarily used as a proof mehodology for proving he erminaion of concurren programs. In paricular, he scheduler requires generaing a random ineger for he prioriy of a hread afer every sep. Such unbounded nondeerminism [10], while useful in generaing all fair schedules of a program, canno be effecively implemened in a model checker. In conras, our algorihm requires a finie choice among a subse of enabled hreads a each sep of he execuion. Mos operaing sysem schedulers employ mechanisms o fairly share resources among compeing hreads and users. These algorihms ypically manipulae prioriies [16, 11] based on resource usage or use randomized schemes [27] o guaranee fairness. These algorihms are no designed o expose he nondeerminisic choices of he scheduler and canno be used in a model checker. The model checker described in his paper belongs o he class ha direcly execue programs [8, 26, 21, 28] (as opposed o analyzing absrac program models). The echnique of saeless model checking was proposed in Verisof [8] and has been successful for sysemaically esing indusrial concurren sysems [4]. Saeless

model checking ypically relies on parial-order reducion echniques [9] o reduce he sae space explored. Parial-order reducion can only deermine he equivalence of wo execuions of he same lengh and hus is inherenly incapable of deecing equivalence of execuions ha revisi he same sae in a cycle. Parialorder reducion, however, can be used o significanly reduce he se of all fair schedules of fair-erminaing programs, an ineresing avenue of fuure research ha we are currenly pursuing. Killian e al. [17] use a hybrid saeful and saeless echnique o find liveness errors in nework proocols. Their algorihm can only find hose liveness violaions ha are characerized by he presence of dead saes and will no find he livelock in he dining philosopher example (Secion 1) or he violaion of he good samarian propery, described in Secion 4.3.1. Also, heir algorihm provides no soundness guaranees of finding livelocks. 6. Conclusions and fuure work This paper proposes he use of a fair scheduler in saeless model checking. Fairness boh enables he deecion of liveness errors and subsanially improves he efficiency of safey verificaion in a saeless model checker. The incorporaion of he fair scheduling algorihm in he CHESS model checker has significanly improved he applicabiliy of CHESS o large nonerminaing programs. The fairness-enhanced CHESS has found many liveness errors in several indusry-scale programs. Currenly, CHESS checks wo liveness properies: fair erminaion and he good-samarian rule. We would like o exend CHESS o check an arbirary liveness propery. To moivae his work, we are currenly idenifying liveness properies ha are useful for mulihreaded sofware. We are also invesigaing exensions o he fair scheduler o handle unbounded hread creaion. Acknowledgemens We hank Tom Ball for help wih he evaluaion of he fair scheduler. We also hank Parice Godefroid, Andreas Podelski, Mihalis Yannakakis, and he reviewers for valuable commens on a prior version of he paper. References [1] S. Aggarwal, C. Courcoubeis, and P. Wolper. Adding liveness properies o coupled finie-sae machines. ACM Transacions on Programming Languages and Sysems, 12(2):303 339, 1990. [2] K.R. Ap and E.-R. Olderog. Proof rules and ransformaions dealing wih fairness. Science of Compuer Programming, 3:65 100, 1983. [3] Krzyszof R. Ap, Nissim Francez, and Shmuel Kaz. Appraising fairness in languages for disribued programming. In POPL 87: Principles of Programming Languages, pages 189 198, 1987. [4] Saish Chandra, Parice Godefroid, and Chrisopher Palm. Sofware model checking in pracice: an indusrial case sudy. In ICSE 02: Inernaional Conference on Sofware Engineering, pages 431 441, 2002. [5] E.M. Clarke and E.A. Emerson. Synhesis of synchronizaion skeleons for branching ime emporal logic. In Logic of Programs, LNCS 131, pages 52 71. Springer-Verlag, 1981. [6] Nissim Francez. Fairness. In Texs and Monographs in Compuer Science. Springer-Verlag, 1986. [7] Maeo Frigo, Charles E. Leiserson, and Keih H. Randall. The implemenaion of he Cilk-5 mulihreaded language. In PLDI 98: Programming Language Design and Implemenaion, pages 212 223. ACM Press, 1998. [8] P. Godefroid. Model checking for programming languages using Verisof. In POPL 97: Principles of Programming Languages, pages 174 186. ACM Press, 1997. [9] Parice Godefroid. Parial-Order Mehods for he Verificaion of Concurren Sysems: An Approach o he Sae-Explosion Problem. LNCS 1032. Springer-Verlag, 1996. [10] Orna Grumberg, Nissim Francez, and Shmuel Kaz. Fair erminaion of communicaing processes. In PODC 84: Principles of Disribued Compuing, pages 254 265. ACM Press, 1984. [11] Joseph L. Hellersein. Achieving service rae objecives wih decay usage scheduling. IEEE Transacions on Sofware Engineering, 19(8):813 825, 1993. [12] G. Holzmann. The model checker SPIN. IEEE Transacions on Sofware Engineering, 23(5):279 295, May 1997. [13] Galen C. Hun, Mark Aiken, Manuel Fhndrich, Chris Hawblizeland Orion Hodson, James R. Larus, Seven Levi, Bjarne Seensgaard, David Tardii, and Ted Wobber. Sealing OS processes o improve dependabiliy and safey. In Proceedings of he EuroSys Conference, pages 341 354, 2007. [14] Radu Iosif. Exploiing heap symmeries in explici-sae model checking of sofware. In ASE 01: Auomaed Sofware Engineering, pages 254 261, 2001. [15] Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Feerly. Dryad: disribued daa-parallel programs from sequenial building blocks. In Proceedings of he EuroSys Conference, pages 59 72, 2007. [16] J. Kay and P. Lauder. A fair share scheduler. Communicaions of he ACM, 31(1):44 55, 1988. [17] Charles Edwin Killian, James W. Anderson, Ranji Jhala, and Amin Vahda. Life, deah, and he criical ransiion: Finding liveness bugs in sysems code. In NSDI 07: Symposium on Neworked Sysems Design and Implemenaion, pages 243 256, 2007. [18] M. Z. Kwiakowska. Survey of fairness noions. Informaion and Sofware Technology, 31(7):371 386, 1989. [19] Daniel J. Lehmann, Amir Pnueli, and Jonahan Savi. Imparialiy, jusice and fairness: The ehics of concurren erminaion. In ICALP 81: Inernaional Conference on Auomaa Languages and Programming, pages 264 277, 1981. [20] Daan Leijen. Fuures: a concurrency library for C#. Technical Repor MSR-TR-2006-162, Microsof Research, 2006. [21] M. Musuvahi, D. Park, A. Chou, D. Engler, and D. L. Dill. CMC: A pragmaic approach o model checking real code. In OSDI 02: Operaing Sysems Design and Implemenaion, pages 75 88, 2002. [22] Madanlal Musuvahi and Shaz Qadeer. Ieraive conex bounding for sysemaic esing of mulihreaded programs. In PLDI 07: Programming Language Design and Implemenaion, pages 446 455, 2007. [23] Amir Pnueli. The emporal logic of programs. In FOCS 77: Foundaions of Compuer Science, pages 46 57, 1977. [24] J. Queille and J. Sifakis. Specificaion and verificaion of concurren sysems in CESAR. In Fifh Inernaional Symposium on Programming, LNCS 137, pages 337 351. Springer-Verlag, 1981. [25] M. Y. Vardi and P. Wolper. An auomaa-heoreic approach o auomaic program verificaion. In LICS 86: Logic in Compuer Science, pages 322 331. IEEE Compuer Sociey Press, 1986. [26] W. Visser, K. Havelund, G. Bra, and S. Park. Model checking programs. In ASE 00: Auomaed Sofware Engineering, pages 3 12, 2000. [27] Carl A. Waldspurger and William E. Weihl. Loery scheduling: Flexible proporional-share resource managemen. In OSDI 94: Operaing Sysems Design and Implemenaion, pages 1 11, 1994. [28] Junfeng Yang, Paul Twohey, Dawson R. Engler, and Madanlal Musuvahi. Using model checking o find serious file sysem errors. ACM Transacions on Compuer Sysems, 24(4):393 423, 2006.