Sequentializationtargetedto BoundedModel-Checkers Salvatore La Torre Dipartimento di Informatica Università degli Studi di Salerno
Sequentialization Code-to-code translation from a multithreaded program to an equivalent sequential one shared vars Conc. program loc loc loc T 1 T 2 T m Seq. progam Seq. Verifier Re-use of existing tools (delegate the analysis to the backend tool) Fast prototyping (designers can concentrate only concurrency features) Can work with different backends
Sequentialization alters the original program structure by injecting control code (which is an overhead for the backend) requires careful attention to the details of the translation for well-performing tools replaces concurrency with nondeterminism
Backend: Bounded Model Checkers Reduce program verification to SAT/SMT solvers Very effective technique to discover bugs in seq. programs Performance relies on the performance of underlying solvers 100000 Impressive improvement of SAT solvers in last years Vars 10000 1000 100 10 1 1960 1970 1980 1990 2000 2010 Year
Implementations with BMC backend LR seq. [Lal-Reps, CAV 08]: eager, bounded round-robin LMP seq. [La Torre-Madhusudan-Parlato]: lazy, any scheduling With BMC, LR works better than LMP [Ghafari-Hu-Rakamaric, SPIN 10] Implementations of LR: CSeq: Pthreads C programs [Fischer-Inverso-Parlato, ASE 13] STORM: also dynamic memory allocation [Lahiri-Qadeer-Rakamaric, CAV 09] Corral [Lal-Qadeer-Lahiri, CAV 12] Delay bounded-scheduling [Emmi-Qadeer-Rakamaric, POPL 11] Real-time systems [Chaki, Gurfinkel, Strichman FMCAD 11]
Lal/Reps Sequentialization considers only round-robin schedules with k rounds thread function, run to completion global memory copy for each round scalar array context switch round counter++ first thread starts with nondeterministic memory contents other threads continue with content left by predecessor T0 S0,0 S1,0 S2,0 Sk,0 T1 S0,1 S1,1 S2,1 Sk,1 S0,n S1,n S2,n Sk,n Tn
Lal/Reps Sequentialization considers only round-robin schedules with k rounds thread function, run to completion global memory copy for each round scalar array context switch round counter++ first thread starts with nondeterministic memory contents other threads continue with content left by predecessor checker prunes away inconsistent simulations assume(si+1,0 == S i,n); requires second set of memory copies errors can only be checked at end of simulation requires explicit error checks T0 S0,0 S1,0 S2,0 Sk,0 T1 S0,1 S1,1 S2,1 Sk,1 S0,n S1,n S2,n Sk,n Tn
CSeq Tool Architecture [Fischer-Inverso-Parlato, ASE 13] pycparser, AST traversal with unparsing insert new type declarations, modify memory accesses insert context switch simulation code at each sequence point insert explicit error checks insert checker and code for pthread functions k, N concurrent C program CSeq sequential non-deterministic C program P P' sequential tool SAFE UNSAFE
Can we improve this? Eager sequentializations cannot rely on error checks built into the backend also requires specific techniques to handle programs with heap-allocated memory LR (but also LMP) uses additional copies of shared variables BMC will make more copies with loop/recursion unwinding can seriously affect the formula size
Two new sequentializations [Inverso-Tomasco-Fischer-LaTorre-Parlato] Lazy-CSeq [CAV 14-TACAS/SVCOMP 14] lazy light-weight: few additional variables and code MU-CSeq [TACAS/SVCOMP 14] Bound on memory unwindings Extension message-passing programs
Outline Bounded Model-checking for programs Lazy sequentialization: Lazy-CSeq Memory unwindings: MU-CSeq Experiments Conclusions
How does it work Transform a programs into a set of equations Simplify control flow Unwind all of the loops Convert into Static Single Assignment (SSA) Convert into equations Bit-blast Solve with a SAT Solver Convert SAT assignment into a counterexample
Control Flow Simplifications All side effect are removed e.g., j=i++ becomes j=i; i=i+1 Control Flow is made explicit continue, break replaced by goto All loops are simplified into one form for, do, while replaced by while
Loop Unwinding void f() { } while(cond) { Body; } Remainder; while() loops are unwound iteratively Break / continue replaced by goto
Loop Unwinding void f() { } if(cond) { Body; } while(cond) { Body; } Remainder; while() loops are unwound iteratively Break / continue replaced by goto
Loop Unwinding void f() { } if(cond) { Body; } if(cond) { Body; } while(cond) { Body; } Remainder; while() loops are unwound iteratively Break / continue replaced by goto
Unwinding assertion void f() { } if(cond) { Body; } if(cond) { Body; } if(cond) { Body; } Remainder; while(cond) { Body; } while() loops are unwound iteratively Break / continue replaced by goto Assume statements inserted after last iteration: block execution if program runs longer than bound permits
Unwinding assertion void f() { if(cond) { Body; } if(cond) { Body; } if(cond) { Body; } assume(!cond); } Remainder; } Unwinding assume while() loops are unwound iteratively Break / continue replaced by goto Assume statements inserted after last iteration: block execution if program runs longer than bound permits
Transforming Loop-Free Programs Into Equations (1) Easy to transform when every variable is only assigned once! Program x = a; y = x + 1; z = y 1; Constraints x = a && y = x + 1 && z = y 1 &&
Transforming Loop-Free Programs Into Equations (2) When a variable is assigned multiple times, use a new variable for the RHS of each assignment Program SSA Program
What about conditionals? Program SSA Program if (v) x = y; else x = z; w = x; if (v 0 ) x 0 = y 0 ; else x 1 = z 0 ; w 1 = x??; What should x be?
What about conditionals? Program SSA Program if (v) x = y; else x = z; w = x; if (v 0 ) x 0 = y 0; else x 1 = z 0; x 2 = v 0? x 0 : x 1 ; w 1 = x 2 For each join point, add new variables with selectors
Example
CBMC: Bounded Model Checker for C A tool by D. Kroening/Oxford and Ed Clarke/CMU C Program Parser gotoprogram Static Analysis equations SAFE UNSAT SAT solver CNF CNF-gen SAT UNSAFE + CEX CEX-gen CBMC
BMC: from sequential to concurrent BMC for sequential programs has been used to discover subtle errors in applications robust BMC tools exist for C programs (e.g. CBMC, LLBMC, ESBMC) attempts to apply BMC to multi-threaded programs face problems number of interleavings grows exponentially with #threads and #statements Recent solutions for multi-threaded programs partial orders sequentializations
Outline Bounded Model-checking for programs Lazy sequentialization: Lazy-CSeq Memory unwindings: MU-CSeq Experiments Conclusions
Lazy-CSeq schema P' simulates all computations (up to K rounds) of P lazy: avoid exploring unfeasible runs T₁ T₂ T n (T' 1 ; T' 2 ; ; T' n ) K
Lazy-CSeq Sequentialization Translation P P': unwinding, inlining (bounded program) thread T function T' main driver: For each round in [1..K] for each thread in [1..N] T' thread ();
Main driver void main(void) { }} for(r=1; r<=k; r++) { ct=1; // thread 1 if(active[ct]) { //only active theads cs=pc[ct]+nondet uint(); // guess cs assume(cs<=size[ct]); // legal? fseq_1(arg[ct]); // simulate thread pc[ct]=cs; // store cs } : : : : : : : : : ct=n; // thread n if(active[ct]) { : : : : : : : : : } K: rounds ct: thread count active[j]: true iff thread j is active cs: guessed context-switch position for coming simulation pc[j]: position when contextswitched out of thread j size[j]: last position for context-switches
Thread T function T' Translation P P': unwinding, inlining thread T function T' main driver: for round in [1..K] for thread in [1..N] T' thread (); T' guard; stmt; Thread T function T' var x; static var x; stmt; guard; stmt;
Thread T function T' Thread T function T' var x; static var x; stmt; guard; stmt; exec T' context-switch cs 1 Thread simulation: round 1 guess context-switch point cs 1 (in main) skip execute stmts before cs 1 jump in multiple hops to the end and return simulation round 1
Thread T function T' Thread T function T' var x; static var x; stmt; guard; stmt; Thread simulation: round i guess context-switch point cs i (in main) jump in multiple hops to pc i-1 skip exec skip T' resume from pc i-1 context-switch cs i execute stmts from pc i-1 to cs i jump in multiple hops to the end and return simulation round i >1
Instrumenting jumping in and out of thread executions (stmt; guard; stmt;) Multiple hops: #define J(A,B) if ( pc[ct] > A A >= cs ) goto B; Ex. At position 5, J(5,6) jumps to 6 (next position) when not in [ pc[ct], cs [
Branching stmts Thread T function T' Use macro: #define var x; static var x; stmt; guard; stmt; G(L) assume(cs >= L); if cond { body1 }else{ body2 } if cond { body1 }else{ G(A) body2 } G(B) Rules out spurious thread executions when context-switch position is inside the not picked branch
Example 2:J(2,3) if(c>0) 3:J(3,4) c++; else { G(4) 4:J(4,5) c=0; 5:J(5,6) 6:J(6,7) } G(7) if(!(tmp>0)) goto l1; c++; tmp--; if(!(tmp>0)) goto l1; c++; tmp--; assume(!(tmp>0)); l1: G(7); 7:J(7,8) pthread mutex unlock(&m); Assume pc=2 and cs=3 (only 2: must be executed) Additionally suppose c<=0 holds Note this is not ruling out any good exectutions Execution jumps to the else stmt and exits In main pc is then set to 3 (cs) The sketched one is captured by guessing cs=4 instead of cs=3 When resumed, the tread will start from 3: executing c++!!!!! Adding guard G(4) right after the else stmt rules out this unfeasible computation
Outline Bounded Model-checking for programs Lazy sequentialization: Lazy-CSeq Memory unwindings: MU-CSeq Experiments Conclusions
Main idea Guess a memory unwinding of the shared memory sequence of writes into the shared memory of an execution Execute each thread s.t. its local computation is consistent with the memory unwinding any scheduling that fits it is considered Crucial notion: all the shared memory operations are done through a memory object
Memory object Stores a sequence of writes Each write is a triple (thread, variable, value) Is used through an interface Several implementations are possible
Memory object interface int mem_init(uint V, uint W, uint T); instantiate a memory object V = number of variables W = number of writes T = max number of threads
Memory object interface int read(uint th_id, uint var); void write(uint th_id, uint var, int value); int mem_thread_create(uint parent_id); returns id for created thread void terminate(uint th_id); checks that all writes of th_id have been executed
MU-Cseq sequentialization A funciton for each thread thread T function T' main driver: void proc main(void) call memory_init(v, W, T); ct := mem_thread_create(0); call main W;T(x1,, xk); // instantiate memory // register main thread call terminate(ct); // all writes are executed assert(_error!= 1) // error check
Simulation Basic Idea: simulate all executions compatible with guessed memory unwinding uses auxiliary variables pos (current index into memory object) ct (id of currently simulated thread) every thread is translated into a function simulation starts from main thread each thread creation interrupts simulation of current thread new thread is called with current value of pos terminated threads must have completed their memory writes terminate(ct) must hold old thread resumed with old memory position (in the activation record)
Simulation Basic Idea: simulate all executions compatible with guessed memory unwinding every read / write is translated into a function simulates valid access into unwound memory (similarly for pthread functions)
Implementation I: Explicit Read pos x memory y z 0 0 0 0 1 4 0 0 2 4 2 0 3 4 3 0 4 4 3 42 W 0 3 42 writes 0 0 1 x 1 y 2 y 1 z 2 x Guess and store sequence of individual write operations: add N copies of shared variables ( memory ) _memory[i,j] is value of j-th variable after i-th write add array to record writes ( writes ) i-th write is by _thr[i], which has written to _var[i] add array to record validity of writes ( barrier ) next write (after i) by thread t is at _next[i,t] thr var next[1] N+1 N+1 N+1 N+1 N+1 N+1 barrier 1 2 4 4 N+1 3 3 3 N N N+1 N+1 next[t] N+1
Simulating reads and writes pos memory x y z writes thr var 0 0 0 0 0 0 1 4 0 0 1 x 2 4 2 0 1 y 3 4 3 0 2 y 4 4 3 42 1 z W 0 3 42 2 x next[1] N+1 N+1 N+1 N+1 N+1 N+1 barrier 1 2 4 4 N+1 3 3 3 N N N+1 N+1 next[t] N+1 int read(uint th, uint var) { uint th_pos; if is_terminated(th) then return; th_pos = Jump(th_id); return (mem[th_pos][var]); }
Simulating reads and writes int write(uint th, uint var, int val) { } uint jump; if is_terminated(th) then return; jump=next[pos][th_id]; assume ( (jump<=last_write_pos) && (var[jump]==var) && (value[jump]==val) ); pos=jump;
Otherimplementations Implementation II: (Implicit Read) mem WxV-array is replaced with W-array var_next_write[ i ] is smallest j>i s.t. var[ i ]=var[ j ] saves memory but causes more complex Read implementation (larger formula) Implementation III: mixed I and II
Outline Bounded Model-checking for programs Lazy sequentialization: Lazy-CSeq Memory unwindings: MU-CSeq Experiments Conclusions
Evaluation: SV-COMP2014 Lazy-CSeq won the Gold Medal and MU-CSeq I won the Silver Medal in the Concurrency category 76 concurrent C programs UNSAFE instances: 20 programs containing a bug SAFE instances: all the others 4,500 l.o.c. 1) CSeq-Lazy: 1,000s, 136pts 2) CSeq-MU I: 1,200s, 136pts 3) CBMC: 29,000s, 128pts Results: small verification times small memory footprint no missed bugs!
Evaluation: bug-hunting Lazy-Cseq vs Native Concurrency Handling (UNSAFE instances) _1: timeout (750s) _2: internal error _3: manual translation not done _4: test case rejected _5: unknown failure
Evaluation: state space coverage Lazy-CSeq + CBMC vs CBMC (SAFE instances) How far is it possible to push the unwind bound with the two different methods and still finish the analysis within the given time and space requirements (10GB, 750s)? bounding the rounds allows deeper exploration of loops alternative coverage of the state space
Evaluation: formula size Lazy-CSeq + CBMC vs CBMC (v4.7) (on SV-COMP14 SAFE instances)
MU-CSeq
Outline Bounded Model-checking for programs Lazy sequentialization: Lazy-CSeq Memory unwindings: MU-CSeq Experiments Conclusions
Conclusions LR and LMP sequentializations complete analysis of concurrent programs up to a given number of context-switches (contextbounded analysis [Qadeer-Rehof, CAV 05] unbounded analysis within each context Lazy-CSeq and MU-CSeq use additional bounding criteria backend tool will bound the computations anyway this can be exploited to induce simplifications of the generated formula Effective for bug-hunting many concurrency errors show up within few context-switches
Lazy-CSeq Lightweight sequentialization designed to take advantage of modern sequ. BMC tools small BMC formulae for small number of rounds reduced memory footprint and verification times Laziness: avoid handling spurious errors typical of eager exploration inherit from the backend tool all checks for sequential C: array-bounds-check, division-by-zero, pointer- checks, overflow-checks, reachability of error labels and assertion failures, etc. no need to handle dynamic memory allocation
MU-CSeq Ongoing research Preliminary experiments show it is competitive with stateof-the-art model-checkers II improves on I on some benchmarks III performs better than both I and II in general and when it is outperfomed it stays close to the best of the two NEXT: memory unwinding can be used to sequentialize message-passing programs send/receive instead of read/write
Lazy-CSeq and MU-CSeq joint work with Ermenegildo Tomasco Omar Inverso Gennaro Parlato Bernd Fischer UK South Africa