Modelling systems, and model extraction from software

Modelling systems, and model extraction from software [Automated Reasoning, 2013/2014 1b Lecture 2] Doina Bucur d.bucur@rug.nl Nov 2013

Summary In this lecture... Modelling computing systems Reactive systems Kripke structures. Definition and intuition Modeling hardware, software, and protocols Digital circuits: synchronous and asynchronous Extracting models from software (sequential or concurrent) Communication protocols Modelling with Promela Historical and bibliographical notes Assignment You learn to model various systems (software and hardware) in an automaton-like formalism of states and transitions called transition system, and to extract a model from software. The checking of the model against a state property is then immediate: you simply browse through all the states of the model and check your property. We apply this manual model checking to mutual exclusion algorithms. The assignments then move you towards using Spin to automatize this. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 2 / 25

Modelling computing systems Reactive systems Systems may be categorized: In what regards form of system implementation: software, hardware (logic circuits, ASICs, etc.). In what regards system behaviour and its complexity: deterministic or nondeterministic terminating or not terminating synchronous or asynchronous lazy or real-time sequential or concurrent transformational or reactive. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 3 / 25

Modelling computing systems Reactive systems Reactive systems A transformational computation is not interruptible (below left, represented as an input/output black box). A reactive system interacts with the environment; input events may arrive at all times (right, as a black cactus ). Reactive includes transformational. S S A reactive system may be any combination of (non)terminating, (not) concurrent, (a)synchronous, etc. Examples: concurrent software, device drivers, networked machines, social humans, etc.; i.e., most complex systems. The modelling formalism you see here suits reactive systems. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 4 / 25

Modelling computing systems Reactive systems Modelling a reactive system...is by writing: Its internal state at any given time; this state describes the current values stored in all memory (stack, heap, registers), plus the program counter. What this state will transition into, given an incoming event. For this, it is standard to use a variation of finite-state automata called Kripke structures. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 5 / 25

Modelling computing systems Kripke structures. Definition and intuition Kripke structures Definition (Kripke structure) A Kripke structure M over a set of atomic propositions AP is a tuple M = (S, S 0, T, L) where S is a finite set of states, or state space; S 0 S is the subset of initial states; T S S is a transition relation. T should be total: s S, s S so that T (s, s ); L : S P(AP) is a function labelling each state with a subset of atomic propositions which are true in that state. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 6 / 25

Modelling computing systems Kripke structures. Definition and intuition A simple example: from code to Kripke structure l0 int x = y = 0; l1 while (true) l2 x = (x+1) mod 3, y = x+1; State s S is a valuation of V = {x, y}; note that I left out the program counter to keep the example simple. Call the domain of both variables D. Then, S D D. In s 0, x = 0, y = 0. AP expresses variable valuations. A transition α T encodes an atomic assignment. l2 is atomic here. Match the transitions to statements in the original program! α α 0 1 s 0 s 1 s 2 {x=0,y=0} {x=1,y=2} α 3 α 2 s 3 {x=2,y=3} {x=0,y=1} Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 7 / 25

Modelling computing systems Kripke structures. Definition and intuition Some intuition behind this formalism S must be finite. This is not an artificial limitation of the model: in all information systems, D may be large (think 64-bit data types, in which case the size of D is 2 64 ), but is finite. (Digital systems never allow variables to have the full set of values from N, Z, R.) For a correct model, transitions must represent atomic system statements. Deciding on atomicity takes knowledge of the language, compiler, scheduler: Basic assignments in high-level languages may be atomic. However, machine-language translations may have finer grain. atomic {code} constructs, if available, make code atomic. Disabling interrupts in sequential low-level code makes the code atomic. T having to be total is an artificial (but harmless) feature of the model. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 8 / 25

Modelling computing systems Kripke structures. Definition and intuition...and some intuition behind model checking s 0 {x=0,y=0} In this simple example: suppose you want to know whether α α 0 1 s 1 {x=1,y=2} α 3 s 3 {x=0,y=1} y eventually becomes equal to 3 x and y can ever be equal to 1 at the same time y is always greater than or equal to x the program terminates s 2 α 2 {x=2,y=3} the program can hit a deadlock the program has an overflow You can answer all these by searching through the model. This is what an automated model checker also does. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 9 / 25

Modelling computing systems Kripke structures. Definition and intuition An equivalent logic representation for Kripke structures As per definition, the information is in state labels: α α 0 1 s 0 s 1 s 2 {x=0,y=0} {x=1,y=2} α 3 α 2 s 3 {x=2,y=3} {x=0,y=1} In this equivalent predicate-logic form, the same information is on transitions instead: A state is a conjunctive formula: s 0 (x = 0) (y = 0). Transitions, also: α i (x = (x + 1) mod 3) (y = (x + 1)), i = 0..3. Primed variables represent next-state variables. Close to the program execution. A next state s 1 is the conjunction of the previous state s 0 and the transition α 0 : s 1 = s 0 α 0. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 10 / 25

Modelling computing systems Kripke structures. Definition and intuition Paths in Kripke structures A path π from state s in a Kripke structure is an infinite sequence of states π = s 0 s 1 s 2... so that s 0 = s and (s i, s i+1 ) T, i 0. π i is the suffix of π starting at s i. Clearly, infinite paths exist in finite-state machines. Later, we see temporal properties, e.g., only after x becomes 1, y can become greater than x. For these, we will examine paths. You can now imagine that the complexity of model checking increases with: the number of states in the model how expressive the property is Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 11 / 25

Modeling hardware, software, and protocols Digital circuits: synchronous and asynchronous A synchronous circuit These are called sequential circuits: the output depends on current and past inputs. They include memory (flip-flops): Q becomes equal to D only after a clock pulse. This is a fragment of CPU design. Three boolean variables, D = {0, 1}, S = D D D; say all start equal to 0. At each transition, the bits change, e.g., b 0 b 0, synchronously; The single transition is the conjunction of these changes: ( b 0 b 0 ) ( b 1 b 0 b 1 ) ( b 2 (b 0 b 1 ) b 2 ) b2 D Q clk b1 D Q clk b0 D Q clk Modulo-8 counter. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 12 / 25

Modeling hardware, software, and protocols Digital circuits: synchronous and asynchronous An asynchronous circuit These are called combinatorial circuits: the output depends solely on the current input (no memory, no notion of time, no means of synchronization). They are constructed with logic gates. Three boolean inputs, V = {b 0, b 1, b 2, }, D = {0, 1}, S D D D; At each transition, the bits change independently by b 0 b 0, etc; these changes may interleave; The transition is a disjunction of these changes: ( b 0 b 0 ) ( b 1 b 0 b 1 ) ( b 2 (b 0 b 1 ) b 2 ) b2 b1 b0 b2 b1 b0 Incrementing circuit. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 13 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent) Sequential software You can automatically translate a program into a transition system. x = y = 0; z = *; /* z is not initialized */ while (true) x = (x+1) mod 3, y = x+1; One method: Generate the control-flow graph (CFG) from program source. Then, start with the initial state, and add a transition (and a new state) for each program statement. 0,0,* 1,2,* (CFG) 2,3,* 0,1,* (Kripke structure) Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 14 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent) Sequential software A formal method: Start by giving the entry point to each basic program statement a label l; the program counter will get these values: l1 l2 l3 /* start with x = y = 0, z uninitialized */ while (true) x = (x+1) mod 3, y = x+1; Then, we define a top-down translation Tr(l 1 P l 3 ) for the program P in predicate-logic form: while loop: Tr(l 1 while(true) {l 2 P 2 } l 3 ) is the disjunction (pc = l 1 ) (pc = l 2 ) same(x, y, z) Tr(l 2 P 2 l 1 ) assignment: Tr(l 2 x = (x+1) mod 3, y = x+1; l 1 ) is the conjunction (pc = l 2 ) (x = (x + 1)mod 3) (y = x + 1) same(z) (pc = l 1 ) Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 15 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent) Generalizing Tr(S) Tr(l x=y; l ) (pc = l) (x = y) (pc = l ) same(v \ {x}) Tr(l if(b) {l 1 P 1 } else {l 2 P 2 } l ) (pc = l) (b) (pc = l 1 ) same(v) (pc = l) ( b) (pc = l 2 ) same(v) Tr(l 1 P 1 l ) Tr(l 2 P 2 l ) Tr(l while(b) {l 1 P 1 } l ) (pc = l) (b) (pc = l 1 ) same(v) (pc = l) ( b) (pc = l ) same(v) Tr(l 1 P 1 l) Tr(l 1 P 1 l 2 P 2 l ) Tr(l 1 P 1 l 2 ) Tr(l 2 P 2 l ) This is what a software model checker will do to acquire a formal model. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 16 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent) Concurrent software Concurrency simply means interleaving the transitions of the sequential programs (as for asynchronous circuits). Processes share variables. Take the concurrent program P l (P 1 P 2... P n ) l, where: The program counters pc i form a set PC. The sets of variables each process can access are V 1, V 2,... V n ; their union is V. Add start/end labels l i and l i for each process P i. Essentially, Tr(l P l ) is: n i=1 ( Tr(li P i l i ) same(v \ V i ) same(pc \ pc i ) ) (excluding a first and last formula starting and ending all processes) Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 17 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent) A mutual exclusion example Have P 0 P 1 symmetric processes, where P 0 : l0 while (true) nc0 wait_until(turn=0); /* busy waiting */ cr0 turn = 1; l0 and symmetric for P 1 (any 0 becomes 1 and vice versa). Consider both initial states. P 0 is represented as predicate logic (below), then as Kripke structure (right). Tr(l 0 P 0 l 0 ) (pc 0 = l 0 ) (pc 0 = nc 0) same(turn) (pc 0 = nc 0 ) (turn = 0) (pc = cr 0 ) same(turn) (pc 0 = nc 0 ) (turn = 0) (pc = nc 0 ) same(turn) (pc 0 = cr 0 ) (turn = 1) (pc 0 = l 0) pc 0 =l 0 turn=0 pc 0 =nc 0 turn=0 pc 0 =cr 0 turn=0 pc 0 =l 0 turn=1 pc 0 =nc 0 turn=1 Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 18 / 25

Modeling hardware, software, and protocols Extracting models from software (sequential or concurrent)...then, P 0 P 1 is: pc 0 =l0 pc 1 =l 1 turn=0 pc 0 =l 0 pc 1 =l 1 turn=1 pc 0 =l 0 pc 1 =nc 1 turn=0 pc 0 =nc0 pc 1 =l 1 turn=0 pc 0 =l 0 pc 1 =nc 1 turn=1 pc 0 =nc 0 pc 1 =l 1 turn=1 pc 0 =nc 0 pc 1 =nc 1 pc 0 =cr 0 pc 1 =l 1 turn=0 turn=0 pc 0 =l 0 pc 1 =cr 1 turn=1 pc 0 =nc 0 pc 1 =nc 1 turn=1 pc 0 =cr 0 pc 1 =nc 1 turn=0 pc 0 =nc 0 pc 1 =cr 1 turn=1 Here you see why modelling program counters is useful. Check visually: Does this program ensure mutual exclusion? Can a process be denied forever entrance to its critical region? Why? If you find hard to answer the Why?, try labelling all your graph s transitions, so that you can match transitions to the original code. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 19 / 25

Modeling hardware, software, and protocols Communication protocols Communication with channels A communication protocol is nothing but a concurrent program. The model includes: Channels c to send on and receive from; these are modelled like shared variables; Sending constant m on c is c!m; this is like an assignment c = m; Receiving variable x on c is c?x; this is also like an assignment x = c; This send/receive can be synchronous or asynchronous (i.e. buffered). Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 20 / 25

Modeling hardware, software, and protocols Modelling with Promela Modelling with Promela High-level language; features from the Communicating Sequential Processes (CSP, C. A. R. Hoare) logic, and C. You have: processes and typed data variables, including inter-process communication channels with sophisticated operations; control-flow similar to that of C; a form of macro called inline (no functions); channels may model either asynchronous or synchronous communication, and may include a buffer; a great way to model busy-waiting on a condition: (a == b) is a statement blocking the process until a equals b;...and many quirky features. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 21 / 25

Historical and bibliographical notes Historical notes The term reactive systems has been coined in On the Development of Reactive Systems, David Harel and Amir Pnueli, in NATO Advanced Study Institute (ASI) Series, Vol. F13, Logics and Models of Concurrent Systems, Springer-Verlag 1985. The term black cactus, also. Unfolding programs into transition systems has been used for verification of programs since the 70s, e.g. Formal verification of Parallel Programs, R. M. Keller, in Comm. ACM, 19(7), 1976. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 22 / 25

Historical and bibliographical notes Bibliographical notes The content in this lecture covers Ch. 2, Modeling Systems, from Model Checking, E. M. Clarke, O. Grumberg, D. A. Peled. For getting a grip with Spin, go for Ch. 2 Building Verification Models, Ch. 3 An Overview of Promela, Ch. 18 Overview of Spin Options, Ch. 19 Overview of Pan Options, from The Spin Model Checker, G. J. Holzmann. Alternatives: the Spin manual spinroot.com/spin/man/manual.html,../index.html,../roadmap.html. You may also read on the topic from Ch. 2, Modelling Concurrent Systems, from Principles of Model Checking, C. Baier, J.-P. Katoen. Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 23 / 25

Assignment Assignment (40%) Do the exercises at http://spinroot.com/spin/man/exercises.html from [1.a] to [1.f]. Dig through the manual pages http://spinroot.com/spin/man/index.html for info. (10%) Go back to the mutual exclusion on slides 18-19. Answer the Why? question on slide 19. (50%) (a) Model Peterson s mutual exclusion algorithm (with two processes) as a Kripke structure; you find the algorithm on Wikipedia. Compact the algorithm as much as possible before you start, and make the threads loop infinitely. Then, visually check on the model the Wikipedia statements under Mutual exclusion (P0 and P1 can never be in the critical section at the same time) and Bounded waiting. (b) Finally, fire up SPIN, find an implementation of this (peterson.pml in SPIN s sources already contains an assertion for mutual exclusion), and check that with an exhaustive search. Decode the output and compare the size of this state space with the one above. Then, prove whether mutual exclusion is still insured without the second wait condition (on turn), and simulate the error trace to see why it fails. See http://www.pst.ifi.lmu.de/~hammer/statespaces/peterson/ for some graphical entertainment. Yes, you can model Peterson with two processes in 30 states (even fewer!). Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 24 / 25

Assignment Assignment administration Handing in: in the lab session or electronically by email; either way, before the end of Wed next week (Nov 27). Grading: this is worth 15% of your Practical grade (and 9% of the final grade). Doina Bucur (d.bucur@rug.nl) Modelling systems, and model extraction from software Nov 2013 25 / 25