Games for Controller Synthesis

Similar documents
Static analysis of parity games: alternating reachability under parity

Reliability Guarantees in Automata Based Scheduling for Embedded Control Software

Static Program Transformations for Efficient Software Model Checking

Today s Agenda. Automata and Logic. Quiz 4 Temporal Logic. Introduction Buchi Automata Linear Time Logic Summary

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Testing LTL Formula Translation into Büchi Automata

A Partial-Observation Stochastic Games: How to Win when Belief Fails

Model Checking: An Introduction

T Reactive Systems: Introduction and Finite State Automata

Doomsday Equilibria for Games on Graphs

Reading 13 : Finite State Automata and Regular Expressions

Automata-based Verification - I

Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007

Regular Expressions and Automata using Haskell

The Model Checker SPIN

Measuring the Performance of an Agent

Properties of Stabilizing Computations

The positive minimum degree game on sparse graphs

Finite Automata. Reading: Chapter 2

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Regular Languages and Finite Automata

Omega Automata: Minimization and Learning 1

Algorithmic Software Verification

MetaGame: An Animation Tool for Model-Checking Games

Informatique Fondamentale IMA S8

Software Modeling and Verification

Near Optimal Solutions

Runtime Verification for Real-Time Automotive Embedded Software

Factoring & Primality

4.6 The Primitive Recursive Functions

Formal Verification of Software

(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

ω-automata Automata that accept (or reject) words of infinite length. Languages of infinite words appear:

Polarization codes and the rate of polarization

Lecture 13 - Basic Number Theory.

Formal Verification Problems in a Bigdata World: Towards a Mighty Synergy

Embeddability and Decidability in the Turing Degrees

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

6.080/6.089 GITCS Feb 12, Lecture 3

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

logic language, static/dynamic models SAT solvers Verified Software Systems 1 How can we model check of a program or system?

Enforcing Security Policies. Rahul Gera

On Recognizable Timed Languages FOSSACS 2004

LOOKING FOR A GOOD TIME TO BET

Automata on Infinite Words and Trees

1 if 1 x 0 1 if 0 x 1

6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation

Computationally Complete Spiking Neural P Systems Without Delay: Two Types of Neurons Are Enough

n 2 + 4n + 3. The answer in decimal form (for the Blitz): 0, 75. Solution. (n + 1)(n + 3) = n lim m 2 1

PETRI NET BASED SUPERVISORY CONTROL OF FLEXIBLE BATCH PLANTS. G. Mušič and D. Matko

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

Handling Fault Detection Latencies in Automata-based Scheduling for Embedded Control Software

Load Balancing and Switch Scheduling

Midterm Practice Problems

On Omega-Languages Defined by Mean-Payoff Conditions

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Formal Verification and Linear-time Model Checking

Finite Automata. Reading: Chapter 2

Fixed-Point Logics and Computation

Notes from Week 1: Algorithms for sequential prediction

Reinforcement Learning

Adaptive Online Gradient Descent

Lecture 22: November 10

Fabio Patrizi DIS Sapienza - University of Rome

No: Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

Data Streams A Tutorial

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

Integer Programming Formulation

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Welcome to... Problem Analysis and Complexity Theory , 3 VU

Notes on Complexity Theory Last updated: August, Lecture 1

10.2 Series and Convergence

Semantics and Verification of Software

An Introduction to Hybrid Automata

SYNTHESIS FROM PROBABILISTIC COMPONENTS

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

Reinforcement Learning

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Generating models of a matched formula with a polynomial delay

Introduction to Logic in Computer Science: Autumn 2006

Regular Languages and Finite State Machines

Mathematical Induction

Turing Degrees and Definability of the Jump. Theodore A. Slaman. University of California, Berkeley. CJuly, 2005

So let us begin our quest to find the holy grail of real analysis.

Solutions for Practice problems on proofs

Verification of hybrid dynamical systems

On Winning Conditions of High Borel Complexity in Pushdown Games

Combining Software and Hardware Verification Techniques

Exponential time algorithms for graph coloring

OPRE 6201 : 2. Simplex Method

A Network Flow Approach in Cloud Computing

Formal Specification and Verification

Software Model Checking: Theory and Practice

Scanner. tokens scanner parser IR. source code. errors

1 Definition of a Turing machine

A first step towards modeling semistructured data in hybrid multimodal logic

Competitive Analysis of On line Randomized Call Control in Cellular Networks

The Taxman Game. Robert K. Moniot September 5, 2003

Transcription:

Games for Controller Synthesis Laurent Doyen EPFL MoVeP 08

The Synthesis Question Given a plant P Tank Thermometer Gas burner

The Synthesis Question Given a plant P and a specification φ, Tank Thermometer Maintain the temperature in the range [T min,t max ]. Gas burner

The Synthesis Question Given a plant P and a specification φ, is there a controller C such that the closed-loop system C P satisfies φ? Tank Thermometer Maintain the temperature in the range [T min,t max ].? Gas burner Digital controller

Synthesis as a game Given a plant P and a specification φ, is there a controller C such that the closed-loop system C P satisfies φ? Inputs a Plant a? Specification Outputs u u! ϕ ϕ1( a1, u1) ϕ2( a2, u2) Plant: 2-players game arena Input (Player 1, System, Controller) vs. Output (Player 2, Environment, Plant) Specification: game objective for Player 1

The Synthesis Question Given a plant P and a specification φ, is there a controller C such that the closed-loop system C P satisfies φ? Controller? Controllable actions a a u u Uncontrollable actions Plant a? u! Specification ϕ ϕ1( a1, u1) ϕ2( a2, u2) If a controller C exists, then construct such a controller.

Synthesis as a game Controllable actions Controller a a? u u Uncontrollable actions Plant: 2-players game arena Specification: game objective for Player 1 Controller: winning strategy for Player 1 Plant a? u! Specification ϕ ϕ1( a1, u1) ϕ2( a2, u2) We are often interested in simple controllers: finite-state, or even stateless (memoryless). We are also often interested in least restrictive controllers.

Example Objective: avoid Bad? hot! cold! Uncontrollable actions

Example Objective: avoid Bad? hot! delay? on?, off? cold! Uncontrollable actions

Example Objective: avoid Bad off?? hot! cold! delay? delay? on?, off? on?, delay? off?, delay? Uncontrollable actions on?

Example Objective: avoid Bad off? hot delay cold delay on Winning strategy = Controller

Games for Synthesis Several types of games: Turn-based vs. Concurrent Perfect-information vs. Partial information Sure vs. Almost-sure winning Objective: graph labelling vs. monitor Timed vs. untimed Stochastic vs. deterministic etc. This tutorial: Games played on graphs, 2 players, turn-based, ω-regular objectives.

Games for Synthesis This tutorial: Games played on graphs, 2 players, turn-based, ω-regular objectives. Outline Part #1: perfect-information Part #2: partial-information

Two-player game structures

Rounded states belong to Player 1

Rounded states belong to Player 1 Square states belong to Player 2

belongs to Player 1 belongs to Player 2 Playing the game: the players move a token along the edges of the graph The token is initially in v 0. In rounded states, Player 1 chooses the next state. In square states, Player 2 chooses the next state.

belongs to Player 1 belongs to Player 2 Play: v 0

belongs to Player 1 belongs to Player 2 Play: v 0 v 1

belongs to Player 1 belongs to Player 2 Play: v 0 v 1 v 3

belongs to Player 1 belongs to Player 2 Play: v 0 v 1 v 3 v 0

belongs to Player 1 belongs to Player 2 Play: v 0 v 1 v 3 v 0 v 2

A 2-player game graph consists of: Two-player game graphs

Two-player game graphs A play in is an infinite sequence such that:

Who is winning? Play: v 0 v 1 v 3 v 0 v 2

Who is winning? Play: v 0 v 1 v 3 v 0 v 2 A winning condition for Player k is a set of plays.

Who is winning? A winning condition for Player k is a set of plays. A 2-player game is zero-sum if.

Winning condition A winning condition for Player k is a set of plays. Reachability

Winning condition A winning condition for Player k is a set of plays. Reachability Safety

Winning condition A winning condition for Player k is a set of plays. B C Reachability Safety Büchi cobüchi

Remark A winning condition for Player k is a set of plays. p=4 p=1 p=3 p=1 p=0 p=2 p=0 p=3 p=1 p=2

Strategies Players use strategies to play the game, i.e. to choose the successor of the current state. A strategy for Player k is a function:

Strategies outcome Graph: nondeterministic generator of behaviors. Strategy: deterministic selector of behavior. Graph + Strategies for both players Behavior

Strategies outcome

Winning strategies Given a game G and winning conditions W 1 and W 2, a strategy λ k is winning for Player k in (G,W k ) if for all strategies λ 3-k of Player 3-k, the outcome of {λ k, λ 3-k } in G is a winning play of W k. Player 1 is winning if Player 2 is winning if

Winning strategies = Controllers that enforce winning plays

Symbolic algorithms to solve games

Controllable predecessors

Controllable predecessors

Controllable predecessors

Controllable predecessors

Symbolic algorithm to solve safety games

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step.

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step: 2 steps:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step: 2 steps:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step: 2 steps:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step: 2 steps:

Solving safety games To win a safety game, Player 1 should be able to force the game to be in at every step. States in which Player 1 can force the game to stay in for the next: 0 step: 1 step: 2 steps: n steps:

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games This is the set of states from which Player 1 can confine the game in forever no matter how Player 2 behaves.

Solving safety games is a solution of the set-equation and it is the greatest solution.

Solving safety games is a solution of the set-equation and it is the greatest solution. We say that is the greatest fixpoint of the function, written: greatest fixpoint operator

On fixpoint computations

Partial order A partially ordered set is a set equipped with a partial order, i.e. a relation such that: is not necessarily total, i.e. there can be such that and.

Partial order Let. is an upper bound of if for all. is a least upper bound of if (1) is an upper bound of, and (2) for all upper bounds of. Note: if has a least upper bound, then it is unique (by anti-symmetry), and we write.

Partial order Examples: has no

Partial order Examples: has no

Partial order A set is a chain if The partially ordered set is complete if (1) has a lub, written, and (2) every chain has a lub.

Fixpoints Let be a function. is monotonic if implies. is continuous if(1) is monotonic, and (2) for every chain. where Note: is a chain (i.e. ) by monotonicity, and therefore exists.

Fixpoints Let be a function. is a fixpoint of if is a least fixpoint of if (1) is a fixpoint of, and (2) for all fixpoints of.

Kleene-Tarski Theorem Let be a partially ordered set. If is a complete partial order, and is a continuous function, then has a least fixpoint, denoted and Proof: exercise.

Kleene-Tarski Theorem Let be a partially ordered set. If is a complete partial order, and is a continuous function, then has a least fixpoint, denoted and Proof: exercise. Over finite sets S, all monotonic functions are continuous.

Kleene-Tarski Theorem Let be a partially ordered set. If is a complete partial order, and and The greatest fixpoint of can be defined dually by where is the greatest lower bound is a continuous function, then operator (dual of ) and has a least fixpoint, denoted Proof: exercise. Over finite sets S, all monotonic functions are continuous.

Safety game Winning states of a safety game: Limit of the iterations: Partial order: with.

Symbolic algorithm to solve reachability games

Solving reachability games To win a reachability game, Player 1 should be able to force the game be in after finitely many steps.

Solving reachability games To win a reachability game, Player 1 should be able to force the game be in after finitely many steps. Let be the set of states from which Player 1 can force the game to be in within at most steps:

Solving reachability games Tthe limit of this iteration is the least fixpoint of the function, written: least fixpoint operator

Symbolic algorithms Let be a 2-player game graph. Theorem Player 1 has a winning strategy

Remarks (I) Memoryless strategies are always sufficient to win parity games, and therefore also for safety, reachability, Büchi and cobüchi objectives.

Remarks (I) A memoryless winning strategy

Remarks (II) Parity games are determined: in every state, either Player 1 or Player 2 has a winning strategy.

Remarks (II) Parity games are determined: in every state, either Player 1 or Player 2 has a winning strategy. Determinacy says: More generally, zero-sum games with Borel objectives are determined [Martin75].

Remarks (II) For instance, since, Player 1 does not win iff Player 2 wins. Claim: if, then Proof: exercise Hint: show that

Remarks (II)

Remarks (II) States in which Player 1 wins for. States in which Player 2 wins for.

Games of imperfect information

The Synthesis Question off hot delay on, delay cold delay on, off off, delay? on The controller knows the state of the plant ( perfect information ). This, however, is often unrealistic. Sensors provide partial information (imprecision), Sensors have internal delays, Some variables of the plant are invisible, etc.

Imperfect information Observations Obs 0 off hot delay on, delay on, off cold delay off, delay on

Imperfect information Observations Obs 0 Obs 1 off hot delay on, delay on, off cold delay off, delay on

Imperfect information Observations Obs 0 Obs 1 off Obs 2 hot delay on, delay on, off cold delay off, delay on

Imperfect information Observations Obs 0 Obs 1 off Obs 2 hot delay on, delay on, off cold delay off, delay on When observing Obs 2, there is no unique good choice: memory is necessary

Player 2 states Nondeterminism off delay on, delay on, off delay off, delay on Playing the game: Player 2 moves a token along the edges of the graph, Player 1 does not see the position of the token. Player 1 chooses an action (on, off, delay), and then Player 2 resolves the nondeterminism and announces the color of the state.

off delay on, delay on, off delay off, delay on Player 2: Player 1:

off delay on, delay on, off delay off, delay on Player 2: v 1 chooses v 1, announces Obs 0 Player 1:

off delay on, delay on, off delay off, delay on Player 2: v 1 delay Player 1: delay plays action delay

off delay on, delay on, off delay off, delay on Player 2: v 1 delay v chooses v 3 3, announces Obs 2 Player 1: delay

off delay on, delay on, off delay off, delay on Player 2: v 1 delay v 3 off Player 1: delay off

off delay on, delay on, off delay off, delay on Player 2: v 1 delay v 3 off v 2 Player 1: delay off

Imperfect information A game graph + Observation structure off on delay delay on, off on, delay off, delay

Strategies Player 1 chooses a letter in, Player 2 resolves nondeteminisim. An observation-based strategy for Player 1 is a function: A strategy for Player 2 is a function:

Outcome

Winning strategies A winning condition for Player 1 is a set of sequences of observations. The set defines the set of winning plays: Player 1 is winning if

Solving games of imperfect information

Imperfect information Games of imperfect information can be solved by a reduction to games of perfect information. G,Obs G Winning region Imperfect information Perfect information subset construction classical techniques

Subset construction After a finite prefix of a play, Player 1 has a partial knowledge of the current state of the game: a set of states, called a cell.

Subset construction After a finite prefix of a play, Player 1 has a partial knowledge of the current state of the game: a set of states, called a cell. Initial knowledge: cell

Subset construction After a finite prefix of a play, Player 1 has a partial knowledge of the current state of the game: a set of states, called a cell. Initial knowledge: cell Player 1 plays σ, Player 2 chooses v 2. Current knowledge: cell

Subset construction Imperfect information Perfect information State space Initial state

Transitions Subset construction

Transitions Subset construction

Parity condition Subset construction

Subset construction Parity condition Theorem Player 1 is winning in G,p if and only if Player 1 is winning in G,p.

Imperfect information G,Obs G Winning region Imperfect information Perfect information subset construction Exponential blow-up classical techniques

Imperfect information G,Obs G Winning region Imperfect implicit Perfect information information Direct symbolic algorithm

Symbolic algorithm Controllable predecessor: set of cells set of cells

Symbolic algorithm Obs 1 Obs 2 The union of two controllable cells is not necessarily controllable, but

Symbolic algorithm If a cell s is controllable (i.e. winning for Player 1), then all sub-cells s s are controllable. copy the strategy from s

Symbolic algorithm The sets of cells computed by the fixpoint iterations are downward-closed.

Symbolic algorithm The sets of cells computed by the fixpoint iterations are downward-closed. It is sufficient to keep only the maximal cells.

Antichains

Antichains

Antichains is monotone with respect to the following order: Least upper bound and greatest lower bound are defined by:

Symbolic algorithms Let be a 2-player game graph of imperfect information, and a set of observations. Games of imperfect information can be solved by the same fixpoint formulas as for perfect information, namely: Theorem Player 1 has a winning strategy

Solving safety games o 1 o 2 o 3

Solving safety games o 1 o 2 o 3 Has Player 1 an observation-based strategy to avoid v 3? We compute the fixpoint

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games

Solving safety games Fixed point

Solving safety games Fixed point Player 1 is winning since

Solving safety games Fixed point A winning strategy:

Remarks 1. Finite memory may be necessary to win safety and reachability games of imperfect information, and therefore also for Büchi, cobüchi, and parity objectives.

Remarks 1. Finite memory may be necessary to win safety and reachability games of imperfect information, and therefore also for Büchi, cobüchi, and parity objectives. 2. Games of imperfect information are not determined.

Non determinacy o 1 o 2 Any fixed strategy of Player 1 can be spoiled by a strategy of Player 2 as follows: In : chooses if in the next step plays b, and chooses if in the next step plays a.

Non determinacy o 1 o 2 Player 1 cannot enforce. Similarly, Player 2 cannot enforce. because when a strategy of Player 2 is fixed, either or is a spoiling strategy for Player 1.

Remarks 1. Finite memory may be necessary to win safety and reachability games of imperfect information, and therefore also for Büchi, cobüchi, and parity objectives. 2. Games of imperfect information are not determined. 3. Randomized strategies are more powerful, already for reachability objectives.

Randomization o 1 o 2 The following strategy of Player 1 wins with probability 1: At every step, play and uniformly at random. After each visit to {v 1,v 2 }, no matter the strategy of Player 2, Player 1 has probability to win (reach v 3 ).

Summary

Conclusion Games for controller synthesis: symbolic algorithms using fixpoint formulas. Imperfect information is more realistic, gives more robust controllers; but exponentially harder to solve. Antichains: exploit the structure of the subset construction.

Conclusion Games for controller synthesis: symbolic algorithms using fixpoint formulas. Imperfect information is more realistic, gives more robust controllers; but exponentially harder to solve. Antichains: exploit the structure of the subset construction. It is sufficient to keep only the maximal elements.

Conclusion The antichain principle has applications in other problems where subset constructions are used: Finite automata: language inclusion, universality, etc. [De Wulf,D,Henzinger,Raskin 06] Alternating Büchi automata: emptiness and language inclusion. [D,Raskin 07] LTL: satisfiability and model-checking. [De Wulf,D,Maquet,Raskin 08]

Alaska Antichains for Logic, Automata and Symbolic Kripke Structure Analysis http://www.antichains.be

Acknowledgments Credits Antichains for games is a joint work with Krishnendu Chatterjee, Martin De Wulf, Tom Henzinger and Jean-François Raskin. Special thanks to Jean-François Raskin for slides preparation.

Thank you! Questions?