Equilibrium computation: Part 1

Transcription

1 Equilibrium computation: Part 1 Nicola Gatti 1 Troels Bjerre Sorensen 2 1 Politecnico di Milano, Italy 2 Duke University, USA Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

2 Outline 1 Models and solution concepts Mechanisms in strategic form Solution concepts 2 Non equilibrium solution concept computation Finding dominated actions Finding never best response actions 3 Computing a Nash equilibrium with strategic form games Matrix games Bimatrix games Polymatrix games 4 Computing correlation based equilibria with strategic form games Computing a correlated equilibrium Computing a leader follower equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

3 Game model Definition A game is formally defined by a pair: Mechanism M, defining the rules of the game Strategiesσ, defining the behavior of each agent in the game Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

4 Game model Definition A game is formally defined by a pair: Mechanism M, defining the rules of the game Strategiesσ, defining the behavior of each agent in the game Mechanisms There are three main classes of mechanisms: Strategic form mechanisms: agents play without observing the actions undertaken by the opponents (simultaneous games) Extensive form mechanisms: there is a sequential tree based structure according which an agent can observe some opponents actions Stochastic form mechanisms: there is a sequential graph based structure according which an agent can observe some opponents actions Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

5 Games in strategic form (1) Definition A strategic form mechanism is a tuple M = (N,{A} i N, X, f,{u} i N ) N: set of agents A i : set of actions available to agent i X: set of outcomes f : i N A i X: outcome function U i : X R: utility function of agent i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

6 Games in strategic form (2) Example: Rock Paper Scissors N = {agent 1, agent 2} A 1 = A 2 = {R, P, S} X = {win1, win2, tie} f(r, S) = f(p, R) = f(s, P) = win 1, f(s, R) = f(r, P) = f(p, S) = win2, tie otherwise U i (wini) = 1, U i (win i) = 1, U i (tie) = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

7 Games in strategic form (2) Example: Rock Paper Scissors N = {agent 1, agent 2} A 1 = A 2 = {R, P, S} X = {win1, win2, tie} f(r, S) = f(p, R) = f(s, P) = win 1, f(s, R) = f(r, P) = f(p, S) = win2, tie otherwise U i (wini) = 1, U i (win i) = 1, U i (tie) = 0 Matrix based representation agent 1 agent 2 R P S R 0, 0 1, 1 1, 1 P 1, 1 0, 0 1, 1 S 1, 1 1, 1 0, 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

8 Games in strategic form (3) Example: three player game A 1 = {a, b} A 2 = {L, R} A 3 = {A, B, C} L R a 2, 2, 1 0, 3, 0 b 3, 0, 2 1, 1, 4 A L R a 2, 3, 0 0, 4, 1 b 3, 1, 2 1, 2, 0 B L R a 2, 1, 0 1, 0, 2 b 0, 3, 1 2, 3, 1 C Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

9 Matrix based games Classification Matrix game: the agents utilities can be represented by a unique matrix (this happens with two agent constant sum games: U 1 + U 2 = constant for every entry) Bimatrix game: two agent general sum games Polymatrix game: the utility U i of each agent i can be expressed as a set of matrices U i,j depending only on the actions of agent i and agent j with non polymatrix games, U i has j N A j entries with polymatrix games, U i has A i j N,j i A j entries Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

10 Strategies Definition A strategy σ i of agent i is a probability distribution over the actions A i Call x i,j the probability with which agent i plays action j and x i the vector of x i,j, we need that x i 0 1 T x i = 1 A strategy profileσ is the collection of one strategy per agent, σ = (σ 1,...,σ N ) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

11 Strategies Definition A strategy σ i of agent i is a probability distribution over the actions A i Call x i,j the probability with which agent i plays action j and x i the vector of x i,j, we need that x i 0 1 T x i = 1 A strategy profileσ is the collection of one strategy per agent, σ = (σ 1,...,σ N ) Example With Rock Paper Scissors games can be: x 1 = x 1,R = 0.2 x 1,P = 0.8 x 2 = x 2,R = 0.6 x 2,P = 0.0 x 1,S = 0.0 x 2,S = 0.4 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

12 Expected utility (1) Definition The expected utility of an agent i related to an action j is: U i x k k N,k i j where(a) j is the j th row of matrix A U i k N,k i x k is the vector of expected utilities of agent i The expected utility of an agent i related to a strategy x i is: x T i U i k N,k i x k Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

13 Expected utility (2) Example U 1 = x 1 = x 2 = The expected utilities related to each action of agent 1 are: = The expected utility related to the strategy of agent 1 is: [ ] = Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

14 Game equivalence Definition Given two games with utility functions U 1,...,U N and U 1,...,U N respectively, if, for every i N, there is an affine transformation between U i and U i such that U i = α iu i +β i A 1 where A 1 is a matrix of ones, then the two games are equivalent Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

15 Game equivalence Definition Given two games with utility functions U 1,...,U N and U 1,...,U N respectively, if, for every i N, there is an affine transformation between U i and U i such that U i = α iu i +β i A 1 where A 1 is a matrix of ones, then the two games are equivalent Example U 1 = U 1 = = Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

16 Solutions and solution concepts Definition Given: The strategy x i of each agent i The beliefˆx i j each agent i has over the strategy x j of agent j A solution is a pair(σ,µ), whereµis the set of agents beliefs, such that Rationality constraints: the strategies of each agent are optimal w.r.t. the beliefs Information constraints: the beliefs of each agent are somehow consistent w.r.t. the opponents strategies Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

17 Solutions and solution concepts Definition Given: The strategy x i of each agent i The beliefˆx i j each agent i has over the strategy x j of agent j A solution is a pair(σ,µ), whereµis the set of agents beliefs, such that Rationality constraints: the strategies of each agent are optimal w.r.t. the beliefs Information constraints: the beliefs of each agent are somehow consistent w.r.t. the opponents strategies Definition A solution concept defines the set of rationality and information constraints Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

18 Solution concept classification Non equilibrium solution concepts Dominance and iterated dominance Never best response and iterated never best response Maxmin strategy and minmax strategy Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

19 Solution concept classification Non equilibrium solution concepts Dominance and iterated dominance Never best response and iterated never best response Maxmin strategy and minmax strategy Equilibrium solution concepts without correlation Nash relaxations: conjectural equilibrium, self confirming equilibrium Nash Nash refinements: perfect equilibrium, proper equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

20 Solution concept classification Non equilibrium solution concepts Dominance and iterated dominance Never best response and iterated never best response Maxmin strategy and minmax strategy Equilibrium solution concepts without correlation Nash relaxations: conjectural equilibrium, self confirming equilibrium Nash Nash refinements: perfect equilibrium, proper equilibrium Equilibrium solution concepts with correlation One agent based correlation: leader follower/stackelberg/committment equilibrium Device based correlation: correlated equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

21 Dominance (1) Definition Action j A i is strictly dominated if there is a strategy x over A that, for every action of the opponents, provides an expected utility larger than action j e T j U i < x T U i where e j is a vector of zeros except for position j wherein there is 1 Example agent 1 Action C is dominated by action B agent 2 D E F A 4, 1 1, 2 1, 3 B 1, 4 4, 0 4, 1 C 0, 1 2, 5 2, 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

22 Dominance (2) Weakly dominance Action j A i is weakly dominated if there is a strategy x over A that, for every action of the opponents, provides an expected utility equal to or larger than action j Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

23 Dominance (2) Weakly dominance Action j A i is weakly dominated if there is a strategy x over A that, for every action of the opponents, provides an expected utility equal to or larger than action j Dominance and rationality No rational agent will play an action that is strictly dominated Strictly dominated actions can be safely removed from the game, never being played The application of strong dominance leads to a reduced game that is equivalent to the original one Weakly dominated actions could be played by agents Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

24 Dominance and mixed strategies Property Dominance with mixed strategies is stronger than with pure strategies Example agent 1 agent 2 D E F A 4, 1 1, 2 1, 3 B 1, 4 4, 0 4, 1 C 2, 1 2, 5 2, 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

25 Dominance and mixed strategies Property Dominance with mixed strategies is stronger than with pure strategies Example agent 1 Dominance in pure strategies agent 2 D E F A 4, 1 1, 2 1, 3 B 1, 4 4, 0 4, 1 C 2, 1 2, 5 2, 0 No action of the agent 1 is dominated by another action Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

26 Dominance and mixed strategies Property Dominance with mixed strategies is stronger than with pure strategies Example agent 1 Dominance in pure strategies agent 2 D E F A 4, 1 1, 2 1, 3 B 1, 4 4, 0 4, 1 C 2, 1 2, 5 2, 0 No action of the agent 1 is dominated by another action Dominance in mixed strategies Action C is dominated by x = [ ] Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

27 Dominance with more than two agents Example L R a 2, 2, 1 0, 3, 0 b 3, 0, 2 1, 1, 4 A L R a 2, 3, 0 0, 4, 1 b 3, 1, 2 1, 2, 0 B L R a 2, 1, 0 1, 0, 2 b 3, 3, 1 2, 3, 1 C Action a is dominated by action b Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

28 Dominance as a solution concept Comments Dominance does not require any assumption over the information available to each agent except for the knowledge of own utility Dominance prescribes what actions are to play and what are not to play independently of the opponents strategies Dominance does not prescribe any strategy over the non dominated actions We have an equilibrium in dominant strategies if dominance removes all the actions except one for every agent Example agent 2 S C agent 1 S 2, 2 0, 3 C 3, 0 1, 1 agent 1 agent 2 H T H 2, 0 0, 2 T 0, 2 2, 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

29 Iterated dominance Definition Under the assumption of complete information over the utility and common information over rationality and utilities, each agent can forecast the dominated actions of the opponents and iteratively remove her own actions Example agent 1 agent 2 D E F A 3, 2 2, 1 2, 0 B 0, 2 0, 5 3, 3 C 0, 1 1, 2 1, 4 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

30 Best response Definition The best response of agent i is an action that maximizes her expected utility given the strategies of the opponents as input BR i (σ i ) = arg max j A i et j U i x k where x k are given k N,k=i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

31 Best response Definition The best response of agent i is an action that maximizes her expected utility given the strategies of the opponents as input BR i (σ i ) = arg max j A i et j U i x k where x k are given Comments BR i (σ i ) can return multiple actions k N,k=i A rational agent will play only best response actions Any mixed strategy over best response actions is a best response Any non never best response action is said rationalizable Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

32 Never best response Definition A never best response of agent i is an action j such that there is not any opponents strategy profile such that action j is a best response j BR i (σ i ) σ i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

33 Never best response Definition A never best response of agent i is an action j such that there is not any opponents strategy profile such that action j is a best response j BR i (σ i ) σ i Comments No rational agent will play never best response actions Never best response actions can be safely removed Rationalizability requires each agent to know her own utilities, no assumption is required over the information on the opponents utilities and rationality Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

34 Never best response Definition A never best response of agent i is an action j such that there is not any opponents strategy profile such that action j is a best response j BR i (σ i ) σ i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

35 Never best response Definition A never best response of agent i is an action j such that there is not any opponents strategy profile such that action j is a best response j BR i (σ i ) σ i Comments No rational agent will play never best response actions Never best response actions can be safely removed When information on the utilities and rationality is complete and common, rationalizability can be iterated Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

36 Rationalizability and dominance (1) Comments Dominance and rationalizability are equivalent with two agents (the proof is by strong duality) With more than two agents, every dominated action is a never best response, but the reverse may not hold (rationalizability removes a larger number of actions than dominance) The main difference: Dominance is similar to rationalizability, but it implicitly assumes that the opponents correlate their strategy as a unique agent Rationalizability explicitly considers each opponent as a different uncorrelated agent If an action is dominated when the opponents can correlate is also dominated when they cannot If an action is dominated when the opponents cannot correlate, it may be not when they can Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

37 Rationalizability and dominance (2) Example L R a 0, 0, 0 0, 0, 0 b 8, 8, 8 0, 0, 0 L R a 0, 0, 0 8, 8, 8 b 0, 0, 0 0, 0, 0 L R a 4, 4, 4 0, 0, 0 b 0, 0, 0 4, 4, 4 L R a 3, 3, 3 3, 3, 3 b 3, 3, 3 3, 3, 3 A B C D Action D is not strictly dominated, but it is a never best response Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

38 Maxmin Assumptions An agent does not know anything about her opponents An agent aims at maximize her utility in the worst case (safety level) Definition A maxmin strategyσ of agent i is defined as: σ = arg max mine[u i ] σ i σ i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

39 Minmax Assumptions An agent knows the utility of the opponent An agent aims at minimize the opponent expected utility Definition A minmax strategyσ of agent i is defined as: σ = arg min maxe[u i ] σ i σ i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

40 Nash equilibrium (1) Assumptions Agents do not communicate before playing Agents know the utilities of the opponents and this information is common Definition A Nash equilibrium is a strategy profile(x 1,...,x n) such that: (x i )T U i j N,j i x j x T i U i j N,j i x j x i, i N Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

41 Nash equilibrium (1) Assumptions Agents do not communicate before playing Agents know the utilities of the opponents and this information is common Definition A Nash equilibrium is a strategy profile(x 1,...,x n) such that: Comments (x i )T U i j N,j i x j x T i U i j N,j i x j x i, i N In a Nash equilibrium, no agent can more by changing her strategy given that the opponents do not change (i..e, every x i is a randomization over best responses) Coalition deviations are not considered Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

42 Nash equilibrium (2) Definition A Nash equilibrium is a strategy profile(x 1,...,x n ) such that: (x i )T U i e T k U i k A i, i N Comments j N,j i x j j N,j i We can substitute x i (infinite constraints) with k A i ( A i constraints) because x T i U i j N,j i x j is a convex combination of different e T k U i j N,j i x j x T i U i j N,j i x j is smaller than or equal to max k e T k U i j N,j i x j since we cannot know what is k with the largest e T k U i j N,j i x j, we impose to be larger than equal to all the e T k U i j N,j i x j We obtain a finite number of constraints that is linear in the size of the game x j Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

43 Nash theorem Theorem Every finite game admits at least a Nash equilibrium in mixed strategies Comments The proof is by Brouwer fixed point theorem: a Nash equilibrium is a fixed point Pure strategies Nash equilibria may not exist (e.g., Matching penny) Multiple equilibria can coexist With continuous games, the things are more complicated (a continuous game may not admit any Nash equilibrium, neither in mixed strategies) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

44 Example (1) Pure strategy equilibrium agent 1 Pure strategy equilibrium agent 1 agent 2 D E F A 1, 3 2, 1 1, 0 B 3, 2 0, 5 2, 3 C 0, 1 1, 2 3, 3 agent 2 D E F A 6, 2 2, 1 1, 6 B 3, 2 3, 3 2, 3 C 0, 6 1, 2 3, 3 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

45 Example (2) Multiple pure strategy equilibria agent 1 No pure strategy equilibrium agent 1 agent 2 D E F A 6, 2 2, 1 1, 6 B 3, 2 3, 3 2, 3 C 0, 6 1, 2 9, 9 agent 2 D E F A 6, 2 2, 1 1, 6 B 3, 2 0, 3 2, 3 C 0, 6 1, 2 3, 3 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

46 Nash equilibrium and Pareto efficiency Example agent 2 S C agent 1 S 2, 2 0, 3 C 3, 0 1, 1 There is a unique Nash equilibrium(c, C) (C, C) is Pareto dominated by (S, S) (C, C) is the unique Pareto dominated strategy profile There is no relationship between Pareto dominance and Nash equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

47 Perturbed games (1) Perturbation Given a set of action A i, a perturbation over it corresponds to a probability function f i,j with j A i and j A i f i,j < 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

48 Perturbed games (1) Perturbation Given a set of action A i, a perturbation over it corresponds to a probability function f i,j with j A i and j A i f i,j < 1 Parametric perturbation Perturbation f i,j = f i,j (ǫ) withǫ [0, 1] Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

49 Perturbed games (1) Perturbation Given a set of action A i, a perturbation over it corresponds to a probability function f i,j with j A i and j A i f i,j < 1 Parametric perturbation Perturbation f i,j = f i,j (ǫ) withǫ [0, 1] Perturbed game Given a perturbation f i,j, a perturbed game is a game in which strategies are constrained as: i N, j A i : x i,j f i,j Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

50 Perturbed games (1) Perturbation Given a set of action A i, a perturbation over it corresponds to a probability function f i,j with j A i and j A i f i,j < 1 Parametric perturbation Perturbation f i,j = f i,j (ǫ) withǫ [0, 1] Perturbed game Given a perturbation f i,j, a perturbed game is a game in which strategies are constrained as: i N, j A i : x i,j f i,j Perturbation and Nash equilibrium The introduction of perturbation (i.e., a perturbation game) affects the set of Nash equilibria Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

51 Perturbed games (2) Example agent 1 agent 2 C D A 10, 10 0, 0 B 0, 0 1, 1 Perturbation: f 1,A = 0.2, f 1,B = 0.2, f 2,C = 0.2, f 2,D = 0.2 (A, C) and (B, D) are Nash equilibria without perturbation (0.8A+0.2B, 0.8C+ 0.2D) is a Nash equilibrium with perturbation: all the probability except for the perturbation is put on(a, C) (0.2A+0.8B, 0.2C+ 0.BD) is not a Nash equilibrium with perturbation: all the probability except for the perturbation cannot put on(b, D) Perturbation: f 1,A = 0.05, f 1,B = 0.05, f 2,C = 0.05, f 2,D = 0.05 (B, D) is a Nash equilibrium without perturbation (0.2A+0.8B, 0.2C+ 0.BD) is a Nash equilibrium with perturbation: all the probability except for the perturbation is put on(b, D) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

52 Perfect equilibrium (1) Definition A strategy profileσ is a perfect equilibrium if there is a f i,j (ǫ) such that, called σ (ǫ) a sequence of Nash equilibria for any ǫ [0,ǫ 0 ] of the associated perturbed games,σ (ǫ) σ as ǫ 0 Example agent 1 agent 2 C D A 1, 1 0, 0 B 0, 0 0, 0 For every f 1,A (ǫ) > 0, action D is not a best response For every f 2,C (ǫ) > 0, action B is not a best response (B, D) is a Nash equilibrium, but it is not perfect (A, C) is a perfect equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

53 Perfect equilibrium (2) Properties An equilibrium is perfect if it keeps to be a Nash equilibrium when minimally perturbed Every finite game admits at least a perfect equilibrium Every perfect equilibrium is a Nash equilibrium in which no weakly dominated action is played The vice versa (i.e., every Nash equilibrium in which no weakly dominated action is played is a perfect equilibrium) is true only for two player games There is not relationship between perfect equilibrium and Pareto efficiency We can safely consider only f i,j (ǫ) that are polynomial inǫ Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

54 Perfect equilibrium (3) Example C D A 1, 1, 1 1, 0, 1 B 1, 1, 1 0, 0, 1 E C D A 1, 1, 1 0, 0, 0 B 0, 1, 0 1, 0, 0 F F is weakly dominated for agent 3 D is weakly dominated for agent 2 (A, C, E) and (B, C, E) are Nash equilibria without weakly dominated actions (A, C, E) is not perfect Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

55 Perfect equilibrium (3) Example agent 1 agent 2 C D A 1, 1 10, 0 B 0, 10 10, 10 There are two pure strategy Nash equilibria (A, C) and(b, D) Actions B and D are weakly dominated The unique perfect Nash is (A, C) (B, D) Pareto dominates (A, C) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

56 Perfect equilibrium (5) Example agent 1 agent 2 A B agent 1 a 1, 1 0, 0 b 0, 0 0, 0 agent 2 A B C a 1, 1 0, 0-1,2 b 0, 0 0, 0 0,-2 c 2, 1 2, 0-2,-2 Without c and C, the unique perfect equilibrium is (a, A) With c and C, (b, B) is a perfect equilibrium The introduction of strictly dominated actions may change the set of perfect equilibria Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

57 Proper equilibrium (1) Perfection weakness The perfect equilibrium is sensible to weakly dominated actions Aim The design of a solution concept refining Nash equilibrium that is not sensible to weakly dominated actions Properness idea A proper equilibrium is a perfect equilibrium with a specific perturbation: given two actions j and k of agent i, if j provides a utility strictly larger than k, then perturbation f i,k is subject to f i,k ǫf i,j In other words The perturbation has the property that a good action must be played (due to perturbation) with probability larger than the probability of a bad action Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

58 Proper equilibrium (2) Properties Every game admits at least a proper equilibrium The proper equilibrium removes weakly dominated strategies with two player games With more agents, the proper equilibrium may not remove weakly dominated strategies Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

59 Correlated equilibrium (1) Assumptions Agents can correlate in some way Typically, a correlation device is considered that sends different signals to each agent Definition A correlated equilibrium is a tuple(v,π,σ), where v is a tuple of random variables v = (v 1,...,v n ) with respective domains D = (D 1,...,D n ),π is a joint distribution over v,σ = (σ 1,...,σ n ) is a vector of mappings σ i : D i A i, and for each agent i and every mapping σ i is the case that: d Dπ(d i, d i )U i (σ i (d i ),σ i (d i )) d D It is possible to limit strategies σ i to be pure π(d i, d i )U i (σ i (d i),σ i (d i )) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

60 Correlated equilibrium (2) Properties Every Nash equilibrium is a correlated equilibrium in which there is only one signal per agent A correlated equilibrium may be not a Nash equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

61 Leader follower equilibrium (1) Assumptions An agent, called leader, can announce (commit to) her strategy to the opponents The other agents, called followers, act knowing the commitment The announce must be credible Definition A leader follower equilibrium is a strategy profile in which the expected utility of the leader is maximized given that the followers act knowing the strategy of the leader Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

62 Leader follower equilibrium (2) Properties With two agents The follower acts in pure strategies, selecting the best response that maximizes the leader s expected utility The leader follower equilibrium is always (non strictly) better than the Nash equilibrium With more than two agents The follower may act in mixed strategies The leader follower equilibrium may be worse than the Nash equilibrium Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

63 Finding dominated actions (1) Formulation Given action j A i, dominance can be checked as follows: max xi,r r U i x i 1r U i e j 1 T x i = 1 x i 0 r free in sign If the optimal value r is > 0, then j is strictly dominated If the optimal value r is = 0, then j is weakly dominated If the optimal value r is < 0, then j is not dominated Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

64 Finding dominated actions (2) Duality Given action j A i, dominance can be checked as follows: max x,r r 1 ( ) e T j U ix (U i x) j 1r 0 1 T x = 1 x 0 (where x is a correlated strategy) If the optimal value r is > 0, then j is not dominated If the optimal value r is = 0, then j is weakly dominated If the optimal value r is < 0, then j is strictly dominated Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

65 Finding dominated actions (2) Duality Given action j A i, dominance can be checked as follows: max x,r r 1 ( ) e T j U ix (U i x) j 1r 0 1 T x = 1 x 0 (where x is a correlated strategy) If the optimal value r is > 0, then j is not dominated If the optimal value r is = 0, then j is weakly dominated If the optimal value r is < 0, then j is strictly dominated Equivalence By strong duality, the primal and dual problems are equivalent Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

66 Finding never best response actions Formulation Given action j A i, rationalizability can be checked as follows: max {xi } i N,r r ( U i k N,k i x k 1 e T j U ) i k N,k i x k 1r 0 1 T x i = 1 i N x i 0 i N If the optimal value r is > 0, then j is not a never best response If the optimal value r is = 0, then j is a weak never best response If the optimal value r is < 0, then j is a strict never best response Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

67 Dominance vs. rationalizability Comments Rationalizability poses additional constraints w.r.t. dominance Practically, the optimal value of r is not larger than the value with dominance Reducing the optimal value of r, it is possible that the optimal r is negative, and therefore the action is a a never best response Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

68 Dominance vs. rationalizability: complexity Complexity results Dominance: checking whether an action is dominated is inp for every n player game, being formulable as a linear mathematical programming problem Rationalizability: checking whether an action is a never best response is inp for two player games, being formulable as a linear mathematical programming problem, and NP complete for every n player games with n > 2, by reduction to SAT3 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

69 Matrix games Strategies x 1 = [ x 1,1... x 1,n1 ] T x 2 = [ x 2,1... x 2,n2 ] T Payoff matrices U 1 = a 1,1... a 1,n U 2 = U 1 a n1,1... a n1,n 2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

70 Best response Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

71 Best response Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments 3 i=1 a 1,ix 2,i is the utility expected by agent 1 from taking action 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

74 Best response Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments Agent 1 will play only the actions j that maximize 3 i=1 a j,ix 2,i and therefore two actions j, j will be played only if they provide the same expected utility Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

75 Best response Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments The strategy space is a simplex 2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

76 Best response: dual Agent 1 s best response (dual) Example min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

77 Best response: dual Agent 1 s best response (dual) Example Comments min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 is an upper bound of 3 i=1 a j,ix 2,i for every j v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

78 Best response: dual Agent 1 s best response (dual) Example Comments min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 Since we are minimizing v 1, the optimal v 1 equals the largest 3 i=1 a j,ix 2,i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

79 Primal dual relation Agent 1 s best response (primal) u = max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Agent 1 s best response (dual) Strong duality v = min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign The mathematical problem being linear, we have u = v by strong duality Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

80 Minmax strategy Definition The minmax strategy of agent i is the strategy that minimizes agent i s expected utility when agent i acts as best responder Formulation We consider x 2 as a variable and not as an input to the problem min v1,x 2 v 1 1v 1 U 1 x 2 0 v 1 free in sign 1 T x 2 = 1 x 2 0 Agent 2 is trying to minimize the expected utility of agent 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

81 Minmax and duality: maxmin strategy Definition The maxmin strategy of agent i is the strategy that maximizes agent i s expected utility when agent i acts to minimize agent i s expected utility Maxmin formulation The minmax dual problem is (maxmin formulation): max u1,x 2 u 1 1u 1 U1 Tx 1 0 u 1 free in sign 1 T x 1 = 1 x 1 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

82 Minmax and duality: maxmin strategy Definition The maxmin strategy of agent i is the strategy that maximizes agent i s expected utility when agent i acts to minimize agent i s expected utility Maxmin formulation The minmax dual problem is (maxmin formulation): Strong duality max u1,x 2 u 1 1u 1 U1 Tx 1 0 u 1 free in sign 1 T x 1 = 1 x 1 0 By strong duality the maxmin problem and the minmax problem produce the same value Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

83 Example Strategies x 1 = [ x 1,1 x 1,2 x 1,3 ] T x 2 = [ x 2,1 x 2,2 x 2,3 ] T Payoff matrices U 1 = U 2 = U 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

84 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

85 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

86 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) x 2,1 (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

87 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) x 2,1 x 2,2 (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

88 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) x 2,1 x 2,2 x2,3 (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

89 Maxmin: graphical interpretation (1) x 1,3 (0,0,1) (0,0,0) x 2,1 x 2,2 x2,3 (0,1,0) x 1,2 x 1,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

90 Maxmin: graphical interpretation (2) x l, x l,1 x l,2 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

91 Minmax/maxmin/Nash with zero sum games (1) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same (see below) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

92 Minmax/maxmin/Nash with zero sum games (1) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same (see below) minmax maxmin min v1,x 2 v 1 max u2,x 2 u 2 1v 1 U 1 x 2 0 1u 2 U2 Tx 2 0 v 1 free in sign u 2 free in sign 1 T x 2 = 1 1 T x 2 = 1 x 2 0 x 2 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

93 Minmax/maxmin/Nash with zero sum games (1) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same (see below) minmax maxmin min v1,x 2 v 1 max u2,x 2 u 2 1v 1 +U 2 x 2 0 1u 2 U2 Tx 2 0 v 1 free in sign u 2 free in sign 1 T x 2 = 1 1 T x 2 = 1 x 2 0 x 2 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

94 Minmax/maxmin/Nash with zero sum games (1) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same (see below) minmax maxmin max v 1,x 2 v 1 max u2,x 2 u 2 1v 1 + U 2x 2 0 1u 2 U2 Tx 2 0 v 1 free in sign u 2 free in sign 1 T x 2 = 1 1 T x 2 = 1 x 2 0 x 2 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

95 Minmax/maxmin/Nash with zero sum games (1) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same (see below) minmax maxmin max v 1,x 2 v 1 max u2,x 2 u 2 1v 1 U 2x 2 0 1u 2 U2 Tx 2 0 v 1 free in sign u 2 free in sign 1 T x 2 = 1 1 T x 2 = 1 x 2 0 x 2 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

96 Minmax/maxmin/Nash with zero sum games (2) Comments Maxmin strategy of agent i is the best response to the minmin strategy of agent i (by strong duality) With zero sum games, maxmin and minmax strategies of agent i are the same Maxmin/minmax strategies constitute a Nash equilibrium Suppose agent i to play her minmax/maxmin strategy The best response of the agent i is her maxmin/minmax strategy The best response of the agent i is her maxmin/minmax strategy Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

97 Bimatrix games Strategies x 1 = [ x 1,1... x 1,n1 ] T x 2 = [ x 2,1... x 2,n2 ] T Payoff matrices U 1 = a 1,1... a 1,n U 2 = b 1,1... b 1,n a n1,1... a n1,n 2 b n2,1... a n2,n 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

98 Bimatrix games: example Strategies x 1 = [ x 1,1 x 1,2 x 1,3 ] T x 2 = [ x 2,1 x 2,2 x 2,3 ] T Payoff matrices U 1 = U 2 = Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

99 Equilibrium constraints: primal Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

100 Equilibrium constraints: primal Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments 3 i=1 a 1,ix 2,i is the utility expected by agent 1 from taking action 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

103 Equilibrium constraints: primal Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments Agent 1 will play only the actions j that maximize 3 i=1 a j,ix 2,i and therefore two actions j, j will be played only if they provide the same expected utility Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

104 Equilibrium constraints: primal Agent 1 s best response (primal) max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Example max x1 x 3 1,1 i=1 a 1,ix 2,i + x 3 1,2 i=1 a 2,ix 2,i + x 3 1,3 i=1 a 3,ix 2,i x 1,1 + x 1,2 + x 1,3 = 1 x 1,1, x 1,2, x 1,3 0 Comments The strategy space is a simplex 2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

105 Strategies, simplices, and best response x 1,3 (0,0,1) x 2,3 (0,0,1) (0,0,0) (0,1,0) x 1,2 (0,0,0) x 1,1 (1,0,0) (0,1,0) x 2,2 x 2,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

106 Strategies, simplices, and best response x 1,3 (0,0,1) x 2,3 (0,0,1) (0,0,0) (0,1,0) x 1,2 (0,0,0) x 1,1 (1,0,0) (0,1,0) x 2,2 x 2,1 (1,0,0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

107 Strategies, simplices, and best response x 1,3 (0,0,1) x 2,3 (0,0,1) (0,0,0) x 2,1 (0,1,0) x 1,2 (0,0,0) (0,0.5,0.5) x 1,1 (1,0,0) x 2,1 (1,0,0) x 1,1 (0.6,0.3,0) (0.3,0.6,0) (0,1,0) x 2,2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

108 Strategies, simplices, and best response x 1,3 (0,0,1) (0,0.3,0.6) x 2,3 (0,0,1) (0.25,0,0.75) (0,0,0) x 2,1 x 2,2 (0,1,0) x 1,2 (0,0,0) (0.41,0.34,0.25) (0,0.5,0.5) x 1,2 x 1,1 (1,0,0) x 2,1 (1,0,0) x 1,1 (0.6,0.3,0) (0.3,0.6,0) (0,1,0) x 2,2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

109 Strategies, simplices, and best response x 1,3 (0,0,1) (0,0.3,0.6) x 2,3 (0,0,1) x 2,3 (0.25,0,0.75) (0,0,0) x 2,1 x 2,2 (0,1,0) x 1,2 (0,0,0) (0.41,0.34,0.25) x 1,1 (1,0,0) (0,0.5,0.5) x 2,1 (1,0,0) x 1,3 x 1,2 x 1,1 (0.6,0.3,0) (0.3,0.6,0) (0,1,0) x 2,2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

110 Strategies, simplices, and best response x 1,3 (0,0,1) (0,0.3,0.6) x 2,3 (0,0,1) x 2,3 (0.25,0,0.75) (0,0,0) x 2,1 x 2,2 (0,1,0) x 1,2 (0,0,0) (0.41,0.34,0.25) x 1,1 (1,0,0) (0,0.5,0.5) x 2,1 (1,0,0) x 1,3 x 1,2 x 1,1 (0.6,0.3,0) (0.3,0.6,0) (0,1,0) x 2,2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

111 Equilibrium constraints: dual Agent 1 s best response (dual) Example min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

112 Equilibrium constraints: dual Agent 1 s best response (dual) Example Comments min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 is an upper bound of 3 i=1 a j,ix 2,i for every j v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

113 Equilibrium constraints: dual Agent 1 s best response (dual) Example Comments min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign min v1 v 1 v 1 3 i=1 a 1,ix 2,i 0 v 1 3 i=1 a 2,ix 2,i 0 v 1 3 i=1 a 3,ix 2,i 0 free in sign v 1 Since we are minimizing v 1, the optimal v 1 equals the largest 3 i=1 a j,ix 2,i Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

114 Primal dual relation Agent 1 s best response (primal) u = max x1 x T 1 U 1x 2 1 T x 1 = 1 x 1 0 Agent 1 s best response (dual) Strong duality v = min v1 v 1 1v 1 U 1 x 2 0 v 1 free in sign The mathematical problem being linear, we have u = v by Strong duality theorem Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

115 Equilibrium constraints: complementary slackness Agent 1 s best response (complementary slackness) 1v 1 U 1 x 2 0 x T x 1 = 1 v 1 free in sign x T 1 (1v 1 U 1 x 2 ) = 0 Example: linear complementarity constraints x 1,1 = 0 or v 1 3 i=1 a 1,ix 2,i = 0 x 1,2 = 0 or v 1 3 i=1 a 2,ix 2,i = 0 x 1,3 = 0 or v 1 3 i=1 a 3,ix 2,i = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

116 Equilibrium constraints: complementary slackness Agent 1 s best response (complementary slackness) 1v 1 U 1 x 2 0 x T x 1 = 1 v 1 free in sign x T 1 (1v 1 U 1 x 2 ) = 0 Example: linear complementarity constraints Comments x 1,1 = 0 or v 1 3 i=1 a 1,ix 2,i = 0 x 1,2 = 0 or v 1 3 i=1 a 2,ix 2,i = 0 x 1,3 = 0 or v 1 3 i=1 a 3,ix 2,i = 0 Action j is played only if 3 i=1 a j,ix 2,i equals v 1, that is, only if it is a best response Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

117 Equilibrium constraints: formulations (1) LCP formulation 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 free in sign v 1 v 2 x T 1 (1v 1 U 1 x 2 ) = 0 x T 2 (1v 2 U 2 x 1 ) = 0 free in sign Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

118 Mixed integer linearization LC constraints x T 1 (1v 1 U 1 x 2 ) = 0 Linearization We introduce a vector s 1 of binary variables When s 1,j = 1, action j is in the support of agent 1, it is not otherwise An action j can be played with positive probability only if in the support x 1 s 1 If an action is in the support, it must be a best response (i.e., it must return v 1 ); call U 1 = max U 1 : 1v 1 U 1 x 2 U 1 (1 s 1 ) 0 if s 1,j = 1, then v 1 = (U 1 x 2 ) j given that 1v 1 U 1 x 2 0 if s 1,j = 0, the constraint is always satisfied, U 1 being larger than v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

119 Equilibrium constraints: formulations (2) MILP formulation 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 x 1 s 1 x 2 s 2 v 1 free in sign v 2 free in sign s 1 binary vector s 2 binary vector 1v 1 U 1 x 2 U 1 (1 s 1 ) 0 1v 2 U 2 x 1 U 2 (1 s 2 ) 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

120 Equilibrium constraints: formulations (3) Checking the equilibrium existence with a given support Given s 1 and s 2 The mathematical problem is linear 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 x 1 s 1 x 2 s 2 v 1 free in sign v 2 free in sign 1v 1 U 1 x 2 U 1 (1 s 1 ) 0 1v 2 U 2 x 1 U 2 (1 s 2 ) 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

121 Equilibrium constraints: formulations (4) Support enumeration All the supports are scanned according to some heuristics and pruning techniques For each support, the algorithm checks the existence of an equilibrium by linear programming Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

122 Equilibrium constraints: formulations (4) Support enumeration All the supports are scanned according to some heuristics and pruning techniques For each support, the algorithm checks the existence of an equilibrium by linear programming Comparison between formulations LCP: the problem can be solved by using the Lemke Howson algorithm (the solution space is O(2.6 n ) where n is the number of agents actions) MILP: the problem can be solved by using MILP solvers (it allows one to find optimal equilibria) Support enumeration: the problem can be solved by using repeatedly LP solvers (the solution space is O(4 n ) where n is the number of agents actions) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

123 Lemke Howson Basic idea It works with the linear complementarity formulation An auxiliary problem is generated from the original problem A solution for the auxiliary problem is found A solution of the original problem is derived from the solution of the auxiliary problem Algorithm characteristics As the simplex algorithm: it is based on tableaux and basis exchange by pivoting Differently from the simplex algorithm: it employs a specific pivoting rule, called complementary pivoting Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

124 Auxiliary problem (1) Original problem 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 free in sign v 1 v 2 x T 1 (1v 1 U 1 x 2 ) = 0 x T 2 (1v 2 U 2 x 1 ) = 0 free in sign Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

125 Auxiliary problem (1) Original problem Transformation 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 free in sign v 1 v 2 x T 1 (1v 1 U 1 x 2 ) = 0 x T 2 (1v 2 U 2 x 1 ) = 0 free in sign Under the assumption that v i > 0, assign x 1 = x 1 v 2 and x 2 = x 2 v 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

126 Auxiliary problem (1) Original problem Transformation 1v 1 U 1 x 2 0 1v 2 U 2 x 1 0 x 1 0 x T x 1 = 1 1 T x 2 = 1 free in sign v 1 v 2 x T 1 (1v 1 U 1 x 2 ) = 0 x T 2 (1v 2 U 2 x 1 ) = 0 free in sign If U i > 0, we have v i > 0; by adding a constant value to an arbitrary U i we can have an equivalent positive U i and, thus, we can always apply the transformation Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

127 Auxiliary problem (2) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x T x 1 = 1 v 2 1 T x 2 = 1 v 1 v 1 > 0 v 2 > 0 v 1 v 2 x T 1 (1 U 1 x 2 ) = 0 v 1 v 2 x T 2 (1 U 2 x 1 ) = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

128 Auxiliary problem (2) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x T x 1 = 1 v 2 1 T x 2 = 1 v 1 v 1 > 0 v 2 > 0 v 1 v 2 x T 1 (1 U 1 x 2 ) = 0 v 1 v 2 x T 2 (1 U 2 x 1 ) = 0 Simplification Constraints v i v i x T i (1 U i x i ) = 0 are equivalent to x T i (1 U i x i ) = 0 when v i, v i > 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

129 Auxiliary problem (3) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x T x 1 = 1 v 2 1 T x 2 = 1 v 1 v 1 > 0 v 2 > 0 x T 1 (1 U 1 x 2 ) = 0 x T 2 (1 U 2 x 1 ) = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

130 Auxiliary problem (3) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x T x 1 = 1 v 2 1 T x 2 = 1 v 1 v 1 > 0 v 2 > 0 x T 1 (1 U 1 x 2 ) = 0 x T 2 (1 U 2 x 1 ) = 0 Simplification Constraints 1 T x i = 1 v i always hold except when x i = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

131 Auxiliary problem (4) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x 2 0 x T 1 (1 U 1 x 2 ) = 0 x T 2 (1 U 2 x 1 ) = 0 The above problem is equivalent to the original problem, except that it admits an additional solution (0, 0) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

132 Auxiliary problem (4) New problem U 1 x 2 1 U 2 x 1 1 x 1 0 x 2 0 x T 1 (1 U 1 x 2 ) = 0 x T 2 (1 U 2 x 1 ) = 0 The above problem is equivalent to the original problem, except that it admits an additional solution (0, 0) Definition Solution (0, 0) is called artificial solution Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

133 Auxiliary problem (5) New problem with slack variables r 1 + U 1 x 2 = 1 r 2 + U 2 x 1 = 1 x 1 0 x 2 0 r 1 0 r 2 0 x T 1 r 1 = 0 x T 2 r 2 = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

134 Auxiliary problem (5) New problem with slack variables r 1 + U 1 x 2 = 1 r 2 + U 2 x 1 = 1 x 1 0 x 2 0 r 1 0 r 2 0 x T 1 r 1 = 0 x T 2 r 2 = 0 Complementary variables Variables x i,j, r i,j are said complementary Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

135 Auxiliary problem (5) New problem with slack variables Complementary variables Variables x i,j, r i,j are said complementary Complementary solution r 1 + U 1 x 2 = 1 r 2 + U 2 x 1 = 1 x 1 0 x 2 0 r 1 0 r 2 0 x T 1 r 1 = 0 x T 2 r 2 = 0 In a solution of the above problem, at least one variable for every pair of complementary variables is equal to 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

136 Labeling (1) Labels We assign each variable x i,j, r i,j a label(i, j) There are n 1 + n 2 different labels The set of labels L(s) associated with a solution s will be composed of the labels of all the variables whose value is 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

137 Labeling (1) Labels We assign each variable x i,j, r i,j a label(i, j) There are n 1 + n 2 different labels The set of labels L(s) associated with a solution s will be composed of the labels of all the variables whose value is 0 Example s = x 1,1 = 0.3 x 2,1 = 0 x 1,2 = 0.7 x 2,2 = 0.2 x 1,3 = 0 x 2,3 = 0.8 r 1,1 = 0 r 2,1 = 0.2 r 1,2 = 0.2 r 2,2 = 0.6 r 1,3 = 0.1 r 2,3 = 0 L(s) = {(1, 1),(1, 3),(2, 1),(2, 3)} Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

138 Labeling (1) Labels We assign each variable x i,j, r i,j a label(i, j) There are n 1 + n 2 different labels The set of labels L(s) associated with a solution s will be composed of the labels of all the variables whose value is 0 Example s = x 1,1 = 0 x 2,1 = 0 x 1,2 = 0 x 2,2 = 0 x 1,3 = 0 x 2,3 = 0 r 1,1 = 0.2 r 2,1 = 0.2 r 1,2 = 0.2 r 2,2 = 0.6 r 1,3 = 0.8 r 2,3 = 0.4 L(s) = {(1, 1),(1, 2),(1, 3),(2, 1),(2, 2),(2, 3)} Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

139 Labeling (1) Labels We assign each variable x i,j, r i,j a label(i, j) There are n 1 + n 2 different labels The set of labels L(s) associated with a solution s will be composed of the labels of all the variables whose value is 0 Labels and solutions A completely labelled solution is a complementary solution An almost completely labelled solution s has the property that L(s) contains all the labels except one Every vertex of the best response polytope on a simplex ni has n i labels (or more if the game is degenerate) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

140 Labeling (2) x 1,3 x 2,3 (2, 3) (2, 1) (2, 2) x 1,2 (1, 3) (1, 1) (1, 2) x 2,2 x 1,1 x 2,1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

141 Labeling (2) x 1,3 x 2,3 (1, 2) (2, 3) (1, 1) (2, 1) (2, 2) (2, 1) (2, 2) x 1,2 (1, 3) (1, 1) (1, 2) x 2,2 x 1,1 (1, 3) x 2,1 (2, 3) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

142 Moving on almost complementary solutions (1) Vertex neighbors Every vertex of the best response polytope on a simplex ni has n i neighbors (or more if the game is degenerate) Each neighbor differs only for one label All the neighbors of a vertex contain different labels (except for degenerate games) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

143 Moving on almost complementary solutions (1) Vertex neighbors Every vertex of the best response polytope on a simplex ni has n i neighbors (or more if the game is degenerate) Each neighbor differs only for one label All the neighbors of a vertex contain different labels (except for degenerate games) Almost complementary solution neighbors Given an almost complementary solution s such that label l does not appear in L(s), there are two neighbors s, s such that L(s) L(s ) and L(s) L(s ) s is obtained by moving on the best response polytope of agent 1 s is obtained by moving on the best response polytope of agent 2 s and s can be either almost complementary solutions or complementary solutions Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

144 Moving on almost complementary solutions (2) Almost complementary solution paths Given an almost complementary solution s lacking of label l and a direction, it is possible to traverse a path of almost complementary solutions lacking of label l These paths can: have terminals in complementary solutions, or be cycles If we start from a complementary solution and we move on a path of almost complementary solutions lacking of label l we reach a complementary solution Artificial solution (0, 0) is a complementary solution, and therefore it can be used as starting point Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

145 Lemke Howson algorithm Algorithm 1 The initial solution s is the artificial solution (0, 0) 2 Select a label l = (i, j) and move to the neighbor s of s such that l L(s ) 3 if s is complementary, return it 4 move along the almost complementary path lacking of l 5 go to Step 3 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

146 Lemke Howson algorithm Algorithm 1 The initial solution s is the artificial solution (0, 0) 2 Select a label l = (i, j) and move to the neighbor s of s such that l L(s ) 3 if s is complementary, return it 4 move along the almost complementary path lacking of l 5 go to Step 3 Comments There are n 1 + n 2 different initializations of the algorithms Each initialization leads to a different path Along the almost complementary path, the selection of the neighbor can be done by trying to remove one label appearing twice Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

147 Lemke Howson: graphical representation x 1,3 x 2,3 (1, 2) (2, 3) (1, 1) (2, 1) (2, 2) (2, 1) (2, 2) x 1,2 (1, 3) (1, 1) (1, 2) x 2,2 x 1,1 (1, 3) x 2,1 (2, 3) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

152 Comments on Lemke Howson algorithm Comments on pivoting The movement between neighbors can be done by pivoting as in the simplex algorithm The specific adopted pivoting rule is said complementary pivoting Complementary pivoting is an heuristic to move on the vertices of the best response polytopes Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

153 Comments on Lemke Howson algorithm Comments on pivoting The movement between neighbors can be done by pivoting as in the simplex algorithm The specific adopted pivoting rule is said complementary pivoting Complementary pivoting is an heuristic to move on the vertices of the best response polytopes Comments on complexity The complexity of the algorithm is measured in terms of number of pivoting steps It is possible to generate games with the property that all the paths have exponential length in the number of agents actions In general, paths can be of extremely different lengths and there is usually some path with short length Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

154 Lemke Howson algorithm: tableaux implementation (1) Tableaux on agent 1 s simplex r 2,1... r 2,n x 1,1... x 1,n 1 r 2, U 2 (1, 1)... U 2 (1, n) r 2,n U 2 (n, 1)... U 2 (n, n) 1 on agent 2 s simplex r 1,1... r 1,n x 2,1... x 2,n 1 r 1, U 1 (1, 1)... U 1 (1, n) r 1,n U 1 (n, 1)... U 1 (n, n) 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

155 Lemke Howson algorithm: tableaux implementation (2) Comments Variables r i,j are in the basis and they have positive values (= 1) Variables x i,j are out the basis and thus they have null values The basic solution is feasible, all the basic variables being positive Each variable out the basis gives a label Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

156 Lemke Howson algorithm: tableaux implementation (2) Comments Variables r i,j are in the basis and they have positive values (= 1) Variables x i,j are out the basis and thus they have null values The basic solution is feasible, all the basic variables being positive Each variable out the basis gives a label Algorithm initialization Select arbitrarily a variable x i,j to put in the basis (e.g., x 1,1 ) Select the variable r i,j to leave the basis by minimum ratio test Consider the column vector of coefficients associated with r i,j Collect all the strictly positive coefficients c k, where k is the row, and the corresponding q k (q is the vector of constants) Select the row k such that k = arg min{q k /c k } (remember that, by construction, c k > 0 and q k 0) The leaving variable is the basic variable in position k Make a pivoting step Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

157 Lemke Howson algorithm: tableaux implementation (3) Complementary pivoting The variable entering the basis will be the complementary of the variable that has previously left the basis Pivoting is repeated until the basis is almost complementary Once a solution is found, assign x 1 = x 1 1 T x 1 x 2 = x 2 1 T x 2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

158 Example (1) Initial tableaux r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 r 2, r 2, r 2, r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 r 1, r 1, r 1, Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

159 Example (1) Initial tableaux r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 r 2, r 2, r 2, r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 r 1, r 1, r 1, Choosing the first entering variable We choose x 1,1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

160 Example (2) Minimum ratio test r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 r 2, r 2, r 2, Only the coefficients corresponding to r 2,1 and r 2,3 are strictly positive, 3 and 2 respectively r 2,1 1/3 = 0.3 r 2,3 1/2 = 0.5 The leaving variable is r 2,1 (given that 0.3 < 0.5) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

161 Example (3) Pivoting r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 x 1, r 2, r 2,3 2/ /3 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

162 Example (3) Pivoting Next entering variable r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 x 1, r 2, r 2,3 2/ /3 Since the previous leaving variable was r 2,1, the next entering variable is x 2,1 r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 r 1, r 1, r 1, r 1,1 1/2 = 0.5 r 1,3 1/3 = 0.3 (leaving variable) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

163 Example (4) Pivoting r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 r 1, / /3 1/3 r 1, x 2, Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

164 Example (4) Pivoting r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 r 1, / /3 1/3 r 1, x 2, Next entering variable Since the previous leaving variable was r 1,3, the next entering variable is x 1,3 r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 x 1, r 2, r 2,3 2/ /3 x 1,1 1/1 = 1 r 2,3 3/3 = 1 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

165 Degeneracy Interpretation When the minimum ratio test returns two possible variables, removing an arbitrary variable we reach the same vertex At such a vertex, we get multiple labels (one for each variable satisfying the minimum ratio test) This is drawback for the algorithm because it allows for multiple choices at each step Multiple choices do not assure the completeness of the algorithm, the path following not being possible Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

166 Degeneracy: graphical interpretation x 1,3 x 2,3 (1, 2) (2, 3) (1, 1) (2, 1) (2, 2) (2, 1) (2, 2) x 1,2 (1, 3) (1, 1) (1, 2) x 2,2 x 1,1 (1, 3) x 2,1 (2, 3) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

167 Removing degeneracy (1) Lexico positiveness A vector y is said lexico positive if, scouring y j from j = 1 on, the first non null y j is positive Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

168 Removing degeneracy (1) Lexico positiveness A vector y is said lexico positive if, scouring y j from j = 1 on, the first non null y j is positive Example y 1 = , y 2 = y 1 and y 3 are lexico positive, while y 2 is not, y 3 = Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

169 Removing degeneracy (2) Symbolic perturbation In order to break ties, it is possible to introduce a symbolic perturbation inǫ, withǫ 0 +, over the constants: Call ǫ = [ ǫ ǫ 2... ǫ n ] T Constraints can be perturbed as follows: r 1 + U 1 x 2 = 1 r 2 + U 2 x 1 = 1 r 1 + U 1 x 2 r 2 + U 2 x 1 = 1+Q 1 ǫ = 1+Q 2 ǫ If Q 1 and Q 2 are full rank n matrices, then ties are not possible Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

170 Perturbation: graphical interpretation x 1,3 x 2,3 (1, 2) (2, 3) (1, 1) (2, 1) (2, 2) (2, 1) (2, 2) x 1,2 (1, 3) (1, 1) (1, 2) x 2,2 x 1,1 (1, 3) x 2,1 (2, 3) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

173 Symbolic perturbation and tableaux Perturbed tableaux r 2,1... r 2,n x 1,1... x 1,n 1 ǫ... ǫ n r 2, U 2 (1, 1)... U 2 (1, n) 1 Q 1 (1, 1)... Q 1 (1, n) r 2,n U 2 (n, 1)... U 2 (n, n) 1 Q 1 (n, 1)... Q 1 (n, n) r 1,1... r 1,n x 2,1... x 2,n 1 ǫ... ǫ n r 1, U 1 (1, 1)... U 1 (1, n) 1 Q 2 (1, 1)... Q 2 (1, n) r 1,n U 1 (n, 1)... U 1 (n, n) 1 Q 2 (n, 1)... Q 2 (n, n) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

174 Basis exchange with symbolic perturbation Lexico minimum ratio test Instead of considering only q k, the entire row of perturbation is considered(q) k The minimum ratio test is a applied lexicographically fromǫ 0 (corresponding to q k ) on The row k corresponding to the lexicon minimum value will indicate the leaving variable Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

175 Example (1) Initial tableaux r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 ǫ... ǫ n r 2, r 2, r 2, r 1,1 r 1,2 r 1,2 x 2,1 x 2,2 x 2,3 1 ǫ... ǫ n r 1, r 1, r 1, Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

176 Example (2) Initial tableaux After two pivoting steps: r 2,1 r 2,2 r 2,3 x 1,1 x 1,2 x 1,3 1 ǫ... ǫ n x 1, r 2, r 2,3 2/ /3 1/3 2/3 0 1 Lexico minimum ratio test x 1,1 [ ] r 2,3 [ ] (leaving variable) Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

177 Exception Lexico minimum ratio test and equilibria If the variable associated with the label that appears twice satisfies the (non lexico) minimum ratio test, then remove it Apply the lexico minimum ratio test otherwise This exception is not necessary, but it improves the efficiency of the algorithm Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

178 Lemke algorithm (1) Standard LCP w, z 0 w = Mz+q z T w = 0 Basic idea The Lemke algorithm can be used to solve standard LCPs The algorithm works as follows Build a tableau containing the linear constraints of an LCP Augment the tableau with an artificial variable z 0 such that w = Mz+dz 0 + q, where d is said covering vector z 0 enters the basis and the leaving variable is the variable w i such that q i is the minimum value Apply complementary pivoting until z 0 leaves the basis Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

179 Lemke algorithm (2) Original tableau w 1... w n z 1... z n 1 w m 1,1... m 1,n q w n m n,1... m n,n q n When q 0, a trivial solution is w = q and z = 0 Augmented tableau w 1... w n z 1... z n z 0 1 w m 1,1... m 1,n d 1 q w n m n,1... m n,n d n q n When q is not positive, w is not a basic solution Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

180 Lemke algorithm (3) Initial tableau (1) z 0 is the entering variable w j such that j = arg min{q j } is the leaving variable (e.g., w n ) w 1... w n z 1... z n z 0 1 w m 1,1... m 1,n d 1 q w n m n,1... m n,n d n q n By a pivoting step w 1... w n z 1... z n z 0 1 w d 1 /d n m 1,1 + m n,1 d 1 /d n... m 1,n + m n,nd 1 /d n 0 q 1 q nd 1 /d n z /d n m n,1 /d n... m n,n/d n 1 q n/d n Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

181 Lemke algorithm (4) Initial tableau (2) By choosing an opportune d, we have found a feasible basic solution of the initial tableau Lemke uses d = 1 q n/d n is strictly positive, q n being strictly negative q i q nd i /d n is positive, q n being strictly positive and q i < q n if q i is negative Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

182 Lemke algorithm (4) Initial tableau (2) By choosing an opportune d, we have found a feasible basic solution of the initial tableau Lemke uses d = 1 q n/d n is strictly positive, q n being strictly negative q i q nd i /d n is positive, q n being strictly positive and q i < q n if q i is negative Complementary pivoting Variables z i, w i are said complementary According to the complementary pivoting, the entering variable at a step is the complementary variable of the previous step (e.g., if w n has left the basis, z n is the next entering variable) Complementary pivoting is applied until z 0 leaves the basis Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

183 Lemke algorithm (5) Comments The Lemke algorithm follows a path of almost complementary solutions If the entering variable cannot enter the basis because there is no leaving variable according to the minimum ratio test, the algorithm stops (ray termination) In general, an LCP may not admit any solution Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

184 Lemke algorithm (5) Comments The Lemke algorithm follows a path of almost complementary solutions If the entering variable cannot enter the basis because there is no leaving variable according to the minimum ratio test, the algorithm stops (ray termination) In general, an LCP may not admit any solution Completeness The Lemke algorithm is not assured to be complete and therefore it can go in ray termination even if the problem admits a solution When the problem satisfies some properties, the algorithm is assured to be complete A sufficient condition is the satisfaction of the following two properties: z t Mz 0 for every z 0 z t q 0 for every z 0 such that z T Mz = 0 and Mz = 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

185 Lemke algorithm for bimatrix games Formulation z = v + 1 v 1 v + 2 v 2 x1 x 2, w = T T T T U U 2 0 } {{ } M z } {{ } q Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

186 Lemke algorithm for bimatrix games Formulation z = Comments v + 1 v 1 v + 2 v 2 x1 x 2, w = T T T T U U 2 0 } {{ } M z } {{ } q We have v i = v + i v i When U i < 0, M and q are such that the properties to assure the completeness of the algorithm are satisfied Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

187 Lemke algorithm for polymatrix games (1) Expected utility with polymatrix games Agent i s expected utility can be expressed as x T j i U i,jx j This linearity can be exploited to design efficient algorithms Formulation for three agent games z = v + 1 v 1 v + 2 v 2 v + 3 v 3 x1 x 2 x 3, w = T T T T T T U 1,2 U 1, U 2,1 0 U 2, U 3,1 U 3,2 0 } {{ } M z } {{ } q With more agents the formulation is similar Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

188 Lemke algorithm for polymatrix games (2) Comments Exactly as with the bimatrix case, with the polymatrix case M and q satisfy the properties to assure the completeness of the Lemke algorithm The Lemke algorithm can be efficiently used Non polymatrix game approximation via polymatrix game Some algorithms find approximate Nash equilibria with non polymatrix games by approximating these games with polymatrix games Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

189 Computing a correlated equilibrium Correlated equilibrium and recommendation Each game admits at least a correlated equilibrium in which the random variable is interpreted as a recommendation to each agent of what action to play We can search for a correlated equilibrium in which, when agent i is recommended to play a given action a, she cannot gain more by deviating form the recommendation Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

190 Computing a correlated equilibrium Correlated equilibrium and recommendation Each game admits at least a correlated equilibrium in which the random variable is interpreted as a recommendation to each agent of what action to play We can search for a correlated equilibrium in which, when agent i is recommended to play a given action a, she cannot gain more by deviating form the recommendation Formulation a i A i π(a i, a i )U i (a i, a i ) a i A i π(a i, a i )U i (a i, a i) a i, a i A i, i N a A π(a) = 1 π(a) 0 a A Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

191 Computing a correlated equilibrium (2) Comments The computation of a correlated equilibrium can be done in polynomial time, being an LP The possibility to correlated allows for the joint randomization over all the outcomes With Nash equilibrium, the randomization over the outcomes is induced by the product of the randomizations over the single agents actions The product of multiple strategies leads to non linear constraints Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

192 Formulation (1) Bi level maximization problem max xl x T l U lx f 1 T x l = 1 x l 0 max xf x T l U f x f 1 T x f = 1 x f 0 Multi LP max uj u j u j = max xl x T l U le j 1(x T l U f e j ) U T f x l 1 T x l = 1 x l 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

193 Example Payoff matrices U l = U f = Agent l s expected utility as function of her strategy x l 3 1 x l, x l, x l,2 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

194 Leader follower equilibrium constraints (1) Follower s best response constraints 1v f U f x l 0 x f 0 1 T x f = 1 v f free in sign x T f (1v f U f x l ) = 0 Linearization 1v f U f x l 0 x f binary vectors 1 T x f = 1 v f free in sign 1v f U f x l U f (1 x f ) 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

195 Leader follower equilibrium constraints (2) MINLP formulation max xl,x f,v f x T l U lx f 1 T x l = 1 x l 0 1v f U f x l 0 x f binary vectors 1 T x f = 1 v f free in sign 1v f U f x l U f (1 x f ) 0 Linearization max xl,x f,v f,v l 1 T v l U l x l v l U l x f v l 1v f U f x l 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

196 Formulation (2) MILP formulation max xl,x f,v f,v l 1 T v l 1 T x l = 1 x l 0 1v f U f x l 0 x f binary vectors 1 T x f = 1 v f free in sign 1v f U f x l U f (1 x f ) 0 U l x l v l U l x f v l 1v f U f x l 0 Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

197 Formulation (3) LP formulation By resorting to the concept of correlated equilibrium, we obtain a l A l π(a l, a f )U f (a l, a f ) a l A l π(a l, a f )U f (a l, a f ) a f,a f A f a A π(a) = 1 π(a) 0 a A Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

198 Formulation (3) LP formulation By resorting to the concept of correlated equilibrium, we obtain a l A l π(a l, a f )U f (a l, a f ) a l A l π(a l, a f )U f (a l, a f ) a f,a f A f a A π(a) = 1 π(a) 0 a A Interpretations A leader follower equilibrium is an optimal (maximizing the leader s utility) correlated equilibrium without rationality constraints of the leader agent The leader agent commits to a correlated equilibrium: Agent l chooses a distributionπ(a l, a f ) over the outcomes She draws(a l, a f ) according to the distribution She recommends to agent f to play a f She plays a l Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120

199 LP formulation Comments The LP returns a distribution probability over the outcomes such that π(a l, a f ) = 0 except for a single a f The returnedπ(a l, a f ) 0 is exactly the strategy of agent l The unique a f such that π(a l, a f ) 0 is the agent f s best response Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium Duke University, computation: USA Part ) / 120