Game Theory 1. Introduction Dmitry Potapov CERN
What is Game Theory? Game theory is about interactions among agents that are self-interested I ll use agent and player synonymously Self-interested: Each agent has its own description of what states are desirable Generally model this using utility theory Utility function: maps each state of the world to a real number how much an agent likes that state 2
Example: TCP Users Internet traffic is governed by the TCP protocol TCP s backoff mechanism If the rate at which you re sending packets causes congestion, reduce the rate until congestion subsides Suppose that You re trying to finish an important project It s extremely important for you to have a fast connection Only one other person is using the Internet That person wants a fast connection just as much as you do You each have 2 possible actions: C (use a correct implementation) D (use a defective implementation that won t back off) 3
Action Profiles and their Payoffs An action profile is a choice of action for each agent You both use C => average packet delay is 1 ms You both use D => average delay is 3 ms (router overhead) One of you uses D, the other uses C: D user s delay is 0 C user s delay is 4 ms Payoff matrix: Your options are the rows The other agent s options are the columns Each cell = an action profile 1st number in the cell is your payoff or utility (I ll use those terms synonymously) In this case, the negative of your delay 2nd number in each cell is the other agent s payoff 1, 1 4, 0 0, 4 3, 3 4
Some questions Examples of the kinds of questions game theory attempts to answer: Which action should you use: C or D? Does it depend on what you think the other person will do? What kind of behavior can the network operator expect? Would any two users behave the same? Will this change if users can communicate with each other beforehand? Under what changes to the delays would the users decisions still be the same? 1, 1 4, 0 0, 4 3, 3 How would you behave if you knew you would face this situation repeatedly with the same person? 5
Some Fields where Game Theory is Used Economics Auctions Markets Bargaining Fair division Social networks 6
Some Fields where Game Theory is Used Government and Politics Voting systems Negotiations International relations War Human rights A trench in World War 1: 7
Some Fields where Game Theory is Used Evolutionary Biology Communication Population ratios Territoriality Altruism Parasitism, symbiosis Social behavior 8
Some Fields where Game Theory is Used Computer Science Artificial Intelligence Multi-agent systems Computer networks Robotics 9
Some Fields where Game Theory is Used Engineering Communication networks Control systems Road networks 10
Expected utility maximization Games against nature Nature is considered an impartial agent (may be represented by U = const) U = N i=1 p i U i Assume that p = 0.7; p = 0.3 Then: U = (0) (0.7) + (2)(0.3) = 0.6 0 2 U = (-5) (0.7) + (10)(0.3) = 0.5 5 10 It is rational to take an umbrella! 11
These games are purely competitive Zero-sum Games Constant-sum game: For every action profile, the sum of the payoffs is the same, i.e., there is a constant c such for every action profile (a 1,, a n ), u 1 (a 1,, a n ) + + u n (a 1,, a n ) = c Any constant-sum game can be transformed into an equivalent game in which the sum of the payoffs is always 0 Just subtract c/n from every payoff Thus constant-sum games are usually called zero-sum games 12
Matching Pennies Two agents, each has a penny Each independently chooses to display Heads or Tails Examples If same, agent 1 gets both pennies Otherwise agent 2 gets both pennies Heads Tails Heads Tails 1, 1 1, 1 1, 1 1, 1 Rock, Paper, Scissors (Roshambo) 3-action generalization of matching pennies If both choose same, no winner Otherwise, paper beats rock, rock beats scissors, scissors beats paper 13
Dominant strategies Now let s assume that weather is a conscious, self-interested agent Column 2 always gives me less utility, than column 1. There is no use for me choosing it! Surely, Weather will choose to rain, so I better take my umbrella! 0, 0 2, 2 5, 5 10, -10 It is rational for the man to take an umbrella and for Weather to rain! 14
Nash Equilibrium If I choose R 1, C is better with C 2, if R 2 with C 1. C will consider worst case scenario and will choose C 1. So I must choose R 1! R will choose R 1, so I must choose C 2! R will choose C 2, so I must choose R 2! R will choose R 2, so I must choose C 1! No dominant strategies: C 1 C 2 Solution: use Nash equilibrium choose R 1 with probability p = 5/7 and R 2 with probability p = 2/7 R 1 R 2 2, 2 0, 0-1, 1 4, 4 15
Nash Equilibrium C 1 C 2 R 1 2, 2 0, 0 R 2-1, 1 4, 4 16
Non-zero-sum games The game of chicken 1, 1 10, 10 10, 10 100, 100 What is the rational decision? Yank your steering wheel off and throw it out of the window (H. Kahn) - Need to make sure that your opponent sees this - If your opponent uses the same strategy, you have a problem (are killed) Choose maximin strategy (C, C) - Not an equilibrium Choose a pure equilibrium (C, D) or (D, C) - Not symmetric Choose a mixed equilibrium C with p = 10/11, D with p = 1/11 - Average payoff is 0, but a non-equilibrium (C, C) gives 1 No satisfactory answer which is the most rational strategy! 17
Non-zero-sum games The deterrence game S 1 T 1 S 2 T 2 1, 0 0, 1 0, 0 x, y Model: Cuban Missile Crisis S 2 T 2 T 1 S 1 remove Russian missiles keep Russian missiles attack Cuba not attack Cuba How to induce Cuba to choose S 2? Send a threat: If you choose T 2, I am going to choose T 1 Is it a credible threat or a bluff? Assume that Cuba will defy the threat with probability p. Then a solution (T.C. Schelling, 1960) is: It is worthwhile to threaten with some probability π so that: (1 p) > π > 1 px 1+y 18
The deterrence game Real world implementation Assume π = 1/6. The problem is: One doesn t release a 1/6th of a nuclear war! He either releases a full-blown nuclear attack or none at all The chances of winning are the same as in Russian roulette The question reduces to the question of how rational it is to play Russian roulette 19
The Prisoner s Dilemma The TCP user s game is more commonly called the Prisoner s Dilemma Scenario: two prisoners are in separate rooms For each prisoner, the police have enough evidence for a 1 year prison sentence They want to get enough evidence for a 4 year prison sentence They tell each prisoner, If you testify against the other prisoner, we ll reduce your prison sentence by 1 year C = Cooperate (with the other prisoner): refuse to testify D = Defect: testify against the other prisoner Both prisoners cooperate => both stay in prison for 1 year Both prisoners defect => both stay in prison for 4 1 = 3 years One defects, other cooperates => cooperator stays in prison for 4 years; defector goes free 1, 1 4, 0 0, 4 3, 3 20
Prisoner s Dilemma 1, 1 4, 0 0, 4 3, 3 The paradox: strategy (D, D) is a dominant equilibrium (for example, for every strategy of the column player, the row player prefers C to D.) But (C, C) has a bigger payoff. 21
Backward Induction To find subgame-perfect equilibria, we can use backward induction Identify the equilibria in the bottom-most nodes Assume they ll be played if the game ever reaches these nodes For each node x, recursively compute a vector v x = (v x1,, v xn ) that gives every agent s equilibrium utility At each node x, If i is the agent to move, then i s equilibrium action is to move to a child y of x for which i s equilibrium utility v yi is highest C A 2 (3,8) D (3,8) (8,3) 1 (3,8) B 2 (2,10) E F (5,5) 1 (2,10) G H (2,10) (1,0) Thus v x = v y 22
The Centipede Game 1 2 1 1 2 100, 100 1, 1 0, 3 2, 2 99, 99 98,101 Two possible moves: C (continue) and S (stop) Agent 1 makes the first move At each terminal node, the payoffs are as shown 23
A Problem with Backward Induction The Centipede Game Can extend this game to any length The payoffs are constructed in such a way that for each agent, the only SPE is always to choose S This equilibrium isn t intuitively appealing Seems unlikely that an agent would choose S near the start of the game If the agents continue the game for several moves, they ll both get higher payoffs In lab experiments, subjects continue to choose C until close to the end of the game 24
A Problem with Backward Induction Suppose agent 1 chooses C If you re agent 2, what do you do? SPE analysis says you should choose S But SPE analysis also says you should never have gotten here at all How to amend your beliefs and course of action based on this event? Fundamental problem in game theory Differing accounts of it, depending on the probabilistic assumptions made what is common knowledge (whether there is common knowledge of rationality) how to revise our beliefs in the face of an event with probability 0 25
Backward Induction in Zero-Sum Games Backward induction works much better in zero-sum games No zero-sum version of the Centipede Game, because we can t have increasing payoffs for both players Only need one number: agent 1 s payoff (= negative of agent 2 s payoff) Propagate agent 1 s payoff up to the root At each node where it s agent 1 s move, the value is the maximum of the labels of its children At each node where it s agent 2 s move, the value is the minimum of the labels of its children The root s label is the value of the game (from the Minimax Theorem) In practice, it may not be possible to generate the entire game tree E.g., extensive-form representation of chess has about 10 150 nodes Need a heuristic search algorithm 26
Summary Basic concepts: payoffs, pure strategies, mixed strategies Some classifications of games based on their payoffs Zero-sum Roshambo, Matching Pennies Non-zero-sum Prisoner s Dilemma, Game of chicken Backward induction 27