Game Theory and Algorithms Lecture 10: Extensive Games: Critiques and Extensions



Similar documents
6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation

Sequential lmove Games. Using Backward Induction (Rollback) to Find Equilibrium

Minimax Strategies. Minimax Strategies. Zero Sum Games. Why Zero Sum Games? An Example. An Example

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Decision Making under Uncertainty

Bayesian Nash Equilibrium

Game Theory 1. Introduction

Lecture V: Mixed Strategies

6.042/18.062J Mathematics for Computer Science. Expected Value I

Bonus Maths 2: Variable Bet Sizing in the Simplest Possible Game of Poker (JB)

AN INTRODUCTION TO GAME THEORY

Week 7 - Game Theory and Industrial Organisation

Théorie de la décision et théorie des jeux Stefano Moretti

Nash Equilibria and. Related Observations in One-Stage Poker

Lab 11. Simulations. The Concept

Economics 1011a: Intermediate Microeconomics

Backward Induction and Subgame Perfection

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Math 728 Lesson Plan

6.3 Conditional Probability and Independence

6.254 : Game Theory with Engineering Applications Lecture 1: Introduction

Microeconomic Theory Jamison / Kohlberg / Avery Problem Set 4 Solutions Spring (a) LEFT CENTER RIGHT TOP 8, 5 0, 0 6, 3 BOTTOM 0, 0 7, 6 6, 3

1. Write the number of the left-hand item next to the item on the right that corresponds to it.

Games of Incomplete Information

Phantom bonuses. November 22, 2004

Christmas Gift Exchange Games

Psychology and Economics (Lecture 17)

Game Theory in Wireless Networks: A Tutorial

Game Theory and Poker

Probability and Expected Value

Probabilistic Strategies: Solutions

Economics 159: Introduction to Game Theory. Summer 2015: Session A. Online Course

Reading 13 : Finite State Automata and Regular Expressions

Colored Hats and Logic Puzzles

Video Poker in South Carolina: A Mathematical Study

21. Unverifiable Investment, Hold Up, Options and Ownership

Binomial lattice model for stock prices

Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

In the situations that we will encounter, we may generally calculate the probability of an event

Rafael Witten Yuze Huang Haithem Turki. Playing Strong Poker. 1. Why Poker?

Texas Hold em. From highest to lowest, the possible five card hands in poker are ranked as follows:

Chapter 7. Sealed-bid Auctions

Perfect Bayesian Equilibrium

6.207/14.15: Networks Lectures 19-21: Incomplete Information: Bayesian Nash Equilibria, Auctions and Introduction to Social Learning

Competition and Regulation. Lecture 2: Background on imperfect competition

UCLA. Department of Economics Ph. D. Preliminary Exam Micro-Economic Theory

Feb 7 Homework Solutions Math 151, Winter Chapter 4 Problems (pages )

FINAL EXAM, Econ 171, March, 2015, with answers

Bayesian Tutorial (Sheet Updated 20 March)

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.

Infinitely Repeated Games with Discounting Ù

Pascal is here expressing a kind of skepticism about the ability of human reason to deliver an answer to this question.

Combinatorics 3 poker hands and Some general probability

1 Nonzero sum games and Nash equilibria

Capital Structure. Itay Goldstein. Wharton School, University of Pennsylvania

1 Representation of Games. Kerschbamer: Commitment and Information in Games

Best-response play in partially observable card games

Lecture 8 The Subjective Theory of Betting on Theories

Game theory and AI: a unified approach to poker games

Oligopoly and Strategic Pricing

Compact Representations and Approximations for Compuation in Games

Wald s Identity. by Jeffery Hein. Dartmouth College, Math 100

Two-sample inference: Continuous data

Equilibrium: Illustrations

The fundamental question in economics is 2. Consumer Preferences

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling

The Game of Blackjack and Analysis of Counting Cards

i/io as compared to 4/10 for players 2 and 3. A "correct" way to play Certainly many interesting games are not solvable by this definition.

Lecture 25: Money Management Steven Skiena. skiena

Expected Value. 24 February Expected Value 24 February /19

10 Evolutionarily Stable Strategies

Creating a NL Texas Hold em Bot

Expected Value and the Game of Craps

POLYNOMIAL FUNCTIONS

Chapter 31 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M.

Dynamics and Equilibria

How to Win Texas Hold em Poker

Competition and Fraud in Online Advertising Markets

Two-State Options. John Norstad. January 12, 1999 Updated: November 3, 2011.

14.74 Lecture 12 Inside the household: Is the household an efficient unit?

Math Games For Skills and Concepts

Cournot s model of oligopoly

Money Unit $$$$$$$$$$$$$$$$$$$$$$$$ First Grade

A Statistical Analysis of Popular Lottery Winning Strategies

Week 4: Gambler s ruin and bold play

AP Stats - Probability Review

Champion Poker Texas Hold em

Von Neumann and Newman poker with a flip of hand values

The Plaintiff s Attorney in the Liability Insurance Claims Settlement Process: A Game Theoretic Approach

The Peruvian coin flip Cryptographic protocols

Understanding Options: Calls and Puts

A capacity planning / queueing theory primer or How far can you go on the back of an envelope? Elementary Tutorial CMG 87

Simon Fraser University Spring Econ 302 D200 Final Exam Solution Instructor: Songzi Du Tuesday April 21, 2015, 12 3 PM

GOD S BIG STORY Week 1: Creation God Saw That It Was Good 1. LEADER PREPARATION

Find an expected value involving two events. Find an expected value involving multiple events. Use expected value to make investment decisions.

Gaming the Law of Large Numbers

Transcription:

Game Theory and Algorithms Lecture 0: Extensive Games: Critiques and Extensions March 3, 0 Summary: We discuss a game called the centipede game, a simple extensive game where the prediction made by backwards induction contrasts sharply with what happens in practice. Then, we briefly discuss a standard model even more general than extensive with simultaneous moves: games of incomplete information. We give some examples showing how its concepts of randomization and equilibrium are more complex than those encountered previously. Imps and Centipedes The Bottle Imp is a short story by the Scottish writer Robert Louis Stevenson, from 89. The central object of the story is a magical bottle inhabited by an imp who bestows good fortune and wealth upon its owner. There is a catch: each owner of the bottle must sell it before they die or else they will burn in hell; and each sale price must be strictly less than the previous sale price. Legend has it that the imp was the source of power for Napoleon and Captain Cook. In the story, a man buys it, gets a beautiful wife, then sells it, and contracts leprosy. He is able to track down the current owner of the bottle but unfortunately that owner paid only for it. Should the man buy it for and seal his doom, or live in a leper colony? What would you do in this situation? Is it rational to buy the bottle in the first place? We will return to this story later, but for now let us introduce a related one, the centipede game (introduced by Rosenthal in 98). In the centipede game, we have two players with a joint investment. ˆ The two players take alternate turns caring for the investment, which starts off with a value of $0. ˆ On each turn, either the player can Invest, increasing the value of the investment by one dollar, or else they can Cash In. Lecture Notes for a course given by David Pritchard at EPFL, Lausanne.

ˆ If a player cashes in, the game ends: the value of the investment is split equally between the two players, and the player who cashed in gets a $ bonus. ˆ We limit the investment s value (e.g., at $00); the next player to move once it reaches this value is forced to Cash In. This is a finite extensive game without simultaneous moves. Let s draw its game tree we will use a limit of $5 to make the tree smaller, and multiply everything by to remove fractions: I I I I I C C C C C C, 0, 3 4, 3, 5 6, 4 5, 7 The game gets its name since the shape of the tree looks a little bit like a caterpillar. Next, we would like to see what backwards induction predicts in this game. ˆ If the investment reaches $0 in value, the second player is forced to cash in, giving utilities (5, 7). ˆ If the investment reaches $8 in value, the first player can either cash in to get utility 6, or invest and get utility 5. So, he would cash in. ˆ Likewise, if the investment reaches $6 in value, the second player will cash in since 5 4. ˆ Continuing in this way, the only subgame perfect equilibrium is the strategy profile (always cash in, always cash in). So the SPE behaviour would give a payoff of $ to player and $0 to player (no matter how large the limit is).. Experiments In experiments, players do not do what SPEs predict (even though we are in the special class without ties where SPEs seemed to be very logical predictors). Here are the results of one particular study []: ˆ The experiment took place in 989 with a maximum investment value of $6. ˆ The actual cash value given to participants was $0.0 x instead of x. ˆ Players played about 0 games, never facing the same opponent twice. Roughly, the result was a bell curve: most of the games terminated after 3 or 4 Invest moves, with a smooth decrease to a handful of games stopping immediately or going as long as possible.

.. Discussion One fact which is important is that the players non-equilibrium behaviour actually gives them more money than equilibrium behaviour would. In this way it is similar to what is sometimes reported when humans play a finitely-repeated Prisoners Dilemma: there is cooperation at the start, and back-stabbing at the end. Nonetheless, even taking into account randomization, the game has no mixed SPE other than the pure SPE we already found. So, it means that at least one of the assumptions (greed, rationality, maximizing their income in every subgame, predicting and simulating their opponent) is invalid. It raises some interesting questions: ˆ Suppose player Invests on her first move. Then player could deduce that player is not using an SPE strategy; and given that player Invested once, might they not Invest again? ˆ One might approximate large versions of this game by an infinite game; in this game (always invest, always invest) is an SPE. Is it easier to do this approximate calculation rather than reason everything through? We also note that even the weaker notion of Nash equilibria always leads to Cashing in on the first move.. Bottle Imp Conclusion The bottle imp game leads to basically the same analysis as the centipede game, where the game ends once we hit the minimum possible monetary value. The bottle imp game illustrates that the fact that there are players is immaterial; we could equally well have a different player play at each node. And for the story... As you may recall, our hero was faced with the decision whether to become a leper, or buy the bottle for, condemning his soul to hell since he d have to sell it for less than. Luckily, his wife suggests they sail to a colony where they use centimes worth /5 of a cent. His wife bribes a sailor to buy it for four centimes, and she buys it afterwards. The man discovers this and bribes a drunk to buy it from her from two centimes, proposing to buy it back from the drunk for centime. The drunk buys the bottle, and the man tries to buy it back, politely reminding that the man who has that bottle goes to hell. The drunk replies I reckon I m going anyway, and this bottle s the best thing to go with I ve struck yet. No, sir! This is my bottle now, and you can go and fish for another. The man and wife live happily ever after. Games of Incomplete Information We now explain a little bit about how the model of extensive games seen previously can be extended to include poker. The relevant model is called a game with incomplete information. 3

Such a game is represented with the same sort of game tree used for an extensive game without simultaneous moves; we also require that the edges (actions) are labelled. Next, we draw information sets around some of the internal nodes of the tree. ˆ The information sets are disjoint (i.e. each internal node lies in at most one information set). ˆ Any two nodes in an information set must be labelled by the same player. ˆ Any two nodes in an information set must have the same labels on the edges coming out of them. Then, our model is that each player cannot distinguish nodes in the same information set. That is to say, if nodes h, h both lie in the same information set (say for player i), then any strategy for player i must choose the same action (label) at h and h.. Examples The first example shows that strategic games can be modelled as extensive games with incomplete information. We actually mentioned this in passing in the first lecture: for the taxpayer game, the citizen moves before the government, but the government can t distinguish between the taxpayer s two moves before choosing their own move. p \p A DA L,3 4, DL, 3,3 L DL In the diagrams, we represent information sets with dashed circles. The second example gives an example of a simplified game of poker: A DA A DA,3 4,, 3, 3 ˆ It is a zero-sum game, and the root node is controlled by external randomness. ˆ Both players ante $, and player is dealt a high or low card. We imagine player as always having a middling card. ˆ Player looks at her card and gets to bet $ or check. If she checks, the game ends; she wins with a high card and loses with a low card. 4

ˆ If player bet, then player can call or fold. Folding loses the $ ante immediately, while calling means that the stakes are ±$ depending on the type of player s card. {player high} chance {player low}, check bet bet check, fold call fold call,,,, For this particular game, there is a unique pair of maxminimizing strategies, as follows. Player calls /3 of the time. Player always bets with a high card, and also bets /3 of the time with a low card (intuitively, he bluffs sometimes since otherwise player would never need to call).. Equilibria Equilibrium concepts in games of incomplete information tend to be more complicated. We give one example here, adapted from Osborne & Rubinstein. A B C, L 3, R L R 0, 0,, Consider the strategy profile (A, R). Is it an equilibrium? The reasonableness of playing R for player depends on her estimation of Pr[B]/(Pr[B] + Pr[C]) which is 0/0 if player is always using A. The subgame-perfect concept of checking subgames is not so useful here, since there are no subgames. So equilibrium concepts tend to incorporate new ideas such as beliefs, or 5

convergent sequences of strategy profiles, or allowing errors. We mention a few names of these concepts: (weak) sequential equilibria, trembling-hand perfect equilibria, and perfect equilibria, which allow a second most likely choice..3 Mixed Strategies So far, we did not explicitly describe how mixed strategies are modelled in extensive games. There are actually two alternatives: ˆ A player can flip a generalized coin once at the start of the game, and then they pick a pure strategy thereafter. (Probability distribution over strategy functions.) ˆ At each node, a player gets to independently flip a new generalized coin, and makes just one action based on the result of that flip. (Single strategy function whose output is drawn from a distribution, called a behavioural mixed strategy.) These turn out to be equivalent for all games studied before today, but are not equivalent in general. The distinction between the definitions are easiest to see with an example. It also will show that with incomplete information, the two concepts are different, and neither one generalizes the other. A B A B [outcome z] [outcome x] [outcome y] Consider the game depicted above. Player is forgetful whether he has already moved or not. ˆ If player flips a coin before the game starts and thereafter picks a pure strategy A or B uniformly at random, then he gets outcome x half of the time, and outcome z half of the time. ˆ If player flips a coin before each move and picks a uniformly random choice, then he can get to outcome x, y each a quarter of the time, and z half of the time. ˆ Neither model can achieve the distribution on outcomes achieved by the other model. 6

There is a natural sufficient condition for the equivalence of the two models: a game has perfect recall if for all pairs h, h in the same information set for some player i, the sequence of prior moves made by player i are the same in h and h. (We assume for this definition that nodes in different information sets get distinct labels for their moves.) For example, the game just pictured does not have perfect recall because of the two nodes in the same information set, one was reached by action sequence, and the other by (A). One nice thing that happens with perfect recall is that we can work on trees in a bottomup fashion. Moreover, we get the equivalence mentioned above. Theorem. In games with perfect recall, randomizing at the start of the game is outcomeequivalent to randomizing each choice separately. The proof idea is to use conditional probability. The next topic will be mechanism design. As a segue from extensive games, we will mention cake-cutting games. References [] R. D. McKelvey and T. R. Palfrey. An experimental study of the centipede game. Econometrica, 60(4):pp. 803 836, 99. 7