RISK AVERSION IN GAME SHOWS



Similar documents
Chapter 25: Exchange in Insurance Markets

Prospect Theory Ayelet Gneezy & Nicholas Epley

Sample Size and Power in Clinical Trials

Association Between Variables

Worldwide Casino Consulting Inc.

1 Uncertainty and Preferences

A Statistical Analysis of Popular Lottery Winning Strategies

Lab 11. Simulations. The Concept

Book Review of Rosenhouse, The Monty Hall Problem. Leslie Burkholder 1

Direct test of Harville's multi-entry competitions model on race-track betting data

Math 728 Lesson Plan

Statistics 2014 Scoring Guidelines

Economics 1011a: Intermediate Microeconomics

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Chapter 6 Experiment Process

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Chapter 1 Introduction. 1.1 Introduction

Decision Making under Uncertainty

Video Poker in South Carolina: A Mathematical Study

Ready, Set, Go! Math Games for Serious Minds

How to Win the Stock Market Game

Lecture 13. Understanding Probability and Long-Term Expectations

You Are What You Bet: Eliciting Risk Attitudes from Horse Races

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Are Lottery Players Affected by Winning History? Evidence from China s Individual Lottery Betting Panel Data

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

THE WINNING ROULETTE SYSTEM.

CALCULATIONS & STATISTICS

Standard 12: The student will explain and evaluate the financial impact and consequences of gambling.

Chapter 5 Uncertainty and Consumer Behavior

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Capital Structure. Itay Goldstein. Wharton School, University of Pennsylvania

Colored Hats and Logic Puzzles

Reflections on Probability vs Nonprobability Sampling

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs

Conn Valuation Services Ltd.

AN ANALYSIS OF A WAR-LIKE CARD GAME. Introduction

DOES SPORTSBOOK.COM SET POINTSPREADS TO MAXIMIZE PROFITS? TESTS OF THE LEVITT MODEL OF SPORTSBOOK BEHAVIOR

Simple Regression Theory II 2010 Samuel L. Baker

$ ( $1) = 40

Decision Analysis. Here is the statement of the problem:

Current California Math Standards Balanced Equations

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

Martin J. Silverthorne. Triple Win. Roulette. A Powerful New Way to Win $4,000 a Day Playing Roulette! Silverthorne Publications, Inc.

Daily vs. monthly rebalanced leveraged funds

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Behavioral Responses towards Risk Mitigation: An Experiment with Wild Fire Risks

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Risk Attitudes in Large Stake Gambles: Evidence from a Game Show. Cary Deck, Jungmin Lee, and Javier Reyes *

A Probabilistic Model for Measuring Stock Returns and the Returns from Playing Lottery Tickets: The Basis of an Instructional Activity

Do Commodity Price Spikes Cause Long-Term Inflation?

International Statistical Institute, 56th Session, 2007: Phil Everson

Fairfield Public Schools

Constructing a TpB Questionnaire: Conceptual and Methodological Considerations

LOGIT AND PROBIT ANALYSIS

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Descriptive Statistics

Setting the scene. by Stephen McCabe, Commonwealth Bank of Australia

Understanding Financial Management: A Practical Guide Guideline Answers to the Concept Check Questions

Momentum Traders in the Housing Market: Survey Evidence and a Search Model

A Numeracy Refresher

State-Dependent Risk Preferences: Evidence From Online Sports Gambling

Chapter 5. Conditional CAPM. 5.1 Conditional CAPM: Theory Risk According to the CAPM. The CAPM is not a perfect model of expected returns.

FUN AND EASY PHONATHON GAMES

6.4 Normal Distribution

Beating the MLB Moneyline

Using games to support. Win-Win Math Games. by Marilyn Burns

Sample Size Issues for Conjoint Analysis

During the last several years, poker has grown in popularity. Best Hand Wins: How Poker Is Governed by Chance. Method

Independent samples t-test. Dr. Tom Pierce Radford University

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R

Binomial lattice model for stock prices

Linear Programming Notes VII Sensitivity Analysis

GOLF COMPETITIONS AND HOW THEY ARE PLAYED Golf Australia Advice (Version 20 February 2013)

There are a number of superb online resources as well that provide excellent blackjack information as well. We recommend the following web sites:

Problem of the Month: Fair Games

Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling

Lotto Master Formula (v1.3) The Formula Used By Lottery Winners

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Decision & Risk Analysis Lecture 6. Risk and Utility

Forex Success Formula

Market efficiency in greyhound racing: empirical evidence of absence of favorite-longshot bias

LIVE CASINO 1. ROULETTE GAME RULES. 1.1 Game Overview. 1.2 How to Play. 1.3 Bet Type Explanation

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

EVALUATION OF THE PAIRS TRADING STRATEGY IN THE CANADIAN MARKET

Fundamentals of Probability

Texas Hold em. From highest to lowest, the possible five card hands in poker are ranked as follows:

A Basic Introduction to Missing Data

Forex Success Formula. Presents. Secure Your Money

Homework Assignment #2: Answer Key

Two-sample inference: Continuous data

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Making Decisions in Chess

Free Report. My Top 10 Tips to Betting Like a Pro With Zero Risk

Financial Mathematics and Simulation MATH Spring 2011 Homework 2

Clock Arithmetic and Modular Systems Clock Arithmetic The introduction to Chapter 4 described a mathematical system

Transcription:

RISK AVERSION IN GAME SHOWS Steffen Andersen, Glenn W. Harrison, Morten I. Lau and E. Elisabet Rutstro m ABSTRACT We review the use of behavior from television game shows to infer risk attitudes. These shows provide evidence when contestants are making decisions over very large stakes, and in a replicated, structured way. Inferences are generally confounded by the subjective assessment of skill in some games, and the dynamic nature of the task in most games. We consider the game shows Card Sharks, Jeopardy!, Lingo, and finally Deal Or No Deal. We provide a detailed case study of the analyses of Deal Or No Deal, since it is suitable for inference about risk attitudes and has attracted considerable attention. Observed behavior on television game shows constitutes a controlled natural experiment that has been used to estimate risk attitudes. Contestants are presented with well-defined choices where the stakes are real and sizeable, and the tasks are repeated in the same manner from contestant to contestant. We review behavior in these games, with an eye to inferring risk attitudes. We describe the types of assumptions needed to evaluate behavior, and propose a general method for estimating the parameters of structural models of choice behavior for these games. We illustrate with a detailed case study of behavior in the U.S. version of Deal Or No Deal (DOND). Risk Aversion in Experiments Research in Experimental Economics, Volume 12, 361 406 Copyright r 2008 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0193-2306/doi:10.1016/S0193-2306(08)00008-2 361

362 STEFFEN ANDERSEN ET AL. In Section 1 we review the existing literature in this area that is focused on risk attitudes, starting with Gertner (1993) and the Card Sharks program. We then review the analysis of behavior on Jeopardy! by Metrick (1995) and on Lingo by Beetsma and Schotman (2001). 1 In Section 2 we turn to a detailed case study of the DOND program that has generated an explosion of analyses trying to estimate large-stakes risk aversion. We explain the basic rules of the game, which is shown with some variations in many countries. We then review complementary laboratory experiments that correspond to the rules of the naturally occurring game show. Finally, we discuss alternative modeling strategies employed in related DOND literature. Section 3 proposes a general method for estimating choice models in the stochastic dynamic programming environment that most of these game shows employ. We resolve the curse of dimensionality in this setting by using randomization methods and certain simplifications to the forwardlooking strategies adopted. We discuss the ability of our approach to closely approximate the fully dynamic path that agents might adopt. We illustrate the application of the method using data from the U.S. version of DOND, and estimate a simple structural model of expected utility theory choice behavior. The manner in which our method can be extended to other models is also discussed. Finally, in Section 4 we identify several weaknesses of game show data, and how they might be addressed. We stress the complementary use of natural experiments, such as game shows, and laboratory experiments. 1. PREVIOUS LITERATURE 1.1. Card Sharks The game show Card Sharks provided an opportunity for Gertner (1993) to examine dynamic choice under uncertainty involving substantial gains and losses. Two key features of the show allowed him to examine the hypothesis of asset integration: each contestant s stake accumulates from round to round within a game, and the fact that some contestants come back for repeat plays after winning substantial amounts. The game involves each contestant deciding in a given round whether to bet that the next card drawn from a deck will be higher or lower than

Risk Aversion in Game Shows 363 Fig. 1. Money Cards Board in Card Sharks. some face card on display. Fig. 1 provides a rough idea of the layout of the Money Cards board before any face cards are shown. Fig. 2 provides a representation of the board from a computerized laboratory implementation 2 of Card Sharks. In Fig. 2 the subject has a face card with a 3, and is about to enter the first bet. Cards are drawn without replacement from a standard 52-card deck, with no Jokers and with Aces high. Contestants decide on the relative value of the next card, and then on an amount to bet that their choice is correct. If they are correct their stake increments by the amount bet, if they are incorrect their stake is reduced by the amount bet, and if the new card is the same as the face card there is no change in the stake. Every contestant starts off with an initial stake of $200, and bets could be in increments of $50 of the available stake. After three rounds in the first, bottom row of cards, they move to the second, middle row and receive an additional $200 (or $400 in some versions). If the stake goes to zero in the first row, contestants go straight to the second row and receive the new stake; otherwise, the additional stake is added to what remains from row one. The second row includes three choices, just as in the first row. After these three choices, and if the stakes have not dropped to zero, they can play the final bet. In this case they have to bet at least one-half of their stake, but otherwise the betting works the same way. One feature of the game is that contestants

364 STEFFEN ANDERSEN ET AL. Fig. 2. Money Cards Board from Lab Version of Card Sharks. sometimes have the option to switch face cards in the hope of getting one that is easier to win against. 3 The show aired in the United States in two major versions. The first, between April 1978 and October 1981, was on NBC and had Jim Perry as the host. The second, between January 1986 and March 1989, was on CBS and had Bob Eubanks as the host. 4 The maximum prize was $28,800 on the NBC version and $32,000 on the CBS version, and would be won if the contestant correctly bet the maximum amount in every round. This only occurred once. Using official inflation calculators 5 this converts into 2006 dollars between $89,138 and $63,936 for the NBC version, and between $58,920 and $52,077 for the CBS version.

Risk Aversion in Game Shows 365 These stakes are actually quite modest in relation to contemporary game shows in the United States, such as DOND described below, which typically has a maximal stake of $1,000,000. Of course, maximal stakes can be misleading, since Card Sharks and DOND are both long shot lotteries. Average earnings in the CBS version used by Gertner (1993) were $4,677, which converts to between $8,611 and $7,611 in 2006, whereas average earnings in DOND have been $131,943 for the sample we report later (excluding a handful of special shows with significantly higher prizes). 1.1.1. Estimates of Risk Attitudes The analysis of Gertner (1993) assumes a Constant Absolute Risk Aversion (CARA) utility function, since he did not have information on household wealth and viewed that as necessary to estimate a Constant Relative Risk Aversion (CRRA) utility function. We return to the issue of household wealth later. Gertner (1993) presents several empirical analyses. He initially (p. 511) focuses on the last round, and uses the optimal investment formula b ¼ lnðpwin Þ lnðp lose Þ 2a where the probabilities of winning and losing the bet b are defined by p win and p lose, and the utility function is UðWÞ ¼ expð awþ for wealth W. 6 From observed bets he infers a. There are several potential problems with this approach. First, there is an obvious sample selection problem from only looking at the last round, although this is not a major issue since relatively few contestants go bankrupt (less than 3%). Second, there is the serious problem of censoring at bets of 50% or 100% of the stake. Gertner (1993, p. 510) is well aware of the issue, and indeed motivates several analytical approaches to these data by a desire to avoid it: Regression estimates of absolute risk aversion are sensitive to the distribution assumptions one makes to handle the censoring created by the constraints that a contestant must bet no more than her stake and at least half of her stake in the final round. Therefore, I develop two methods to estimate a lower bound on the level of risk aversion that do not rely on assumptions about the error distribution. The first method he uses is just to assume that the censored responses are in fact the optimal response. The 50% bets are assumed to be optimal bets, when in fact the contestant might wish to bet less (but cannot due to the

366 STEFFEN ANDERSEN ET AL. final-round betting rules); thus inferences from these responses will be biased towards showing less risk aversion than there might actually be. Conversely, the 100% bets are assumed to be risk neutral, when in fact they might be risk lovers; thus inferences from these responses will be biased towards showing more risk aversion than there might actually be. Two wrongs do not make a right, although one does encounter such claims in empirical work. Of course, this approach still relies on exactly the same sort of assumptions about the interpretation of behavior, although not formalized in terms of an error distribution. And it is not apparent that the estimates will be lower bounds, since this censoring issue biases inferences in either direction. The average estimate of ARA to emerge is 0.000310, with a standard error of 0.000017, but it is not clear how one should interpret this estimate since it could be an overestimate or an underestimate. The second approach is a novel and early application of simulation methods, which we will develop in greater detail below. A computer simulates optimal play by a risk-neutral agent playing the entire game 10 million times, recognizing that the cards are drawn without replacement. The computer does not appear to recognize the possibility of switching cards, but that is not central to the methodological point. The average return from this virtual lottery (VL) is $6,987 with a standard deviation of $10,843. It is not apparent that the lottery would have a Gaussian distribution of returns, but that can be allowed for in a more complete numerical analysis as we show later, and is again not central to the main methodological point. The next step is to compare this distribution with the observed distribution of earnings, which was an average of $4,677 with a standard deviation of $4,258, and use a revealed preference argument to infer what risk attitudes must have been in play for this to have been the outcome instead of the VL: A second approach is to compare the sample distribution of outcomes with the distribution of outcomes if a contestant plays the optimal strategy for a risk-neutral contestant. One can solve for the coefficient of absolute risk aversion that would make an individual indifferent between the two distributions. By revealed preference, an average contestant prefers the actual distribution to the expected-value maximizing strategy, so this is an estimate of the lower bound of constant absolute risk aversion (pp. 511/512). This approach is worth considering in more depth, because it suggests estimation strategies for a wide class of stochastic dynamic programming problems which we develop in Section 3. This exact method will not work once one moves beyond special cases such as risk neutrality, where outcomes

Risk Aversion in Game Shows 367 and behavior in later rounds have no effect on optimal behavior in earlier rounds. But we will see that an extension of the method does generalize. The comparison proposed here generates a lower bound on the ARA, rather than a precise estimate, since we know that an agent with an even higher ARA would also implicitly choose the observed distribution over the virtual RN distribution. Obviously, if one could generate VL distributions for a wide range of ARA values, it would be possible to refine this estimation step and select the ARA that maximizes the likelihood of the data. This is, in fact, exactly what we propose later as a general method for estimating risk attitudes in such settings. The ARA bound derived from this approach is 0.0000711, less than one-fourth of the estimate from the first method. Gertner (1993, p. 512) concludes that The Card Sharks data indicate a level of risk aversion higher than most existing estimates. Contestants do not seem to behave in a risk-loving and enthusiastic way because they are on television, because anything they win is gravy, or because the producers of the show encourage excessive risk-taking. I think this helps lend credence to the potential importance and wider applicability of the anomalous results I document below. His first method does not provide any basis for these claims, since risk loving is explicitly assumed away. His second method does indicate that the average player behaves as if risk averse, but there are no standard errors on that bound. Thus, one simply cannot say that it is statistically significant evidence of risk aversion. 1.1.2. EUT Anomalies The second broad set of empirical analyses by Gertner (1993) considers a regression model of bets in the final round, and shows some alleged violations of EUT. The model is a two-limit tobit specification, recognizing that bets at 50% and 100% may be censored. However, most of the settings in which contestants might rationally bet 50% or 100% are dropped. Bets with a face card of 2 or an Ace are dropped since they are sure things in the sense that the optimal bet cannot result in a loss (the bet is simply returned if the same card is then turned up). Similarly, bets with a face card of 8 are dropped, since contestants almost always bet the minimum. These deletions amount to 258 of the 844 observations, which is not a trivial sub-sample. The regression model includes several explanatory variables. The central ones are cash and stake. Variable cash is the accumulated earnings by the contestant to that point over all repetitions of the game. So this includes previous plays of the game for champions, as well as earnings

368 STEFFEN ANDERSEN ET AL. accumulated in rounds 1 6 of the current game. Variable stake is the accumulated earnings in the current game, so it excludes earnings from previous games. One might expect the correlation of stake and cash to be positive and high, since the average number of times the game is played in these data is 1.85 ( ¼ 844/457). Additional explanatory variables include a dummy for new players that are in their first game; the ratio of cash to the number of times the contestant has played the whole game (the ratio is 0 for new players); the value of any cars that have been won, given by the stated sticker price of the car; and dummy variables for each of the possible face card pairs (in this game a 3 is essentially the same as a King, a 4 the same as a Queen, etc). The stake variable is included as an interaction with these face dummies, which are also included by themselves. 7 The model is estimated with or without a multiplicative heteroskedasticity correction, and the latter estimates preferred. Card-counters are ignored when inferring probabilities of a win, and this seems reasonable as a first approximation. Gertner (1993, Section VI) draws two striking conclusions from this model. The first is that stake is statistically significant in its interactions with the face cards. The second is that the cash variable is not significant. The first result is said to be inconsistent with EUT since earnings in this show are small in relation to wealth, and The desired dollar bet should depend upon the stakes only to the extent that the stakes impact final wealth. Thus, risky decisions on Card Sharks are inconsistent with individuals maximizing a utility function over just final wealth. If one assumes that utility depends only on wealth, estimates of zero on card intercepts and significant coefficients on the stake variable imply that outside wealth is close to zero. Since this does not hold, one must reject utility depending only on final wealth (p. 517). This conclusion bears close examination. First, there is a substantial debate as to whether EUT has to be defined over final wealth, whatever that is, or can be defined just over outcomes in the choice task before the contestant (e.g., see Cox and Sadiraj (2006) and Harrison, Lau, and Rutstro m (2007) for references to the historical literature). So even if one concludes that the stake matters, this is not fatal for specifications of EUT defined over prizes, as clearly recognized by Gertner (1993, p. 519) in his reference to Markowitz (1952). Second, the deletion of all extreme bets likely leads to a significant understatement of uncertainty about coefficient estimates. Third, the regression does not correct for panel effects, and these could be significant since the variables cash and stake are correlated with the individual. 8 Hence their coefficient estimates might be picking up other, unobservable effects that are individual-specific.

Risk Aversion in Game Shows 369 The second result is also said to be inconsistent with EUT, in conjunction with the first result. The logic is that stake and cash should have an equal effect on terminal wealth, if one assumes perfect asset integration and that utility is defined over terminal wealth. But one has a significant effect on bets, and the other does not. Since the assumption that utility is defined over terminal wealth and that asset integration is perfect are implicitly maintained by Gertner (1993, p. 517ff.), he concludes that EUT is falsified. However, one can include terminal wealth as an argument of utility without also assuming perfect asset integration (e.g., Cox & Sadiraj, 2006). This is also recognized explicitly by Gertner (1993, p. 519), who considers the possibility that contestants have multi-attribute utility functions, so that they care about something in addition to wealth. 9 Thus, if one accepts the statistical caveats about samples and specifications for now, these results point to the rejection of a particular, prominent version of EUT, but they do not imply that all popular versions of EUT are invalid. 1.2. Jeopardy! In the game show Jeopardy! there is a subgame referred to as Final Jeopardy. At this point, three contestants have cash earnings from the initial rounds. The skill component of the game consists of hearing some text read out by the host, at which point the contestants jump in to state the question that the text provides the answer to. 10 In Final Jeopardy the contestants are told the general subject matter for the task, and then have to privately and simultaneously state a wager amount from their accumulated points. They can wager any amount up to their earned endowment at that point, and are rewarded with even odds: if they are correct they get that wager amount added, but if they are incorrect they have that amount deducted. The winner of the show is the contestant with the most cash after this final stage. The winner gets to keep the earnings and come back the following day to try and continue as champion. In general, these wagers are affected by the risk attitudes of contestants. But they are also affected by their subjective beliefs about their own skill level relative to the other two contestants, and by what they think the other contestants will do. So this game cannot be fully analyzed without making some game-theoretic assumptions. Jeopardy! was first aired in the United States in 1964, and continued until 1975. A brief season returned between 1978 and 1979, and then the modern era began in 1984 and continues to this day. The format changes have been

370 STEFFEN ANDERSEN ET AL. relatively small, particularly during the modern era. The data used by Metrick (1995) comes from shows broadcasted between October 1989 and January 1992, and reflects more than 1,150 decisions. Metrick (1995) examines behavior in Final Jeopardy in two stages. 11 The first stage considers the subset of shows in which one contestant is so far ahead in cash that the bet only reveals risk attitudes and beliefs about own skill. In such runaway games there exist wagers that will ensure victory, although there might be some rationale prior to September 2003 for someone to bet an amount that could lead to a loss. Until then, the champion had to retire after five wins, so if one had enough confidence in one s skill at answering such questions, one might rationally bet more than was needed to ensure victory. After September 2003 the rules changed, so the champion stays on until defeated. In the runaway games Metrick (1995, p. 244) uses the same formula that Gertner (1993) used for CARA utility functions. The only major difference is that the probability of winning in Jeopardy! is not known objectively to the observer. 12 His solution is to substitute the observed fraction of correct answers, akin to a rational expectations assumption, and then solve for the CARA parameter a that accounts for the observed bets. The result is an estimate of a equal to 0.000066 with a standard error of 0.000056. Thus, there is slight evidence of risk aversion, but it is not statistically significant, leading Metrick (1995, p. 245) to conclude that these contestants behaved in a risk-neutral manner. The second stage of the analysis considers subsamples in which two players have accumulated scores that are sufficiently close that they have to take beliefs about the other into account, but where there is a distant third contestant who can be effectively ignored. Metrick (1995) cuts this Gordian knot of strategic considerations by assuming that contestants view themselves as betting against contestants whose behavior can be characterized by their observed empirical frequencies. He does not use these data to make inferences about risk attitudes. 1.3. Lingo The underlying game in Lingo involves a team of two people guessing a hidden five-letter word. Fig. 3 illustrates one such game from the U.S. version. The team is told the first letter of the word, and can then just state words. If incorrect, the words that are tried are used to reveal letters in the correct word if there are any. To take the example in Fig. 3, the true word

Risk Aversion in Game Shows 371 Fig. 3. The Word Puzzle in Lingo. was STALL. So the initial S was shown. The team suggested SAINT and is informed (by light grey coloring) that A and T are present in the correct word. The team is not told the order of the letters A and T in the correct word. The team then suggested STAKE, and was informed that the T and A were in the right place (by grey coloring) and that no other letters were in the correct word. The team then tried STAIR, SEATS, and finally STALL. Most teams are able to guess the correct word in five rounds. The game occurs in two stages. In the first stage, one team of two plays against another team for several of these Lingo word-guessing games. The couple with the most money then goes on to the second stage, which is the one of interest for measuring risk attitudes because it is non-interactive. So the winning couple comes into the main task with a certain earned endowment (which could be augmented by an unrelated game called jackpot ). The team also comes in with some knowledge of its own ability to solve these word-guessing puzzles. In the Dutch data used by Beetsma and Schotman (2001), spanning 979 games, the frequency distribution of the number of solutions across rounds

372 STEFFEN ANDERSEN ET AL. 1 5 in the final stage was 0.14, 0.32, 0.23, 0.13, 0.081, and 0.089, respectively. Every round that the couple requires to guess the word means that they have to pick one ball from an urn affecting their payoffs, as described below. If they do not solve the word puzzle, they have to pick six balls. These balls determine if the team goes bust or survives something called the Lingo Board in that round. An example of the Lingo Board is shown in Fig. 4, from Beetsma and Schotman (2001, Fig. 3). 13 There are 35 balls in the urn numbered from 1 to 35, plus one golden ball. If the golden ball is picked then the team wins the cash prize for that round and gets a free pass to the next round. If one of the numbered balls is picked, then the fate of the team depends on the current state of the Lingo Board. The team goes bust if they get a row, column, or diagonal of X s, akin to the parlor game noughts and crosses. So solving the word puzzle in fewer moves is good, since it means that fewer balls have to be drawn from the urn, and hence that the survival probability is higher. In the example from Fig. 4, drawing a 5 would be fatal, drawing an 11 would not be, and drawing a 1 would not be if a 2 or 8 had not been previously drawn. If the team survives a round it gets a cash prize, and is asked if they want to keep going or stop. This lasts for five rounds. So apart from the skill part of the game, guessing the words, this is the only choice the team makes. This is therefore a stop-go problem, in which the team balances current earnings with the lottery of continuing and either earning more cash or going bust. If the team chooses to continue the stake doubles; if the golden ball had been drawn it is replaced in the urn. If the team goes bust it takes home nothing. Teams can play the game up to three times, then retire from the show. Fig. 4. Example of a Lingo Board.

Risk Aversion in Game Shows 373 Risk attitudes are involved when the team has to balance the current earnings with the lottery of continuing. That lottery depends on subjective beliefs about the skill level of the team, the state of the Lingo Board at that point, and the perception of the probabilities of drawing a fatal number or the golden ball. In many respects, apart from the skill factor and the relative symmetry of prizes, this game is remarkably like DOND, as we see later. Beetsma and Schotman (2001) evaluate data from 979 finals. Each final lasts several rounds, so the sample of binary stop/continue decisions is larger, and constitutes a panel. Average earnings in this final round in their sample are 4,106 Dutch guilders ( f ), with potential earnings, given the initial stakes brought into the final, of around f 15,136. The average exchange rate into U.S. dollars in 1997, which is around when these data were from, was f 0.514 per dollar, so these stakes are around $2,110 on average, and up to roughly $7,780. These are not life-changing prizes, like the top prizes in DOND, but are clearly substantial in relation to most lab experiments. Beetsma and Schotman (2001, Section 4) show that the stop/continue decisions have a simple monotonic structure if one assumes CRRA or CARA utility. Since the odds of surviving never get better with more rounds, if it is optimal to stop in one round then it will always be optimal to stop in any later round. This property does not necessarily hold for other utility functions. But for these utility functions, which are still an important class, one can calculate a threshold survival probability p n i for any round i such that the team should stop if the actual survival probability falls below it. This threshold probability does depend on the utility function and parameter values for it, but in a closed-form fashion that can be easily evaluated within a maximum-likelihood routine. 14 Each team can play the game three times before it has to retire as a champion. The specification of the problem clearly recognizes the option value in the first game of coming back to play the game a second or third time, and then the option value in the second game of coming back to play a third time. The certainty-equivalent of these option values depends, of course, on the risk attitudes of the team. But the estimation procedure black boxes these option values to collapse the estimation problem down to a static one: they are free parameters to be estimated along with the parameter of the utility function. Thus, they are not constrained by the expected returns and risk of future games, the functional form of utility, and the specific parameters values being evaluated in the maximum-likelihood routine. Beetsma and Schotman (2001, p. 839) do clearly check that the option value in the first game exceeds the option value in the second game, but (a) they only examine point estimates, and make no claim that this

374 STEFFEN ANDERSEN ET AL. difference is statistically significant, 15 and (b) there is no check that the absolute values of these option values are consistent with the utility function and parameter values. In addition, there is no mention of any corrections for the fact that each team makes several decisions, and that errors for that team are likely correlated. With these qualifications, the estimate of the CRRA parameter is 0.42, with a standard error of 0.05, if one assumes that utility is only defined over the monetary prizes. It rises to 6.99, with a standard error of 0.72, if one assumes a baseline wealth level of f 50,000, which is the preferred estimate. Each of these estimates is significantly different from 0, implying rejection of risk neutrality in favor of risk aversion. The CARA specification generates comparable estimates. One extension is to allow for probability weighting on the actual survival probability p i in round i. The weighting occurs in the manner of original Prospect Theory, due to Kahneman and Tversky (1979), and not in the rank-dependent manner of Quiggin (1982, 1993) and Cumulative Prospect Theory. One apparent inconsistency is that the actual survival probabilities are assumed to be weighted subjectively, but the threshold survival probabilities p n i are not, which seems odd (see their Eq. (18), p. 843). The results show that estimates of the degree of concavity of the utility function increase substantially, and that contestants systematically overweight the actual survival probability. We return to some of the issues of structural estimation of models assuming decision weights, in a rank-dependent manner, in the discussion of DOND and Andersen, Harrison, Lau, and Rutstro m (2006a, 2006b). 2. DEAL OR NO DEAL 2.1. The Game Show as a Natural Experiment The basic version of DOND is the same across all countries. We explain the general rules by focusing on the version shown in the United States, and then consider variants found in other countries. The show confronts the contestant with a sequential series of choices over lotteries, and asks a simple binary decision: whether to play the (implicit) lottery or take some deterministic cash offer. A contestant is picked from the studio audience. They are told that a known list of monetary prizes, ranging from $0.01 up to $1,000,000, has been placed in 26 suitcases. 16 Each suitcase is carried onstage by attractive female models, and has a number from 1 to

Risk Aversion in Game Shows 375 26 associated with it. The contestant is informed that the money has been put in the suitcase by an independent third party, and in fact it is common that any unopened cases at the end of play are opened so that the audience can see that all prizes were in play. Fig. 5 shows how the prizes are displayed to the subject at the beginning of the game. The contestant starts by picking one suitcase that will be his case. In round 1, the contestant must pick 6 of the remaining 25 cases to be opened, so that their prizes can be displayed. Fig. 6 shows how the display changes after the contestant picks the first case: in this case the contestant unfortunately picked the case containing the $300,000 prize. A good round for a contestant occurs if the opened prizes are low, and hence the odds increase that his case holds the higher prizes. At the end of each round the host is phoned by a banker who makes a deterministic cash offer to the contestant. In one of the first American shows (12/21/2005) the host made a point of saying clearly that I don t know what s in the suitcases, the banker doesn t, and the models don t. The initial offer in early rounds is typically low in comparison to expected offers in later rounds. We use an empirical offer function later, but the qualitative trend is quite clear: the bank offer starts out at roughly 10% of Fig. 5. Opening Display of Prizes in TV Game Show Deal or No Deal.

376 STEFFEN ANDERSEN ET AL. Fig. 6. Prizes Available After One Case Has Been Opened. the expected value of the unopened cases, and increments by about 10% of that expected value for each round. This trend is significant, and serves to keep all but extremely risk-averse contestants in the game for several rounds. For this reason, it is clear that the case that the contestant owns has an option value in future rounds. In round 2, the contestant must pick five cases to open, and then there is another bank offer to consider. In succeeding rounds, 3 10, the contestant must open 4, 3, 2, 1, 1, 1, 1, and 1 cases, respectively. At the end of round 9, there are only two unopened cases, one of which is the contestant s case. In round 9 the decision is a relatively simple one from an analyst s perspective: either take the non-stochastic cash offer or take the lottery with a 50% chance of either of the two remaining unopened prizes. We could assume some latent utility function, and estimate parameters for that function that best explains observed binary choices. Unfortunately, relatively few contestants get to this stage, having accepted offers in earlier rounds. In our data, only 9% of contestants reach that point. More serious than the smaller sample size, one naturally expects that risk attitudes would affect those surviving to this round. Thus, there would be a serious sample attrition bias if one just studied choices in later rounds.

Risk Aversion in Game Shows 377 The bank offer gets richer and richer over time, ceteris paribus the random realizations of opened cases. In other words, if each unopened case truly has the same subjective probability of having any remaining prize, there is a positive expected return to staying in the game for more and more rounds. A risk-averse subject that might be just willing to accept the bank offer, if the offer were not expected to get better and better, would choose to continue to another round since the expected improvement in the bank offer provides some compensation for the additional risk of going into the another round. Thus, to evaluate the parameters of some latent utility function given observed choices in earlier rounds, we have to mentally play out all possible future paths that the contestant faces. 17 Specifically, we have to play out those paths assuming the values for the parameters of the likelihood function, since they affect when the contestant will decide to deal with the banker, and hence the expected utility of the compound lottery. This corresponds to procedures developed in the finance literature to price pathdependant derivative securities using Monte Carlo simulation (e.g., Campbell, Lo, & MacKinlay, 1997, Section 9.4). We discuss general numerical methods for this type of analysis later. Saying no deal in early rounds provides one with the option of being offered a better deal in the future, ceteris paribus the expected value of the unopened prizes in future rounds. Since the process of opening cases is a martingale process, even if the contestant gets to pick the cases to be opened, it has a constant future expected value in any given round equal to the current expected value. This implies, given the exogenous bank offers (as a function of expected value), that the dollar value of the offer will get richer and richer as time progresses. Thus, bank offers themselves will be a submartingale process. In the U.S. version the contestants are joined after the first round by several family members or friends, who offer suggestions and generally add to the entertainment value. But the contestant makes the decisions. For example, in the very first show a lady was offered $138,000, and her hyperactive husband repeatedly screamed out no deal! She calmly responded, At home, you do make the decisions. But y. we re not at home! She turned the deal down, as it happens, and went on to take an offer of only $25,000 two rounds later. Our sample consists of 141 contestants recorded between December 19, 2005 and May 6, 2007. This sample includes 6 contestants that participated in special versions, for ratings purposes, in which the top prize was increased from $1 million to $2 million, $3 million, $4 million, $5 million or $6 million. 18 The biggest winner on the show so far has been Michelle Falco, who was lucky enough to be on the September 22, 2006 show with a top prize

378 STEFFEN ANDERSEN ET AL. of $6 million. Her penultimate offer was $502,000 when the 3 unopened prizes were $10, $750,000 and $1 million, which has an expected value of $583,337. She declined the offer, and opened the $10 case, resulting in an offer of $808,000 when the expected value of the two remaining prizes was $875,000. She declined the offer, and ended up with $750,000 in her case. In other countries there are several variations. In some cases there are fewer prizes, and fewer rounds. In the United Kingdom there are only 22 monetary prizes, ranging from 1p up to d250,000, and only 7 rounds. In round 1 the contestant must pick 5 boxes, and then in each round until round 6 the contestant has to open 3 boxes per round. So there can be a considerable swing from round to round in the expected value of unopened boxes, compared to the last few rounds of the U.S. version. At the end of round 6 there are only 2 unopened boxes, one of which is the contestant s box. Some versions substitute the option of switching the contestant s box for an unopened box, instead of a bank offer. This is particularly common in the French and Italian versions, and relatively rare in other versions. Things become much more complex in those versions in which the bank offer in any round is statistically informative about the prize in the contestant s case. In that case the contestant has to make some correction for this possibility, and also consider the strategic behavior of the banker s offer. Bombardini and Trebbi (2005) offer clear evidence that this occurs in the Italian version of the show, but there is no evidence that it occurs in the U.K. version. The Australian version offers several additional options at the end of the normal game, called Chance, SuperCase, and Double Or Nothing. In many cases they are used as entertainment filler, for games that otherwise would finish before the allotted 30 min. It has been argued, most notably by Mulino, Scheelings, Brooks, and Faff (2006), that these options should rationally change behavior in earlier rounds, since they provide some uncertain insurance against saying deal earlier rather than later. 2.2. Comparable Laboratory Experiments We also implemented laboratory versions of the DOND game, to complement the natural experimental data from the game shows. 19 The instructions were provided by hand and read out to subjects to ensure that every subject took some time to digest them. As far as possible, they rely on screen shots of the software interface that the subjects were to use to enter their choices. The opening page for the common practice session in the lab, shown in Fig. 7, provides the subject with basic information about the task

Risk Aversion in Game Shows 379 Fig. 7. Opening Screen Shot for Laboratory Experiment. before them, such as how many boxes there were and how many boxes needed to be opened in any round. 20 In the default setup the subject was given the same frame as in the Australian and U.S. game shows: this version has more prizes (26 instead of 22) and more rounds (9 instead of 6) than the U.K. version. After clicking on the Begin box, the lab subject was given the main interface, shown in Fig. 8. This provided the basic information for the DOND task. The presentation of prizes was patterned after the displays used on the actual game shows. The prizes are shown in the same nominal denomination as the Australian daytime game show, and the subject told that an exchange rate of 1,000:1 would be used to convert earnings in the DOND task into cash payments at the end of the session. Thus, the top cash prize the subject could earn was $200 in this version. The subject was asked to click on a box to select his box, and then round 1 began. In the instructions we illustrated a subject picking box #26, and then six boxes, so that at the end of round 1 he was presented with a deal from the banker, shown in Fig. 9. The prizes that had been opened in round 1 were shaded on the display, just as they are in the game show display. The subject is then asked to accept $4,000 or continue. When the

380 STEFFEN ANDERSEN ET AL. Fig. 8. Prize Distribution and Display for Laboratory Experiment. game ends the DOND task earnings are converted to cash using the exchange rate, and the experimenter prompted to come over and record those earnings. Each subject played at their own pace after the instructions were read aloud. One important feature of the experimental instructions was to explain how bank offers would be made. The instructions explained the concept of the expected value of unopened prizes, using several worked numerical examples in simple cases. Then subjects were told that the bank offer would be a fraction of that expected value, with the fractions increasing over the rounds as displayed in Fig. 10. This display was generated from Australian game show data available at the time. We literally used the parameters defining the function shown in Fig. 10 when calculating offers in the experiment, and then rounding to the nearest dollar.

Risk Aversion in Game Shows 381 Fig. 9. Typical Bank Offer in Laboratory Experiment. The subjects for our laboratory experiments were recruited from the general student population of the University of Central Florida in 2006. 21 We have information on 676 choices made by 89 subjects. We estimate the same models for the lab data as for the U.S. game show data. We are not particularly interested in getting the same quantitative estimates per se, since the samples, stake, and context differ in obvious ways. Instead our interest is whether we obtain the same qualitative results: is the lab reliable in terms of the qualitative inferences one draws from it? Our null hypothesis is that the lab results are the same as the naturally occurring results. If we reject this hypothesis one could infer that we have just not run the right lab experiments in some respect, and we have some sympathy for that view. On the other hand, we have implemented our lab experiments in exactly the manner that we would normally do as lab experimenters. So we

382 STEFFEN ANDERSEN ET AL. 1 Path of Bank Offers.9 Bank Offer As A Fraction of Expected Value of Unopened Cases.8.7.6.5.4.3.2.1 0 Fig. 10. 1 2 3 4 5 6 7 8 9 Round Information on Bank Offers in Laboratory Experiment. are definitely able to draw conclusions in this domain about the reliability of conventional lab tests compared to comparable tests using naturally occurring data. These conclusions would then speak to the questions raised by Harrison and List (2004) and Levitt and List (2007) about the reliability of lab experiments. 2.3. Other Analyses of Deal or No Deal A large literature on DOND has evolved quickly. 22 Appendix B in the working paper version documents in detail the modeling strategies adopted in the DOND literature, and similarities and differences to the approach we propose. 23 In general, three types of empirical strategies have been employed to modeling observed DOND behavior. The first empirical strategy is the calculation of CRRA bounds at which a given subject is indifferent between one choice and another. These bounds can be calculated for each subject and each choice, so they have the advantage of not assuming that each subject has the same risk preferences, just that they use the same functional form. The studies differ in terms of

Risk Aversion in Game Shows 383 how they use these bounds, as discussed briefly below. The use of bounds such as these is familiar from the laboratory experimental literature on risk aversion: see Holt and Laury (2002), Harrison, Johnson, McInnes, and Rutstro m (2005), and Harrison, Lau, Rutstro m, and Sullivan (2005) for discussion of how one can then use interval regression methods to analyze them. The limitation of this approach, discussed in Harrison and Rutstro m (2008, Section 2.1), is that it is difficult to go beyond the CRRA or other one-parameter families, and in particular to examine other components of choice under uncertainty (such as more flexible utility functions, preference weighting or loss aversion). 24 Post, van den Assem, Baltussen, and Thaler (2006) use CRRA bounds in their analysis, and it has been employed in various forms by others as noted below. The second empirical strategy is the examination of specific choices that provide trip wire tests of certain propositions of EUT, or provide qualitative indicators of preferences. For example, decisions made in the very last rounds often confront the contestant with the expected value of the unopened prizes, and allow one to identify those who are risk loving or risk averse directly. The limitation of this approach is that these choices are subject to sample selection bias, since risk attitudes and other preferences presumably played some role in whether the contestant reached these critical junctures. Moreover, they provide limited information at best, and do not allow one to define a metric for errors. If we posit some stochastic error specification for choices, as is now common, then one has no way of knowing if these specific choices are the result of such errors or a manifestation of latent preferences. Blavatskyy and Pogrebna (2006) illustrate the sustained use of this type of empirical strategy, which is also used by other studies in some respects. The third empirical strategy it to propose a latent decision process and estimate the structural parameters of that process using maximum likelihood. This is the approach we favor, since it allows one to examine structural issues rather than rely on ad hoc proxies for underlying preferences. Harrison and Rutstro m (2008, Section 2.2) discuss the general methodological advantages of this approach. 3. A GENERAL ESTIMATION STRATEGY The DOND game is a dynamic stochastic task in which the contestant has to make choices in one round that generally entail consideration of future consequences. The same is true of the other game shows used for estimation

384 STEFFEN ANDERSEN ET AL. of risk attitudes. In Card Sharks the level of bets in one round generally affects the scale of bets available in future rounds, including bankruptcy, so for plausible preference structures one should take this effect into account when deciding on current bets. Indeed, as explained earlier, one of the empirical strategies employed by Gertner (1993) can be viewed as a precursor to our general method. In Lingo the stop/continue structure, where a certain amount of money is being compared to a virtual money lottery, is evident. We propose a general estimation strategy for such environments, and apply it to DOND. The strategy uses randomization to break the general curse of dimensionality that is evident if one considers this general class of dynamic programming problems (Rust, 1997). 3.1. Basic Intuition The basic logic of our approach can be explained from the data and simulations shown in Table 1. We restrict attention here to the first 75 contestants that participated in the standard version of the television game with a top prize of $1 million, to facilitate comparison of dollar amounts. There are nine rounds in which the banker makes an offer, and in round 10 the contestant simply opens his case. Only 7 contestants, or 9% of the sample of 75 continued to round 10, with most accepting the banker s offer in rounds 6, 7, 8, and 9. The average offer is shown in column 4. We stress that this offer is stochastic from the perspective of the sample as a whole, even if it is non-stochastic to the specific contestant in that round. Thus, to see the logic of our approach from the perspective of the individual decision-maker, think of the offer as a non-stochastic number, using the average values shown as a proximate indicator of the value of that number in a particular instance. In round 1 the contestant might consider up to nine VLs. He might look ahead one round and contemplate the outcomes he would get if he turned down the offer in round 1 and accepted the offer in round 2. This VL, realized in virtual round 2 in the contestant s thought experiment, would generate an average payoff of $31,141 with a standard deviation of $23,655. The top panel of Fig. 11 shows the simulated distribution of this particular lottery. The distribution of payoffs to these VLs are highly skewed, so the standard deviation may be slightly misleading if one thinks of these as Gaussian distributions. However, we just use the standard deviation as one pedagogic indicator of the uncertainty of the payoff in the VL: in our formal analysis we consider the complete distribution of the VL in a nonparametric manner.

Risk Aversion in Game Shows 385 Table 1. Virtual Lotteries for US Deal or No Deal Game Show. Round Active Contestants Deal! Average Offer Looking At Virtual Lottery Realized In y Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 Round 8 Round 9 Round 10 1 75 0 $16,180 $31,141 $53,757 $73,043 $97,275 $104,793 $120,176 $131,165 $136,325 $136,281 100% ($23,655) ($45,996) ($66,387) ($107,877) ($102,246) ($121,655) ($154,443) ($176,425) ($258,856) 2 75 0 $33,453 $53,535 $72,588 $96,887 $104,369 $119,890 $130,408 $135,877 $135,721 100% ($46,177) ($66,399) ($108,086) ($102,222) ($121,492) ($133,239) ($175,278) ($257,049) 3 75 0 $54,376 $73,274 $97,683 $105,117 $120,767 $131,563 $136,867 $136,636 100% ($65,697) ($107,302) ($101,271) ($120,430) ($153,058) 173810 ($255,660)) 4 75 1 $75,841 $99,895 $107,290 $123,050 $134,307 $139,511 $139,504 100% ($108,629) ($101,954) ($120,900) ($154,091) ($174,702) ($257,219)) 5 74 5 $103,188 $111,964 $128,613 $140,275 $145,710 $145,757 99% ($106,137) ($126,097) ($160,553) ($180,783) ($266,303) 6 69 16 $112,818 $128,266 $139,774 $145,348 $145,301 92% ($124,945) ($159,324) ($180,593) ($266,781) 7 53 20 $119,746 $136,720 $142,020 $142,323 71% ($154,973) ($170,118) ($246,044) 8 33 16 $107,779 $116,249 $116,020 44% ($157,005) ($223,979) 9 17 10 $79,363 $53,929 23% ($113,721) 10 7 9% Note: Data drawn from observations of contestants on the U.S. game show, plus author s simulations of virtual lotteries as explained in the text.