The Two Envelopes Problem



Similar documents
An Innocent Investigation

Inflation. Chapter Money Supply and Demand

arxiv: v1 [math.pr] 5 Dec 2011

Session 7 Fractions and Decimals

Pascal is here expressing a kind of skepticism about the ability of human reason to deliver an answer to this question.

The Deadly Sins of Algebra

Decision Making under Uncertainty

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Constrained optimization.

Practical Probability:

The Basics of Graphical Models

Two-State Options. John Norstad. January 12, 1999 Updated: November 3, 2011.

The Reinvestment Assumption Dallas Brozik, Marshall University

CHAPTER 2 Estimating Probabilities

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Revised Version of Chapter 23. We learned long ago how to solve linear congruences. ax c (mod m)

Lecture 8 The Subjective Theory of Betting on Theories

9.2 Summation Notation

Beating Roulette? An analysis with probability and statistics.

Psychology and Economics (Lecture 17)

Tips for Solving Mathematical Problems

Why is Insurance Good? An Example Jon Bakija, Williams College (Revised October 2013)

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Linear Programming Notes VII Sensitivity Analysis

5544 = = = Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1

Double Deck Blackjack

Game Theory and Poker

A New Interpretation of Information Rate

Book Review of Rosenhouse, The Monty Hall Problem. Leslie Burkholder 1

SOLVING EQUATIONS WITH EXCEL

Objective. Materials. TI-73 Calculator

Week 4: Gambler s ruin and bold play

Solutions for Practice problems on proofs

PYTHAGOREAN TRIPLES KEITH CONRAD

Part 2: One-parameter models

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

Moving on! Not Everyone Is Ready To Accept! The Fundamental Truths Of Retail Trading!

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

Factoring & Primality

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

What is a day trade?

1 Portfolio Selection

The Taxman Game. Robert K. Moniot September 5, 2003

Online Appendix Feedback Effects, Asymmetric Trading, and the Limits to Arbitrage

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

Copyrighted Material. Chapter 1 DEGREE OF A CURVE

Second Order Linear Nonhomogeneous Differential Equations; Method of Undetermined Coefficients. y + p(t) y + q(t) y = g(t), g(t) 0.

Week 1: Introduction to Online Learning

Notes from Week 1: Algorithms for sequential prediction

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

1 Short Introduction to Time Series

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Theorem (informal statement): There are no extendible methods in David Chalmers s sense unless P = NP.

Evaluating Trading Systems By John Ehlers and Ric Way

Copyright (c) 2015 Christopher Small and The Art of Lawyering. All rights reserved.

Power Rankings: Math for March Madness

A Short Course in Logic Zeno s Paradox

Chapter 31 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M.

Decision Theory Rational prospecting

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Bonus Maths 2: Variable Bet Sizing in the Simplest Possible Game of Poker (JB)

A Production Planning Problem

Wald s Identity. by Jeffery Hein. Dartmouth College, Math 100

1.7 Graphs of Functions

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Expected Value and the Game of Craps

SPREAD BETTING DISCIPLINE

Answer: (a) Since we cannot repeat men on the committee, and the order we select them in does not matter, ( )

help! I don t feel I have the confidence to help and I can t find the time He can spell better than me!! I m sure my daughter doesn t want my help

Tom wants to find two real numbers, a and b, that have a sum of 10 and have a product of 10. He makes this table.

We never talked directly about the next two questions, but THINK about them they are related to everything we ve talked about during the past week:

Economics 1011a: Intermediate Microeconomics

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Chapter 11 Number Theory

Solution to Homework 2

OPRE 6201 : 2. Simplex Method

2 When is a 2-Digit Number the Sum of the Squares of its Digits?

1. Overconfidence {health care discussion at JD s} 2. Biased Judgments. 3. Herding. 4. Loss Aversion

Integrals of Rational Functions

2. Information Economics

What Is Pay Per Click Advertising?

CROSS EXAMINATION OF AN EXPERT WITNESS IN A CHILD SEXUAL ABUSE CASE. Mark Montgomery

How To Check For Differences In The One Way Anova

Applied Algorithm Design Lecture 5

The One Key Thing You Need to Be Successful In Prospecting and In Sales

The Trip Scheduling Problem

! Insurance and Gambling

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Lies My Calculator and Computer Told Me

The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.

DESCRIBING OUR COMPETENCIES. new thinking at work

Section 1.5 Exponents, Square Roots, and the Order of Operations

Transcription:

1 The Two Envelopes Problem Rich Turner and Tom Quilter The Two Envelopes Problem, like its better known cousin, the Monty Hall problem, is seemingly paradoxical if you are not careful with your analysis. In this note we present analyses of both the careless and careful kind, providing pointers to common pitfalls for authors on this topic. As with the Monty Hall problem, the key is to condition on all the facts you have when you crank the handle of Bayesian inference. Sometimes facts unexpectedly have to provide you with information: and you have to take them into account. This point is well known, and most explanations of the paradox point this out. However authors then tend to make a number of mistakes. Firstly they claim the conventional analysis corresponds to using a uniform-improper distribution. This is wrong: it corresponds to a log-uniform improper distribution. Secondly they claim the root of the paradox is in the use of an un-normalisable distribution. This is merely a conjecture and a toy example provided here suggests that improper distributions exist which do not result in the paradox. The Two Envelopes Problem s ultimate lesson is that inference always involves making assumptions and any attempt to blindly use uniformative priors as a method for avoiding making assumptions can run into trouble. After all the innocuous uniform (or log-uniform) prior corresponds to a very strong assumption about your data. 0.1 The problem You are playing a game for money. There are two envelopes on a table. You know that one contains $X and the other $X, [but you do not know which envelope is which or what the number X is]. Initially you are allowed to pick one of the envelopes, to open it, and see that it contains $Y. You then have a choice: walk away with the $Y or return the envelope to the table and walk away with whatever is in the other envelope. What should you do? 0. The Paradox Clearly, to decide whether we should switch or stick we want to compute the expected return from switching ( R ) and compare it to

the return from sticking (trivially, $Y ). Initially, the chances of choosing the envelope with the highest contents P (C=H) are equal to the chances of choosing the envelope containing the smaller sum P (C=L), and they are both 1. A careless analysis analysis might then reason that the expected return from switching is therefore: R = 1 Y P (C=H) + Y P (C=L) (1) = 1 Y 1 + Y 1 () = 5 4 Y (3) This would mean we expect to gain 1 Y from switching on average. 4 This is paradoxical as it says It doesn t matter which envelope you choose initially - you should always switch. 0.3 A thought experiment So goes the usual explanation of the paradox. Where s the flaw in the analysis? In particular, a key question we have to resolve is does observing the contents of the first envelope provides us with useful information. If it does not, then it will not matter whether we stick or switch [just as it didn t matter which envelope we chose to start with]. First I m going to pursuade you that observing the contents does provide you with useful information. Then I m going to show you how to use Bayes theorem to correctly use this information to compute the optimal decision. Imagine you make your first choice, and open up the envelope only to discover that it contains a bill - the envelope is charging you money - Y is negative. Not only is this annoying, it is also very unexpected. Who said we could lose money playing this game? If we use the above analysis for the new situation, we should now stick and not switch. So the paradox - that we should switch regardless of Y - (partly) melts away if the assumption that Y is always positive is broken. Put another way: if Y can be negative or positive then we d have to observe Y before we decided whether switch. The moral of the story is that we should pay closer attention to our assumptions and, in particular, to what Y is telling us.

3 0.4 Condition on all the infomation So we made a mistake: When computing the expected reward above we should have taken into account [technically, condition on] all the information we have available to us at the time. Equation 1 should therefore have read: R = 1 Y P (C=H Y ) + Y P (C=L Y ) (4) We didn t do this because we naively we thought the amount of money that the envelope contained didn t provide any useful information (ie. P (C=H Y ) = P (C=H), conditional independence). However the thought experiment above shows us that this is not necessarily true, and moreover whether it is or not depends on our assumptions. The cure is that we should work through the analysis stating the assumptions explicitly. The way to incorporate these assumptions explcitly is to use Bayes theorem: P (C=i Y ) = P (C=i Y ) = P (Y C=i)P (C=i) P (Y ) P (Y C=i)P (C=i) P (Y C=H)P (C=H) + P (Y C=L)P (C=L) (5) (6) We have already noted that the probability that P (C=H) = P (C=L) = 1 and so the factors of 1 in the denominator and numerator all cancel and we can plug the result into the formula for the expected reward: 1Y P (Y C=H) + Y P (Y C=L) R = P (Y C=H) + P (Y C=L) = 1 Y 1 + 4γ(X) 1 + γ(x) (7) (8) P (Y C=L) The expected reward therefore depends on γ = and we are P (Y C=H) now forced to state our beliefs about the two distributions:p (Y C=L) and P (Y C=H) to calculate the expected reward. 0.5 Uniform beliefs At first sight a sensible distribution - in the sense that it is noncommital - is a uniform prior between two limits, Y min and Y max say:

4 p(y C=L) = 1 Y max Y min (9) = 1 (10) Z 1 p(y C=H) = (11) Y max Y min = 1 (1) Z For the sake of clarity let s think about the situation where Y is always greater than zero (Y min = 0). What are the possible values for γ? p(y C=L)dY p(y C=H)dY Well, if 0 < Y < Y min then γ = = 1/, so: R = 3Y/ and we should always switch. This makes sense - it is twice as likely that the envelope contains the lower amount in this case, so switching should yield + 1 1, which it does. 3 3 However, if Y min < Y < Y max then γ = 0 and R = 1 - we must have chosen the envelope containing the larger amount, the expected reward from switching is therefore $ 1 and so we should stick. There is no paradox. 0.6 Improper uniform beliefs We might now say, well, I m very unsure about what value Y would take and therefore I ll make Y max extremely large, and take the limit Y max 1. This turns out to be a terrible assumption for any practical situation. As we take this limit p(y C=L) and p(y C=H) become very ill matched to any possible real data in such a way that this causes problems. All data we observe will lie beneath Y max (as it s tending to infinity). However, in this regime we believe it is twice as likely that the envelope contains the lower amount, and so we switch whatever we see. Put another way, in order to stick we would have to observe a Y which is larger than Y max, but we ll never encounter an envelope containing more than infinity dollars so we ll always switch. Although we tried to make the most non-commital assumptions possible - we ended up making very definite, absurd assumptions which 1 In this limit the uniform distribution is un-normalisable and such distributions are called improper although there are numbers greater than infinity, it s tough to write legal cheques for these amounts

5 makes sensible inference impossible. 3 Before we move on, it s worth mentioning that it might have been tempting to set γ equal to one in the above case. After all, in the limit both p(y C=L) and p(y C=H) are (improper) uniform distributions between zero and infinity. We would then have γ = 1 and recover the old result R = 5Y/4. However, setting the two densities equal is not consistent with the assumption that one envelope contains an amount twice that in the other. This assumption demands one density be half of the other (and to have twice the range). So the original analysis does not correspond to assuming a uniform improper distribution (as some authors claim). To conclude, it s impossible to use this particular improper distributions without being careful (or else we introduce a contradiction with our other assumptions) and even if we are careful they correspond to absurd assumptions (where we switch unless Y is greater than infinity). 0.7 The log-uniform beliefs If the γ = 1 case does not correspond to pair of uniform distributions p(y C=H) and p(y C=L), does it correspond to another choice? The answer, somewhat suprisingly, is yes. To see this we need to do some more work to simplify the form of γ. We can do this by realising that specifying p(y C=L) immediately specifies p(y C=H) (due to the doubling constraint). We can derive a relation which relates the two via manipulation which is easy to skrew up (as, for example, many authors on this paradox do - even in publications). Here s one way to make sure you get the right result: We want to sum up all the probability at Y in the old distribution which gets mapped to Y = Y in the new distribution. So we write down: p(y C=H) = = 0 p(y C=L)δ(Y Y )dy (13) p(u/ C=L)δ(U Y )du/ (14) = p(y/ C=L)/ (15) This makes intuitive sense: A chunk of probability in the old distribution p(y C=L)dY is mapped to a region centred at Y = Y but it is smeared out to extend over a length dy so the density at p(y C=L) must be half that of p(y/ C=L). 3 Note, however, that as we take the limit the paradox (that we should always switch regardless of Y ) does not rear it s ugly head.

6 If γ equals one then this means: p(y C=L) = p(y C=H) = p(y/ C = L)/ for all Y, which holds when p(y C=L) = 1/Y. This is a uniform distribution on log Y and says, I have know idea about the scale of Y. I think Y is as likely to be between 1 and 10 as it is to be between 10 and 100. This seems more sensible than the uniform prior before - for one thing, it decreases with Y. However, this distribution is not normalisable (without introducing a lower and upper cut-off and then γ is not 1 for all Y ). Limiting arguments fail for the same reason they did in the uniform case: in the region where both the distributions are non-zero, you should always switch. So taking Y max to infinity pushes the regime where you should stick to a place where no data will land. Again, the limit corresponding to the improper distribution corresponds to a terrible set of assumptions. 0.8 A sensible approach Is there a sensible approach? In a word yes. However, it will depend on what assumptions you want to make. In a way the previous discussion is a red-herring: you might be quite happy to use a uniform distribution, or a log-uniform distribution - and everything will be ok so long as you think carefully about the choice for the ranges (am I playing with a friend for his poket money, or am I through to the final stages of a game-show called Who wants to be a billionaire?). 4 One final alternative (which does not have hard cut-offs) is to place an exponential distribution over Y : P (Y C=L) = λ exp( λy ) (16) 1/λ is the characteristic length scale over which the density decays. From eqn. 8 we should switch when: 1 γ(y ) (17) = exp( λy/) (18) (An intuition for this equation is that it is the result of a hypothesis test: we shuld switch when it is more likely we ve chosen the envelope containing the smaller amount. That is when P (C = L Y ) > P (C = H Y ). Plugging in values we ll get the same result as above.) 4 One alternative is to place a prior over the ranges and integrate them out: P (Y C=L) = R P (Y C=L, Y min, Y max)p (Y min, Y max)dy mindy max

7 So we switch when Y ln 4 3, which is proportional to the λ λ characteristic length scale, as expected. 0.9 I want to specify a marginal, not a conditional! We have discussed how specifying the conditional distrution P (Y C = L) immediately specifies the other conditional P (Y C = H) = 1P (Y/ C = L). Together these two distributions combine to form the marginal distribution, or more intuitively, the distribution over values of money we expect to see in envelopes: P (Y ) = P (Y C = L)P (C = L) + P (Y C = H)P (C = H). This is uniquely determined once we specify P (Y C = L) as P (Y ) = 1P (Y C = L) + 1 P (Y/ C = L). This opens up the 4 possibility of the reverse: specifying a marginal and then deriving the two conditionals. The above equation can be solved by recursion, yielding: P (Y C = L) = n=0 ( )n P ( n Y ) under the proviso that N P ( N Y C = L) 0 as N. This equation says: You give me a marginal, I ll give you the conditional. 0.10 A sensible improper distribution Does using an improper distribution always lead to the paradox? Some author s claim so, but it hasn t been proved. More over the intuition from the above examples is that it wasn t the improper nature of the distribution which lead to the paradox (as some authors glibly state) - it was the fact that they corresponded to stupid assumptions. Let s try and invent a distribution which correspondeds to reasonable assumptions in the limit where they become un-normalisable. 5 The key intuition is this: we ll be fine as long as the point we always switch is not situated in the extreme of the range of allowed Y values. For instance, if the conditional distributions have some periodic struture of correctly chosen amplitude then there can be regions when one distribution is much higher than the other, and regions when the converse is true. In this case the paradox will not arise. Whatsmore, the marginal distribution P (Y ) implied by the two periodic conditionals may not be such a stupid one 6. Let me give a simple example: 5 Another point noted by authors is that the improper distribution above has an infinite mean value - this may correspond to another undesireable assumption too. For instance there are proper distributions which have infinite expected value and in such cases the paradox can occur too. 6 Although a counter example might be enough to convince the mathematicians the paradox doesn t arise from un-normalisabilty, it would be nice if such a counter example corresponded to a realistic

8 P (Y C = L) = 4η n + 1 > x > n (19) P (Y C = L) = 1η otherwise (0) Over some range L that we extend to infinity (and therefore η 0) to make our improper prior. This specifies the other conditional: And the marginal: P (Y C = H) = η 4n + > x > 4n (1) P (Y C = H) = 1/η otherwise () P (Y ) = 3η 4n + 1 > x > 4n (3) P (Y ) = 3/η 4n + > x > 4n + 1 (4) P (Y ) = 9/4η 4n + 3 > x > 4n + (5) P (Y ) = 3/4η 4n + 4 > x > 4n + 3 (6) Is this a sensible marginal? Well, imagine that some types of coins were harder to come by than others. For example, imagine only 1,5,9,13... valued coins existed. Then we might like something like this (perhaps we d like to add in a 1/Y decay too). When should we switch? This depends on Y in the following way: Region P (Y C = L) P (Y C = H) switch? I 4n + 1 > x > 4n 4η η doesn t matter II 4n + > x > 4n + 1 1η η no III 4n + 3 > x > 4n + 4η 1/η yes IV 4n + 4 > x > 4n + 3 1η 1/η doesn t matter So we do need to listen to what the data are telling us even though the condtionals are improper.