Frequentist vs. Bayesian Statistics



Similar documents
Stat 134 Fall 2011: Gambler s ruin

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

1 Gambler s Ruin Problem

Two-resource stochastic capacity planning employing a Bayesian methodology

Binomial Random Variables. Binomial Distribution. Examples of Binomial Random Variables. Binomial Random Variables

Chapter 4. Probability and Probability Distributions

The risk of using the Q heterogeneity estimator for software engineering experiments

The Binomial Probability Distribution

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

Example 1. so the Binomial Distrubtion can be considered normal

The Cubic Formula. The quadratic formula tells us the roots of a quadratic polynomial, a polynomial of the form ax 2 + bx + c. The roots (if b 2 b+

A Dynamical Model of the Spread of HIV/AIDS and Statistical Forecast for HIV/AIDS Population in India

Normally Distributed Data. A mean with a normal value Test of Hypothesis Sign Test Paired observations within a single patient group

Lecture 9: Bayesian hypothesis testing

Foundations of Statistics Frequentist and Bayesian

CRITICAL AVIATION INFRASTRUCTURES VULNERABILITY ASSESSMENT TO TERRORIST THREATS

Lecture 8. Confidence intervals and the central limit theorem

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

X How to Schedule a Cascade in an Arbitrary Graph

Risk and Return. Sample chapter. e r t u i o p a s d f CHAPTER CONTENTS LEARNING OBJECTIVES. Chapter 7

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

HYPOTHESIS TESTING: POWER OF THE TEST

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Basics of Statistical Machine Learning

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

More Properties of Limits: Order of Operations

1 Sufficient statistics

The Lognormal Distribution Engr 323 Geppert page 1of 6 The Lognormal Distribution

An Introduction to Basic Statistics and Probability

Applications of Regret Theory to Asset Pricing

POISSON PROCESSES. Chapter Introduction Arrival processes

Bayesian Analysis for the Social Sciences

Working paper No: 23/2011 May 2011 LSE Health. Sotiris Vandoros, Katherine Grace Carman. Demand and Pricing of Preventative Health Care

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

Local Connectivity Tests to Identify Wormholes in Wireless Networks

Inference of Probability Distributions for Trust and Security applications

F inding the optimal, or value-maximizing, capital

Drinking water systems are vulnerable to

Normal distribution. ) 2 /2σ. 2π σ

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Bayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009

MATH 140 Lab 4: Probability and the Standard Normal Distribution

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations

Pinhole Optics. OBJECTIVES To study the formation of an image without use of a lens.

Machine Learning with Operational Costs

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

TRANSCENDENTAL NUMBERS

An inventory control system for spare parts at a refinery: An empirical comparison of different reorder point methods

Hypothesis Testing for Beginners

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Coin Flip Questions. Suppose you flip a coin five times and write down the sequence of results, like HHHHH or HTTHT.

References. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014

Bayesian Tutorial (Sheet Updated 20 March)

An Associative Memory Readout in ESN for Neural Action Potential Detection

Math 210 Lecture Notes: Ten Probability Review Problems

Bayesian Statistical Analysis in Medical Research

An important observation in supply chain management, known as the bullwhip effect,

Review for Test 2. Chapters 4, 5 and 6

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

A Few Basics of Probability

Comparing Dissimilarity Measures for Symbolic Data Analysis

NOISE ANALYSIS OF NIKON D40 DIGITAL STILL CAMERA

IEEM 101: Inventory control

C-Bus Voltage Calculation

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Price Elasticity of Demand MATH 104 and MATH 184 Mark Mac Lean (with assistance from Patrick Chan) 2011W

Statistical Fallacies: Lying to Ourselves and Others

Monitoring Frequency of Change By Li Qin

COST CALCULATION IN COMPLEX TRANSPORT SYSTEMS

Lecture 7: Continuous Random Variables

Part III. Lecture 3: Probability and Stochastic Processes. Stephen Kinsella (UL) EC4024 February 8, / 149

Synopsys RURAL ELECTRICATION PLANNING SOFTWARE (LAPER) Rainer Fronius Marc Gratton Electricité de France Research and Development FRANCE

As we have seen, there is a close connection between Legendre symbols of the form

8. THE NORMAL DISTRIBUTION

Monty Hall, Monty Fall, Monty Crawl

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Automatic Search for Correlated Alarms

An Analysis Model of Botnet Tracking based on Ant Colony Optimization Algorithm

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Index Numbers OPTIONAL - II Mathematics for Commerce, Economics and Business INDEX NUMBERS

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Bayesian Adaptive Designs for Early-Phase Oncology Trials

Risk in Revenue Management and Dynamic Pricing

Dynamics of Open Source Movements

Transcription:

Bayes Theorem

Frequentist vs. Bayesian Statistics Common situation in science: We have some data and we want to know the true hysical law describing it. We want to come u with a model that fits the data. Examle: We look at n= random galaxies and find that m=4 are sirals. So what s the true ratio of sirals in the universe, r? Frequentist: There are true, fixed arameters in a model (though they may be unknown at times). Data contain random errors which have a certain robability distribution (Gaussian for examle) Mathematical routines analyze the robability of getting certain data, given a articular model (If I fli a fair coin, what's the robability of me getting exactly 5% heads and 5% tails?) Bayesian: There are no true model arameters. Instead all arameters are treated as random variables with robability distributions. Random errors in data have no robability distribution, but rather the model arameters are random with their own distributions Mathematical routines analyze robability of a model, given some data (If I fli a coin and get X heads and Y tails, what is the robability that the coin is fair?). The statistician makes a guess (rior distribution) and then udates that guess with the data Both aroaches are addressing the same fundamental roblem, but attack it in reverse orders (robability of getting data, given a model, versus robability of a model, given some data). Its quite common to get the same basic result out of both methods, but many will argue that the Bayesian aroach more closely relates to the fundamental roblem in science (we have some data, and we want to infer the most likely truth)

Bayes Theorem The rimary tool of Bayesian statistics. Allows one to estimate the robability of measuring/observing something given that you have already measured/observed some other relevant iece of information ( B A) = ( A B) ( B) ( A) P(B A)=robability of measuring B given A P(A B)=robability of measuring A given B P(B)=rior robability of measuring B, before any data is taken P(A)=rior robability of measuring A, before any data is taken

A simle examle Drug Testing: ` Let say.5% of eole are drug users Our test is 99% accurate (it correctly identifies 99% of drug users and 99% of non-drug users) What s the robability of being a drug user if you ve tested ositive? Our Bayes theorem reads: ( user os) = ( os user) ( user) ( os).5 =.99 =.33..995 +.99.5 (os user)=.99 (99% effective at detecting users) (user)=.5 (only.5% of eole actually are users) (os)=.*.995+.99*.5 (% chance of non-users, 99.5% of the oulation, to be tested ositive, lus 99% chance of the users,.5% of the oulation, to be tested ositive Only a 33% chance that a ositive test is correct This examle assumes we know something about the general oulation (users vs nonusers), but we usually don t!

Examle: Galaxy Poulations Looked at n= random galaxies. Found m=4 sirals. What s the ratio of sirals in the universe, r? We are introducing an unknown model arameter. Bayes Theorem reads: ( r data) = ( data r) ( r) ( data) (r data)=robability of getting r, given our current data (what we want to know) (data r)=robability of measuring the current data for a given r (r)=robability of r before any data is taken (known as a rior) (data)=rior robability of measuring the data. This acts as a normalizing constant, and is defined as = In other words, it s the robability of getting finding the data considering all ossible values of r ( data) ( data r) ( r) dr

( r data) ( data r) = ( r) ( data r) ( r) dr Since there are only two ossible measurements (siral or no siral), (r data) is adequately described by a binomial distribution n! m!( n m)! m n m ( data r) = r ( r) = r 4 ( r) 6 We ll assume that before any data was taken, we figured all ossible values of r were equally likely, so we ll set (r)= (our rior)! 4!6! (data) is just an integral, and we find 4 ( data r ) ( r ) dr = r ( r ) 6 dr =! 4!6!! 4!6! 23

( r data) ( data r) = ( r) ( data r) ( r) dr Putting all this together and simlifying, we get: ( r data) = 23r 4 ( r) 6 This is just a robability distribution for r, centered around.4 as we would exect. Also, as exected, more data makes the result more robust (red curve).

The role of riors In revious examle, we assumed that all values of r were equally likely before we took any data. Often, we'll know something else (aart from the data) which we'll want to incororate into our rior (hysics, models, a hunch, etc.) As an examle, lets say we run a cosmological simulation which suggests r~.7+.5. We'll use this as our rior, (r), and estimate it as a Gaussian distribution centered around.7, with σ =.5. n! m!( n m)! m ( ) ( ) n data r = r r m Same as before ( r) ( r.7) ex σ 2π 2σ = 2 2 New rior

The role of riors Our new distribution Notice the rofound effect the rior can have on the result. The more data one has, the more the rior is overwhelmed, but it clearly lays a owerful (and otentially dangerous) role in low samle sizes Priors can be very controversial, esecially when you have no extra information on which to base your rior. Uniform riors, like we originally chose, are considered too agnostic, even though they may seem like the safest aroach.