STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE



Similar documents
3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

Lecture 7: Continuous Random Variables

Chapter 4 Lecture Notes

5. Continuous Random Variables

2. Discrete random variables

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Math 461 Fall 2006 Test 2 Solutions

Random variables, probability distributions, binomial random variable

Chapter 5. Random variables

ST 371 (IV): Discrete Random Variables

Binomial random variables

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Stats on the TI 83 and TI 84 Calculator

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

Practice problems for Homework 11 - Point Estimation

ECE302 Spring 2006 HW3 Solutions February 2,

Joint Exam 1/P Sample Exam 1

MAS108 Probability I

Statistics 100A Homework 4 Solutions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

1.1 Introduction, and Review of Probability Theory Random Variable, Range, Types of Random Variables CDF, PDF, Quantiles...

6.2. Discrete Probability Distributions

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Chapter 9 Monté Carlo Simulation

LECTURE 16. Readings: Section 5.1. Lecture outline. Random processes Definition of the Bernoulli process Basic properties of the Bernoulli process

Aggregate Loss Models

Exponential Distribution

Principle of Data Reduction

Practice Problems #4

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Chapter 4. Probability Distributions

Notes on Continuous Random Variables

The Exponential Distribution

Example 1: Dear Abby. Stat Camp for the Full-time MBA Program

Important Probability Distributions OPRE 6301

Section 5 Part 2. Probability Distributions for Discrete Random Variables

Math 431 An Introduction to Probability. Final Exam Solutions

WHERE DOES THE 10% CONDITION COME FROM?

The normal approximation to the binomial

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

TEST 2 STUDY GUIDE. 1. Consider the data shown below.

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

E3: PROBABILITY AND STATISTICS lecture notes

2 Binomial, Poisson, Normal Distribution

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

Chapter 5. Discrete Probability Distributions

STAT 830 Convergence in Distribution

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

An Introduction to Basic Statistics and Probability

4. Continuous Random Variables, the Pareto and Normal Distributions

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Probability Generating Functions

MAT 211 Introduction to Business Statistics I Lecture Notes

The normal approximation to the binomial

Lecture 8. Confidence intervals and the central limit theorem

Binomial random variables (Review)

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

2WB05 Simulation Lecture 8: Generating random variables

The Binomial Distribution. Summer 2003

STAT 35A HW2 Solutions

CHAPTER 7 SECTION 5: RANDOM VARIABLES AND DISCRETE PROBABILITY DISTRIBUTIONS

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

Lecture 5 : The Poisson Distribution

Introduction to Probability

6 PROBABILITY GENERATING FUNCTIONS

Lecture 6: Discrete & Continuous Probability and Random Variables

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

STAT x 0 < x < 1

Chapter 4. Probability and Probability Distributions

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = x = 12. f(x) =

Probability Calculator

Statistics 100A Homework 8 Solutions

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem.

You flip a fair coin four times, what is the probability that you obtain three heads.

Chapter 5: Normal Probability Distributions - Solutions

ECE302 Spring 2006 HW4 Solutions February 6,

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Section 6.1 Joint Distribution Functions

UNIT 2 QUEUING THEORY

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Probability Distributions

Binomial Distribution n = 20, p = 0.3

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1

Master s Theory Exam Spring 2006

Session 10. Laboratory Works

Exploratory Data Analysis

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000

Statistics 100A Homework 4 Solutions

Transcription:

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE TROY BUTLER 1. Random variables and distributions We are often presented with descriptions of problems involving some level of uncertainty about what the outcome will be of a physical experiment or recorded data. We find it useful to quantify the outcomes with real numbers. The function (or map or rule) that defines which real number gets associated with which particular outcome is what we call a random variable (rv) often denoted by a capital letter such as X or Y (the generic choices). Random variables are not random! The only thing that is uncertain about them is what the input will be as that comes from a yet-to-be-performed physical experiment or datum recorded from a not-yet-chosen member of a population! A random variable is NOT RANDOM! IT IS NOT RANDOM! It is a well-defined function! For example, we might say that we are interested in the heights of students in this class. I would represent the recorded height as the output of the random variable X. The only reason I am unsure of the outputs of X is that I do not know who will be chosen, but once a student is chosen, there is nothing random about this student s height. Once we have settled upon what the random variable is (i.e. how we map outcomes from a sample space, which is nothing more than a domain containing all the possible outcomes, to the real numbers), we are interested in the distribution of this random variable. Specifically, we want to know how to compute probabilities of events defined by some sets of real numbers. An event defined in terms of the random variable belonging to some set of real numbers means nothing more than the event of all outcomes in the sample space that get mapped into this set. For example, we might want to know the probability of the height of students in this class being less than 6 ft. Again, letting X denote the height of the students in this class (recorded in units of ft), we are asking about P (X < 6), which is read as the probability of the event defined by the random variable being less than 6. We are really asking a question about the proportion of students within this class such that when their heights are measured have values less than 6 ft. The list of all students in this class is the list of all the outcomes defining the sample space, and we map a given student to the student s associated height. As a very specific example, suppose Peyton Manning is a student in the class and he is exactly 6.47 ft tall and no one else is this height. If we ask the question, what is the probability of the event that X = 6.47? Then we are really asking the question, what is the probability that Peyton Manning will randomly be selected from the class? If we ask the question, what is the probability of the event that X > 6.47? Then we are really asking, what is the probability that a student taller than Peyton 1

2 TROY BUTLER Manning will be randomly selected from the class? Thus, questions about the probability of rv X having certain real-numbered values are really questions about the probability of certain outcomes in the sample space. The last sentence in the above paragraph implies that if we want to determine the probability distribution of random variable X, then we must consider the underlying probability of the sample space it acts upon! How do we determine the probabilities of these various outcomes in this sample space? In what follows, we use S (read script S ) to denote the sample space and s S to denote a particular outcome (or sample) s in this sample space. Uppercase letters denote random variables and their lowercase counterparts represent particular real numbers, for example X(s) = x indicates that outcome s is mapped to real number x by rv X. 2. Discrete random variables and their distributions 2.1. Bernoulli random variables. Consider an experiment with the following two outcomes: success (S) and failure (F ). Thus, S = {S, F }. Define the rv X : S R as, X(S) = 1, and X(F ) = 0. We define a Bernoulli random variable as any rv whose only possible values are 0 and 1. A Bernoulli trial is an experiment that will result in one of two outcomes, a success or a failure. The canonical example for a Bernoulli trial is a coin toss where the coin landing heads up is a success with success probability denoted by 0 ρ 1 and landing tails up is a failure with failure probability given by 1 ρ. The pmf for Bernoulli rv X : {S, F } {0, 1} is given as above with p(1) = ρ and p(0) = 1 ρ. We often denote X Bernoulli(ρ) to indicate that rv X has a Bernoulli distribution with success probability ρ. Bernoulli rv s and the concept of independent identically distributed (or i.i.d. or iid) Bernoulli trials is critical in many areas of probability theory including the development of the Binomial distribution. Any rv (continuous or discrete) X can be used to define a Bernoulli rv simply by identifying an event of interest. For example, we can let X denote the price paid by all first-time home buyers in the greater Denver area. Clearly X is not a Bernoulli rv as there are lots of prices that could be paid. However, if we decide that we are interested only in determining the probability that first-time home buyers paid less than $250,000, now we have defined a brand-spanking-new Bernoulli rv that we call Y (since X is already taken). Here, Y is really a function of X and since X is a function on the sample space defined by first-time home buyers, so is Y. If X <$250,000, then Y = 1, otherwise Y = 0. The probability of success is defined by P (X < $250, 000). All that is necessary to define a Bernoulli rv is to somehow define a rule that separates the sample space into two disjoint sets where one of those sets gets mapped to 1 and the other gets mapped to 0.

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE 3 2.2. Binomial random variables. Let X be the sum of n i.i.d. (independent identically distributed) Bernoulli trials with success probability ρ, then X Binomial(n, ρ) with pmf: b(x; n, ρ) := n x ρ x (1 ρ) n x x {0, 1, 2,..., n} 0 otherwise What does S look like? Suppose there are 3 Bernoulli trials defining the sample space, then S := {SSS, SSF, SF S, F SS, SF F, F SF, F F S, F F F } defines all of the possible 8 distinct outcomes from the experiment. The rv X maps s S to the number of S s showing up in the element s (keep the s s straight here). For example, if s = SSS then X(s) = 3, if s = SF S then X(s) = 2 but s = SSF also has X(s) = 2 because the rv X does not care which order the S s appear but only the number of them (that is the rule that defines X). We use B(x; n, p) to denote the cdf of a binomial rv X. This does not give the probability of X = x (that is given by P (x) which is a shorthand way of denoting the pmf evaluated at x), it gives the probability of the event X x. Given a dichotomous population (meaning a population defined by two disjoint sets satisfying some rule ) of size N, if we use a sample of n from this population without replacement, then the rv X counting the number of successes in the n samples is not a binomial distribution. Why? Each trial within the experiment is not independent. However, if n/n < 0.05, then we can reasonably approximate the distribution of X as a binomial distribution. In the example of first-time home buyers, if we say that we randomly sample 8 names from a list of first-time home buyers (and assume this list has N names so that 8/N < 0.05), and we want to know the probability that at least 3 of them paid less than $250,000, then we are asking a question about a rv that has a binomial distribution with n = 8 and probability of success given by P (X < $250, 000) where X is the price paid as described previously. This new rv can be called Y (but if you decide to list the intermediate step of defining a Bernoulli rv and use Y to denote this associated Bernoulli rv as was done previously, then you should call the binomial rv something else like W to avoid confusion). 2.3. Poisson random variables. The Poisson distribution is used to describe the probabilities of x numbers of events occurring in a fixed interval of time or space where λ represents the mean frequency per unit time/space. For example, the number of cars passing through an intersection in a fixed unit of time, the number of phone calls being routed through a cell tower in a given hour, or the number of chocolate chips per cookie baked from a big batch are often appropriately modeled by Poisson random variables.

4 TROY BUTLER A random variable X follows the Poisson distribution with parameter λ (λ > 0) if the pmf of X is given by e λ λ x x! x {0, 1, 2, 3,...} p(x; λ) = 0 otherwise. Remark 1. Given a binomial pmf b(x; n, p), if we let n and p 0 s.t. np λ > 0, then b(x; n, p) p(x; λ). The above remark implies that even though the binomial distribution might be the correct distribution to model the specific problem you are considering, it might be more computationally practical to use a Poisson distribution to approximate the answers. However, this approximation only holds in certain cases and we use the rule of thumb that this approximation holds when n > 50 and nρ < 5. In this case, we approximate the binomial distribution with the Poisson distribution where λ = nρ. Theorem 1. If the number of events that can occur in a time interval are independent with a mean rate λ and there are t disjoint time intervals, then X = the number of events occurring in the t time intervals follows a Poisson distribution with mean λt. Returning again to the example of first time home-buyers, we might want to model the number of firsttime home buyers in any year. We would have to know or be given data over the years in which to estimate the mean number of first-time home buyers to use as the parameter in the Poisson distribution. Suppose we have such a model distribution and the mean number of first-time home buyers in any 12 month span is 24,000, and we now want to model the number of first-time home buyers in any 6 month span, then it is reasonable to take a Poisson distribution with parameter 12,000 (by the above theorem). 2.4. Non-named distributions. When given a description of a finite (or countable) sample space and a rv X that does not conform to the type of descriptions that the named distributions above model, we must use the description along with rules of probabilitiy/logic/etc. to determine the distribution of X (meaning we must determine what the pmf is). 3. Continuous random variables and their distributions The common continuous distributions used in this class are the uniform, exponential, and normal/student T distributions. It will almost always be immediately clear from context which one applies as terms like uniform or equally likely show up when describing the uniform distribution and you will almost always be told whether or not the exponential or normal distribution is used to model the distribution of a particular rv. The exception is when we consider statistics (quick: what is a statistic?). Specifically, we often look at sample means or sample proportions as statistics and with a large enough sample size, the distributions of these statistics are approximately normal (Student T is approximately normal) by the Central Limit

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE 5 Theorem (CLT). You will know which distribution to use in these cases based on the sample size and the use of either the exact or approximate standard deviation as we discuss in chapter 7.