Section 5.1 Continuous Random Variables: Introduction



Similar documents
5. Continuous Random Variables

Notes on Continuous Random Variables

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Chapter 4 Lecture Notes

The Exponential Distribution

Lecture 6: Discrete & Continuous Probability and Random Variables

Statistics 100A Homework 7 Solutions

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Statistics 100A Homework 4 Solutions

Practice problems for Homework 11 - Point Estimation

Section 6.1 Joint Distribution Functions

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Exponential Distribution

1 Sufficient statistics

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Lecture 7: Continuous Random Variables

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Statistics 100A Homework 8 Solutions

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

Math 461 Fall 2006 Test 2 Solutions

Maximum Likelihood Estimation

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

Feb 28 Homework Solutions Math 151, Winter Chapter 6 Problems (pages )

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

INSURANCE RISK THEORY (Problems)

Lecture 8. Confidence intervals and the central limit theorem

An Introduction to Basic Statistics and Probability

ST 371 (IV): Discrete Random Variables

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

STAT 830 Convergence in Distribution

Generating Random Variables and Stochastic Processes

2WB05 Simulation Lecture 8: Generating random variables

ECE302 Spring 2006 HW5 Solutions February 21,

Generating Random Variables and Stochastic Processes

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin

1.1 Introduction, and Review of Probability Theory Random Variable, Range, Types of Random Variables CDF, PDF, Quantiles...

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Lecture 8: More Continuous Random Variables

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Math 431 An Introduction to Probability. Final Exam Solutions

ISyE 6761 Fall 2012 Homework #2 Solutions

Normal distribution. ) 2 /2σ. 2π σ

4. Continuous Random Variables, the Pareto and Normal Distributions

TOPIC 4: DERIVATIVES

STA 256: Statistics and Probability I

Inner Product Spaces

Practice with Proofs

Important Probability Distributions OPRE 6301

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Chapter 3 RANDOM VARIATE GENERATION

APPLIED MATHEMATICS ADVANCED LEVEL

Scalar Valued Functions of Several Variables; the Gradient Vector

Manual for SOA Exam MLC.

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem.

Stirling s formula, n-spheres and the Gamma Function

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

General Sampling Methods

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

correct-choice plot f(x) and draw an approximate tangent line at x = a and use geometry to estimate its slope comment The choices were:

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Solutions for Review Problems

MULTIVARIATE PROBABILITY DISTRIBUTIONS

M2S1 Lecture Notes. G. A. Young ayoung

Topic 8. Chi Square Tests

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

Random variables, probability distributions, binomial random variable

Math 151. Rumbos Spring Solutions to Assignment #22

2.6. Probability. In general the probability density of a random variable satisfies two conditions:

Introduction to Probability

Chapter 2: Binomial Methods and the Black-Scholes Formula

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Joint Exam 1/P Sample Exam 1

Lecture Notes 1. Brief Review of Basic Probability

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Modeling and Analysis of Information Technology Systems

Determining distribution parameters from quantiles

Notes on the Negative Binomial Distribution

Survival Distributions, Hazard Functions, Cumulative Hazards

Lecture 13: Martingales

Chapter 5. Random variables

6 PROBABILITY GENERATING FUNCTIONS

M5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term

Introduction to General and Generalized Linear Models

Exact Confidence Intervals

Math 432 HW 2.5 Solutions

Principle of Data Reduction

1 Prior Probability and Posterior Probability

Properties of Future Lifetime Distributions and Estimation

Optimization of Business Processes: An Introduction to Applied Stochastic Modeling. Ger Koole Department of Mathematics, VU University Amsterdam

Transcription:

Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene, etc). 2. Distance a ball is thrown. 3. Size of an antenna on a bug. The general idea is that now the sample space is uncountable. Probability mass functions and summation no longer works. DEFINITION: We say that X is a continuous random variable if the sample space is uncountable and there exists a nonnegative function f defined for all x (, ) having the property that for any set B R, P {X B} = f(x)dx The function f is called the probability density function of the random variable X, and is (sort of) the analogue of the probability mass function in the discrete case. So probabilities are now found by integration, rather than summation. REMARK: We must have that = P { < X < } = Also, taking B = [a, b] for a < b we have P {a X b} = B b a f(x)dx Note that taking a = b yields the moderately counter-intuitive P {X = a} = a a f(x)dx = f(x)dx Recall: For discrete random variables the probability mass function can be reconstructed from the distribution function and vice versa and changes in the distribution function corresponded with finding where the mass of the probability was. We now have that and so F(t) = P {X t} = t F (t) = f(t) f(x)dx

agreeing with the previous interpretation. Also, P(X (a, b)) = P(X [a, b]) = etc. = b a f(t)dt = F(b) F(a) The density function does not represent a probability. However, its integral gives probability of being in certain regions of R. Also, f(a) gives a measure of likelihood of being around a. That is, P {a ǫ/2 < X < a + ǫ/2} = a+ǫ/2 a ǫ/2 f(t)dt f(a)ǫ when ǫ is small and f is continuous at a. Thus, the probability that X will be in an interval around a of size ǫ is approximately ǫf(a). So, if f(a) < f(b), then P(a ǫ/2 < X < a + ǫ/2) ǫf(a) < ǫf(b) P(b ǫ/2 < X < b + ǫ/2) In other words, f(a) is a measure of how likely it is that the random variable takes a value near a. EXAMPLES:. The amount of time you must wait, in minutes, for the appearance of an mrna molecule is a continuous random variable with density { λe 3t, t f(t) =, t < What is the probability that. You will have to wait between and 2 minutes? 2. You will have to wait longer than /2 minutes? Solution: First, we haven t given λ. We need = f(t)dt = Therefore, λ = 3, and f(t) = 3e 3t for t. Thus, P { < X < 2} = P {X > /2} = 2 /2 λe 3t dt = λ 3 e 3t t= = λ 3 3e 3t dt = e 3 e 3 2.473 3e 3t dt = e 3/2.223 2. If X is a continuous random variable with distribution function F X and density f X, find the density function of Y = kx. 2

2. If X is a continuous random variable with distribution function F X and density f X, find the density function of Y = kx. Solution: We ll do two ways (like in the book). We have Differentiating with respect to t yields Another derivation is the following. We have F Y (t) = P {Y t} = P {kx t} = P {X t/k} = F X (t/k) f Y (t) = d dt F X(t/k) = k f X(t/k) ǫf Y (a) P {a ǫ/2 Y a + ǫ/2} = P {a ǫ/2 kx a + ǫ/2} = P {a/k ǫ/(2k) X a/k + ǫ/(2k)} ǫ k f X(a/k) Dividing by ǫ yields the same result as before. Returning to our previous example with f X (t) = 3e 3t. If Y = 3X, then What if Y = kx + b? We have f Y (t) = 3 f X(t/3) = e t, t. F Y (t) = P {Y t} = P {kx + b t} = P {X (t b)/k} ( ) t b = F X k Differentiating with respect to t yields ( ) t b ( ) t b f Y (t) = d dt F X k = k f X k 3

Section 5.2 Expectation and Variance of Continuous Random Variables DEFINITION: If X is a random variable with density function f, then the expected value is This is analogous to the discrete case: E[X] = xf(x)dx. Discretize X into small ranges (x i, x i ] where x i x i = h is small. 2. Now think of X as discrete with the values x i. 3. Then, EXAMPLES: E[X] x i x i p(x i ) x i x i P(x i < X x i ) x i x i f(x i )h. Suppose that X has the density function Find E[X]. f(x) = 2 x, x 2 xf(x)dx Solution: We have E[X] = 2 x 2 xdx = 2 6 x3 = 8 6 = 4 3 2. Suppose that f X (x) = for x (, ). Find E[e X ]. 4

2. Suppose that f X (x) = for x (, ). Find E[e X ]. Solution: Let Y = e X. We need to find the density of Y, then we can use the definition of Y. Because the range of X is (, ), the range of Y is (, e). For x e we have F Y (x) = P {Y x} = P {e X x} = P {X ln x} = F X (ln x) = ln x f X (t)dt = ln x Differentiating both sides, we get f Y (x) = /x for x e and zero otherwise. Thus, E[e X ] = E[Y ] = xf Y (x)dx = e dx = e As in the discrete case, there is a theorem making these computations much easier. THEOREM: Let X be a continuous random variable with density function f. Then for any g : R R we have E[g(X)] = Note how easy this makes the previous example: E[e X ] = g(x)f(x)dx e x dx = e To prove the theorem in the special case that g(x) we need the following: LEMMA: For a nonnegative random variable Y, Proof: We have = P {Y > y}dy = f Y (x) x E[Y ] = dydx = y P {Y > y}dy f Y (x)dxdy = x xf Y (x)dx = E[Y ] f Y (x)dydx Proof of Theorem: For any function g : R R (general case is similar) we have E[g(X)] = P {g(x) > y}dy = f(x)dx dy x:g(x)>y = f(x)dxdy = f(x) g(x) dydx {(x,y)}:g(x)>y x:g(x)> = f(x)g(x)dx = f(x)g(x)dx x:g(x)> 5

Of course, it immediately follows from the above theorem that for any constants a and b we have E[aX + b] = ae[x] + b In fact, putting g(x) = ax + b in the previous theorem, we get E[aX + b] = (ax + b)f X (x)dx = a xf X (x)dx + b f X (x)dx = ae[x] + b Thus, expected values inherit their linearity from the linearity of the integral. DEFINITION: If X is random variable with mean µ, then the variance and standard deviation are given by Var(X) = E[(X µ) 2 ] = (x µ) 2 f(x)dx σ X = Var(x) We also still have Var(X) = E[X 2 ] (E[X]) 2 Var(aX + b) = a 2 Var(X) σ ax+b = a σ X The proofs are exactly the same as in the discrete case. EXAMPLE: Consider again X with the density function Find Var[X]. f(x) = 2 x, x 2 Solution: Recall that we have E[X] = 4/3. We now find the second moment E[X 2 ] = x 2 f(x)dx = 2 x 2 2 xdx = 8 x4 2 = 6 8 = 2 Therefore, Var(X) = E[X 2 ] E[X] 2 = 2 (4/3) 2 = 8/9 6/9 = /3 6

Section 5.3 The Uniform Random Variable We first consider the interval (, ) and say a random variable is uniformly distributed over (, ) if its density function is f(x) = {, < x <, else Note that f(x)dx = dx = Also, as f(x) = outside of (, ), X only takes values in (, ). As f is a constant, X is equally likely to be near each number. That is, each small region of size ǫ is equally likely to contain X (and has a probability of ǫ of doing so). Note that for any a b we have P {a X b} = b a f(x)dx = b a General Case: Now consider an interval (a, b). We say that X is uniformly distributed on (a, b) if, t < a f(t) = b a, a < t < b and F(t) = t a b a, a x < b, else, t b We can compute the expected value and variance straightaway: 7

We can compute the expected value and variance straightaway: E[X] = b a x b a dx = b a 2 (b2 a 2 ) = b + a 2 Similarly, Therefore, E[X 2 ] = b a x 2 b a dx = b3 a 3 3(b a) = b2 + ab + a 2 3 Var(X) = E[X 2 ] E[X] 2 = b2 + ab + a 2 3 ( ) 2 b + a = 2 (b a)2 2 EXAMPLE: Consider a uniform random variable with density, x (, 7) f(x) = 8, else Find P {X < 2}. 8

EXAMPLE: Consider a uniform random variable with density, x (, 7) f(x) = 8, else Find P {X < 2}. Solution: The range of X is (, 7). Thus, we have P {X < 2} = 2 f(x)dx = 2 8 dx = 3 8 Section 5.4 Normal Random Variables We say that X is a normal random variable or simply that X is normal with parameters µ and σ 2 if the density is given by f(x) = σ 2π e (x µ) 2 2σ 2 This density is a bell shaped curve and is symmetric around µ: It should NOT be apparent that this is a density. We need that it integrates to one. Making the substitution y = (x µ)/σ, we have σ 2π e (x µ) 2 2σ 2 dx = 2π 9 e y2 /2 dy

We will show that the integral is equal to 2π. To do so, we actually compute the square of the integral. Define I = e y2 /2 dy Then, I 2 = e y2 /2 dy e x2 /2 dx = e (x2 +y 2 )/2 dxdy Now we switch to polar coordinates. That is, x = r cosθ, y = r sin θ, dxdy = rdθdr Thus, 2π 2π I 2 = e r2 /2 rdθdr = e r2 /2 r dθdr = 2πe r2 /2 rdr = 2πe r2 /2 r= = 2π Scaling a normal random variable gives you another normal random variable. THEOREM: Let X be Normal(µ, σ 2 ). Then Y = ax + b is a normal random variable with parameters aµ + b and a 2 σ 2 Proof: We let a > (the proof for a < is the same). We first consider the distribution function of Y. We have ( ) t b F Y (t) = P {Y t} = P {ax + b t} = P {X (t b)/a} = F X a Differentiating both sides, we get f Y (t) = a f X ( ) t b a ( t b µ = σa 2π exp a 2σ 2 = ) 2 } { σa 2π exp (t b aµ)2 2(aσ) 2 which is the density of a normal random variable with parameters aµ + b and a 2 σ 2. From the above theorem it follows that if X is Normal(µ, σ 2 ), then Z = X µ σ is normally distributed with parameters and. Such a random variable is called a standard normal random variable. Note that its density is simply f Z (x) = 2π e x2 /2

By the above scaling, we see that we can compute the mean and variance for all normal random variables if we can compute it for the standard normal. In fact, we have E[Z] = xf Z (x)dx = 2π xe x2 /2 dx = e x2 /2 2π = and Var(Z) = E[Z 2 ] = 2π x 2 e x2 /2 dx Integration by parts with u = x and dv = xe x2 /2 dx yields Var(Z) = Therefore, we see that if X = µ + σz, then X is a normal random variable with parameters µ, σ 2 and E[X] = µ, Var(X) = σ 2 Var(Z) = σ 2 NOTATION: Typically, we denote Φ(x) = P {Z x} for a standard normal Z. That is, Φ(x) = x 2π Note that by the symmetry of the density we have e y2 /2 dy Φ( x) = Φ(x) Therefore, it is customary to provide charts with the values of Φ(x) for all x (, 3.5) in increments of.. (Page 2 in the book). Note that if X Normal(µ, σ 2 ), then Φ can still be used: ( ) a µ F X (a) = P {X a} = P {(X µ)/σ (a µ)/σ} = Φ σ EXAMPLE: Let X Normal(3, 9), i.e. µ = 3 and σ 2 = 9. Find P {2 < X < 5}.

EXAMPLE: Let X Normal(3, 9), i.e. µ = 3 and σ 2 = 9. Find P {2 < X < 5}. Solution: We have, where Z is a standard normal, { 2 3 P {2 < X < 5} = P < Z < 5 3 } = P { /3 < Z < 2/3} = Φ(2/3) Φ( /3) 3 3 = Φ(2/3) ( Φ(/3)) Φ(.66) + Φ(.33) =.7454 +.6293 =.3747 We can use the normal random variable to approximate a binomial. This is a special case of the Central limit theorem. THEOREM (The DeMoivre-Laplace limit theorem): Let S n denote the number of successes that occur when n independent trials, each with a probability of p, are performed. Then for any a < b we have { } P a S n np b Φ(b) Φ(a) np( p) A rule of thumb is that this is a good approximation when np( p). Note that there is no assumption of p being small like in Poisson approximation. EXAMPLE: Suppose that a biased coin is flipped 7 times and that probability of heads is.4. Approximate the probability that there are 3 heads. Compare with the actual answer. What can you say about the probability that there are between 2 and 4 heads? 2

EXAMPLE: Suppose that a biased coin is flipped 7 times and that probability of heads is.4. Approximate the probability that there are 3 heads. Compare with the actual answer. What can you say about the probability that there are between 2 and 4 heads? Solution: Let X be the number of times the coin lands on heads. We first note that a normal random variable is continuous, whereas the binomial is discrete. Thus, (and this is called the continuity correction), we do the following P {X = 3} = P {29.5 < X < 3.5} = P { 29.5 28 7.4.6 < X 28 7.4.6 < } 3.5 28 7.4.6 Whereas P {.366 < Z <.6} Φ(.6) Φ(.37).729.6443 =.848 P {X = 3} = ( ) 7.4 3.6 4 =.853 3 We now evaluate the probability that there are between 2 and 4 heads: P {2 < X < 4} = P {9.5 < X < 4.5} = P { 9.5 28 7.4.6 < X 28 7.4.6 < } 4.5 28 7.4.6 P { 2.7 < Z < 3.5} Φ(3.5) Φ( 2.7) = Φ(3.5) ( Φ(2.7)).9989 (.988) =.9797 3

Section 5.5 Exponential Random Variables Exponential random variables arise as the distribution of the amount of time until some specific event occurs. A continuous random variable with the density function f(x) = λe λx for x and zero otherwise for some λ > is called an exponential random variable with parameter λ >. The distribution function for t is F(t) = P {X t} = t λe λx dx = e λx t x= = e λt Computing the moments of an exponential is easy with integration by parts for n > : E[X n ] = x n λe λx dx = x n e λx + x= nx n e λx dx = n λ x n λe λx dx = n λ E[Xn ] Therefore, for n = and n = 2 we see that E[X] = λ, E[X2 ] = 2 λ E[X] = 2 λ 2,..., E[Xn ] = n! λ n and Var(X) = 2 λ 2 ( ) 2 = λ λ 2 EXAMPLE: Suppose the length of a phone call in minutes is an exponential random variable with parameter λ = / (so the average call is minutes). What is the probability that a random call will take (a) more than 8 minutes? (b) between 8 and 22 minutes? Solution: Let X be the length of the call. We have P {X > 8} = F(8) = ( e (/) 8) = e.8 =.4493 P {8 < X < 22} = F(22) F(8) = e.8 e 2.2 =.3385 Memoryless property: We say that a nonnegative random variable X is memoryless if P {X > s + t X > t} = P {X > s} for all s, t Note that this property is similar to the geometric random variable one. Let us show that an an exponential random variable satisfies the memoryless property. In fact, we have P {X > s + t X > t} = P {X > s + t} P {X > t} = e λ(s+t) /e λt = e λs = P {X > s} One can show that exponential is the only continuous distribution with this property. 4

EXAMPLE: Three people are in a post-office. Persons and 2 are at the counter and will leave after an Exp(λ) amount of time. Once one leaves, person 3 will go to the counter and be served after an Exp(λ) amount of time. What is the probability that person 3 is the last to leave the post-office? Solution: After one leaves, person 3 takes his spot. Next, by the memoryless property, the waiting time for the remaining person is again Exp(λ), just like person 3. By symmetry, the probability is then /2. Hazard Rate Functions. Let X be a positive, continuous random variable that we think of as the lifetime of something. Let it have distribution function F and density f. Then the hazard or failure rate function λ(t) of the random variable is defined via λ(t) = f(t) F(t) where F = F What is this about? Note that P {X (t, t + dt) X > t} = P {X (t, t + dt), X > t} P {X > t} = P {X (t, t + dt)} P {X > t} f(t) F(t) dt Therefore, λ(t) represents the conditional probability intensity that a t-unit-old item will fail. Note that for the exponential distribution we have λ(t) = f(t) F(t) = λe λt e λt = λ The parameter λ is usually referred to as the rate of the distribution. It also turns out that the hazard function λ(t) uniquely determines the distribution of a random variable. In fact, we have d dt λ(t) = F(t) F(t) Integrating both sides, we get Thus, t ln( F(t)) = F(t) = e k exp λ(s)ds + k t λ(s)ds Setting t = (note that F() = ) shows that k =. Therefore F(t) = exp t λ(s)ds 5

EXAMPLE: Suppose that the life distribution of an item has the hazard rate function λ(t) = t 3, t > What is the probability that (a) the item survives to age 2? (b) the item s lifetime is between.4 and.4? (c) the item survives to age? (d) the item that survived to age will survive to age 2? 6

EXAMPLE: Suppose that the life distribution of an item has the hazard rate function What is the probability that (a) the item survives to age 2? λ(t) = t 3, t > (b) the item s lifetime is between.4 and.4? (c) the item survives to age? (d) the item that survived to age will survive to age 2? Solution: We are given λ(t). We know that t F(t) = exp λ(s)ds = exp Therefore, letting X be the lifetime of an item, we get (a) P {X > 2} = F(2) = exp{ 2 4 /4} = e 4.832 t s 3 ds = exp { t 4 /4 } (b) P {.4 < X <.4} = F(.4) F(.4) = exp{ (.4) 4 /4} exp{ (.4) 4 /4}.69 (c) P {X > } = F() = exp{ /4}.7788 (d) P {X > 2 X > } = P {X > 2} P {X > } = F(2) F() = e 4.235 e /4 EXAMPLE: A smoker is said to have a rate of death twice that of a non-smoker at every age. What does this mean? Solution: Let λ s (t) denote the hazard of a smoker and λ n (t) be the hazard of a non-smoker. Then λ s (t) = 2λ n (t) Let X n and X s denote the lifetime of a non-smoker and smoker, respectively. The probability that a non-smoker of age A will reach age B > A is B exp P {X n > B X n > A} = F λ n (t)dt n(b) B F n (A) = = exp A λ n (t)dt exp A λ n (t)dt Whereas, the corresponding probability for a smoker is B P {X s > B X s > A}=exp λ s (t)dt =exp 2 A B A λ n (t)dt = exp B A λ n (t)dt =[P {X n > B X n > A}] 2 Thus, if the probability that a 5 year old non-smoker reaches 6 is.765, the probability for a smoker is.765 2 =.534. If the probability a 5 year old non-smoker reaches 8 is.2, the probability for a smoker is.2 2 =.4. 7 2

Section 5.6 Other Continuous Distributions The Gamma distribution: A random variable X is said to have a gamma distribution with parameters (α, λ) for λ > and α > if its density function is λe λx (λx) α, x f(x) = Γ(x), x < where Γ(x) is called the gamma function and is defined by Γ(x) = e y y x dy Note that Γ() =. For x we use integration by parts with u = y x and dv = e y dy to show that Γ(x) = e y y x y= + (x ) e y y x 2 dy = (x )Γ(x ) Therefore, for integer values of n we see that Γ(n) = (n )Γ(n ) = (n )(n 2)Γ(n 2) =... = (n )(n 2)...3 2Γ() = (n )! So, it is a generalization of the factorial. REMARK: One can show that ( ) Γ = ( ) 3 π, Γ = ( ) 5 π, Γ 2 2 2 2 = 3 4 ( ) π,..., Γ 2 + n = (2n)! π, n Z 4 n n! The gamma distribution comes up a lot as the sum of independent exponential random variables of parameter λ. That is, if X,...,X n are independent Exp(λ), then T n = X +... + X n is gamma(n, λ). Recall that the time of the nth jump in a Poisson process, N(t), is the sum of n exponential random variables of parameter λ. Thus, the time of the nth event follows a gamma(n, λ) distribution. To prove all this we denote the time of the nth jump as T n and see that P {T n t} = P {N(t) n} = Differentiating both sides, we get f Tn (t)= e λtj(λt)j λ λ e λt(λt)j j! j! j=n j=n P {N(t) = j} = j=n = j=n λt (λt)j λe which is the density of a gamma(n, λ) random variable. REMARK: Note that as expected gamma(, λ) is Exp(λ). j=n (j )! λ j=n e λt(λt)j j! e λt(λt)j j! λt (λt)n =λe (n )! 8

We have so E[X] = Γ(x) λxe λx (λx) α dx = λγ(x) E[X] = α λ λe λx (λx) α dx = Γ(α + ) λγ(x) = α λ Recall that the expected value of Exp(λ) is /λ and this ties in nicely with the sum of exponentials being a gamma. It can also be shown that Var(X) = α λ 2 DEFINITION: The gamma distribution with λ = /2 and α = n/2 where n is a positive integer is called the χ 2 n distribution with n degrees of freedom. Section 5.7 The Distribution of a Function of a Random Variable We formalize what we ve already done. EXAMPLE: Let X be uniformly distributed over (, ). Let Y = X n. For y we have F Y (y) = P {Y y} = P {X n y} = P {X y /n } = F X (y /n ) = y /n Therefore, for y (, ) we have and zero otherwise. f Y (y) = n y/n THEOREM: Let X be a continuous random variable with density f X. Suppose g(x) is strictly monotone and differentiable. Let Y = g(x). Then f X (g (y)) d f Y (y) = dy g (y), if y = g(x) for some x, if y g(x) for all x Proof: Suppose g (x) > (the proof in the other case is similar). Suppose y = g(x) (i.e. it is in the range of Y ), then Differentiating both sides, we get F Y (y) = P {g(x) y} = P {X g (y)} = F X (g (y)) f Y (y) = f X (g (y)) d dy g (y) 9