PROPERTIES OF PROBABILITY P (A B) = 0. P (A B) = P (A B)P (B).

Similar documents
5. Continuous Random Variables

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

ST 371 (IV): Discrete Random Variables

Chapter 5. Random variables

Lecture Notes 1. Brief Review of Basic Probability

Chapter 4 Lecture Notes

Lecture 6: Discrete & Continuous Probability and Random Variables

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

Random variables, probability distributions, binomial random variable

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

1.1 Introduction, and Review of Probability Theory Random Variable, Range, Types of Random Variables CDF, PDF, Quantiles...

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Math 431 An Introduction to Probability. Final Exam Solutions

Notes on Continuous Random Variables

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

WHERE DOES THE 10% CONDITION COME FROM?

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Lecture 7: Continuous Random Variables

Section 5.1 Continuous Random Variables: Introduction

Joint Exam 1/P Sample Exam 1

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Introduction to Probability

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

A Tutorial on Probability Theory

Random Variables. Chapter 2. Random Variables 1

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE

Normal distribution. ) 2 /2σ. 2π σ

Lecture 8. Confidence intervals and the central limit theorem

2WB05 Simulation Lecture 8: Generating random variables

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Chapter 5. Discrete Probability Distributions

Sums of Independent Random Variables

32. PROBABILITY P(A B)

MAS108 Probability I

You flip a fair coin four times, what is the probability that you obtain three heads.

Important Probability Distributions OPRE 6301

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Chapter 3 RANDOM VARIATE GENERATION

PROBABILITY AND SAMPLING DISTRIBUTIONS

An Introduction to Basic Statistics and Probability

MAT 211 Introduction to Business Statistics I Lecture Notes

4. Continuous Random Variables, the Pareto and Normal Distributions

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Master s Theory Exam Spring 2006

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

E3: PROBABILITY AND STATISTICS lecture notes

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Math 461 Fall 2006 Test 2 Solutions

SOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve

Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are:

ECE302 Spring 2006 HW3 Solutions February 2,

Lecture 13: Martingales

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Some special discrete probability distributions

Section 5 Part 2. Probability Distributions for Discrete Random Variables

Statistics 100A Homework 7 Solutions

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

Aggregate Loss Models

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

A review of the portions of probability useful for understanding experimental design and analysis.

Exploratory Data Analysis

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

Probability Calculator

SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions)

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = x = 12. f(x) =

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

Review of Random Variables

Statistics 100A Homework 4 Solutions

Covariance and Correlation

Stat 704 Data Analysis I Probability Review

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 5 Solutions

6 PROBABILITY GENERATING FUNCTIONS

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

1 Sufficient statistics

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

The Kelly Betting System for Favorable Games.

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Math/Stats 342: Solutions to Homework

6.2. Discrete Probability Distributions

Practice problems for Homework 11 - Point Estimation

2. Discrete random variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

The normal approximation to the binomial

Exponential Distribution

Chapter 4. Probability and Probability Distributions

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

University of California, Los Angeles Department of Statistics. Random variables

Transcription:

PROPERTIES OF PROBABILITY S is the sample space A, B are arbitrary events, A is the complement of A Proposition: For any event A P (A ) = 1 P (A). Proposition: If A and B are mutually exclusive, that is, A B =, then P (A B) = 0. Proposition: For any two events A and B, P (A B) = P (A) + P (B) P (A B) Definition: For any two events A and B with P (B) > 0 the A given that B has occured is defined by conditional probability of P (A B) = P (A B) P (B) Multiplication Rule: P (A B) = P (A B)P (B). The Law of Total Probability: Let A 1, A 2,..., A n be mutually exclusive and exhaustive events. Then for any other event B n P (B) = P (B A 1 )P (A 1 ) +... + P (B A n )P (A n ) = P (B A i )P (A i ) i=1 Definition: Two events A and B are independent if P (A B) = P (A) and are dependent otherwise. Proposition: A and B are independent if and only if P (A B) = P (A)P (B) 1

DISCRETE RANDOM VARIABLES Definition For a given sample space S of some experiment a random variable (rv) is any rule that associates a number with each outcomes in S. Definition:. Any random variable whose only possible values are 0 and 1 is called Bernoulli random variable. Example: Consider the coin tossing game. Then S = {H, T}. Let X be a random variable equal to 0 if the outcome is T, and equal to 1 if the outcome is H. Definition: A random variable is said to be discrete if its set of possible values is a discrete set, i.e. either if it consists of finite number of elements or if its elements can be listed in a sequence as x 1, x 2,..., x n,.... Definition: A probability mass function (pmf) of a discrete rv is defined for every real number x as p(x) = P (X = x). Remark: Notice that for every possible value x of the random variable, the pmf specifies the probability of observing that value when the experiment is performed. The conditions are required for any pmf. p(x) 0 and p(x) = 1 x Definition: The cumulative distribution function (cdf) F (x) of a discrete rv X with pmf p(x) is defined for every number x by F (x) = P (X x) = y:y x p(y). For any number x, F (x) is the probability that the observed value of X will be at most x. Definition: Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X) or µ X is E(X) = µ X = x D xp(x). Proposition: Rules of Expected Value for any constant a and b Consider the random variables X and Y then E(aX + b) = a E(X) + b, E(aX + by ) = a E(X) + b E(Y ). 2

Definition: Let X have a pmf p(x) and expected value µ. Then the variance of X, denoted by Var(X) or σx, 2 is Var(X) = (x µ) 2 p(x) = E[(X µ) 2 ]. D The standard deviation (SD) of X is σ x = σ 2 X. Proposition: [ ] Var(X) = σx 2 = x 2 p(x) µ 2 = E(X 2 ) (E(X)) 2 D Proposition: Rules of Variance for any constants a and b Consider two independent random variables X, Y, then Var(a X + b) = a 2 σx, 2 σ ax+b = a σ X, Var(a X + b Y ) = a 2 VarX + b 2 VarY, σ a X+b Y = a σ X + b σ Y. 3

CONTINUOUS RANDOM VARIABLES Definition: A random variable X is said to be continuous if its set of possible values is an entire interval of numbers that is, for some A B, any number x between A and B is possible value of X. Definition: A probability density function (pdf) of a continuous rv X is a function f(x) such that for any two numbers a and b with a b, P (a X b) = b a f(x)dx Remark: Notice that, the above definition means that the probability that X takes on a value in the interval [a, b] is the area under the graph of the density function f(x). Proposition: For f(x) to be a legitimate pdf, it must satisfy the following two conditions: 1. f(x) 0 for all x 2. + f(x)dx = 1, that is, the area under the entire graph of f(x) is equal to 1. Proposition: If X is a continuous rv, then for any number c, P (X = c) = 0. Definition: The cumulative distribution function (cdf) F (x) of a continuous rv X with pdf f(x) is defined for every number x by F (x) = P (X x) = x f(y)dy. Remark: For each x, F (x) is the area under the density curve to the left of x. It means that F (x) is increases (from 0 to 1) as x increases. Definition: The expected value or mean value of a continuous rv X with pdf f(x) is E(X) = µ X = + x f(x) dx. Proposition: If X is a continuous rv with pdf f(x) and h(x) is any function of X, then E [h(x)] = µ h(x) = + h(x) f(x) dx. 4

Definition: The variance of a continuous rv X with pdf f(x) and expected value µ is σ 2 X = Var(X) = + (x µ) 2 f(x)dx = E [ (X µ) 2]. The standard deviation (SD) of X is σ X = σ 2 X. Proposition: [ ] Var(X) = σx 2 = x 2 p(x) µ 2 = E(X 2 ) (E(X)) 2 D 5

IMPORTANT DISCRETE RANDOM VARIABLES 1. Binomial Random Variable Definition: An experiment for which the following 4 conditions hold is called binomial experiment. 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment. 2. The trials are identical, and each trial can result in one of the same possible outcomes, which we denote by success (S) or failure (F). 3. The trials are independent, so that the outcome on any particular trial does not influence the outcome on any other trial. 4. The probability of success is constant from trial to trial; we denote this probability by p. Definition: Given a binomial experiment consisting of n trials, the binomial random variable X associated with this experiment is defined as X = the number of S s among the n trials. Remark: A binomial random variable X has two parameters n and p. We will use the notation X B(n, p). Theorem: Let X B(n, p), that is, X is a binomial rv with parameters n and p. Then the pmf of X is: { ( ) n f(x) = P (X = x) = x p x (1 p) n x if x = 0, 1,..., n, 0 otherwise. Theorem: Let X B(n, p), then E(X) = µ X = np and Var(X) = σ 2 X = np(1 p) 2. Poisson Random Variable Definition: A random variable X is said to have a Poisson distribution with parameter λ, that is X P(λ), if the pmf of X is for some λ 0. P (X = x) = e λ λx, x = 0, 1, 2,... x! The value of λ is frequently a rate per unit time or unit area of occurrence of a certain event and X denotes the number of occurrence of this event during the unit time or area. 6

The Poisson probability model assumes that 1. the events occur independently, 2. the probability that an event occurs does not change in time, 3. the probability that an event will occur in an interval is proportional to the length of the interval, 4. the probability of more than one event occurring at the same time is vanishingly small. Proposition: If X has Poisson distribution with parameter λ, then E(X) = Var(X) = λ. Proposition: Suppose that we have a sequence of binomial rv-s B(n, p)-s, and we let n and p 0 in such a way that np remains fixed at value λ > 0. Then B(n, p) P(λ). Remark: According to this proposition, in any binomial experiment in which n is large and p is small, B(n, p) P(λ), where λ = np. As a rule of thumb, this approximation can safely be applied if n 100, p.01 and np 20. 3. Geometric Random Variable A geometric rv and distribution are based on an experiment satisfying the following conditions: 1. The experiment consists of a sequence of independent trials. 2. Each trial can result either success (S) or failure (F). 3. The probability of success is constant from trial to trial, so P ( S on triali) = p for i = 1, 2, 3,.... 4. the experiment continues (trials are performed) until the first success have been observed. Proposition: The pmf of a negative binomial rv X with parameter p = P (S) = p is P (X = x) = (1 p) x 1 p x = 1, 2,... Proposition: If X is a geometric rv with parameter p then E(X) = 1 p Var(X) = 1 p p 2. 7

3. Hypergeometric Random Variable The assumptions leading to the hypergeometric distribution are as follows: 1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population). 2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population. 3. A sample of n individuals is drawn in such a way that each subset of size n is equally likely to be chosen. The random variable of interest is X = the number of S s in the sample. The probability distribution of X depends on the parameters n, M, and N. Example: Suppose that a sample of size n is to be chosen randomly (without replacement) from an urn containing N balls, of which M are white and N M are black. If X denotes the number of white balls selected, then X has hypergeometric distribution, with parameters n, M, and N. Proposition: The pmf of a hypergeometric random variable X, with parameters n, M, and N is given by ) P (X = x) = ( )( M N M x n x ( N n) for x an integer satisfying max(0, n N + M) x min(n, M). 8

IMPORTANT CONTINUOUS RANDOM VARIABLES 1. Normal Distribution Definition: A continuous rv X is said to have a normal distribution with parameters µ and σ 2, where < µ < + and 0 < σ, if the pdf of X is f(x, µ, σ) = 1 σ 2π e(x µ)2 /(2σ2 ) < x < + Remark: The statement that X is normally distributed with parameters µ and σ 2 is abbreviated by X N (µ, σ 2 ). Definition: The normal distribution with parameter values µ = 0 and σ = 1 is called standard normal distribution and the random variable that has this standard normal distribution is called standard normal random variable and will be denoted by Z. The pdf of Z is f(z, 0, 1) = ϕ(z) = 1 2π e z2 /2 < z < + The cdf of Z is Φ(z) = P (Z z) = z ϕ(y)dy. Notation: z α will denote the value on the measurement axis for which α of the area under the z curve lies to the right of z α, that is P (Z z α ) = α. Proposition: If X N (µ, σ 2 ), then is a standard normal rv. Z = X µ σ Empirical Rule: If the population distribution of a variable is (approximately) normal, then 1. Roughly 68% of the values are within 1SD (standard deviation) of the mean. 2. Roughly 95% of the values are within 2SD of the mean. 3. Roughly 99% of the values are within 3SD of the mean. 9

Proposition: Let X be a binomial rv based on n trials with success probability p. Then if the binomial probability histogram is not too skewed, X has approximately a normal distribution with µ = np and σ = npq. In particular for a possible value k of X P (X = k) = Φ ( ) k +.5 np. npq In practice, the approximation is adequate provided that both np 5 and nq 5. 2. Lognormal Distribution Definition: A nonnegative rv X is said to have a lognormal distribution if the rv Y = ln(x) has a normal distribution. The resulting pdf of a lognormal rv, when ln(x) N (µ, σ 2 ) is f(x; µ, σ) = { 1 xσ /(2σ 2 ) 2π e[ln(x) µ]2 if x 0 0 otherwise. Remark: Be careful here; µ and σ are not the mean and standard deviation of X but of ln(x). Proposition: The mean and variance of X can be shown to be E(X) = e µ+σ2 /2, Var(X) = e 2µ+σ2 ( e σ2 1 ). Because ln(x) has a normal distribution the cdf of X can be expressed in terms of the cdf Φ(z) of a standard normal rv Z. For x 0, ( F (x; µ, σ) = P (X x) = P (ln(x) ln(x)) = P Z ln(x) µ ) ( ) ln(x) µ = Φ σ σ Remark: Suppose that X 1 and X 2 are independent rv s from lognormal distribution with the same parameters. Let Y 1 = ln X 1 and Y 2 = ln X 2 Then ( ) ( ) ) Y1 + Y 2 ln X1 + ln X 2 E = E = E ( X 1 X 2. 2 2 In general, if we have X 1, X 2,..., X n independent lognormals with the same parameters and Y i = ln X i, i = 1,..., n then ( ) Y1 +... + Y n E n = n n X i i=1 Thus the mean of the transformed variables corresponds to the geometric mean of the original variables. 10

3. Exponential Distribution Definition: A nonnegative rv X is said to have exponential distribution with parameter λ if the pdf of X is { 1 f(x) = λ e x/λ if x 0 0 if x 0. The cdf of X is F (x) = P (X x) = 1 e x/λ for x 0. Proposition: The mean and variance of X can be shown to be E(X) = λ, Var(X) = λ 2. 11

DISTRIBUTIONS DERIVED FROM THE NORMAL DISTRIBUTION Definition: A random variable X with pdf g(x) = λα Γ(α) xα 1 e λx x 0 has gamma distribution with parameters α > 0 and λ > 0. The gamma function Γ(x), is defined as Γ(x) = Properties of the Gamma Function: (i) Γ(x + 1) = xγ(x) (ii) Γ(n + 1) = n! (iii) Γ(1/2) = π. 0 u x 1 e u du. Remarks: 1. Notice that an exponential rv with parameter 1/θ = λ is a special case of a gamma rv. with parameters α = 1 and λ. 2. The sum of n independent identically distributed (iid) exponential rv, with parameter λ has a gamma distribution, with parameters n and λ. 3. The sum of n iid gamma rv with parameters α and λ has gamma distribution with parameters nα and λ. Definition: If Z is a standard normal rv, the distribution of U = Z 2 called the chi-square distribution with 1 degree of freedom. The density function of U χ 2 1 is f U (x) = x 1/2 2π e x/2, x > 0. Remark: A χ 2 1 random variable has the same density as a random variable with gamma distribution, with parameters α = 1/2 and λ = 1/2. Definition: If U 1, U 2,..., U k are independent chi-square rv-s with 1 degree of freedom, the distribution of V = U 1 + U 2 +... + U k is called the chi-square distribution with k degrees of freedom. Using Remark 3. and the above remark, a χ 2 k rv. follows gamma distribution with parameters α = k/2 and λ = 1/2. Thus the density function of V χ 2 k is: f V (x) = 1 2 k/2 Γ(k/2) xk/2 1 e x/2, x > 0 Proposition: If V has a chi-square distribution with k degree of freedom, then E(V ) = k, Var(V ) = 2k. 12

Definition: If Z N (0, 1) and U χ 2 n and Z and U are independent, then the distribution of Z U/n is called the t distribution with n degrees of freedom. Proposition: The density function of the t distribution with n degrees of freedom is f(t) = ( ) (n+1)/2 Γ[(n + 1)/2] 1 + t2. nπ Γ(n/2) n Remarks: For the above density f(t) = f( t), so the t density is symmetric about zero. As the number of degrees of freedom approaches infinity, the t distribution tends to be standard normal. Definition: Let U and V be independent chi-square variables with m and n degrees of freedom, respectively. The distribution of W = U/m V/n is called F distribution with m and n degrees of freedom and is denoted by F m,n. Remarks: (i) If T t n, then T 2 F 1,n. (ii) If X F n,m, then X 1 F m,n. 13

COVARIANCE and CORRELATION of RANDOM VARIABLES Definition: Let X and Y are random variables with expected values µ X and µ Y, respectively. The covariance of X and Y is provided that the expectation exists. Proposition: Cov(X, Y ) = E[(X µ X )(Y µ Y )], Cov(X, Y ) = E(XY ) E(X)E(Y ) Proof: By definition Cov(X, Y ) = E[(X µ X )(Y µ Y )] = E(XY Xµ Y Y µ X + µ X µ Y ) = = E(XY ) µ X µ Y µ X µ Y + µ X µ Y = E(XY ) E(X)E(Y ). Proposition: (i) If X and Y are independent random variables (ii) If X = Y with Var(X) = Var(Y ) = σ 2 Cov(X, Y ) = 0. Cov(X, Y ) = Var(X) = σ 2 Definition: If X and Y are random variables and the variances and covariances are exist and the variances are nonzero, then the correlation of X and Y, denoted by ρ, is ρ = Cor(X, Y ) = Cov(X, Y ) Var(X)Var(Y ) Proposition: (i) 1 ρ 1. (ii) ρ = ±1 if and only if X = a + by for some constants a and b. Proposition: Let X and Y are arbitrary random variables and the variances and covariances exist. Then Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y ) Proof: Var(X + Y ) = E[(X + Y µ X µ Y ) 2 ] = E[((X µ X ) + (Y µ Y )) 2 ] = = E[(X µ X ) 2 + (Y µ Y ) 2 + 2(X µ X )(Y µ Y )] = = Var(X) + Var(Y ) + 2Cov(X, Y ). 14