2 Random Variables 2.1 Random variables Real valued-functions defined on the sample space, are known as random variables (r.v. s): RV : S R Example. X is a randomly selected number from a set of 1, 2, 4, 5, 6, 10. Y is the number heads that has occured in tossing a coin 10 times. V is the height of a randomly selected student. U is a randomly selected number from the interval (0, 1). Discrete and Continuous Random Variables Random variables may take either a finite or a countable number of possible values. Such random variables are called discrete. However, there also exist random variables that take on a continuum of possible values. These are known as continuous random variables. Example Let X be the number of tosses needed to get the first head. Example Let U be a number randomly selected from the interval [0,1]. Distribution Function The cumulative distribution function (c.d.f) (or simply the distribution function) of the random variable X, sayitf, is a function defined by F (x) P (X x) x R. Here are some properties of the c.d.f F, (i) F (x) is a nondecreasing function, (ii) lim x F (x) 1 (iii) lim x F (x) 0 All probability questions about X can be answered in terms of the c.d.f F. For instance, P (a <X b) F (b) F (a) If we desire the probability that X is strictly smaller than b, we may calculate this probability by P (X <b) lim P (X b h) lim F (b h) + + h 0 h 0 Remark. Note that P (X <b) does not necessarily equal F (b). 1
2 2.2 Discrete Random Variables Definition. (Discrete Random Variable) A random variable that can take on at most a countable number of possible values is said to be discrete. For a discrete random variable X, we define the probability mass function (or probability density function, p.d.f) of X by p(a) P (X a). Let X be a random variable takes the values x 1,x 2,... Then we must have p(x i )1. i1 The distribution function F can be expressed in terms of the mass function by F (a) all x i a p(x i ) Example. Let X be a number randomly selected from the set of numbers 0, 1, 2, 3, 4, 5. Find the probability that P (X 4). The Binomial Random Variable Suppose that n independent trials, each of which results in a success with probability p and in a failure with probability 1 p, are to be performed. If X represents the number of successes that occur in the n trials, then X is said to be a binomial random variable with parameters (n, p). Denote X B(n, p). The probability mass function of a binomial random variable with parameters (n, p) is given by P (X k) p(k) ( ) n p k (1 p) n k, k 0, 1, 2,...,n k where Note that n p(k) k0 n k0 ( ) n n! k k!(n k)!. ( ) n p k (1 p) n k (p +(1 p)) n 1 k Example. According to a CNN/USA Today poll, approximately 70% of Americans believe the IRS abuses its power. Let X equal the number of people who believe the IRS abuses its power in a random sample of n20 Americans. Assuming that the poll results still valid, find the probability that (a) X is at least 13 (b) X is at most 11
3 The Geometric Random Variable Suppose that independent trials, each having probability p of being a success, are performed until a success occurs. If we let X be the number of trials required until the first success, then X is said to be a geometric random variable with parameter p. Its probability mass function is given by p(n) P (X n) (1 p) n 1 p, n 1, 2,... Note that, p(n) p n1 (1 p) n 1 Example. Let X be the number of tosses needed to get the first head. n1 P (X n) 1, x 1, 2, 3,.... 2n The mass function of X is then, p(x) 1/2 x. Hence, 1 p(x) 2 x 1. all x x1 Example. Signals are transmitted according to a Poisson process with rate λ. Each signal is successfully transmitted with probability p and lost with probability 1 p. The fates of different signals are independent. What is the distribution of the number of signals lost before the first one is successfully transmitted? The Poisson Random Variable A random variable X, taking on one of the values 0, 1, 2,... is said to be a Poisson random variable with parameter λ, if p(k) P (X k) e λ λk, k 0, 1, 2,... k! This equation defines a probability mass function since p(k) e λ λ k k! e λ e λ 1. k0 k0 A Poisson random variable involves observing discrete events in a continuous interval of time, length, or space. Example. Suppose that the number of typographical errors on a single page of a book has a Poisson distribution with parameter λ 1. Calculate the probability that there is at least one error on a page?
4 Assume that the average number of occurences of the event in per unit of time is λ. Let Y be the number of the occurences of the event in s unit of time. Then N(t) isa Poisson random variable with parameter λt, that is λt (λt)k P (N(t) k) e k! k 0, 1, 2,... Example. People enter a casino at a rate of 1 for every 2 minutes. (a) What is the probability that none enters between 12:00 and 12:05 (b) What is the probability that at least 4 people enter the casino during that time? Theorem. Let X B(n, p). Ifn is very large such that λ np. then P (X x) ( n )p x (1 p) n x e k k x. x x! In other words, B(n, p) P oisson(λ), where λ np. Proof. Let k np or p k/n P (X x) ( ) n p x (1 p) n x x n! x!(n x)! λx x! n! (n x)! ( n x ( ) x ( λ 1 λ n n 1 (n λ) x ) x ( 1 λ ) n ( 1 λ x n n) )( λ n ) n ( ) x n n λ ( 1 λ n ( ) x ( n! k 1 λ n x!(n x)! n λ n) ) n λx n(n 1) (n x +1) x! (n λ) x ( 1 λ n) n. Example. Suppose that the probability that a random chosen item to be defective is 0.01. 800 items are shipped to a warehouse. What is the probability that there will be at most 5 defective items in that 800 items?
5 2.3 Continuous Random Variables Let X be a random variable whose set of possible values is uncountable. It is known that such random variable is called continuous. Definition. A random variable X is continuous if there exists a nonnegative function f(x), defined for all real x (, ), having the property that for any set B of real numbers P (X B) f(x) dx. The function f(x) is called the probability density function of the random variable X. A density function must have and B f(x) dx P (X (, )) 1 P (a X b) b a f(x) dx. The relationship between the c.d.f F (x) and the p.d.f f(x) is expressed by d F (x) f(x). dx Remark. The density function is not a probability. P (a ε X a + ε) a ε a+ε f(x) dx εf(a) when ε is small. From this, we see that f(a) is a measure of how likely it is that the random variable will be near a. The Uniform Random Variable A random variable is said to be uniformly distributed over the interval (0, 1) if its probability density function is given by { 1, 0 <x<1 f(x) 0, otherwise. Note that the preceding is a density function since f(x) 0 and for any 0 <a<b<1 P (a X b) b a f(x) dx f(x) dx b a 1 0 dx 1. 1 dx b a f(x) dx b a.
6 In general, we say that X is a uniform random variable on the interval (α, β) if its p.d.f is given by 1 f(x) α β, α<x<β 0, otherwise. Exponential Random Variables A continuous random variable whose p.d.f is given, for some λ, by { λe λx, x 0 f(x) 0, x < 0. is said to be an exponential random variable with parameter λ. The c.d.f of X is F (x) x 0 f(x) x 0 λe λt dt 1 e λx, x 0.
7 2.4 Expectation of a Random Variable The Discrete Case If X is a discrete random variable having a probability mass function p(x), then the expected value of X is defined by E(X) xp(x) all x provided all x x p(x) <. Lemma. If X is non-negative integer valued random variable, then E(X) P (X >k). k0 Example. (a) (Expectation of a Binomial Random Variable) Let X B(n, p). Calculate E(X). (b) (Expectation of a Geometric Random Variable) Calculate the expectation of a geometric random variable having parameter p. (c) (Expectation of a Poisson Random Variable) Calculate the expectation of a Poisson random variable having parameter λ. The Continuous Case The expected value of a continuous random variable is defined by E(X) xf(x) dx provided x f(x) dx < Lemma. If X is non-negative random variable, then E(X) 0 P (X >x) dx. Example. (a) (Expectation of a Uniform Random Variable) Let X B(n, p). Calculate E(X). (b) (Expectation of an Exponential Random Variable) Calculate the expectation of an exponential random variable having parameter λ. (c) (Expectation of a Normal Random Variable) Calculate the expectation of a Normal random variable having parameter µ and σ 2.
8 2.5 Expectation of a Function of a Random Variable Now, we are interested in calculating, not the expected value of X, but the expected value of some function of X, say,g(x). Proposition 1. (a) If X is a discrete random variable with probability mass function p(x), then for any real-valued function g, E[g(x)] g(x) p(x) all x (b) If X is a continuous random variable with probability density function f(x), then for any real-valued function g, E[g(x)] Proposition 2. If a and b are constants, then g(x) f(x) dx E(aX + b) ae(x)+b. and E(X + Y )E(X)+E(Y ) Variance of a Random Variable The expected value of a random variable X, E(X), is also referred to as the mean or the first moment. The quantity E(X n ), n 1, is called the nth moment of X. The variance of X, denoted by Var(X), is defined by A useful formula to compute the variance is Var(X) E[X E(X)] 2. Var(X) E(X 2 ) [E(X)] 2.
2.6 Jointly Distributed Random Variables Thus far, we have concerned ourselves with the probability distribution of a single random variable. However, we are often interested in probability statements concerning two or more random variable. Joint Distribution Function To deal with probabilities of two random variables X and Y, we define the joint distribution function of X and Y by 9 F X,Y (a, b) P (X a, Y b), <a,b<. The distribution function of X can be obtained from the joint c.d.f as follows: F X (a) P (X a, Y < ) F (a, ). Similarly, the c.d.f. of Y is given by F Y (b) P (X <, Y b) F (, b). Joint Probability Mass Function Let X and Y be both discrete random variables, then the joint mass function of X and Y is given by p(x, y) P (X x, Y y). The probability mass function of X may be obtained from p(x, y) by p X (x) all y p(x, y) and similarly, the mass function of Y is p Y (y) all x p(x, y) Joint Probability Density Function We say that X and Y are jointly continuous if there exists a function f(x, y), defined for all real x and y, having the property that for all sets A and B of real numbers P (X A, y B) f(x, y) dx dy. B A The function f(x, y) is called the joint probability density function of X and Y. The p.d.f. of X and Y can be obtained from their joint p.d.f. by P (X A) f(x, y) dy dx A
10 and The integrals P (Y B) B f(x, y) dx dy. f X (x) f(x, y) dy and f Y (y) are called the density function of X and Y respectively. Expectation of a Function of Two Random Variables f(x, y) dx. If X and Y are random variables and g is a function of two variables, then E[g(X, Y )] g(x, y) p(x, y) in the discrete case y x g(x, y)f(x, y) dx dy For instance, if g(x, Y )X + Y, then, in the continuous case, E(X + Y ) E(X)+E(Y ) Proposition. For any constants a and b, (x + y) f(x, y) dx dy in the continuous case xf(x, y) dx dy + yf(x, y) dx dy E(aX + by )ae(x)+be(y ). Example. Let us compute the expectation of a binomial variable with parameters n and p. X B(n, p). Solution. X X 1 + X 2 + + X n where { 1, if the ith trial is a success X i 0, if the ith trial is a failure. Hence, E(X i )0 (1 p)+1 p p. Thus E(X) E(X 1 + X 2 + + X n )np. Example. At a party N men throw their hats into the center of a room. The hats are mixed up and each man randomly selects one. Find the expected number of men who select their own hat. Solution. Let X denote the number of men that select their own hats. Define X i by { 1, if the ith man selects his own hat X i 0, otherwise.
Hence, X X 1 + X 2 + + X N. E(X i )1/N. Thus, E(X) 1. That is, no matter how many people at the party, on the average just one of them will select his own hat. Example. Suppose there are 4 different types of coupons and suppose that each time one obtains a coupon, it is equally likely to be any one of the 4 types. Compute the expected number of different types that are contained in a set of 10 coupons. Solution. Define { 1, if at least one type-i coupon is in the set of 10 X i 0, otherwise. Hence X X 1 + X 2 + X 3 + X 4. Now, E(X i )P(X i 1) P (at least one type-i coupon is in the set of 10) 1 P (no type-i coupons are in the set of 10 ) ( ) 10 3 1. 4 11 Hence, [ E(X) E(X 1 )+E(X 2 )+E(X 3 )+E(X 4 )4 1 ( ) ] 10 3 4
12 2.7 Independent Random Variables The random variable X and Y are said to be independent if, for all a, b, P (X a, Y b) P (X a) P (Y b). When X and Y are discrete, the condition of independence reduces to p(x, y) p X (x) p Y (y), and if X and Y are jointly continuous, independence reduces to f(x, y) f X (x) f Y (y). Proposition. If X and Y are independent, then for any functions h and g, g(x) and h(y ) are independent and E[g(X) h(y )] E[g(X)] E[h(Y )]. Remark. In general, E[g(X) h(y )] E[g(X)] E[h(Y )] does NOT imply independence.
13 2.8 Covariance The covariance of any two random variables X and Y, denoted by Cov(X, Y ), is defined by Cov(X, Y )E[(X E(X))(Y E(Y )]. The following is an useful formula to compute the covariance: Cov(X, Y )E(XY ) E(Y )E(Y ). Proposition. If X and Y are independent, then Cov(X, Y ) 0. Properties of Covariance For any random variables X, Y, Z, and a constant c, Cov(X, X) Var(X) Cov(X, Y )Cov(Y,X) Cov(cX,Y)cCov(X, Y ) Cov(X, Y + Z) Cov(X, Y )+Cov(X, Z) Sums of Random Variables Let X 1,X 2,...X n be a sequence of random variables. Then ( n ) Var X i i1 n Var(X i )+2 i1 If X 1,X 2,...X n are independent, then ( n ) Var X i i1 n i 1 i2 j1 n Var(X i ) i1 Cov(X i,x j )