Lecture 5: Mathematical Expectation

Similar documents

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Chapter 4 Lecture Notes

Mathematical Expectation

Ch. 13.2: Mathematical Expectation

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Math 431 An Introduction to Probability. Final Exam Solutions

RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

6.042/18.062J Mathematics for Computer Science. Expected Value I

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

University of California, Los Angeles Department of Statistics. Random variables

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Lecture 6: Discrete & Continuous Probability and Random Variables

Statistics 100A Homework 8 Solutions

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

4. Continuous Random Variables, the Pareto and Normal Distributions

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Expected Value

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Covariance and Correlation

Elementary Statistics and Inference. Elementary Statistics and Inference. 16 The Law of Averages (cont.) 22S:025 or 7P:025.

Section 7C: The Law of Large Numbers

6.3 Conditional Probability and Independence

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

ST 371 (IV): Discrete Random Variables

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

Section 1.3 P 1 = 1 2. = P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., =

AMS 5 CHANCE VARIABILITY

5. Continuous Random Variables

The Method of Partial Fractions Math 121 Calculus II Spring 2015

Statistics 100A Homework 3 Solutions

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Section 5.1 Continuous Random Variables: Introduction

Normal distribution. ) 2 /2σ. 2π σ

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Joint Exam 1/P Sample Exam 1

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

Lab 11. Simulations. The Concept

The overall size of these chance errors is measured by their RMS HALF THE NUMBER OF TOSSES NUMBER OF HEADS MINUS NUMBER OF TOSSES

The Bivariate Normal Distribution

Random Variables. Chapter 2. Random Variables 1

Statistics 100A Homework 7 Solutions

ECE302 Spring 2006 HW5 Solutions February 21,

ECE 316 Probability Theory and Random Processes

Roulette. Math 5 Crew. Department of Mathematics Dartmouth College. Roulette p.1/14

E3: PROBABILITY AND STATISTICS lecture notes

Correlation in Random Variables

Notes on Continuous Random Variables

WHERE DOES THE 10% CONDITION COME FROM?

Random variables, probability distributions, binomial random variable

Homework # 3 Solutions

Probability Distribution for Discrete Random Variables

Continued Fractions and the Euclidean Algorithm

More Quadratic Equations

CALCULATIONS & STATISTICS

36 Odds, Expected Value, and Conditional Probability

25 Integers: Addition and Subtraction

Descriptive Statistics

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Feb 7 Homework Solutions Math 151, Winter Chapter 4 Problems (pages )

GEOMETRIC SEQUENCES AND SERIES

Lecture 13: Martingales

Probability Generating Functions

THE CENTRAL LIMIT THEOREM TORONTO

Stat 20: Intro to Probability and Statistics

Math 55: Discrete Mathematics

Midterm Exam #1 Instructions:

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Solving DEs by Separation of Variables.

Math 120 Final Exam Practice Problems, Form: A

STA 256: Statistics and Probability I

TOPIC 4: DERIVATIVES

We rst consider the game from the player's point of view: Suppose you have picked a number and placed your bet. The probability of winning is

M2S1 Lecture Notes. G. A. Young ayoung

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

Math/Stats 342: Solutions to Homework

1 Lecture: Integration of rational functions by decomposition

6.4 Logarithmic Equations and Inequalities

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Lecture Notes on Elasticity of Substitution

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

10.2 Series and Convergence

ECE302 Spring 2006 HW3 Solutions February 2,

Vieta s Formulas and the Identity Theorem

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

Scalar Valued Functions of Several Variables; the Gradient Vector

Mathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11}

Lecture 13. Understanding Probability and Long-Term Expectations

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

Section 6.1 Joint Distribution Functions

Recursive Estimation

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 13 & 14 - Probability PART

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Decision & Risk Analysis Lecture 6. Risk and Utility

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Transcription:

Lecture 5: Mathematical Expectation Assist. Prof. Dr. Emel YAVUZ DUMAN MCB1007 Introduction to Probability and Statistics İstanbul Kültür University

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Introduction A very important concept in probability and statistics is that of the mathematical expectation, expected value, or briefly the expectation, of a random variable. Originally, the concept of a mathematical expectation arose in connection with games of chance, and in its simplest form it is the product of the amount a player stands to win and the probability that he or she will win. For instance, if we hold one of 10, 000 tickets in a raffle for which the grand prize is a trip worth $4, 800, our mathematical expectation is 4, 800 1 10, 000 =$0.48. This amount will have to be interpreted in the sense of an average - altogether the 10, 000 tickets pay $4, 800, or on the average =$0.48 per ticket. $4,800 10,000

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

The Expected Value of a Random Variable Definition 1 If X is a discrete random variable and f (x) is the value of its probability distribution at x, theexpected value of X is E(X )= x xf (x). Correspondingly, if X is a continuous random variable and f (x) is the value of its probability density at x, theexpected value of X is E(X )= xf (x)dx. In this definition it is assumed, of course, that the sum or the integral exists; otherwise, the mathematical expectation is undefined.

Example 2 Imagine a game in which, on any play, a player has a 20% chance of winning $3 and an 80% chance of losing $1. The probability distribution function of the random variable X, the amount won or lost on a single play is x $3 $1 f (x) 0.2 0.8 and so the average amount won (actually lost, since it is negative) - in the long run - is E(X ) = ($3)(0.2) + ( $1)(0.8) = $0.20. What does in the long run mean? If you play, are you guaranteed to lose no more than 20 cents?

Solution. If you play and lose, you are guaranteed to lose $1. An expected loss of 20 cents means that if you played the game over and over and over and over again, the average of your $3 winnings and your $1 losses would be a 20 cent loss. In the long run means that you can t draw conclusions about one or two plays, but rather thousands and thousands of plays.

Example 3 A lot of 12 television sets includes 2 with white cords. If three of the sets are chosen at random for shipment to a hotel, how many sets with white cords can the shipper expect to send to the hotel? Solution. Since x of the two sets with white cords and 3 x of the 10 other sets can be chosen in ( )( 2 10 x 3 x) ways, three of the 12 sets can be chosen in ( 12) ( 3 ways, and these 12 ) 3 possibilities are presumably equiprobable, we find that the probability distribution of X, the number of sets with white cords shipped to the hotel, is given by ( 2 10 ) f (x) = x)( 3 x ( 12 ) for x =0, 1, 2 3

or, in tabular form, Now, E(X )= x 0 1 2 f (x) 6/11 9/22 1/22 2 6 xf (x) =0 11 +1 9 22 +2 1 22 = 1 2 x=0 and since half a set cannot possibly be shipped, it should be clear that the term expect is not used in its colloquial sense. Indeed, it should be interpreted as an average pertaining to repeated shipments made under the given conditions.

Example 4 A saleswoman has been offered a new job with a fixed salary of $290. Her records from her present job show that her weekly commissions have the following probabilities: commission 0 $100 $200 $300 $400 probability 0.05 0.15 0.25 0.45 0.1 Should she change jobs? Solution. The expected value (expected income) from her present job is (0 $0.05) + (0.15 $100) + (0.25 $200) +(0.45 $300) + (0.1 $400) = $240. If she is making a decision only on the amount of money earned, then clearly she ought to change jobs.

Example 5 Dan, Eric, and Nick decide to play a fun game. Each player puts $2 on the table and secretly writes down either heads or tails. Then one of them tosses a fair coin. The $6 on the table is divided evenly among the players who correctly predicted the outcome of the coin toss. If everyone guessed incorrectly, then everyone takes their money back. After many repetitions of this game, Dan has lost a lot of money - more than can be explained by bad luck. What s going on? Solution. A tree diagram for this problem is worked out below, under the assumptions that everyone guesses correctly with probability 1/2 and everyone is correct independently.

Nick right? Dan s payoff Probability Dan right? Eric right? Y 1/2 Y N 1/2 1/2 $0 $1 1/8 1/8 Y 1/2 1/2 N Y N 1/2 1/2 $1 $4 1/8 1/8 N 1/2 Y 1/2 Y N 1/2 1/2 $2 $2 1/8 1/8 N 1/2 Y N 1/2 1/2 $2 $0 1/8 1/8

In the payoff column, we are accounting for the fact that Dan has to put in $2 just to play. So, for example, if he guesses correctly and Eric and Nick are wrong, then he takes all $6 on the table, but his net profit is only $4. Working from the tree diagram, Dan s expected payoff is: E(payoff) =0 1 8 +1 1 8 +1 1 8 +4 1 8 +( 2) 1 8 +( 2) 1 8 +( 2) 1 8 +0 1 8 =0. So the game perfectly fair! Over time, he should neither win nor lose money. A game is called fair if the expected value of the game is zero.

Let say that Nick and Eric are collaborating; in particular, they always make opposite guesses. So our assumption everyone is correct independently is wrong; actually the events that Nick is correct and Eric is correct are mutually exclusive! As a result, Dan can never win all the money on the table. When he guesses correctly, he always has to split his winnings with someone else. This lowers his overall expectation, as the corrected tree diagram below shows:

Nick right? Dan s payoff Probability Dan right? Eric right? Y 1/2 Y N 0 1 $0 $1 0 1/4 Y 1/2 1/2 N Y N 1 0 $1 $4 1/4 0 N 1/2 Y 1/2 Y N 0 1 $2 $2 0 1/4 N 1/2 Y N 1 0 $2 $0 1/4 0

From this revised tree diagram, we can work out Dan s actual expected payoff E(payoff) =0 0+1 1 4 +1 1 1 +4 0+( 2) 0+( 2) 4 4 +( 2) 1 4 +0 0 = 1 2. So he loses an average of a half-dollar per game!

Example 6 Consider the following game in which two players each spin a fair wheel in turn Player A Player B 4 12 2 5 6 The player that spins the higher number wins, and the loser has to pay the winner the difference of the two numbers (in dollars). Who would win more often? What is that player s expected winnings?

Solution. Assuming 3 equally likely outcomes for spinning A s wheel, and 2 equally likely outcomes for B s wheel, there are then 3 2 = 6 possible equally-likely outcomes for this game: A B Sample Space 2 4 12 5 (2, 5) (4, 5) (12, 5) 6 (2, 6) (4, 6) (12, 6) WeseeherethatplayerA wins in 2/6 games, and player B wins in 4/6 games, so player B wins the most often. But, let s compute the expected winnings for player B: Winnings, X, for Player B A 2 4 12 B 5 $3 $1 $7 6 $4 $2 $6

Each of these possible amounts won has the same probability (1/6). The expected winnings for player B is then simply: E(X )=$3 1 6 +$4 1 6 +$1 1 6 +$2 1 6 +( $7) 1 6 +( $6) 1 6 = $0.50. Hence, we see that even though player B is more likely to win on a single game, his expected winnings is actually negative. If he keeps on playing the game, he will eventually start losing money!

Example 7 Find the expected value of the random variable X whose probability density is given by x for 0 < x < 1, f (x) = 2 x for 1 x < 2, 0 elsewhere. Solution. E(X )= = 0 = x3 3 1 0 xf (x)dx xf(x) dx + }{{} + 0 (x 2 x3 3 1 xf(x) dx + 0 }{{} x ) 2 1 2 1 xf(x) dx + }{{} 2 x = 1 3 +22 23 3 1+1 3 =1. 0 xf(x) dx }{{} 0

Example 8 Certain coded measurements of the pitch diameter of threads of a fitting have the probability density { 4 for 0 < x < 1, π(1+x f (x) = 2 ) 0 elsewhere. Find the expected value of this random variable. Solution. 1 1 4dx E(X )= xf (x)dx = x 0 π(1 + x 2 ) = 2 2xdx π 0 (1 + x 2 ) = 2 π ln(x2 +1) 1 0 = 2 ln 4 (ln 2 ln 1) = π π 0.4412712003.

Theorem 9 If X is a discrete random variable and f (x) is the value of its probability distribution at x, the expected value of g(x ) is given by E[g(X )] = x g(x)f (x). Correspondingly, if X is a continuous random variable and f (x) is the value of its probability density at x, the expected value of g(x ) is given by E[g(X )] = g(x)f (x)dx.

Example 10 Roulette, introduced in Paris in 1765, is a very popular casino game. Suppose that Hannah s House of Gambling has a roulette wheel containing 38 numbers: zero (0), double zero (00), and the numbers 1, 2, 3,, 36. Let X denote the number on which the ball lands and u(x) denote the amount of money paid to the gambler, such that: $5 if x =0, $10 if x =00, u(x) = $1 if x is odd, $2 if x is even. How much would Hannah s House of Gambling has to charge each gambler to play in order to ensure that Hannah s House of Gambling made some money?

Solution. Assuming that the ball has an equally likely chance of landing on each number, the probability distribution function of X is f (x) =1/38 for each x =0, 00, 1, 2, 3,, 36. Therefore, the expected value of u(x )is E(u(X )) = u(x)f (x) x ( ) 1 =$5 38 [ ( 1 + $1 38 =$1.82 ( 1 + $10 38 ] ) 18 ) + [ ( ) ] 1 $2 18 38

Note that the 18 that is multiplied by the $1 and $2 is because there are 18 odd and 18 even numbers on the wheel. Our calculation tells us that, in the long run, Hannah s House of Gambling would expect to have to pay out $1.82 for each spin of the roulette wheel. Therefore, in order to ensure that the House made money, the House would have to charge at least $1.82 per play.

Example 11 If X is the number of points rolled with a balanced die, find the expected value of g(x )=2X 2 +1. Solution. Since each possible outcome has the probability 1 6,we get E[g(X )] =E[2X 2 +1]= 6 g(x)f (x) = x=1 6 (2x 2 +1) 1 6 x=1 =(2 1 2 +1) 1 6 +(2 22 +1) 1 6 + +(2 62 +1) 1 6 = 94 3.

Example 12 If X has the probability density { e x for x > 0, f (x) = 0 elsewhere find the expected value of g(x )=e 3X /4. Solution. E[g(X )] = E[e 3X /4 ]= = lim c =4. c 0 c g(x)f (x)dx = lim e 3x 4 e x dx c 0 e x 4 dx = 4 lim c e x 4 = 4 lim c (e c 4 e 0 ) c 0

Some Theorems on Expectation The determination of mathematical expectation can often be simplified by using the following theorems. Since the steps are essentially the same, some proofs will be given either for the discrete case or the continuous case; others are left for the reader as exercises.

Theorem 13 If a and b are any constants, then E(aX + b) =ae(x )+b. Proof. Using Theorem 9 with g(x )=ax + b, weget E(aX + b) = = a (ax + b)f (x)dx xf (x)dx +b } {{ } E(X ) = ae(x )+b. f (x)dx } {{ } 1

If we set b =0ora = 0, it follows from Theorem 13 that Corollary 14 If a is a constant, then E(aX )=ae(x ). Corollary 15 If b is a constant, then E(b) =b. Theorem 16 If c 1, c 2,andc n are constants, then E[c 1 g 1 (X )+c 2 g 2 (X )+ + c n g n (X )] = c 1 E[g 1 (X )] + c 2 E[g 2 (X )] + + c n E[g n (X )].

The concept of a mathematical expectation can easily be extended to situations involving more than one random variable. For instance, if Z is the random variable whose values are related to those of the two random variables X and Y by means of the equation z = g(x, y), it can be shown that Theorem 17 If X and Y are discrete random variables and f (x, y) is the value of their joint probability distribution at (x, y), the expected value of g(x, Y ) is E[g(X, Y )] = x g(x, y) f (x, y). y

Theorem 18 If X and Y are any random variables, then E(X + Y )=E(X )+E(Y ). Proof. Let f (x, y) be the joint probability distribution of X and Y, assumed discrete. Then E(X + Y )= (x + y)f (x, y) x y = xf (x, y)+ yf (x, y) x y x y = E(X )+E(Y ). Note that the theorem is true whether or not X and Y are independent.

Theorem 19 If X and Y are independent random variables, then E(X Y )=E(X ) E(Y ). Proof. Let f (x, y) be the joint probability distribution of X and Y, assumed discrete. If the variables X and Y are independent, we have f (x, y) =g(x) h(y). Therefore, E(X Y )= x = x (x y)f (x, y) = (x y)g(x) h(y) y x y ( x g(x) ) y h(y) y = x (x g(x)e(y )) = E(X ) E(Y ).

Note that the validity of this theorem hinges on whether f (x, y) can be expressed as a function of x multiplied by a function of y, for all x and y, i.e., on whether X and Y are independent. For dependent variables it is not true in general.

Example 20 Suppose the probability distribution of the discrete random variable X is x 0 1 2 3 f (x) 0.2 0.1 0.4 0.3 Find E(X ), E(X 2 ). Knowing that, what is E(4X 2 )and E(3X +2X 2 )? Solution. E(X )= 3 x=0 xf (x) =0 0.2+1 0.1+2 0.4+3 0.3 =1.8, E(X 2 )= 3 x=0 x2 f (x) =0 2 0.2+1 2 0.1+2 2 0.4+3 2 0.3 =4.4, E(4X 2 )=4E(X 2 )=4 4.4 =17.6, E(3X +2X 2 )=3E(X )+2E(X 2 )=3 1.8+2 4.4 =14.2.

Example 21 If the probability density of X is given by { 2(1 x) for 0 < x < 1 f (x) = 0 elsewhere, (a) show that E(X r 2 )= (r +1)(r +2), (b) and use this result to evaluate E[(2X +1) 2 ].

f (x) = Solution. (a) { 2(1 x) for 0 < x < 1 0 elsewhere, E(X r )= = =2 =2 0 1 x r f (x)dx x r f (x) dx + }{{} 0 1 x r f (x) 0 }{{} 2(1 x) (x r x r+1 )dx =2 0 ( 1 r +1 1 ) = r +2 dx + 1 ( x r+1 r +1 xr+2 r +2 2 (r +1)(r +2). x r f (x) dx }{{} 0 ) 1 0

E(X r )= 2 (r+1)(r+2) (b) Since E[(2X +1) 2 ]=E[4X 2 +4X +1]=4E(X 2 )+4E(X )+1 and substitution of r =1andr = 2 into the preceding formula yields E(X )= 2 (2)(3) = 1 3 and E(X 2 )= 2 (3)(4) = 1 6, we get E[(2X +1) 2 ]=4 1 6 +4 1 3 +1=3.

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Moments Definition 22 The rth moment about the origin of a random variable X, denoted by μ r, is the expected value of X r ; symbolically, μ r = E(X r )= x x r f (x) for r =0, 1, 2, when X is discrete, and μ r = E(X r )= when X is continuous. x r f (x)dx

1 When r =0,wehaveμ 0 = E(X 0 )=E(1) = 1. 2 When r =1,wehaveμ 1 = E(X 1 )=E(X ),whichisjustthe expected value of the random variable X, and we give it a special symbol and a special name. Definition 23 μ 1 is called the mean of the distribution function of X,orsimply the mean of X, and it is denoted by μ.

The special moments we shall define next are of importance in statistics because they serve to describe the shape of the distribution of a random variable, that is, the shape of the graph of its probability distribution or probability density. Definition 24 The rth moment about the mean of a random variable X, denoted by μ r, is the expected value of (X μ) r ; symbolically, μ r = E[(X μ) r ]= x (x μ) r f (x) for r =0, 1, 2, when X is discrete, and μ r = E[(X μ) r ]= when X is continuous. (x μ) r f (x)dx

1 When r =0,wehaveμ 0 = E[(X μ) 0 ]=1. 2 When r =1,wehave μ 1 = E[(X μ) 1 ]=E(X μ) =E(X ) E(μ) =μ μ =0. for any random variable for which μ exists. The second moment about the mean is of special importance in statistics because it is indicative of the spread or dispersion of the distribution of a random variable; thus, it is given a special symbol and a special name. Definition 25 μ 2 is called the variance of the distribution of X, or simply the variance of X, and it is denoted by σ 2,var(X ), or V (X ); σ, the positive square root of the variance, is called the standard deviation.

This figure shows how the variance reflects the spread or dispersion of the distribution of a random variable. Here we show the histograms of the probability distributions of four random variables with the same mean μ = 5 but variances equaling 5.26, 3.18, 1.66, and 0.88.

A large standard deviation indicates that the data points are far from the mean and a small standard deviation indicates that they are clustered closely around the mean. For example, each of the three populations {0, 0, 14, 14}, {0, 6, 8, 14} and {6, 6, 8, 8} has a mean of 7. Their standard deviations are 7, 5, and 1, respectively. The third population has a much smaller standard deviation than the other two because its values are all close to 7. It will have the same units as the data points themselves. If, for instance, the data set {0, 6, 8, 14} represents the ages of a population of four siblings in years, the standard deviation is 5 years. As another example, the population {1000, 1006, 1008, 1014} may represent the distances traveled by four athletes, measured in meters. It has a mean of 1007 meters, and a standard deviation of 5 meters.

Theorem 26 Proof. σ 2 = μ 2 μ 2. σ 2 = E[(X μ) 2 ]=E(X 2 2X μ + μ 2 ) = E(X 2 ) 2μ E(X ) +μ 2 } {{ } μ = E(X 2 ) 2μ 2 + μ 2 = E(X 2 ) μ 2 } {{ } μ 2 = μ 2 μ 2

Example 27 Use Theorem 26 to calculate the variance of X, representing the number of points rolled with a balanced die. Solution. First we compute μ = E(X )=1 1 6 +2 1 6 +3 1 6 +4 1 6 +5 1 6 +6 1 6 = 7 2. Now, μ 2 = E(X 2 )=1 2 1 6 +22 1 6 +32 1 6 +42 1 6 +52 1 6 +62 1 6 = 91 6 and it follows that σ 2 = μ 2 μ2 = 91 6 ( 7 2 ) 2 = 35 12.

Example 28 With reference to Example 8, find the standard deviation of the random variable X. Solution. In Example 8 we showed that μ = E(X )= ln 4 π.now, μ 2 = E(X 2 )= 4 1 x 2 π 0 1+x 2 dx = 4 1 ( 1 1 ) π 0 1+x 2 dx = 4 π (x arctan x)1 0 = 4 [ ] (1 arctan 1) (0 arctan 0) π = 4 [( 1 π ) ] (0 0) = 4 π 4 π 1 and it follows that σ 2 = 4 π 1 ( ln 4 π and σ = 0.078519272 = 0.280212905. ) 2 0.078519272

Example 29 Find μ, μ 2,andσ2 for the random variable X that has the probability distribution f (x) = 1 2 for x = 2 andx =2. Solution. μ = E(X )= x xf (x) =( 2) 1 2 +21 2 =0, μ 2 = E(X 2 )= x x 2 f (x) =( 2) 2 1 2 +22 1 2 =4, σ 2 = μ 2 = μ 2 μ2 =4 0 2 =4, or σ 2 = μ 2 = E[(X μ) 2 ]= x (x μ) 2 f (x) =( 2 0) 2 1 2 +(2 0)2 1 2 =4.

Example 30 Find μ, μ 2,andσ2 for the random variable X that has the probability density { x/2 for 0 < x < 2, f (x) = 0 elsewhere. Solution. μ = E(X )= μ 2 = E(X 2 )= xf (x)dx = x 2 f (x)dx = 2 0 2 0 x x 2 dx = x3 6 2 0 = 4 3, x 2 x 2 x4 dx = 2 8 =2, 0

or σ 2 = μ 2 = μ 2 μ 2 =2 ( ) 4 2 = 2 3 9, σ 2 = μ 2 = E[(X μ) 2 ]= (x μ) 2 f (x)dx = = 1 2 (x 3 83 2 x2 + 169 ) x dx = 1 ( x 4 2 = 2 9 0 2 0 3 8x3 9 + 8x2 9 ( x 4 ) 2 x 3 2 dx ) 2 0

Theorem 31 If X has the variance σ 2,then Proof. Since and then we have var(ax + b) =a 2 σ 2. E[aX + b] =ae(x )+b, E[(aX + b) 2 ]=a 2 E(X 2 )+2abE(X )+b 2 σ 2 = a 2 E(X 2 )+2abE(X )+b 2 [ae(x )+b] 2 = a 2 E(X 2 )+2abE(X )+b 2 a 2 [E(X )] 2 2abE(X ) b 2 = a 2 E(X 2 ) = a 2 σ 2.

var(ax + b) =a 2 σ 2. 1 For a = 1, we find that the addition of a constant to the values of a random variable, resulting in a shift of all the values of X to the left or to the right, in no way affects the spread of its distribution. 2 For b = 0 we find that if the values of a random variable are multiplied by a constant, the variance is multiplied by the square of that constant, resulting in a corresponding change in the spread of the distribution.

Theorem 32 If X and Y are independent random variables, var(x + Y )=var(x )+var(y ), var(x Y )=var(x )+var(y ). Proof. The proof is left as an exercise.

Example 33 Suppose that X and Y are independent, with var(x )=6, E(X 2 Y 2 ) = 150, E(Y )=2,E(X ) = 2. Compute var(y ). Solution. var(x )=6=E(X 2 ) (E(X )) 2 = E(X 2 ) 2 2 E(X 2 )=10. Since X and Y are independent, we have E(XY )=E(X )E(Y ), thus E(X 2 Y 2 )=E(X 2 )E(Y 2 ). Therefore, E(X 2 Y 2 ) = 150 = E(X 2 )E(Y 2 )=10E(Y 2 ) E(Y 2 )=15. Also, since var(y )=E(Y 2 ) (E(Y )) 2, then we obtain that var(y )=E(Y 2 ) (E(Y )) 2 =15 2 2 =11.

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Chebyshev s Theorem Theorem 34 (Chebyshev s Theorem) Suppose that X is a random variable (discrete or continuous) having mean μ and variance σ 2, which are finite. Then if ε is any positive number P( X μ ε) σ2 ε 2 or, with ε = kσ P( X μ kσ) 1 k 2.

Example 35 A random variable X has density function given by { 2e 2x, x 0, f (x) = 0, x < 0. (a) Find P( X μ 1). (b) Use Chebyshev s theorem to obtain an upper bound on P( X μ 1) and compare with the result in (a).

Solution. (a) First, we need to find μ: μ = E(X )= c xf (x)dx = 0 x2e 2x dx = lim x2e 2x dx(x = u dx = du, 2e 2x dx = dv e 2x = v) c 0 = lim ( xe 2x c ) ( c0 + e 2x dx = lim xe 2x 1 ) c c 0 c 2 e 2x 0 ( = lim c e 2x x + 1 ) c ( = lim (e 2c c + 1 ) ( e 0 0+ 1 )) 2 c 0 2 2 ( = 0 1 ) = 1 2 2 =0.5. Since P( X μ 1) = 1 P( X μ < 1) and X μ = X 0.5 < 1 1 < X 0.5 < 1 1+0.5 < X < 1+0.5 0.5 < X < 1.5 then

f (x) = { 2e 2x, x 0, 0, x < 0. P( X μ 1) = 1 P( X μ < 1) = 1 P( 0.5 < X < 1.5) =1 1.5 0.5 f (x)dx =1 1.5 0 2e 2x dx =1+ ( e 2x) 1.5 0 =1+(e 3 e 0 )=e 3 =0.049787068. (b) If we take ε = 1 in Chebyshev s theorem we obtain that P( X μ 1) σ2 1 2 = σ2. One can show that σ 2 =0.25. So an upper bound for the given probability is 0.25 which is quite far from the real value.

Example 36 If a random variable X is such that E[(X 1) 2 ] = 10, E[(X 2) 2 ]=6, (a) Find the mean of X. (b) Find the standard deviation of X. (c) Find an upper bound for P ( X 7 2 15 ).

Solution. (a) Since E[(X 1) 2 ]=E[X 2 2X +1]=E[X 2 ] 2E[X ]+1=10 E[X 2 ] 2E[X ]=9, E[(X 2) 2 ]=E[X 2 4X +4]=E[X 2 ] 4E[X ]+4=6 E[X 2 ] 4E[X ]=2, then we have that E[X 2 ]=16andμ = E[X ]= 7 2. (b) σ = Var(X )= E[X 2 ] (E[X ]) 2 = 16 49 4 = 15 2. (c) If we use Chebyshev s inequality by taking μ = 7 2, ε = 15 and σ 2 = 15 4, we obtain ( P( X μ ε) σ2 ε 2 P X 7 2 ) 15 15/4 15 = 1 4.

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Moment Generating Functions Although the moments of most distributions can be determined directly by evaluating the necessary integrals or sums, an alternative procedure sometimes provides considerable simplification. This technique utilizes moment-generating functions. Definition 37 The moment-generating function of a random variable X,where it exists, is given by M X (t) =E[e tx ]= x e tx f (x) when X is discrete and M X (t) =E[e tx ]= when X is continuous. e tx f (x)dx

To explain why we refer to this function as a moment-generating function let us substitute for e tx its Maclaurin s series expansion, that is e tx =1+tx + t2 x 2 2! For discrete case, we thus get M X (t) = x + t3 x 3 3! [1+tx + t2 x 2 + + tr x r 2! r! = f (x) +t x x } {{ } 1 xf (x) } {{ } E[X ]=µ 1 =µ + t2 2! + + tr x r x 2 f (x) x } {{ } µ 2 r! ] + f (x) +. + + tr r! x r f (x) + x } {{ } µ r =1+μt + μ t 2 2 2! + + t r μ r r! + and it can be seen that in the Maclaurin s series of the moment generating function of X the coefficient of t r /r! isμ r,therth moment about the origin. In the continuous case, the argument is the same.

Example 38 Find the moment generating function of the random variable whose probability density is given by { e x for x > 0, f (x) = 0 elsewhere and use it to find an expression for μ r.

Solution. By definition M X (t) =E(e tx )= = lim c c 0 0 e tx e x dx = 0 e x(1 t) dx e x(1 t) dx = 1 1 t lim c e x(1 t) c 0 = 1 1 t lim c (e c(1 t) e 0 )= 1 1 t for t < 1. As is well known, when t < 1 the Maclaurin s series for this moment generating function is M X (t) = 1 1 t =1+t + t2 + t 3 + + t r +... =1+1! t 1! +2!t2 2! +3!t3 + + r!tr 3! r! + and hence μ r = r! forr =0, 1, 2,.

Theorem 39 d r M X (t) dt r = μ r. t=0 This follows from the fact that if a function is expanded as a power series in t, the coefficient of t r /r! istherth derivative of the function with respect to t at t =0.

Example 40 Given that X has the probability distribution f (x) = 1 8( 3 x) for x =0, 1, 2 and 3, find the moment-generating function of this random variable and use it to determine μ 1 and μ 2. Solution. Since M X (t) =E(e tx )= 3 ( ) 3 e tx = 1 x 8 (1 + 3et +3e 2t +3e 3t ) x=0 = 1 8 (1 + et ) 3. Then, by Theorem 39, and μ 1 = M X (t) t=0 = 3 8 (1 + et ) 2 e t t=0 = 3 2, μ 2 = M X (t) t=0 = [ 3 4 (1 + et ) 2 e 2t + 3 8 (1 + et ) 2 e t t=0 =3.

Often the work involved in using moment-generating functions can be simplified by making use of the following theorem: Theorem 41 If a and b are constant, then (a) M X +a (t) =E[e (X +a)t ]=e at M X (t); (b) M bx (t) =E(e bxt )=M X (bt); (c) M X +a (t) =E [ e ( X +a b )t ] = e a b t ( M t ) X b. b Proof. The proof is left as an exercise.

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Product Moments Definition 42 The rth and sth product moment about the origin of the random variables X and Y, denoted by μ r,s, is the expected value of X r Y s ; symbolically μ r,s = E(X r Y s )= x r y s f (x, y) x y for r =0, 1, 2, and s =0, 1, 2, when X and Y are discrete. The double summation extends over the entire joint range of the two random variables. Note that μ 1,0 = E(X ), which we denote here μ X,andthatμ 0,1 = E(Y ),whichwedenoteherebyμ Y.

Definition 43 The rth and sth product moment about the means of the random variables X and Y, denoted by μ r,s, is the expected value of (X μ X ) r (Y μ Y ) s ; symbolically μ r,s = E[(X μ X ) r (Y μ Y ) s ]= (x μ X ) r (y μ Y ) s f (x, y) x y for r =0, 1, 2, and s =0, 1, 2, when X and Y are discrete. In statistics, μ 1,1 is of special importance because it is indicative of the relationship, if any, between the values of X and Y ; thus it is given a special symbol and special name.

Definition 44 μ 1,1 is called the covariance of X and Y, and it is denoted by σ XY, cov(x, Y ), or C(X, Y ). Observe that if there is a high probability that large values of X will go with large values of Y and small values of X with small values of Y, the covariance will be positive, if there is a high probability that large values of X will go with small values of Y, and vice versa, the covariance will be negative. μ Y y + μ X It is in this sense that the covariance measures the relationship, or association, between the values of X and Y. + x

Theorem 45 σ XY = μ 1,1 μ X μ Y. Proof. Using the various theorems about expected values, we can write σ XY = E[(X μ X )(Y μ Y )] = E[XY X μ Y Y μ X + μ X μ Y ] = E(XY ) μ Y E(X ) μ X E(Y )+μ X μ Y = E(XY ) μ Y μ X μ X μ Y + μ X μ Y = μ 1,1 μ X μ Y.

Example 46 We know that, the joint and marginal probabilities of X and Y, the numbers of aspirin and sedative caplets among two caplets drawn at random from a bottle containing three aspirin, two sedative, and four laxative caplets were recorded as follows: x y 0 1 2 h(y) 0 6/36 12/36 3/36 21/36 1 8/36 6/36 14/36 2 1/36 1/36 g(x) 15/36 18/36 3/36 1 Find the covariance of X and Y.

x y 0 1 2 h(y) 0 6/36 12/36 3/36 21/36 1 8/36 6/36 14/36 2 1/36 1/36 g(x) 15/36 18/36 3/36 1 Solution. Referring to the joint probabilities given here, we get μ 1,1 =E(XY )= 2 x=0 y=0 2 xyf (x, y) 6 12 =0 0 +1 0 36 36 +2 0 3 36 8 +0 1 36 +1 1 6 36 +0 2 1 36 = 6 36

x y 0 1 2 h(y) 0 6/36 12/36 3/36 21/36 1 8/36 6/36 14/36 2 1/36 1/36 g(x) 15/36 18/36 3/36 1 and using the marginal probabilities, we get μ X = E(X )= and μ Y = E(Y )= 2 xf (x, y) = x=0 2 yf (x, y) = y=0 2 x=0 2 y=0 xg(x) =0 15 +1 18 36 36 +2 3 36 = 24 36 yh(y) =0 21 36 +1 14 36 +2 1 36 = 16 36.

It follows that σ XY = μ 1,1 μ X μ Y = 6 36 24 36 16 36 = 7 54. The negative result suggests that the more aspirin tablets we get the fewer sedative tables we will get, and vice versa, and this, of course, make sense.

Example 47 A quarter is bent so that the probabilities of heads and tails are 0.4 and 0.6. If it is tossed twice, what is the covariance of Z,the number of heads obtained on the first toss, and W, the total number of heads obtained in the two tosses of the coin. Solution. f (0, 0) = P(T, T )=0.6 0.6 =0.36, f (1, 1) = P(H, T )=0.4 0.6 =0.24, f (0, 1) = P(T, H) =0.6 0.4 =0.24, f (1, 2) = P(H, H) =0.4 0.4 =0.16. w z 0 1 2 g(z) 0 0.36 0.24 0.60 1 0.24 0.16 0.40 h(w) 0.36 0.48 0.16 1

μ 1,1 σ ZW = μ 1,1 μ Zμ W = E(ZW ) E(Z )E(W ). = E(ZW )= 1 z=0 w=1 2 f (z, w) =0 0 0.36 + 0 1 0.24 + 1 1 0.24 + 1 2 0.16 = 0.56, 1 1 μ Z = E(Z )= f (z, w) = g(z) =0 0.60 + 1 0.40 = 0.40, z=0 2 μ W = E(W )= f (z, w) = z=0 w=0 w=0 =0.80 σ ZW =0.56 0.40 0.80 = 0.24. 2 h(w) =0 0.36 + 1 0.48 + 2 0.16

As far as the relation between X and Y is concerned, observe that if X and Y are independent, their covariance is zero; symbolically, Theorem 48 If X and Y are independent, then σ XY =0. Proof. If X and Y are independent then we know that E(XY )=E(X )E(Y ). Since thus σ XY = μ 1,1 μ X μ Y = E(XY ) E(X )E(Y ) σ XY = E(X )E(Y ) E(X )E(Y )=0.

It is of interest to note that the independence of two random variables implies a zero covariance, but a zero covariance does not necessarily imply their independence. This is illustrated by the following example. Example 49 If the joint distribution of X and Y is given by x y 1 0 1 h(y) 1 1/6 2/6 1/6 4/6 0 0 0 0 0 1 1/6 0 1/6 2/6 g(x) 2/6 2/6 2/6 1 show that their covariance is zero even though the two random variables are not independent.

Solution. Using the probabilities shown in the margins, we get μ X = μ Y = μ 1,1 = 1 x= 1 1 y= 1 1 x= 1 y= 1 xg(x) =( 1) 2 6 +0 2 6 +1 2 6 =0, yh(y) =( 1) 4 2 +0 0+1 6 6 = 2 6, 1 xyf (x, y) =( 1)( 1) 1 6 +1( 1) 1 6 +( 1)1 1 6 +1 1 1 6 =0. Thus, σ XY =0 0 ( 2 6) = 0, the covariance is zero, but the two random variables are not independent. For instance, f (x, y) g(x)h(y) forx = 1 andy = 1.

Theorem 50 Var(X ± Y )=Var(X )+Var(Y ) ± 2Cov(X, Y ) or σ 2 X ±Y = σ2 X + σ2 Y ± 2σ XY. Proof. Var(X ± Y ) =E [((X ± Y ) (μ X ± μ Y )) 2 ] =E [(X ± Y ) 2 2(X ± Y )(μ X ± μ Y )+(μ X ± μ Y ) 2 ] =E [X 2 ± 2XY + Y 2 2X μ X 2X μ Y 2Y μ X 2Y μ Y + μ 2 X ± 2μ X μ Y + μ 2 Y ] =E [(X μ X ) 2 +(Y μ Y ) 2 ± 2XY 2X μ Y 2Y μ X ± 2μ X μ Y ] =E [(X μ X ) 2 ]+E [(Y μ Y ) 2 ] ± 2E (XY ) 2μ Y E (X ) 2μ X E (Y ) ± 2μ X μ Y =Var(X )+Var(Y ) ± 2Cov(X, Y ).

Outline 1 Introduction 2 The Expected Value of a Random Variable 3 Moments 4 Chebyshev s Theorem 5 Moment Generating Functions 6 Product Moments 7 Conditional Expectation

Conditional Expectation Definition 51 If X is a discrete random variable and f (x y) is the value of the conditional probability distribution of X given Y = y at x, the conditional expectation of u(x )giveny = y is E[u(X ) y] = x u(x)f (x y). Similar expressions based on the conditional probability distribution of Y given X = x define the conditional expectation of v(y )given X = x.

Definition 52 If we let u(x )=X in Definition 51, we obtain the conditional mean of the random variable X given Y = y, which we denote by μ X y = E(X y) = x xf (x y). Definition 53 Correspondingly, the conditional variance of X given Y = y is σ 2 X y = E[(X μ X y ) 2 y] =E(X 2 y) μ 2 X y where E(X 2 y) is given by Definition 51 with u(x )=X 2

Example 54 With reference to Example 46, find the conditional mean of X given Y =1. Solution. Making use of the results obtained before, that is, f (0 1) = 4 7, f (1 1) = 3 7 and f (2 1) = 0, we get E(X 1) = 2 x=0 xf (x 1) = 0 4 7 +1 3 7 +2 0=3 7.

Example 55 Suppose that we have the following joint probability distribution x y 1 2 3 4 h(y) 1 0.01 0.07 0.09 0.03 0.20 2 0.20 0.05 0.25 0.50 3 0.09 0.03 0.06 0.12 0.30 g(x) 0.30 0.10 0.20 0.40 1 (a) Find the conditional expectation of u(x) =x 2 given Y =2. (b) Find the conditional mean of X given Y =2. (c) Find the conditional variance of X given Y =2.

Solution. Since f (x y) = f (x,y) h(y) f (x 2) = f (x,2) h(2) = f (x,2) 0.5 =2f (x, 2), then we have 2f (1, 2) = 0.40, x =1, 2f (2, 2) = 0.00, x =2, f (x 2) = 2f (3, 2) = 0.10, x =3, 2f (4, 2) = 0.50, x =4. (a) The conditional expectation of u(x) =x 2 given Y =2is E[X 2 Y =2]= 4 x 2 f (x 2) x=1 =1 2 f (1 2) + 2 2 f (2 2) + 3 2 f (3 2) + 4 2 f (4 2) =1(0.40) + 4(0.00) + 9(0.10) + 16(0.50) =9.30.

(b) The conditional mean of X given Y =2is 4 E[X Y =2]=μ X 2 = xf (x 2) x=1 =1f (1 2) + 2f (2 2) + 3f (3 2) + 4f (4 2) =1(0.40) + 2(0.00) + 3(0.10) + 4(0.50) =2.70

(c) The conditional variance of X given Y =2. 4 E[(X μ X 2 ) 2 2] =σx 2 2 = (x 2.7) 2 f (x 2) x=1 =(1 2.7) 2 f (1 2) + (2 2.7) 2 f (2 2) +(3 2.7) 2 f (3 2) + (4 2.7) 2 f (4 2) =2.89(0.40) + 0.49(0.00) + 0.09(0.10) + 1.69(0.50) =2.01 or simply E[(X μ X 2 ) 2 2] =σ 2 X 2 = E(X 2 Y =2) μ 2 X 2 =9.30 (2.70) 2 =2.01.

Thank You!!!