PHP 2510 Expectation, variance, covariance, correlation Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average Variance Variance is the average of (X µ) 2 Standard deviation Covariance and correlation Covariance is the average of (X µ X )(Y µ Y ) Correlation is a scaled version of covariance Lots of examples PHP 2510 Oct 8, 2008 1
Expected value Synonyms for expected value: average, mean The expectation or expected value of a random variable X is a weighted average of its possible outcomes. For a discrete random variable, each outcome is weighted by its probability of occurrence, using the mass function: E(X) = i x i P (X = x i ) = i x i p(x i ) For a continuous random variable, each outcome is weighted by the relative frequency of its occurrence, using the density function: E(X) = x f(x) dx PHP 2510 Oct 8, 2008 2
Examples: Discrete random variables Example 1. Let X denote the number of boys in a family with three children. Assume the probability of having a boy is.5. Step 1: Compute the mass function k p(k) 0.125 1.375 2.375 3.125 Step 2: Compute weighted average E(X) = 3 k p(k) k=0 = (0)(.125) + (1)(.375) + (2)(.375) + (3)(.125) = 1.5 PHP 2510 Oct 8, 2008 3
Example 2: Roulette. In roulette, a ball is tossed on a spinning wheel, and it lands on one of 38 numbers (each of 1 to 36, plus 0 and 00). If you bet $1 on a particular number, the payoff for winning is $36. Suppose you bet $1 on the number 12. Define the random variable X to be your winnings on one play of the roulette wheel. Then 36 if the number is 12 X = 1 if the number is not 12 Find E(X), or your expected winnings. PHP 2510 Oct 8, 2008 4
Step 1: Compute mass function k p(k) 36 1 38 1 37 38 Step 2: Compute E(X) as weighted average of outcomes E(X) = k p(k) k= 1,36 ( ) 37 = ( 1) + (36) 38 = 0.026 ( ) 1 38 Question: What the expected return in 100 plays of roulette? PHP 2510 Oct 8, 2008 5
Expected value for common discrete RV s Binomial. If X has the binomial distribution with parameters n and π, then E(X) = nπ. Example: Toss a coin 50 times, and let X denote the number of heads. Then E(X) = nπ = 50.5 = 25 Example: The proportion of individuals with coronary artery disease is.3. In a sample of 45 individuals, what is the expected number of cases of CAD? E(X) = nπ = 45.3 = 13.5 Suppose one person is selected from the population. Define a random variable Y such that Y = 1 if the person has CAD and Y = 0 if not. Then E(Y ) = nπ = 1.3 =.3 PHP 2510 Oct 8, 2008 6
Poisson. If X has the Poisson distribution with rate parameter λ, then E(X) = λ. This is because ) λ λk E(X) = k (e = λ k! k=0 The mean of a Poisson RV is the number of events you expect to observe. PHP 2510 Oct 8, 2008 7
Geometric. If X has the Geometric distribution with success probability π, then E(X) = 1/π. This is because E(X) = k { (1 π) k 1 π } = 1 π k=1 The mean of a geometric RV is the number of trials you expect to require before observing the first success. Hence if the success probability π is low, E(X) will be high; and vice-versa. Example. If you roll two dice, the probability of rolling a 3 is 2/36 or about 0.56. Let X denote the number of rolls until a 3 comes up. What is E(X)? (Ans: 18) PHP 2510 Oct 8, 2008 8
Expected value for continuous RV Let X be a continuous random variable defined on an interval A. Then the expected value is a weighted average of outcomes, weighted by the relative frequency of each outcome. The weighted average is computed using an integral, E(X) = x f(x) dx A PHP 2510 Oct 8, 2008 9
Example. Suppose X is a uniform random variable on the interval [1, 4]. Find E(X). Step 1: Recall that f(x) = 1 4 1 = 1 3, and that the interval A is [1, 4]. So the appropriate integral is 4 1 x f(x) dx = 4 1 x 1 3 dx Step 2: Evaluate the integral 4 1 x 1 3 dx = 1 3 x 2 2 4 1 = 2.5 PHP 2510 Oct 8, 2008 10
Expected values for common continuous RV s Normal. If X has a normal distribution with parameters µ and σ, then E(X) = µ. Exponential. If X has the exponential distribution with parameter θ, then E(X) = θ. In this case, θ is the expected waiting time until an event occurs, and 1/θ is called the event rate. PHP 2510 Oct 8, 2008 11
Some properties of expected values. 1. Linear combinations. If a and b are constants, then E(aX + b) = ae(x) + b 2. Sums of random variables. The expected value of a sum of random variables is the sum of expected values. E(X 1 + X 2 + + X n ) = E(X 1 ) + E(X 2 ) + + E(X n ) PHP 2510 Oct 8, 2008 12
Example. Suppose X is a Poisson random variable denoting the number of lottery winners per week. Its expected value is E(X) = 2. What is the expected number of winners over 4 weeks? E(4X) = 4 E(X) = 4 2 = 8 Example. Let X denote the daily low temperature for each day in September, and let E(X) denote its average. Suppose E(X) = 65, measured in degrees Fahrenheit. What is the mean temperature in degrees Celsius? To convert X from F to C, define a new random variable Y = 5 9 X 160 9 Then using the rule about linear combinations, E(Y ) = 5 160 E(X) 9 9 18.3 PHP 2510 Oct 8, 2008 13
Computing means from a sample of data Loosely speaking, for a sample of observed data x 1, x 2,..., x n, each of the individual x i can be thought of as having associated probability mass p(x i ) = 1/n. So the sample mean is x = = = 1 n n x i p(x i ) i=1 n x i (1/n) Simply put, take the sum of the observations and divide by n. i=1 Sample means are not expected values! They are random variables. n i=1 We will discuss sample means later on... PHP 2510 Oct 8, 2008 14 x i
Variance of a random variables Variance measures dispersion of a random variable s distribution. It is just an average. It is the average squared deviation of a random variable from its mean. To make notation simple, let µ = E(X). Then var(x) = E{(X µ) 2 } In other words, it is the average value of (X µ) 2. For a discrete random variable, var(x) = i (x i µ) 2 p(x i ) For a continuous random variable, var(x) = (x µ) 2 f(x) dx PHP 2510 Oct 8, 2008 15
Example 1 (consumers of alcohol). In a certain population, the proportion of those consuming alcohol is.65. Select a person at random, with X = 1 if consumer of alcohol and X = 0 if not. In this example, E(X) = µ = 0.65. var(x) = E{(X 0.65) 2 } = i (x i 0.65) 2 p(x i ) = (1 0.65) 2 (0.65) + (0 0.65) 2 (0.35) =.228 Example 2. Suppose instead the probability was 0.1. What then is var(x)? Ans = 0.09. Pattern: For a Binomial random variable X with n = 1 and success probability π, var(x) = π(1 π) PHP 2510 Oct 8, 2008 16
Properties of variance If a and b are constants, then var(ax + b) = a 2 var(x) (Why is b not included?) If X 1, X 2,..., X n are independent random variables, then var(x 1 + X 2 + + X n ) = var(x 1 ) + var(x 2 ) + + var(x n ) PHP 2510 Oct 8, 2008 17
Computing variances from a sample of data Like with the sample mean, for a sample of observed data x 1, x 2,..., x n, each of the individual x i can be thought of as having associated probability mass p(x i ) = 1/n. To calculate the sample variance, we take an average of (x i x) 2. The sample variance is S 2 = = = 1 n n (x i x) 2 p(x i ) i=1 n (x i x) 2 (1/n) i=1 n (x i x) 2 i=1 1 It is more common to use n 1 instead of 1 n. We will discuss reasons for this later. For now, you should think of variance as an average. PHP 2510 Oct 8, 2008 18
Standard deviation The standard deviation measures the average distance of a random variable X from its mean. By definition, SD(X) = var(x). The logic goes like this: 1. because var(x) measures average squared deviation between X and its mean; and 2. because SD(X) = var(x); then 3. SD(X) is approximately equal to the average absolute deviation between X and its mean PHP 2510 Oct 8, 2008 19
Example. In September in Providence, noon time temperature has mean 65 and variance 100. What is the SD of the temperatures? Select a day at random. What does SD tell us about the temperature on that day, relative to the average temperature? Suppose noon time temps are normally distributed. Should a noon time temperature of 85 be considered unusual? Why or why not? PHP 2510 Oct 8, 2008 20
Mean and variance for some common RV s Random variable Mass or Density Function E(X) var(x) Binomial(n, π) ( n ) x π x (1 π) n x nπ nπ(1 π) Poisson(λ) e λ λ x /x! λ λ Geometric(π) (1 π) x 1 π 1/π 1/π 2 Normal(µ, σ 2 ) µ σ 2 Exponential(θ) (1/θ)e θ/x 1/θ 1/θ 2 PHP 2510 Oct 8, 2008 21
Correlation and Covariance Correlation and covariance are one way to measure association between two random variables that are observed at the same time on the same unit. Example: Height and weight measured on the same person Example: years of education and income Example: two successive measures of weight, taken on the same person but one year apart. PHP 2510 Oct 8, 2008 22
Covariance Covariance measures the degree to which two variables differ from their mean. It is an average: cov(x, Y ) = E {(X µ X )(Y µ Y )} cov(x, Y ) > 0 means that X and Y tend to vary in the same direction relative to their means (both higher or both lower). They have a positive association. Example: height and weight cov(x, Y ) < 0 means that X and Y tend to vary in opposite directions relative to their means (when one is higher, the other is lower). They have a negative association. Example: weight and minutes of exercise per day cov(x, Y ) = 0 generally means that X and Y are not associated. PHP 2510 Oct 8, 2008 23
Example: mean arterial pressure and body mass index during pregnancy SUMMARY STATISTICS Variable Obs Mean Std. Dev. ----------+--------------------------------- map24 326 76.55951 7.351673 bmi 326 25.10736 6.217994 Give an interpretation for SD here. PHP 2510 Oct 8, 2008 24
100 80 map24 60 40 20 40 60 bmi PHP 2510 Oct 8, 2008 25
Computing covariance For individual i, let m i denote MAP and let b i denote BMI. In this table, prod represents (m i m) (b i b) Recall m = 76.6 and b = 25.1. To compute covariance, we take the average (sample mean) of the products (following pages) DATA EXCERPT map24 (m_i) bmi (b_i) prod ------------------------------------- 1. 72.7 15.9 35.53593 2. 69.3 16.3 63.9371 3. 81 16.3-39.10899 4. 63.7 16.3 113.2583 5. 74 16.6 21.77467 6. 73.3 16.6 27.7298 PHP 2510 Oct 8, 2008 26
7. 69.3 16.9 59.58139 8. 74.7 16.9 15.26169 9. 82.7 17-49.78313 10. 73 17.2 28.14632 11. 66.3 17.2 81.12561 12. 74 17.8 18.70326 13. 73 17.8 26.01062 14. 84.3 17.9-55.78852 15. 68.3 17.9 59.52924 16. 70.3 18 44.48857 SUMMARY STATISTICS Variable Obs Mean ---------+----------------------- prod 326 13.2753 PHP 2510 Oct 8, 2008 27
Computing covariance from a sample Like mean and variance, covariance is an average. In a sample of pairs (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ), we can assume each pair is observed with probability p(x i, y i ) = 1/n. Then the sample covariance is a weighted average of (x i x) (y i y): n ĉov(x, Y ) = (x i x) (y i y) p(x i, y i ) = 1 n i=1 n (x i x) (y i y) i=1 PHP 2510 Oct 8, 2008 28
Correlation is a standardized covariance corr(x, Y ) = Always between 1 and 1 cov(x, Y ) SD(X) SD(Y ) Measures degree of linear relationship (If relationship not linear, correlation not an appropriate measure of association) Pearson s sample correlation plugs in sample estimates for the quantities in the formula above ĉorr(x, Y ) = (1/n) n i=1 (x i x)(y i y) S x S y PHP 2510 Oct 8, 2008 29
SUMMARY STATISTICS Variable Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- prod 326 13.2753 53.69735-131.3067 391.1627 map24 326 76.55951 7.351673 55 101.3 bmi 326 25.10736 6.217994 15.9 57.2 CORRELATION COEFFICIENT (obs=326) bmi ---------+------------------ map24 0.2913 Using the numbers on the table above, how would you obtain the correlation coefficient? PHP 2510 Oct 8, 2008 30