1 1 CHAPTER 7 Online Supplement Covariance and Correlation for Measuring Dependence We have discussed the notion of probabilistic dependence above and indicated that dependence is defined in terms of conditional distributions. In some cases, though, the use of conditional distributions can be difficult, and another approach to measuring dependence is worthwhile. Covariance is a quantity that is closely related to the idea of variance. Covariance and its close relative correlation can be used to measure certain kinds of dependence. The covariance between two uncertain quantities X and Y is calculated mathematically by: Cov(X, Y) = [x 1 E(X)][y 1 E(Y)]P(X = x 1 and Y = y 1 ) + + [x n E(X)][y n E(Y)]P(X = x n and Y = y n ) n m = [x i E(X)] y j E(Y) P X = x i and Y = y j i=1 j=1 = E X E(X) Y E(Y) Although this is a complicated formula, with a little interpretation we can get some insight into it. First, it really is similar to the formula for variance. There we calculated an average of squared deviations of X from its expected value E(X). Here, instead of squaring the deviations, we multiply X s deviation times Y s deviation. This is sometimes called a cross product because it is a product of deviations for two different quantities (X and Y). Although it may not be evident on the surface, the covariance can be either positive or negative. Suppose that large values of X tend to occur with large values of Y, and small with small. On the other hand, the probability that high values of X and low values of Y (and vice versa) occur together is low. Now consider what happens with the cross products in the formula. When both X and Y are high, they will both be above their corresponding expected values, making both deviations and their cross product positive. When both quantities are low, they will both be below their expected values. Both deviations will be negative, but the cross product will again be positive. When one is high and one is low (that is, one above
2 2 its expected value and one below), the cross product will be negative. If large X s tend to go with large Y s, and small with small, then the positive cross products will get more weight (higher probabilities) than negative cross products in the formula, and the overall calculation will yield a positive covariance. On the other hand, if large X s tend to occur with small Y s, and vice versa, then the negative cross products will get more weight, resulting in a negative covariance. Thus, a positive covariance reflects quantities that tend to move in the same direction, and this is called a positive or direct relationship. Likewise, a negative covariance indicates that the quantities tend to move in opposite directions, which is called a negative or indirect relationship. A simple example will help clarify the idea of covariance and its calculation. Suppose an investor is considering purchasing shares in American Rivets Corporation (ARC). The investor already has shares of Sundance Solar Power (SSP). One of the things the investor would like to accomplish is to stabilize the rate of return of the portfolio; ideally, when the return on one stock goes down, the other would go up. On the other hand, a positive relationship would be bad because the returns would tend to go up and down together, making the overall return on the portfolio vary considerably. A simple model of the returns on the two stocks involves only two possible outcomes for each one. ARC could have returns of 10% or -5%. SSP, on the other hand, could have returns of 12% or -8%. The probabilities of the possible outcomes are P(ARC = 10% and SSP = 12%) = 0.35 P(ARC = 10% and SSP = 8%) = 0.10 P(ARC = 5% and SSP = 12%) = 0.15 P(ARC = 5% and SSP = 8%) = 0.40 You can see that the two stocks tend to move in the same direction. There is a 75% chance that they are both high or both low. Likewise there is only a 25% chance that one is high and the other low. Thus, we expect the covariance to be positive.
3 3 Calculating the covariance requires first calculating the expected values for each stock. Considering ARC first, we can use the law of total probability: P(ARC = 10%) = P(ARC = 10% and SSP = 12%) + P(ARC = 10% and SSP = 8%) = = 0.45 Thus, P(ARC = 5%) = 1 P(ARC = 10%) = 0.55, and we can calculate E(ARC) = 0.45(10%) ( 5%) = 1.75% Likewise, we can calculate E(SSP): P(SSP = 12%) = P(ARC = 10% and SSP = 12%) + P(ARC = 5% and SSP = 12%) = = 0.50 P(SSP = 8%) = 1 P(SSP = 12%) = 0.50 Therefore: E(SSP) = 0.50(12%) ( 8%) = 2%
4 4 Now we can calculate the covariance between ARC and SSP: Cov(ARC, SSP) = [10% 1.75%][12% 2%]P(ARC = 10% and SSP = 12%) + [10% 1.75%][ 8% 2%]P(ARC = 10% and SSP = 8%) + [ 5% 1.75%][12% 2%]P(ARC = 5% and SSP = 12%) + [ 5% 1.75%][ 8% 2%]P(ARC = 5% and SSP = 8%) = 8.25% 10% % ( 10%) ( 6.75%) 10% ( 6.75%) ( 10%) 0.40 = 37.5(% squared) As expected, the covariance is positive. The problem, however, is that the magnitude of the covariance is not very meaningful because it depends on the range of variation in the two quantities. Also, as with the variance, the covariance carries units that are not meaningful. In the case of the two stock returns, the units are percentage squared. But suppose we wanted to calculate the covariance between hemline height and stock market return; the calculation would involve multiplying inches times percentage points of return, and so the units would be in percentage inches. What in the world is a percentage inch? To solve these two problems, we often transform the covariance to get a standardized measure of dependence. This standardized measure is called the correlation coefficient, and the Greek symbol ρ (rho) is used to represent it. To calculate ρ, divide the covariance of X and Y by the standard deviations of these two uncertain quantities: ρ XY = Cov(X, Y) σ X σ Y
5 5 The correlation ρ XY or simply ρ has a number of useful properties. First, it ranges between +1 (perfect positive dependence) and 1 (perfect negative dependence). A correlation of zero suggests no relationship, although certain kinds of dependence are possible even though the correlation is zero. Complete Exercise 7S.1 and calculate the correlation to see an example. Also, ρ has no units. In the hemline stock market example, we divide the covariance, which is in percentage inches, by the standard deviation of return, which is in percentage, and the standard deviation of hemline height (in inches), and the units cancel each other out. As a result, the correlation is a unitless measure. An implication is that the correlation is useful for comparing the strength of the relationship in one case with the strength of the relationship in another that involves different variables altogether. To continue the example, we calculate the correlation between the returns for stocks ARC and SSP. To do this calculation, we must first calculate the standard deviation for each of the two individual stocks. These are σ ARC = 7.46% and σ SSP = 10%. Thus, ρ ARC,SSP = Cov(ARC, SSP) σ ARC σ SSP = = This correlation of gives the investor an indication of the extent to which the returns of the two stocks are related to each other. By comparing the correlations of different pairs of stocks, the investor can try to locate those with lower (or even negative) correlations in order to accomplish the objective of stabilizing the return of the overall portfolio. A portfolio of two assets can have a lower risk (variation) than the individual risks of either asset. When two assets are negatively correlated, increases in one asset tend to occur with decreases in the other, thereby stabilizing the returns. This lowers the risk of the portfolio without sacrificing return because the expected return is still the weighted sum of the two assets returns. But what happens when the portfolio has more than two assets? Then the number of dependent relations grows quickly. With two assets, there is only one correlation to monitor, with three assets there are three correlations, and with 30 assets there are over 400 correlations! Clearly, we need a formula that keeps track of the interplay among all the correlations and precisely relates the risk of a portfolio to the risk of each asset.
6 6 As in the text, we will denote Portfolio P s return by R P. To derive the formula for the variance, Var(R P ), we will use the expected-value operator. We have already noted that the variance can be written as the expected value of squared differences: Var(X) = E[(X E(X)) 2 ]. In a similar way, the covariance is an expected value, this time the expected value of the cross product: Cov(X, Y) = E[(X E(X))(Y E(Y))]. Using the properties of expected value, we have: Var(R P ) = E[(R P E(R P )) 2 ] = E (w AB R AB + w CD R CD + w EF R EF ) E(w AB R AB + w CD R CD + w EF R EF ) 2 = E[(w AB R AB E(w AB R AB ) + w CD R CD E(w CD R CD ) + w EF R EF E(w EF R EF )) 2 ] = E w AB R AB E(w AB R AB ) 2 + E[(w CD R CD E(w CD R CD )) 2 ] + E[(w EF R EF E(w EF R EF )) 2 ] + E 2 (w AB R AB E(w AB R AB ))(w CD R CD E(w CD R CD )) 2 + E[2((w AB R AB E(w AB R AB ))(w EF R EF E(w EF R EF ))) 2 ] + E 2 (w CD R CD E(w CD R CD ))(w EF R EF E(w EF R EF )) 2 = Var(w AB R AB ) + Var(w CD R CD ) + Var(w EF R EF ) + 2Cov(w AB R AB, w CD R CD ) + 2Cov(w AB R AB, w EF R EF ) + 2Cov(w CD R CD, w EF R EF ) = w 2 AB Var(R AB ) + w 2 CD Var(R CD ) + w 2 EF Var(R EF ) + 2w AB w CD Cov(R AB, R CD ) + 2w AB w EF Cov(R AB, R EF ) + 2w CD w EF Cov(R CD, R EF ). Thus, the variance of a portfolio is the squared weights times the variance of each asset plus twice the product of the weights times the covariance for each pair of variables. It is this complicated formula that makes managing portfolios both difficult and interesting. To better understand this formula, let s simplify by considering the two asset portfolio: Var(R P ) = w 2 AB Var(R AB ) + w 2 CD Var(R CD ) + 2w AB w CD Cov(R AB, R CD ).
7 7 Problem 7.7 asks you to show that Var(R AB ) = , Var(R CD ) = , and Var(R EF ) = Thus, the variance of an equally weighted portfolio is: Var(R P ) = (0.0019) (0.0003) Cov(R AB, R CD ) = Cov(R AB, R CD ) = ρ AB,CD. Now, we can compare the portfolio s risk to that of the individual assets. The standard deviation of AB is 4.3% and the standard deviation of CD is 1.8%. If the correlation between AB and CD were 0.75, then the standard deviation of the portfolio is 2.89%, which is less than AB s standard deviation but larger than CD s. If, however, the stocks were negatively correlated, say -0.75, then the portfolio s risk drops to 1.62%, below both assets, demonstrating the power of portfolio diversification. One final warning is in order before leaving the ideas of covariance and correlation. Although these measures of dependence are widely used, they only provide insight into a certain kind of dependence. That is, as long as the relationship is such that an increase in one variable suggests an increase (or a decrease) in the other, then the covariance and correlation will reflect this relationship. If the relationship is more complex, however, such a relationship may not be adequately reflected in the covariance and correlation. For example as X increases up to a certain point, Y might be expected to increase. If, after that point, as X continues to increase, Y is expected to decrease, then we call this a nonmonotonic relationship. Covariance and correlations should only be used for monotonic relationships as they poorly reflect nonmonotonicity. Stochastic Dominance and Multiple Attributes Now we are prepared to follow up the discussion of stochastic dominance with multiple attributes in Chapter 4. In the case of multiple attributes, one must consider the joint distribution for all of the attributes together. To develop this, we need to introduce some notation.
8 8 Figure 7S.1 CDF s for three investment alternatives. Investment B stochastically dominates Investment A. First, let F(X) denote the CDF for a variable X. That is, F(X) = P(X x) For example, in the case of yearly profit for the duplex, F($6000) = P(Yearly Profit $6000). Now we can write an algebraic formula for stochastic dominance in terms of the F s. Considering the investments in Figure 7S.1, B dominates A because F B (x) F A (x) for all values of x on the horizontal axis. This condition asserts that the CDF for B must lie to the right of the CDF for A. Again, we warn the reader that it is easy to reverse dominance. Note that B dominates A when B s CDF F B (x) is always less than or equal to A s CDF F A (x). When there are more attributes, the CDF must encompass all of the attributes. For example, recall the summer-job example from Chapter 4, in which we discussed uncertainty about both salary and summer fun. We would have to look at the CDF for both uncertain quantities. The CDF would be denoted by F x s, x f = P Salary x s and I x f. An alternative B (a specific job like the in-town job) dominates alternative A if F B x s, x f F A x s, x f for all values of x s and x f and is strictly less for some x s and x f values. If we could draw the picture of the graph in three-dimensional space, you could see that this means that the CDF for B must be entirely below the CDF for A and shifted toward larger values for both Salary and Fun. In Chapter 4, we made the claim that if the uncertain quantities are independent, then stochastic dominance on each of the individual attributes implies overall stochastic dominance. To see this, consider
9 9 the summer-job example further. Stochastic dominance requires that F B x s, x f F A x s, x f for all values of x s and x f, which is the same as P B Salary x s and Fun x f P A Salary x s and Fun x f for all values of x s and x f. If Salary and Fun are independent for both alternatives A and B, we now know that the joint probabilities in this condition can be rewritten as the product of the individual (or marginal) probabilities: P B (Salary x s )P B Fun x f P a (Salary x s )P A Fun x f Now, suppose that B dominates A individually on each attribute. This means that P B (Salary x s ) P A (Salary x s ) and P B Fun x f P A Fun x f If this is true, it is certainly the case that the overall stochastic-dominance condition is met, because the product P B (Salary x s )P B Fun x f must be less than or equal to P A (Salary x s )P A Fun x f. A final word of caution is in order here. The reasoning above only goes in one direction. That is, if the attributes are independent and if the individual stochastic-dominance conditions are met, then the overall stochastic-dominance condition is also met. That is, we have identified sufficient conditions for overall stochastic dominance. However, it is possible for overall stochastic dominance to exist even though the uncertain quantities are not independent or do not display stochastic dominance in the individual attributes. In other words, in some cases, you might have to go back to the definition of overall stochastic dominance [F B (x 1,, x n ) F A (x 1,, x n ) for all x 1,, x n ] in order to determine whether B dominates A. Covariance and Correlation: The Continuous Case Covariance and correlation also have counterparts when the uncertain quantities are continuous. As with expected value and variance, the definition of covariance uses an integral sign instead of a summation: *start box
10 10 x + y + Cov(X, Y) = [x E(X)][y E(Y)]f(x, y)dydx x y As before, the correlation ρ XY is calculated by dividing Cov(X, Y) by σ X σ Y. The double integral in the formula above replaces the double summation in the previous formula for the covariance of two discrete uncertain quantities. The term f(x, y) refers to the joint density function for uncertain quantities X and Y. This joint density function is a natural extension of the density function for a single variable; it can be interpreted as a function that indicates the relative likelihood of different (x, y) pairs occurring, and the probability that X and Y fall into any given region can be calculated from f(x, y). Problems 7S.1 Consider the following probabilities: P(X = 2) = 0.3 P(X = 4) = 0.7 P(Y = 10 X = 2) = 0.9 P(Y = 20 X = 2) = 0.1 P(Y = 10 X = 4) = 0.25 P(Y = 20 X = 4) = 0.75 Calculate the covariance and correlation between X and Y. 7S.2 Consider the following joint probability distribution for uncertain quantities X and Y: P(X = 2 and Y = 2) = 0.2 P(X = 1 and Y = 1) = 0.2 P(X = 0 and Y = 0) = 0.2 P(X = 1 and Y = 1) = 0.2 P(X = 2 and Y = 2) = 0.2 Calculate the covariance and correlation between X and Y.