On Correlation Coefficient. The correlation coefficient indicates the degree of linear dependence of two random variables.


1 C.Candan EE3/53METU On Coelation Coefficient The coelation coefficient indicates the degee of linea dependence of two andom vaiables. It is defined as ( )( )} σ σ Popeties: (See appendi fo the poof of this popet.). If then and ae called uncoelated andom vaiables. (Note that two independent vaiables ae guaanteed to be uncoelated; but the evese is not tue in geneal. So thee can be two andom vaiables which ae uncoelated, but dependent.) 3. 1 a + b Hee a and b ae nonandom paametes, i.e. scalas. This elation shows that when 1, then the andom vaiable is a lineal elated to is sufficient to detemine the othe and vicevesa. If 1, knowing o one though a + b. So knowing one of two andom vaiables is as good as knowing the both of them. (See appendi fo the poof of this popet.) 4. In man applications, we can estimate the coelation coefficient between two andom vaiables b conducting epeiments. In pactice we use the coelation coefficient to pedict the value of not luck to obseve (something of inteest) when we can onl obseve diectl in man applications. If then we ma epect that we can eliabl pedict Lets sa we ae inteested in coefficient between and fom and.. We ae ae closel elated, ; but have onl and we know the coelation. You will lean in some othe couses that we can σ pedict as follows ˆ ( ) +. This is the best linea pediction of in σ the mean squae sense. (You will also hea about mean squae sense at these couses.)
2 Remembe that we have noted in item 3 the following: If 1, the knowing o is as good as knowing both of them. Theefoe we epect to have zeo pediction eo in this case. Fo othe immediatel clea. values, the value of the pediction eo is not The gaph given below shows the mean squae eo (appoimation eo) fo a geneal value of. As epected, the mean squae eo is zeo, when 1 and as the magnitude of coelation coefficient deceases, the eo inceases. The eo eaches its maimum when two andom vaiables ae uncoelated. Mean Squae Eo of the pediction σ (1 ) σ linspace(,,1).*ect(linspace(,,1),1,1); plot(linspace(,,1),*(1.^).*ect(linspace(,,1),1,1)); gid on; label('_{}'); title('mean Squae Eo of the pediction \sigma_^(1^_{})'); ais([.5]) [Fo moe info Haes, Statistical Digital Signal Pocessing and Modeling, p. 7] Eamples with Scatte Plots: Lets sa that we want to lean be given as ; but we can onl obseve. Let the obsevation model
3 Hee + n n is the effect of noise. (You can assume zeo mean noise without an ham o loss of genealit.) The coelation coefficient between σ. σ + σ and n can be calculated as Lets stat with the case of little noise When noise is little, i.e. vaiance of noise is small; is close to 1. 1 Scatte plot fo (,) pais + n; n is the andom noise on, The plot given above is called scatte plot and it is dawn b andoml geneating and n and calculating though + n. If thee wee no noise, ; but unfotunatel thee is noise in an obsevation. The scatte plot is dawn b putting coss maks () whee the andoml geneated calculated ae on the (,) plane. Thee ae 1 cosses in the given figue. So we conclude fom this figue, when thee is little noise, knowing knowing, which is wondeful. and can be as good as
4 Below we have some othe scatte plots. The noise level is highe in these plots, theefoe thee is a bigge spead aound the line. 1 Scatte plot fo (,) pais + n; n is the andom noise on, Scatte plot fo (,) pais + n; n is the andom noise on,
5 Scatte plot fo (,) pais + n; n is the andom noise on, Scatte plot fo (,) pais + n; n is the andom noise on,
6 4 Scatte plot fo (,) pais + n; n is the andom noise on, Scatte plot fo (,) pais + n; n is the andom noise on,
7 So as a conclusion, the coelation coefficients show how much two andom vaiables ae elated to each othe in a linea wa. Matlab code fo scatte plots: 1*and(1,1);.5; sigma1^/1; % sqt(sigma/(sigma+sigman) sigman (1/^1)*sigma, + sqt(sigman)*andn(size()); plot(,,''); title(['scatte plot fo (,) pais' cha(1) ' + n; n is the andom noise on,' cha(1) ' _{}' numst()]); label(''),label(''); NonLineal Related RandomVaiables and Coelation Coefficient In the pevious section, we have tied to intepet the coelation coefficient fo a linea obsevation model. Linea obsevation model means that the signal of inteest is mapped to the output though a linea function. In the eample pesented in the pevious section, the model is etemel simple (but useful) one, + n. In this section, elaboate futhe on the same topic; but we switch to the nonlinea obsevation models such as + n. As in the pevious eample, lets assume that is unifoml distibuted in [, ]. Then 3 3 E { }, E { }, E { } and so on. We will constuct a nonlinea function 3 4 of in the fom f ( ) a b such that the coelation coefficient of and is zeo! (Note that, we ae not adding an noise to the obsevations. The coelation is zeo in the absence of noise!) The coelation coefficient is epessed as follows: ( )( )} σ σ } σ σ It is clea that fo the coelation coefficient to be zeo, E { }.
8 Let s calculate E {} : } ( a 3 a } b 3 a b 4 3 b)} } Similal we can calculate as follows: ae ead to evaluate } : 3 a b 3 a b 6 4. Now we 3 3 } a 4 6 b 3 Fom the last elation, we can conclude that when and is zeo. 4 1 ( a b) a 1, the coelation coefficient of b The following figue pesents, the scatte plot fo the linea obsevation model and the nonlinea obsevation model. We ae assuming that both obsevation models have the same noise, i.e. noise is zeomean Gaussian and have identical vaiance. Fo the plot given in top pat of the figue, the coelation coefficient fo linea model is set to.99. Fo the nonlinea obsevation model, it is equal to since we have set a 1 and b 1. Fom these figues, we can see the coelation coefficient of two andom vaiables having a nonlinea elation between them should be teated with cae. Fom these figues, it is clea that when the effect of noise is little, it is possible to sa something about given obsevation fo both models. At least, it is possible to educe the set of possibilities fo the unknown given. Unfotunatel, the coelation coefficient of the nonlinea obsevation model is equal to zeo iespective of the noise level coupting the obsevations. Hence coelation coefficient and elated ideas ae especiall useful fo linea obsevation models.
9 Scatte plot fo (,) pais + n (blue) and 1/  + n (ed) Scatte plot fo (,) pais + n (blue) and 1/  + n (ed)
10 The following is the Matlab code geneating the figue pesented above. Delta1;MCnum1e3; Delta*and(1,MCnum);.9; sigmadelta^/1; % sqt(sigma/(sigma+sigman) sigman (1/^1)*sigma, 1 + sqt(sigman)*andn(size()); 1/Delta*.^  + sqt(sigman)*andn(size()); plot(,1,''); hold all; plot(,,''); hold off; title(['scatte plot fo (,) pais' cha(1) ' + n (blue)'... ' and 1/\Delta ^  + n (ed) ' cha(1)... ' _{_1}' numst()... ' _{_} ' ]); label(''),label(''); co_coef1 1/MCnum*sum((mean()).*(1mean(1)))/sqt(va()*va(1)), co_coef 1/MCnum*sum((mean()).*(mean()))/sqt(va()*va()), The last two lines of the Matlab code geneates an estimate fo the coelation coefficient. The estimate is poduced b estimating the mean, vaiance and cosscoelation of andom vaiables fom the epeimental data. When MCnum is set to 1, and the scipt is un, we get the following esult: co_coef1.91 co_coef .85 This esult shows that the coelation coefficient fo the linea model is almost equal to.9 (as epected) and thee is indeed ve little coelation between and fo the nonlinea model.
11 Appendi: Poof of 1 a + b It can be noted that that fom the definition of that does not depend on the mean values of and. Without loss of an genealit, we assume that and ae zeo mean. Then the esult to be poved and the definition of educes to 1 a, } espectivel, fo zeomean andom vaiables and. } } i. Poof of 1 a If a, then } } } E aσ, { } aσ a σ σ 1. E a σ, and { } E σ, then { } ii. Poof of 1 a Let P (ψ ) be a quadatic polnomial in ψ. P (ψ ) is defined as follows: ( ψ ) } } ψ } ψ + } aψ + b c. P ( ψ ) ψ + It is clea that P ( ψ ) fo all ψ values. Then the disciminant of P (ψ ), i.e. b 4ac, should be eithe o negative valued. The disciminant can be calculated as } 4 }. Since, ( } ) } } and then ( ) } If 1 } } 1, which is the fist popet. }, then ( E { } ) 4 } } value called ψ fo which ( ). last elation shows that.. Theefoe thee is a specific ψ P ψ This leads to P( ψ ) ( ψ ) } ψ. The
