Multi Factors Model Daniel Herlemont March 31, 2009 Contents 1 Introduction 1 2 Estimating using Ordinary Least Square regression 3 3 Multicollinearity 6 4 Estimating Fundamental Factor Models by Orthogonal Regression 7 5 References 11 1 Introduction The objective of this practical work is to provide an empirical case study of factor decomposition using historical prices of two stocks (Nokia and Vodafone) and four fundamental factors: ˆ a broad market index, The New York Stock Exchange (NYSE) composite index, ˆ an industry factor, a Mutual Communication fund, ˆ a growth style factor, the Riverside growth fund and ˆ a large caps factor, the AFBA Five Star Large Cap fund. source: Carol Alexander, see [1], study case II.1.4 1
1 INTRODUCTION Download the data at /downloads/alexander-case-study-ii-1-4.csv the data to your working directory and read them by the command quotes=read.csv("alexander-case-study-ii-1-4.csv") This work can be performed under Excel (download the package /downloads/matrix.zip Use the following code to read the data and plot the prices > dates = as.date(quotes[, 1], "%d/%m/%y") > prices = quotes[, -1] > prices = apply(prices, 2, function(p) p/p[1]) > n = ncol(prices) > matplot(dates, prices, type = "l", col = 1:n, lty = 1:n, xaxt = "n") > axis.date(1, dates) > legend(min(dates), max(prices), colnames(prices), col = 1:n, + lty = 1:n, cex = 0.7) Daniel Herlemont 2
2 ESTIMATING USING ORDINARY LEAST SQUARE REGRESSION prices 0.2 0.4 0.6 0.8 1.0 1.2 Vodafone Nokia NYSE.Index Communications Growth Large.Cap 2001 2002 2003 2004 2005 2006 dates Using regression to build a multi factor model with these factors gives rise to some econometric problems. The main problem is related to multi-collinearity. The proposed solution is to use orthogonal regression. 2 Estimating using Ordinary Least Square regression The following commands compute the returns and transform to a data frame to facilitate regression using R. > r = apply(prices, 2, function(p) diff(p)/p[-length(p)]) > r = data.frame(r) Daniel Herlemont 3
2 ESTIMATING USING ORDINARY LEAST SQUARE REGRESSION Then we can perform a regression of stocks against the risk factors > reg.vodafone = lm(vodafone ~ NYSE.Index + Communications + Growth + + Large.Cap, data = r) > summary(reg.vodafone) lm(formula = Vodafone ~ NYSE.Index + Communications + Growth + Large.Cap, data = r) Min 1Q Median 3Q Max -0.110331-0.009820-0.000308 0.009155 0.131810 (Intercept) -7.16e-05 5.32e-04-0.13 0.8930 NYSE.Index 8.69e-01 1.47e-01 5.91 4.4e-09 *** Communications 1.44e-01 5.14e-02 2.81 0.0051 ** Growth 2.04e-01 1.19e-01 1.71 0.0869. Large.Cap 1.01e-02 1.35e-01 0.07 0.9403 Residual standard error: 0.0194 on 1326 degrees of freedom Multiple R-squared: 0.348, Adjusted R-squared: 0.346 F-statistic: 177 on 4 and 1326 DF, p-value: <2e-16 > reg.nokia = lm(nokia ~ NYSE.Index + Communications + Growth + + Large.Cap, data = r) > summary(reg.nokia) lm(formula = Nokia ~ NYSE.Index + Communications + Growth + Large.Cap, data = r) Min 1Q Median 3Q Max Daniel Herlemont 4
2 ESTIMATING USING ORDINARY LEAST SQUARE REGRESSION -0.175062-0.009665-0.000142 0.008843 0.217256 (Intercept) 0.000217 0.000620 0.35 0.73 NYSE.Index -0.260330 0.171240-1.52 0.13 Communications 0.265789 0.059836 4.44 9.7e-06 *** Growth 0.209248 0.138489 1.51 0.13 Large.Cap 1.142582 0.157037 7.28 5.9e-13 *** Residual standard error: 0.0226 on 1326 degrees of freedom Multiple R-squared: 0.468, Adjusted R-squared: 0.467 F-statistic: 292 on 4 and 1326 DF, p-value: <2e-16 todo: comments on the results... Suppose we build a portfolio with $3 Millions of Nokia and $1 Million of Vodafone, what is the todo: compute the following: ˆ the volatility of the portfolio ˆ the betas of the portfolio with respect to the factors, ˆ the explained variance by the factors, Expected results: > w = c(0.25, 0.75) > rptf = 0.75 * r[, "Nokia"] + 0.25 * r[, "Vodafone"] > covfactors = cov(r[, c("nyse.index", "Communications", "Large.Cap", + "Growth")]) > beta = 0.75 * reg.nokia$coef[-1] + 0.25 * reg.vodafone$coef[-1] > var.explained = t(beta) %*% covfactors %*% beta > var.total = sd(rptf)^2 > sigma.total = sd(rptf) * sqrt(252) * 100 > sigma.explained = sqrt(var.explained) * sqrt(252) * 100 Daniel Herlemont 5
3 MULTICOLLINEARITY ˆ the total variance of the portfolio is 0.00072 and total volatility (yearly) is 42.6% ˆ beta NYSE.Index Communications Growth Large.Cap 0.0220 0.2354 0.2079 0.8595 ˆ The Variance explained by the factors is 0.000375 and total volatility (yearly) is 30.7% Comments? 3 Multicollinearity Multicollinearity refers to the correlation between the explanatory variables in a regression model: if one or more explanatory variables are highly correlated then it is difficult to estimate their regression coefficients. The multicollinearity problem becomes apparent when the estimated change considerably when adding another (collinear) variable to the regression. When high multicollinearity is present, confidence intervals for coefficients tend to be very wide and tstatistics tend to be very small. Coefficients will have to be larger in order to be statistically significant, i.e. it will be harder to reject the null when multicollinearity is present. There is no statistical test for multicollinearity, but a useful rule of thumb is that a model will suffer from it if the square of the pairwise correlation between explanatory variables is greater than the multiple R 2 of the regression. Todo: perform regression of the Nokia and Vodafone using ˆ one factor: NYSE.Index ˆ 2 factors: NYSE.Index and Communications ˆ 3 factors: NYSE.Index and Communications and Growth ˆ 4 factors: NYSE.Index and Communications and Growth and Large.Cap Explain the results, using the correlation matrix of the factors > r.factors = r[, c("nyse.index", "Communications", "Growth", "Large.Cap")] > cor.factors = cor(r.factors) > cor.factors Daniel Herlemont 6
4 ESTIMATING FUNDAMENTAL FACTOR MODELS BY ORTHOGONAL REGRESSION NYSE.Index Communications Growth Large.Cap NYSE.Index 1.000 0.689 0.844 0.909 Communications 0.689 1.000 0.880 0.834 Growth 0.844 0.880 1.000 0.892 Large.Cap 0.909 0.834 0.892 1.000 4 Estimating Fundamental Factor Models by Orthogonal Regression The best solution to a multicollinearity problem is to apply principal component analysis and then use the principal components as explanatory variables. We apply principal component analysis to the covariance matrix of the factors: > pca = prcomp(r.factors) > pca Standard deviations: [1] 0.031355 0.008992 0.004167 0.002782 Rotation: PC1 PC2 PC3 PC4 NYSE.Index 0.2588-0.6099-0.0966 0.7427 Communications 0.7963 0.5640-0.1407 0.1674 Growth 0.3915-0.2687 0.8447-0.2472 Large.Cap 0.3817-0.4875-0.5074-0.5993 > summary(pca) Importance of components: PC1 PC2 PC3 PC4 Standard deviation 0.03 0.009 0.004 0.003 Proportion of Variance 0.90 0.074 0.016 0.007 Cumulative Proportion 0.90 0.977 0.993 1.000 > plot(pca) Daniel Herlemont 7
4 ESTIMATING FUNDAMENTAL FACTOR MODELS BY ORTHOGONAL REGRESSION pca Variances 0e+00 2e 04 4e 04 6e 04 8e 04 Alternatively we can use eigen(cov(r.factors)). todo: using the first component (maybe the 2 main components) compute the explained variance by the components. Conclusions? Daniel Herlemont 8
Solutions: 4 ESTIMATING FUNDAMENTAL FACTOR MODELS BY ORTHOGONAL REGRESSION > pc1 = pca$rotation[, 1] > pc2 = pca$rotation[, 2] > pc3 = pca$rotation[, 3] > pc4 = pca$rotation[, 4] > pc1r = apply(r.factors, 1, function(x) sum(x * pc1)) > pc2r = apply(r.factors, 1, function(x) sum(x * pc2)) > pc3r = apply(r.factors, 1, function(x) sum(x * pc3)) > pc3r = apply(r.factors, 1, function(x) sum(x * pc4)) > summary(lm(r[, "Nokia"] ~ pc1r)) lm(formula = r[, "Nokia"] ~ pc1r) Min 1Q Median 3Q Max -0.182175-0.009307-0.000295 0.008892 0.201183 (Intercept) 0.000275 0.000628 0.44 0.66 pc1r 0.662287 0.020043 33.04 <2e-16 *** Residual standard error: 0.0229 on 1329 degrees of freedom Multiple R-squared: 0.451, Adjusted R-squared: 0.451 F-statistic: 1.09e+03 on 1 and 1329 DF, p-value: <2e-16 > summary(lm(r[, "Nokia"] ~ pc1r + pc2r)) lm(formula = r[, "Nokia"] ~ pc1r + pc2r) Min 1Q Median 3Q Max -0.181130-0.009391-0.000152 0.008528 0.212437 Daniel Herlemont 9
4 ESTIMATING FUNDAMENTAL FACTOR MODELS BY ORTHOGONAL REGRESSION (Intercept) 0.000178 0.000624 0.28 0.78 pc1r 0.662287 0.019907 33.27 < 2e-16 *** pc2r -0.304551 0.069417-4.39 1.2e-05 *** Residual standard error: 0.0228 on 1328 degrees of freedom Multiple R-squared: 0.459, Adjusted R-squared: 0.458 F-statistic: 563 on 2 and 1328 DF, p-value: <2e-16 > summary(lm(r[, "Vodafone"] ~ pc1r)) lm(formula = r[, "Vodafone"] ~ pc1r) Min 1Q Median 3Q Max -0.112669-0.010215-0.000164 0.009569 0.126809 (Intercept) 0.000140 0.000548 0.26 0.8 pc1r 0.423424 0.017470 24.24 <2e-16 *** Residual standard error: 0.02 on 1329 degrees of freedom Multiple R-squared: 0.307, Adjusted R-squared: 0.306 F-statistic: 587 on 1 and 1329 DF, p-value: <2e-16 > summary(lm(r[, "Vodafone"] ~ pc1r + pc2r)) lm(formula = r[, "Vodafone"] ~ pc1r + pc2r) Daniel Herlemont 10
Min 1Q Median 3Q Max -0.111048-0.009771-0.000363 0.009244 0.132099 (Intercept) -0.000022 0.000534-0.04 0.97 pc1r 0.423424 0.017013 24.89 <2e-16 *** pc2r -0.508226 0.059325-8.57 <2e-16 *** Residual standard error: 0.0195 on 1328 degrees of freedom Multiple R-squared: 0.343, Adjusted R-squared: 0.342 F-statistic: 346 on 2 and 1328 DF, p-value: <2e-16 5 References [1] ALEXANDER, C. Market Risk Analysis: Practical Financial Econometrics. Wiley, 2008. Daniel Herlemont 11