Factor Analysis - Introduction

Size: px
Start display at page:

Download "Factor Analysis - Introduction"

Transcription

1 Factor Analysis - Introduction Factor analysis is used to draw inferences on unobservable quantities such as intelligence, musical ability, patriotism, consumer attitudes, that cannot be measured directly. The goal of factor analysis is to describe correlations between p measured traits in terms of variation in few underlying and unobservable factors. Changes across subjects in the value of one or more unobserved factors could affect the values of an entire subset of measured traits and cause them to be highly correlated. 654

2 Factor Analysis - Example A marketing firm wishes to determine how consumers choose to patronize certain stores. Customers at various stores were asked to complete a survey with about p = 80 questions. Marketing researchers postulate that consumer choices are based on a few underlying factors such as: friendliness of personnel, level of customer service, store atmosphere, product assortment, product quality and general price level. A factor analysis would use correlations among responses to the 80 questions to determine if they can be grouped into six sub-groups that reflect variation in the six postulated factors. 655

3 Orthogonal Factor Model X = (X 1, X 2,..., X p ) is a p dimensional vector of observable traits distributed with mean vector µ and covariance matrix Σ. The factor model postulates that X can be written as a linear combination of a set of m common factors F 1, F 2,..., F m and p additional unique factors ɛ 1, ɛ 2,..., ɛ p, so that X 1 µ 1 = l 11 F 1 + l 12 F l 1m F m + ɛ 1 X 2 µ 2 = l 21 F 1 + l 22 F l 2m F m + ɛ 2,.. X p µ p = l p1 F 1 + l p2 F l pm F m + ɛ p where l ij is called a factor loading or the loading of the ith trait on the jth factor. 656

4 Orthogonal Factor Model In matrix notation: (X µ) p 1 = L p m F m 1 + ɛ p 1, where L is the matrix of factor loadings and F is the vector of values for the m unobservable common factors. Notice that the model looks very much like an ordinary linear model. Since we do not observe anything on the right hand side, however, we cannot do anything with this model unless we impose some more structure. The orthogonal factor model assumes that E(F) = 0, V ar(f) = E(FF ) = I, E(ɛ) = 0, V ar(ɛ) = E(ɛɛ ) = Ψ = diag{ψ i }, i = 1,..., p, and F, ɛ are independent, so that Cov(F, ɛ) =

5 Orthogonal Factor Model Assuming that the variances of the factors are all one is not a restriction, as it can be achieved by properly scaling the factor loadings. Assuming that the common factors are uncorrelated and the unique factors are uncorrelated are the defining restrictions of the orthoginal factor model. The assumptions of the orthogonal factor model have implications for the structure of Σ. If then it follows that (X µ) p 1 = L p m F m 1 + ɛ p 1, (X µ)(x µ) = (LF + ɛ)(lf + ɛ) = (LF + ɛ)((lf) + ɛ ) = LFF L + ɛf L + LFɛ + ɛɛ. 658

6 Orthogonal Factor Model Taking expectations of both sides of the equation we find that: Σ = E(X µ)(x µ) = LE(FF )L + E(ɛF )L + LE(Fɛ ) + E(ɛɛ ) = LL + Ψ, since E(FF ) = V ar(f) = I and E(ɛF ) = Cov(ɛ, F) = 0. Also, (X µ)f = LFF + ɛf, Cov(X, F) = E[(X µ)f ] = LE(FF ) + E(ɛF ) = L. 659

7 Orthogonal Factor Model Under the orthogonal factor model, therefore: Var(X i ) = σ ii = l 2 i1 + l2 i2 + + l2 im + ψ i Cov(X i, X k ) = σ ik = l i1 l k1 + l i2 l k2 + + l im l km. The portion of the variance σ ii that is contributed by the m common factors is the communality and the portion that is not explained by the common factors is called the uniqueness (or the specific variance): σ ii = l 2 i1 + l2 i2 + + l2 im + ψ i = h 2 i + ψ i. 660

8 Orthogonal Factor Model The model assumes that the p(p + 1)/2 variances and covariances of X can be reproduced from the pm + p factor loadings and the variances of the p unique factors. Situations in which m is small relative to p is when factor analysis works best. If, for example, p = 12 and m = 2, the 78 elements of Σ can be reproduced from = 36 parameters in the factor model. Not all covariance matrices can be factored as LL + Ψ. 661

9 Example: No Proper Solution Let p = 3 and m = 1 and suppose that the covariance of X 1, X 2, X 3 is Σ = The orthogonal factor model requires that Σ = LL + Ψ. Under the one factor model assumption, we get: 1 = l ψ = l 11 l = l 11 l 31 1 = l ψ = l 21 l 31 1 = l ψ 3 662

10 Example: No Proper Solution From the equations above, we see that l 21 = (0.4/0.7)l 11. Since 0.90 = l 11 l 21, substituting for l 21 implies that l 2 11 = or l 11 = ± Here is where the problem starts. Since (by assumption) Var(F 1 ) = 1 and also Var(X 1 ) = 1, and since therefore Cov(X 1, F 1 ) = Corr(X 1, F 1 ) = l 11, we notice that we have a solution inconsistent with the assumptions of the model because a correlation cannot be larger than 1 or smaller than

11 Example: No Proper Solution Further, 1 = l ψ 1 ψ 1 = 1 l 2 11 = 0.575, which cannot be true because ψ = Var(ɛ 1 ). Thus, for m = 1 we get a numerical solution that is not consistent with the model or with the interpretation of its parameters. 664

12 Rotation of Factor Loadings When m > 1, there is no unique set of loadings and thus there is ambiguity associated with the factor model. Consider any m m orthogonal matrix T such that T T = T T = I. We can rewrite our model: X µ = LF + ɛ = LT T F + ɛ = L F + ɛ, with L = LT and F = T F. 665

13 Rotation of Factor Loadings Since E(F ) = T E(F) = 0, and Var(F ) = T Var(F)T = T T = I, it is impossible to distinguish between loadings L and L from just a set of data even though in general they will be different. Notice that the two sets of loadings generate the same covariance matrix Σ: Σ = LL + Ψ = LT T L + Ψ = L L + Ψ. 666

14 Rotation of Factor Loadings How to resolve this ambiguity? Typically, we first obtain the matrix of loadings (recognizing that it is not unique) and then rotate it by mutliplying by an orthogonal matrix. We choose the orthogonal matrix using some desired criterion. For example, a varimax rotation of the factor loadings results in a set of loadings with maximum variability. Other criteria for arriving at a unique set of loadings have also been proposed. We consider rotations in more detail in a little while. 667

15 Estimation in Orthogonal Factor Models We begin with a sample of size n of p dimensional vectors x 1, x 2,..., x n and (based on our knowledge of the problem) choose a small number m of factors. For the chosen m, we want to estimate the factor loading matrix L and the specific variances in the model Σ = LL + Ψ We use S as an estimator of Σ and first investigate whether the correlations among the p variables are large enough to justify the analysis. If r ij 0 for all i, j, the unique factors ψ i will dominate and we will not be able to identify common underlying factors. 668

16 Estimation in Orthogonal Factor Models Common estimation methods are: The principal component method The iterative principal factor method Maximum likelihood estimation (assumes normaility) The last two methods focus on using variation in common factors to describe correlations among measured traits. Principal component analysis gives more attention to variances. Estimated factor loadings from any of those methods can be rotated, as explained later, to facilitate interpretation of results. 669

17 The Principal Component Method Let (λ i, e i ) denote the eigenvalues and eigenvectors of Σ and recall the spectral decomposition which establishes that Σ = λ 1 e 1 e 1 + λ 2e 2 e λ pe p e p. Use L to denote the p p matrix with columns equal to λ i e i, i = 1,..., p. Then the spectral decomposition of Σ is given by Σ = LL + 0 = LL, and corresponds to a factor model in which there are as many factors as variables m = p and where the specific variances ψ i = 0. The loadings on the jth factor are just the coefficients of the jth principal component multiplied by λ j. 670

18 The Principal Component Method The principal component solution just described is not interesting because we have as many factors as we have variables. We really want m < p so that we can explain the covariance structure in the measurements using just a small number of underlying factors. If the last p m eigenvalues are small, we can ignore the last p m terms in the spectral decomposition and write Σ L p m L m p. 671

19 The Principal Component Method The communality for the i-th observed variable is the amount of its variance that can be atttributed to the variation in the m factors h i = m l 2 ij j=1 for i = 1, 2,..., p The variances ψ i of the specific factors can then be taken to be the diagonal elements of the difference matrix Σ LL, where L is p m. That is, ψ i = σ ii m l 2 ij j=1 or Ψ = diag(σ L p m L m p ) for i = 1,..., p 672

20 The Principal Component Method Note that using m < p factors will produce an approximation L p m L m p + Ψ for Σ that exactly reproduces the variances of the p measured traits but only approximates the correlations. If variables are measured in very different scales, we work with the standardized variables as we did when extracting PCs. This is equivalent to modeling the correlation matrix P rather than the covariance matrix Σ. 673

21 Principal Component Estimation To implement the PC method, we must use S to estimate Σ (or use R if the observations are standardized) and use X to estimate µ. The principal component estimate of the loading matrix for the m factor model is L = [ ˆλ 1 ê 1 ˆλ 2 ê 2... ˆλ m ê m ], where (ˆλ i, ê i ) are the eigenvalues and eigenvectors of S (or of R if the observations are standardized). 674

22 Principal Component Estimation The estimated specific variances are given by the diagonal elements of S L L, so Ψ = ψ ψ ψ p, ψ i = s ii m j=1 l 2 ij. Communalities are estimated as h 2 i = l 2 i1 + l 2 i2 + + l 2 im. If variables are standardized, then we substitute R for S and substitute 1 for each s ii. 675

23 Principal Component Estimation In many applications of factor analysis, m, the number of factors, is decided prior to the analysis. If we do not know m, we can try to determine the best m by looking at the results from fitting the model with different values for m. Examine how well the off-diagonal elements of S (or R) are reproduced by the fitted model L L + Ψ because by definition of ψ i, diagonal elements of S are reproduced exactly but the off-diagonal elements are not. The chosen m is appropriate if the residual matrix S ( L L + Ψ) has small off-diagonal elements. 676

24 Principal Component Estimation Another approach to deciding on m is to examine the contribution of each potential factor to the total variance. The contribution of the k-th factor to the sample variance for the i-th trait, s ii, is estimated as l 2 ik. The contribution of the k-th factor to the total sample variance s 11 + s s pp is estimated as l 2 1k + l 2 2k l 2 pk = ( ˆλ k ê k ) ( ˆλ k ê k ) = ˆλ k. 677

25 Principal Component Estimation As in the case of PCs, in general ( Proportion of total sample variance due to jth factor or equals ˆλ j /p if factors are extracted from R. ) = ˆλ j s 11 + s s pp, Thus, a reasonable number of factors is indicated by the minimum number of PCs that explain a suitably large proportion of the total variance. 678

26 Strategy for PC Factor Analysis First, center observations (and perhaps standardize) If m is determined by subject matter knowledge, fit the m factor model by: Extracting the p PCs from S or from R. Constructing the p m matrix of loadings L by keeping the PCs associated to the largest m eigenvalues of S (or R). If m is not known a priori, examine the estimated eignevalues to determine the number of factors that account for a suitably large proportion of the total variance and examine the off-diagonal elements of the residual matrix. 679

27 Example: Stock Price Data Data are weekly gains in stock prices for 100 consecutive weeks for five companies: Allied Chemical, Du Pont, Union Carbide, Exxon and Texaco. Note that the first three are chemical companies and the last two are oil companies. The data are first standardized and the sample correlation matrix R is computed. Fit an m = 2 orthogonal factor model 680

28 Example: Stock Price Data The sample correlation matrix R is R = Eigenvalues and eigenvectors of R are ˆλ = , ê 1 = , ê 2 =

29 Example: Stock Price Data Recall that the method of principal components results in factor loadings equal to ˆλê j, so in this case Loadings Loadings Specific on factor 1 on factor 2 variances Variable l i1 l i2 ψ i = 1 h 2 i Allied Chem Du Pont Union Carb Exxon Texaco The proportions of total variance accounted by the first and second factors are λ 1 /p = and λ 2 /p =

30 Example: Stock Price Data The first factor appears to be a market-wide effect on weekly stock price gains whereas the second factor reflects industry specific effects on chemical and oil stock price returns. The residual matrix is given by R ( L L + Ψ) = By construction, the diagonal elements of the residual matrix are zero. Are the off-diagonal elements small enough? 683

31 Example: Stock Price Data In this example, most of the residuals appear to be small, with the exception of the {4, 5} element and perhaps also the {1, 2}, {1, 3}, {2, 3} elements. Since the {4, 5} element of the residual matrix is negative, we know that L L is producing a correlation value between Exxon and Texaco that is larger than the observed value. When the off-diagonals in the residual matrix are not small, we might consider changing the number m of factors to see whether we can reproduce the correlations between the variables better. In this example, we would probably not change the model. 684

32 SAS code: Stock Price Data /* This program is posted as stocks.sas. It performs factor analyses for the weekly stock return data from Johnson & Wichern. The data are posted as stocks.dat */ data set1; infile "c:\stat501\data\stocks.dat"; input x1-x5; label x1 = ALLIED CHEM x2 = DUPONT x3 = UNION CARBIDE x4 = EXXON x5 = TEXACO; run; 685

33 /* Compute principal components */ proc factor data=set1 method=prin scree nfactors=2 simple corr ev res msa nplot=2 out=scorepc outstat=facpc; var x1-x5; run; Factor Pattern Factor1 Factor2 x1 ALLIED CHEM x2 DUPONT x3 UNION CARBIDE x4 EXXON x5 TEXACO

34 R code: Stock Price Data # This code creates scatter plot matrices and does # factor analysis for 100 consecutive weeks of gains # in prices for five stocks. This code is posted as # # stocks.r # # The data are posted as stocks.dat # # There is one line of data for each week and the # weekly gains are represented as # x1 = ALLIED CHEMICAL # x2 = DUPONT # x3 = UNION CARBIDE # x4 = EXXON # x5 = TEXACO 687

35 stocks <- read.table("c:/stat501/data/stocks.dat", header=f, col.names=c("x1", "x2", "x3", "x4", "x5")) # Print the first six rows of the data file stocks[1:6, ] x1 x2 x3 x4 x

36 # Create a scatter plot matrix of the standardized data stockss <- scale(stocks, center=t, scale=t) pairs(stockss,labels=c("allied","dupont","carbide", "Exxon", "Texaco"), panel=function(x,y){panel.smooth(x,y) abline(lsfit(x,y),lty=2) }) 689

37 Allied Dupont Carbide Exxon Texaco 690

38 # Compute principal components from the correlation matrix s.cor <- var(stockss) s.cor x1 x2 x3 x4 x5 x x x x x s.pc <- prcomp(stocks, scale=t, center=t) # List component coefficients 691

39 s.pc$rotation PC1 PC2 PC3 PC4 PC5 x x x x x

40 # Estimate proportion of variation explained by each # principal component,and cummulative proportions s <- var(s.pc$x) pvar<-round(diag(s)/sum(diag(s)), digits=6) pvar PC1 PC2 PC3 PC4 PC cpvar <- round(cumsum(diag(s))/sum(diag(s)), digits=6) cpvar PC1 PC2 PC3 PC4 PC

41 # Plot component scores par(fin=c(5,5)) plot(s.pc$x[,1],s.pc$x[,2], xlab="pc1: Overall Market", ylab="pc2: Oil vs. Chemical",type="p") 693

42 PC1: Overall Market PC2: Oil vs. Chemical

43 Principal Factor Method The principal factor method for estimating factor loadings can be viewed as an iterative modification of the principal components methods that allows for greater focus on explaining correlations among observed traits. The principal factor approach begins with a guess about the communalities and then iteratively updates those estimates until some convergence criterion is satisfied. (Try several different sets of starting values.) The principal factor method chooses factor loadings that more closely reproduce correlations. Principal components are more heavily influenced by accounting for variances and will explain a higher proportion of the total variance. 694

44 Principal Factor Method Consider the estimated model for the correlation matrix. R L (L ) + Ψ The estimated loading matrix should provide a good approximation for all of the correlations and part of the variances as follows L (L ) R Ψ (h 1 )2 r 12 r 13 r 1p r 21 (h 2 )2 r 23 r 2p r p1 r p2 r p3 (h p) 2 where (h i )2 = 1 Ψ i is the estimated communality for the i-th factor Find a factor loading matrix L so that L (L ) is a good approximation for R Ψ, rather than trying to make L (L ) a good approximation for R. 695

45 Principal Factor Method Start by obtaining initial estimates of the communalities, (h 1 )2, (h 2 )2,..., (h p) 2. Then the estimated loading matrix should provide a good approximation to R Ψ (h 1 )2 r 12 r 13 r 1p r 21 (h 2 )2 r 23 r 2p r p1 r p2 r p3 (h p) 2 where (h i )2 = 1 Ψ i is the estimated communality for the i-th factor Use the spectral decomposition of R Ψ to find a good approximation to R Ψ. 696

46 Principal Factor Method Use the initial estimates to compute R Ψ (h 1 )2 r 12 r 13 r 1p r 21 (h 2 )2 r 23 r 2p r p1 r p2 r p3 (h p) 2 The estimated loading matrix is obtained from the eignevalues and eigenvectors of this matrix as L = [ λ ] 1 e 1 λ me m Update the specific variances Ψ m i = 1 λ j [e ij ]2 j=1 697

47 Principal Factor Method Use the updated communalities to re-evaluate R Ψ (h 1 )2 r 12 r 13 r 1p r 21 (h 2 )2 r 23 r 2p r p1 r p2 r p3 (h p) 2 and compute a new estimate of the loading matirx, L = [ λ ] 1 e 1 λ me m Repeat this process until it converges. 698

48 Principal Factor Method Note that R Ψ is generally not positive definite and some eignvalues can be negative. The results are sensitive to the choice of the number of factors m If m is too large, some communalities can be larger than one, which would imply that variation in factor values accounts for more than 100 percent of the variation in the measured traits. SAS has two options to deal with this: HEYWOOD: Set any estimated communality larger than one equal to one and continue iterations with the remaining variables. ULTRAHEYWOOD: Continue iterations with all of the variables and hope that iterations eventually bring you back into the allowable parameter space. 699

49 Stock Price Example The estimated loadings from the principal factor method are displayed below. Loadings Loadings Specific Commuon factor 1 on factor 2 variances nalities Variable l i1 l i2 ψi = 1 (h i )2 (h i )2 Allied Chem Du Pont Union Carb Exxon Texaco The proportions of total variance accounted by the first and second factors are λ 1 /p = 0.45 and λ 2 /p =

50 Stock Price Example Interpretations are similar. The first factor appears to be a market-wide effect on weekly stock price gains whereas the second factor reflects industry specific effects on chemical and oil stock price returns. The loadings tend to be smaller than those obtained from principal component analysis. Why? 701

51 Stock Price Example The residual matrix is R [(L )(L ) +Ψ ] = Since the principal factor methods give greater emphasis to describing correlations, the off-diagonal elements of the residual matrix tend to be smaller than corresponding elements for principal component estimation. 702

52 /* SAS code for the principal factor method */ proc factor data=set1 method=prinit scree nfactors=2 simple corr ev res msa nplot=2 out=scorepf outstat=facpf; var x1-x5; priors smc; run; To obtain the initial communaliy for the i-th variable, smc in the priors statement uses the R 2 value for the regression of the i-th variable on the other p 1 variables in the analysis. 703

53 Maximum Likelihood Estimation To implement the ML method, we need to include some assumptions about the distribution of the p-dimensional vector X j and the m-dimensional vector F j : X j N p (µ, Σ), F j N m (0, I m ), ɛ j N p (0, Ψ p p ), where X j = LF j + ɛ j, Σ = LL + Ψ, and F j is independent of ɛ j. Also, Ψ is a diagonal matrix. Because the L that maximizes the likelihood for this model is not unique, we need another set of restrictions that lead to a unique maximum of the likelihood function: where is a diagonal matrix. L Ψ 1 L =, 704

54 Maximum Likelihood Estimation Maximizing the likelihood function with respect to (µ, L, Ψ) is not easy because the likelihood depends in a very non-linear fashion on the parameters. Recall that (L, Ψ) enter the likelihood through Σ which in turn enters the likelihood as an inverse matrix in a quadratic form and also as a determinant. Efficient numerical algorithms exist to maximize the likelihood iteratively and obtain ML estimates ˆL m m = {ˆl ij }, ˆΨ = diag{ ˆψ i }, ˆµ = x. 705

55 Maximum Likelihood Estimation The MLEs of the communalities for the p variables (with m factors)are ĥ 2 i = m j=1 ˆl 2 ij, i = 1,..., p, and the proportion of the total variance accounted for by the jth factor is given by p i=1 ˆl 2 ij trace(s) if using S or dividing by p if using R. 706

56 Maximum Likelihood Estimation: Stock Prices Results; Loadings Loadings Specific Variable for factor 1 for factor 2 variances Allied Chem Du Pont Union Carb Exxon Texaco Prop. of variance

57 Maximum Likelihood Estimation: Stock Prices The interpretation of the two factors remains more or less the same (although Exxon is hardly loading on the second factor now). What has improved considerably is the residual matrix, since most of the entries are essentially zero now: R (ˆLˆL + ˆΨ) =

58 Likelihood Ratio test for Number of Factors We wish to test whether the m factor model appropriately describes the covariances among the p variables. We test versus H 0 : Σ p p = L p m L m p + Ψ p p H a : Σ is a positive definite matrix A likelihood ratio test for H 0 can be constructed as the ratio of the maximum of likelihood function under H 0 and the maximum of the likelihood function under no restrictions (i.e., under H a ). 709

59 Likelihood Ratio test for Number of Factors Under H a the multivariate normal likelihood is maximized at ˆµ = x, and ˆΣ = 1 n ni=1 (x i x)(x i x). Under H 0, the multivariate normal likelihood is maximized at ˆµ = x and ˆLˆL + ˆΨ, the mle for Σ for the orthogonal factor model with m factors. 710

60 Likelihood Ratio test for Number of Factors It is easy to show that the result is proportional to ( ˆLˆL + ˆΨ n/2 exp 1 ) 2 ntr[(ˆlˆl + ˆΨ) 1 ˆΣ], Then, the log-likelihood ratio statistic for testing H 0 is 2 ln Λ = 2 ln ( ˆLˆL ) n/2 + ˆΨ + n[tr((ˆlˆl + ˆΨ) 1 ˆΣ) p]. ˆΣ Since the second term in the expression above can be shown to be zero, the test statistic simplifies to ( ˆLˆL ) + ˆΨ 2 ln Λ = n ln. ˆΣ 711

61 Likelihood Ratio test for Number of Factors The degrees of freedom for the large sample chi-square approximation to this test statistic are Where does this come from? (1/2)[(p m) 2 p m] Under H a we estimate p means and p(p+1)/2 elements of Σ. Under H 0, we estimate p means p diagonal elements of Ψ and mp elements of L. An additional set of m(m 1)/2 restrictions are imposed on the mp elements of L by the identifiability restriction L Ψ 1 L = 712

62 Likelihood Ratio test for Number of Factors Putting it all together, we have p(p + 1) m(m 1) df = [p+ ] [p+pm+p ] = [(p m)2 p m] degrees of freedom. Bartlett showed that the χ 2 approximation to the sampling distribution of 2 ln Λ can be improved if the test statistic is computed as 2 ln Λ = (n 1 (2p + 4m + 5)/6) ln ˆLˆL + ˆΨ. ˆΣ 713

63 Likelihood Ratio test for Number of Factors We reject H 0 at level α if 2 ln Λ = (n 1 (2p + 4m + 5)/6) ln ˆLˆL + ˆΨ ˆΣ > χ 2 df, with df = 1 2 [(p m)2 p m] for large n and large n p. To have df > 0, we must have m < 1 2 (2p + 1 8p + 1). If the data are not a random sample from a multivariate normal distribution, this test tends to indicate the need for too many factors. 714

64 Stock Price Data We carry out the log-likelihood ratio test at level α = 0.05 to see whether the two-factor model is appropriate for these data. We have estimated the factor loadings and the unique factors using R rather than S, but the exact same test statistic (with R in place of ˆΣ) can be used for the test. Since we evaluated ˆL, ˆΨ and R, we can easily compute ˆLˆL + ˆΨ R = =

65 Stock Price Data Then, the test statistic is [100 1 ( ) ] ln(1.0065) = Since 0.58 < χ 2 1 = 3.84, we fail to reject H 0 and conclude that the two-factor model adequately describes the correlations among the five stock prices (p-value=0.4482). Another way to say it is that the data do not contradict the two-factor model. 716

66 Stock Price Data Test the null hypothesis that one factor is sufficient [100 1 ( ) with 5 degrees of freedom. 6 ] ln ˆLˆL + ˆΨ ˆΣ = Since p value = , we reject the null hypothesis and conclude that a one factor model is not sufficient to describe the correlations among the weekly returns for the five stocks. 717

67 Maximum Likelihood Estimation: Stock Prices We implemented the ML method for the stock price example. The same SAS program can be used, but now use method = ml instead of prinit in the proc factor statement. proc factor data=set1 method=ml scree nfactors=2 simple corr ev res msa nplot=2 out=scorepf outstat=facpf; var x1-x5; priors smc; run; 718

68 Initial Factor Method: Maximum Likelihood Prior Communality Estimates: SMC x1 x2 x3 x4 x Preliminary Eigenvalues: Total = Average = Eigenvalue Difference Proportion Cumulative factors will be retained by the NFACTOR criterion. 719

69 Iteration Criterion Ridge Change Communalities Convergence criterion satisfied. 720

70 Significance Tests Based on 100 Observations Test DF Chi-Square Pr>ChiSq H0: No common factors <.0001 HA: At least one common factor H0: 2 Factors are sufficient HA: More factors are needed Chi-Square without Bartlett s Correction Akaike s Information Criterion Schwarz s Bayesian Criterion Tucker and Lewis s Reliability Coefficient Squared Canonical Correlations Factor1 Factor

71 Eigenvalues of the Weighted Reduced Correlation Matrix: Total = Average = Eigenvalue Difference Proportion Cumulative Factor Pattern Factor1 Factor2 x5 TEXACO x2 DUPONT x1 ALLIED CHEM x3 UNION CARBIDE x4 EXXON

72 Variance Explained by Each Factor Factor Weighted Unweighted Factor Factor Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight x x x x x

73 Residual Correlations With Uniqueness on the Diagonal x1 x2 x3 x1 ALLIED CHEM x2 DUPONT x3 UNION CARBIDE x4 EXXON x5 TEXACO x4 x5 x1 ALLIED CHEM x2 DUPONT x3 UNION CARBIDE x4 EXXON x5 TEXACO Root Mean Square Off-Diagonal Residuals: Overall = x1 x2 x3 x4 x

74 Maximum Likelihood Estimation with R stocks.fac <- factanal(stocks, factors=2, method= mle, scale=t, center=t) stocks.fac Call: factanal(x = stocks, factors = 2, method = "mle", scale=t, center=t) Uniquenesses: x1 x2 x3 x4 x

75 Loadings: Factor1 Factor2 x x x x x Factor1 Factor2 SS loadings Proportion Var Cumulative Var Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 0.58 on 1 degree of freedom. The p-value is

76 Measure of Sampling Adequacy In constructing a survey or other type of instrument to examine factors like political attitudes, social constructs, mental abilities, consumer confidence, that cannot be measured directly, it is common to include a set of questions or items that should vary together as the values of the factor vary across subjects. In a job attitude assessment, for example, you might include questions of the following type in the survey: 1. How well do you like your job? 2. How eager are you to go to work in the morning? 3. How professionally satisfied are you? 4. What is your level of frustration with assignment to meaningless tasks? 727

77 Measure of Sampling Adequacy If the responses to these and other questions had strong positive or negative correlations, you may be able to identify a job satisfaction factor. In marketing, education, and behavioral research, the existence of highly correlated resposnes to a battery of questions is used to defend the notion that an important factor exists and to justify the use of the survey form or measurement instrument. Measures of sampling adequacy are based on correlations and partial correlations among responses to different questions (or items) 728

78 Measure of Sampling Adequacy Kaiser (1970, Psychometrica) proposed a statistic called Measure of Sampling Adequacy (MSA). MSA measures the relative sizes of the pairwise correlations to the partial correlations between all pairs of variables as follows: MSA = 1 j k j qjk 2 j k j rjk 2 where r jk is the marginal sample correlation between variables j and k and q jk is the partial correlation between the two variables after accounting for all other variables in the set. 729

79 Measure of Sampling Adequacy If the r s are relatively large and the q s are relatively small, MSA tends to 1. This indicates that more than two variables are changing together as the values of the factor vary across subjects. Side note: If p = 3 for example, q 12 is given by q 12 = r 12 r 13 r 23 (1 r 2 13 )(1 r2 23 ). 730

80 Measure of Sampling Adequacy The MSA can take on values between 0 and 1 and Kaiser proposed the following guidelines: MSA range Interpretation MSA range Interpretation 0.9 to 1 Marvelous data 0.6 to 0.7 Mediocre data 0.8 to 0.9 Meritorious data 0.5 to 0.6 Miserable 0.7 to 0.8 Middling 0.0 to 0.5 Unacceptable SAS computes MSAs for each of the p variables. Kaiser s Measure of Sampling Adequacy: Overall MSA = x1 x2 x3 x4 x

81 Tucker and Lewis Reliability Coefficient Tucker and Lewis (1973) Psychometrica, 38, pp From the mle s for m factors compute ˆL m estimates of factor loadings ˆΨ m = diag(r ˆL mˆl m) G m = ˆΨ 1/2 m diag(r ˆL mˆl m) ˆΨ 1/2 m 732

82 Tucker and Lewis Reliability Coefficient g m(ij) is called the partial correlation between variables X i and X j controlling for the m common factors Sum of the squared partial correlations controlling for the m common factors is Mean square is F m = i<j g 2 m(ij) M m = F m df m where df m = 0.5[(p m) 2 p m] are the degrees of freedom for the likelihood ratio test of the null hypothesis that m factors are sufficient 733

83 Tucker and Lewis Reliability Coefficient The mean square for the model with zero common factors is i<j rij 2 M 0 = p(p 1)/2 The reliability coefficient is ˆρ m = M 0 M m M 0 n 1 m where n m = (n 1) 2p+5 6 2m 3 is a Bartlett correction factor. For the weekly stock gains data SAS reports Tucker and Lewis s Reliability Coefficient

84 Cornbach s Alpha A set of observed variables X = (X 1, X 2,..., X p ) that all measure the same latent trait should all have high positive pairwise correlations Define r = 1 p(p 1)/2 i<j ˆ Cov(X i, X j ) 1 p p i=1 ˆ V ar(x i ) = 2 i<j S ij p p 1 i=1 S ii Cornbach s Alpha is α = p r 1 + (p 1) r = p p 1 [ 1 i V ar(x i ) V ar( i X i ) ] 735

85 Cornbach s Alpha For standardized variables, p i=1 V ar(x i) = p If ρ ij = 1 for all i j, then V ar( i X i ) = i V ar(x i ) + 2 i<j V ar(x i )V ar(x j ) = i V ar(x i ) 2 = p 2 736

86 Cornbach s Alpha In the extreme case where all pairwise correlations are 1 we have i V ar(x i ) V ar( i X i ) = p p 2 = 1 p and α = p p 1 [ 1 i V ar(x i ) V ar( i X i ) ] = p p 1 [ 1 p p 2 ] = 1 When r = 0, then α = 0 Be sure that the scores for all items are orientated in the same direction so all correlations are positive 737

87 Factor Rotation As mentioned earlier, multiplying a matrix of factor loadings by any orthogonal matrix leads to the same approximation to the covariance (or correlation) matrix. This means that mathematically, it does not matter whether we estimate the loadings as ˆL or as ˆL = ˆLT where T T = T T = I. The estimated residual matrix matrix also remains unchanged: S n ˆLˆL ˆΨ = S n ˆLT T ˆL ˆΨ = S n ˆL ˆL ˆΨ The specific variances ˆψ i and therefore the communalities also remain unchanged. 738

88 Factor Rotation Since only the loadings change by rotation, we rotate factors to see if we can better interpret results. There are many choices for a rotation matrix T, and to choose, we first establish a mathematical criterion and then see which T can best satisfy the criterion. One possible objective is to have each one of the p variables load highly on only one factor and have moderate to negligible loads on all other factors. It is not always possible to achieve this type of result. This can be examined with a varimax rotation 739

89 Varimax Rotation Define l ij = ˆl ij /ĥ i as the scaled loading of the i-th variable on the j-th rotated factor. Compute the variance of the squares of the scaled loadings for the j-th rotated factor 1 p p i=1 ( l ij )2 1 p 2 p ( l kj )2 k=1 = 1 p p i=1( l ij )4 1 p p 2 ( l kj )2 k=1 The varimax procedure finds the orthogonal transformation of the loading matrix that maximizes the sum of those variances, summing across all m rotated factors. After rotation each of the p variables should load highly on at most one of the rotated factors 740

90 Maximum Likelihood Estimation: Stock Prices Varimax rotation of m = 2 factors MLE MLE Rotated Rotated Variable factor 1 factor 2 factor 1 factor 2 Allied Chem Du Pont Union Carb Exxon Texaco Prop. of var

91 Quartimax Rotation The varimax rotation will destroy an overall factor The quartimax rotation try to 1. Preserve an overall factor such that each of the p variables has a high loading on that factor 2. Create other factors such that each of the p variables has a high loading on at most one factor 742

92 Quartimax Rotation Define l ij = ˆl ij /ĥ i as the scaled loading of the i-th variable on the j-th rotated factor. Compute the variance of the squares of the scaled loadings for the i-th variable 1 m m j=1 ( l ij )2 1 m m k=1 2 2 ( l ik ) = 1 m m j=1( l ij )4 1 m 2 m ( l ik )2 k=1 The quartimax procedure finds the orthogonal transformation of the loading matrix that maximizes the sum of those variances, summing across all p variables. 743

93 Maximum Likelihood Estimation: Stock Prices Quartimax rotation of m = 2 factors MLE MLE Rotated Rotated Variable factor 1 factor 2 factor 1 factor 2 Allied Chem Du Pont Union Carb Exxon Texaco Prop. of var

94 PROMAX Transformation The varimax and quartimax rotations produce uncorrelated factors PROMAX is a non-orthogonal (oblique) transformation that 1. is not a rotation 2. can produce correlated factors 745

95 PROMAX Transformation 1. First perform a varimax rotate to obtain loadings L 2. Construct another p m matrix Q such that q ij = l ij k 1 l ij for l ij 0 = 0 for l ij = 0 where k > 1 is selected by trial and error, ususally k < Find a matrix U such that each column of L U is close to the corresponding column of Q. Choose the j-th column of U to minimize This yields (q j L u j ) (q j L u j ) U = [(L ) (L )] 1 (L ) Q 746

96 PROMAX Transformation 1. Rescale U so that the transformed factors have unit variance. Compute D 2 = diag[(u U) 1 ] and M = UD. 2. The PROMAX factors are obtained from L F = L MM 1 F = L P F P The PROMAX transformation yields factors with loadings L P = L M 747

97 PROMAX Transformation The correlation martix for the transformed factors is (M M) 1 Derivation: V ar(f P ) = V ar(m 1 F) = M 1 V ar(f)(m 1 ) = M 1 I(M 1 ) = (M M) 1 748

98 Maximum Likelihood Estimation: Stock Prices PROMAX transformation of m = 2 factors MLE MLE PROMAX PROMAX Variable factor 1 factor 2 factor 1 factor 2 Allied Chem Du Pont Union Carb Exxon Texaco Estimated correlation between the factors is

99 /* Varimax Rotation */ proc factor data=set1 method=ml nfactors=2 res msa mineigen=0 maxiter=75 conv=.0001 outstat=facml2 rotate=varimax reorder; var x1-x5; priors smc; run; proc factor data=set1 method=prin nfactors=3 mineigen=0 res msa maxiter=75 conv=.0001 outstat=facml3 rotate=varimax reorder; var x1-x5; priors smc; run; 750

100 The FACTOR Procedure Rotation Method: Varimax Orthogonal Transformation Matrix Rotated Factor Pattern Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON

101 Variance Explained by Each Factor Factor Weighted Unweighted Factor Factor Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight x x x x x

102 /* Apply other rotations to the two factor solution */ proc factor data=facml2 nfactors=2 conv=.0001 rotate=quartimax reorder; var x1-x5; priors smc; run; 753

103 The FACTOR Procedure Rotation Method: Quartimax Orthogonal Transformation Matrix Rotated Factor Pattern Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x4 EXXON x5 TEXACO

104 Variance Explained by Each Factor Factor Weighted Unweighted Factor Factor Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight x x x x x

105 proc factor data=facml2 nfactors=2 conv=.0001 rotate=promax reorder score; var x1-x5; priors smc; run; 756

106 The FACTOR Procedure Prerotation Method: Varimax Orthogonal Transformation Matrix Rotated Factor Pattern Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON

107 Variance Explained by Each Factor Factor Weighted Unweighted Factor Factor Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight x x x x x

108 The FACTOR Procedure Rotation Method: Promax (power = 3) Target Matrix for Procrustean Transformation Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON Procrustean Transformation Matrix

109 Normalized Oblique Transformation Matrix Inter-Factor Correlations Factor1 Factor2 Factor Factor

110 Rotated Factor Pattern (Standardized Regression Coefficients) Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON

111 Reference Structure (Semipartial Correlations) Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON Variance Explained by Each Factor Eliminating Other Factors Factor Weighted Unweighted Factor Factor

112 Factor Structure (Correlations) Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON Variance Explained by Each Factor Ignoring Other Factors Factor Weighted Unweighted Factor Factor

113 Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight x x x x x

114 Scoring Coefficients Estimated by Regression Squared Multiple Correlations of the Variables with Each Factor Factor1 Factor Standardized Scoring Coefficients Factor1 Factor2 x2 DUPONT x3 UNION CARBIDE x1 ALLIED CHEM x5 TEXACO x4 EXXON

115 R code posted in stocks.r # Compute maximun likelihood estimates for factors stocks.fac <- factanal(stocks, factors=2, method= mle, scale=t, center=t) stocks.fac Call: factanal(x = stocks, factors = 2, method = "mle", scale = T, center = T) Uniquenesses: x1 x2 x3 x4 x

116 Loadings: Factor1 Factor2 x x x x x Factor1 Factor2 SS loadings Proportion Var Cumulative Var Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 0.58 on 1 degree of freedom. The p-value is

117 # Apply the Varimax rotation (this was the default) stocks.fac <- factanal(stocks, factors=2, rotation="varimax", method="mle", scores="regression") stocks.fac Call: factanal(x = stocks, factors = 2, scores = "regression", rotation = "varimax", method = "mle") Uniquenesses: x1 x2 x3 x4 x

118 Loadings: Factor1 Factor2 x x x x x Factor1 Factor2 SS loadings Proportion Var Cumulative Var Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 0.58 on 1 degree of freedom. The p-value is

119 # Compute the residual matrix pred<-stocks.fac$loadings%*%t(stocks.fac$loadings) +diag(stocks.fac$uniqueness) resid <- s.cor - pred resid x1 x2 x3 x4 x e e e e e-03 x e e e e e-04 x e e e e e-03 x e e e e e-04 x e e e e e

120 # List factor scores stocks.fac$scores Factor1 Factor

121 # You could use the following code in Splus to apply # the quartimax rotation, but this is not available in R # # stocks.fac <- factanal(stocks, factors=2, rotation="quartimax", # method="mle", scores="regression") # stocks.fac # Apply the Promax rotation promax(stocks.fac$loadings, m=3) 772

122 Loadings: Factor1 Factor2 x x x x x Factor1 Factor2 SS loadings Proportion Var Cumulative Var $rotmat [,1] [,2] [1,] [2,]

123 # You could try to apply the promax rotation with the # factanal function. It selects the power as m=4 > > stocks.fac <- factanal(stocks, factors=2, rotation="promax", + method="mle", scores="regression") > > stocks.fac Call: factanal(x = stocks, factors = 2, scores = "regression", rotation = " Uniquenesses: x1 x2 x3 x4 x

124 Loadings: Factor1 Factor2 x x x x x Factor1 Factor2 SS loadings Proportion Var Cumulative Var Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 0.58 on 1 degree of freedom. The p-value is

125 Example: Test Scores The sample correlation between p = 6 test scores collected from n = 220 students is given below: R = An m = 2 factor model was fitted to this correlation matrix using ML. 776

126 Example: Test Scores Estimated factor loadings and communalities are Loadings on Loadings on Communalities Variable factor 1 factor 2 ĥ 2 i Gaelic English History Arithmetic Algebra Geometry

127 Example: Test Scores All variables load highly on the first factor. general intelligence factor. We call that a Half of the loadings are positive and half are negative on the second factor. The positive loadings correspond to the verbal scores and the negative correspond to the math scores. Correlations between scores on verbal tests and math tests tend to be lower than correlations among scores on math tests or correlations among scores on verbal tests, so this is a math versus verbal factor. We plot the six loadings for each factor (ˆl i1, ˆl i2 ) on the original coordinate system and also on a rotated set of coordinates chosen so that one axis goes through the loadings (ˆl 41, ˆl 42 ) of the fourth variable on the two factors. 778

128 Factor Rotation for Test Scores 779

129 Varimax Rotation for Test Scores Loadings for rotated factors using the varimax criterion are as follows: Loadings on Loadings on Communalities Variable F1 F2 ĥ 2 i Gaelic English History Arithmetic Algebra Geometry F 1 is primarily a mathematics ability factor and F 2 is a verbal ability factor. 780

130 PROMAX Rotation for Test Scores F 1 is more clearly a mathematics ability factor and F 2 is a verbal ability factor. Loadings on Loadings on Communalities Variable F1 F2 ĥ 2 i Gaelic English History Arithmetic Algebra Geometry These factors have correlation Code is posted as Lawley.sas and Lawley.R 781

131 Factor Scores Sometimes, we require estimates of the factor values for each respondent in the sample. For j = 1,..., n, the estimated m dimensional vector of factor scores is ˆf j = estimate of values f j attained by F j. To estimate f j we act as if ˆL (or L) and ˆΨ (or Ψ) are true values. Recall the model (X j µ) = LF j + ɛ j. 782

Exploratory Factor Analysis

Exploratory Factor Analysis Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

More information

Factor Analysis. Factor Analysis

Factor Analysis. Factor Analysis Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

More information

Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Common factor analysis

Common factor analysis Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use

Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use : Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

More information

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables. FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

More information

Factor Analysis. Sample StatFolio: factor analysis.sgp

Factor Analysis. Sample StatFolio: factor analysis.sgp STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of

More information

FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

More information

A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

STA 4107/5107. Chapter 3

STA 4107/5107. Chapter 3 STA 4107/5107 Chapter 3 Factor Analysis 1 Key Terms Please review and learn these terms. 2 What is Factor Analysis? Factor analysis is an interdependence technique (see chapter 1) that primarily uses metric

More information

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables--factors are linear constructions of the set of variables; the critical source

More information

9.2 User s Guide SAS/STAT. The FACTOR Procedure. (Book Excerpt) SAS Documentation

9.2 User s Guide SAS/STAT. The FACTOR Procedure. (Book Excerpt) SAS Documentation SAS/STAT 9.2 User s Guide The FACTOR Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete

More information

Factor Analysis - 2 nd TUTORIAL

Factor Analysis - 2 nd TUTORIAL Factor Analysis - 2 nd TUTORIAL Subject marks File sub_marks.csv shows correlation coefficients between subject scores for a sample of 220 boys. sub_marks

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

More information

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

More information

Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

More information

The president of a Fortune 500 firm wants to measure the firm s image.

The president of a Fortune 500 firm wants to measure the firm s image. 4. Factor Analysis A related method to the PCA is the Factor Analysis (FA) with the crucial difference that in FA a statistical model is constructed to explain the interrelations (correlations) between

More information

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS 1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2

More information

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method. Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the

More information

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Introduction to Principal Component Analysis: Stock Market Values

Introduction to Principal Component Analysis: Stock Market Values Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of

More information

Topic 10: Factor Analysis

Topic 10: Factor Analysis Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Statistics for Business Decision Making

Statistics for Business Decision Making Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply

More information

Solution to Homework 2

Solution to Homework 2 Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

5.2 Customers Types for Grocery Shopping Scenario

5.2 Customers Types for Grocery Shopping Scenario ------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? 1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

1 Solving LPs: The Simplex Algorithm of George Dantzig

1 Solving LPs: The Simplex Algorithm of George Dantzig Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Multivariate Analysis

Multivariate Analysis Table Of Contents Multivariate Analysis... 1 Overview... 1 Principal Components... 2 Factor Analysis... 5 Cluster Observations... 12 Cluster Variables... 17 Cluster K-Means... 20 Discriminant Analysis...

More information

Research Methodology: Tools

Research Methodology: Tools MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Factor Rotations in Factor Analyses.

Factor Rotations in Factor Analyses. Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

INTRODUCTION TO MULTIPLE CORRELATION

INTRODUCTION TO MULTIPLE CORRELATION CHAPTER 13 INTRODUCTION TO MULTIPLE CORRELATION Chapter 12 introduced you to the concept of partialling and how partialling could assist you in better interpreting the relationship between two primary

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Factor Analysis and Structural equation modelling

Factor Analysis and Structural equation modelling Factor Analysis and Structural equation modelling Herman Adèr Previously: Department Clinical Epidemiology and Biostatistics, VU University medical center, Amsterdam Stavanger July 4 13, 2006 Herman Adèr

More information

Linear Programming. March 14, 2014

Linear Programming. March 14, 2014 Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information