2 Learning Objectives 1. Develop the ability to assess a quality of measurement instruments 2. Deepen your knowledge of measure development process 3. Gain fundamental knowledge of SPSS software application 4. Perform an exploratory factor analysis 5. Inspect the exploratory factor analysis output 6. Construct factor solutions and check reliability of the scales 4/26/2016 2
Learning objectives 1. Develop ability to assess a quality of measurement instruments What to do  i.e. what to do when we have some responses 2. Deepen knowledge of measure development process 3. Gain foundational knowledge of application of SPSS software 4. Perform exploratory factor analysis How to do  i.e. how to understand the structure of measurement constructs and assess their quality 5. Inspect the output from exploratory factor analysis 6. Construct factor solutions and check reliability of the scales
4 Research process Research problem / question What might be the answer? Hypotheses & theoretical model Where and how to collect data? How and when to measure the variables? Evaluation of data and measure quality Testing the hypotheses Reporting the findings Answering the research question
5 Part 1: Measuring a latent construct
6 Why to be concerned about measurement scales? 1. The GIGO rule 2. Statistical properties Meaningful theories 3. Good measure is a half of success in developing a good theory 6
7 Attributes of a good measure 1. Reliability independent, but comparable measures of the same construct agree, i.e. the variance in scores is not attributed to random errors (Xr=0). 2. Validity differences in observed scores reflect true differences in the phenomenon (latent variable) we measure (Xo=Xt). 3. Generalizability measured effect is not samplespecific and could be applied to other contexts. 7
8 Why is a measure not good enough? Random error a lack of consistency of repeated measurements Systematic error a constant defect in measuring 8
9 Classroom exercise I Step 1: In groups of 23 people generate a list of potential measurement error antecedents. Step 2: Go through your list and divide the antecedents into systematic and random sources of error. Step 3: Think about potential ways to mitigate the effect of error sources. 9
10 Part 2: Measure development process
11 Procedure for developing measures 1. Specify domain of construct 2. Generate sample of items 3. Collect data 4. Purify measure Literature search Literature search Stimulating examples Critical incidents Focus groups Pilot survey => Content validity Factor analysis (EFA) Coefficient alpha (Cronbach alpha) 5. Collect data 6. Assess reliability 7. Assess validity 8. Develop norms Coefficient alpha Splithalf reliability Composite reliability CR (CFA) Construct validity: convergent/discriminant Predictive validity Average and other statistics summarizing distribution of scores Source: (Churchill 1979; Lee & Hooley 2005) 11
12 What should I check to ensure reliability & validity? Criteria Type Meaning Procedure Reliability Stability The extent the test scores correlate when being measured at two different points of time Validity Equivalence Predictive (criterion) Content (face validity) Construct Internal consistency of the items included in the scale (common variance) How well does a measure predict a criterion (dependent variable)? Adequacy with which the domain of the characteristic is captured by the measure. 1). Convergent  a measure correlates with other measures designed to evaluate the same construct. 2). Discriminant a measure does not correlate too highly with measures designed to evaluate different constructs 3). Nomological a measure behaves as theoretically expected with other constructs Testretest Splithalf reliability Cronbach alpha Composite reliability Correlations Evaluate the procedure of measure development Correlations Fornell & Lacker (1981) test of discriminant validity Correlation and regression analysis 12
13 Assess reliability Absence of random error types: Testretest reliability (stability)  test the same construct 2 times Splithalf/parallelforms reliability (consistency) split the items in 2 parts Internal consistency reliability (homogeneity) average interitem correlation (coefficient alpha) Interrater reliability (concordance) test the same construct by two researchers (judges) 4/26/
14 Assess validity 1. External validity of findings = generalizability 2. Internal validity of findings = if x actually causes y 3. Validity of measurement scales = absence of systematic error (bias) in measurement Predictive / criterion validity Content / face validity Construct validity (convergent, discriminant, nomological) = Are we actually measuring what we were supposed to measure? 4/26/
15 Part 3: Exploratory factor analysis (EFA)
16 When and why to use exploratory factor analysis? 1. To understand (explore) a structure of a set of variables; 2. To reduce a data set to a more manageable size while retaining as much of the original information as possible. Factor (or latent variable) = explanatory construct represented by a number of observed variables highly correlated with each other and explaining a common variance in the latent variable. 16
17 Interdependence of five variables In factor analysis we look to reduce the Rmatrix into smaller set of uncorrelated dimensions. 17
18 Interdependence of five variables Common variance, factor 1 Common variance, factor 2 All variance shared = communality equals 1 No variance shared = communality equals 0 18
19 Steps 1. Select variables 2. Check assumptions 3. Select factoring (extraction) method 4. Decide on the number of factors 5. Rotate 6. Interpret 7. Validate 8. Proceed to further analyses with new variables 19
20 Variable selection Continuous variables, correlation must make sense Interdependent but not causally related More observations than variables, recommended cases/variable, min 50 cases, normally around 100 cases Approximately normally distributed, no outliers 20
21 Assumptions 1. KaiserMeyerOlkin measure of sampling adequacy (KMO) Overall KMO is a measure of the correlation matrix s suitability for factor analysis KMO receives high value when partial correlations are small Kaiser s guidelines for interpreting KMO: 0.9 marvelous 0.8 meritorious 0.7 middling 0.6 mediocre 0.5 miserable < 0.5 unacceptable 21
22 Assumptions (Cont d) 2. Barlett s test of sphericity  Is our correlation matrix significantly different from an identity matrix, i.e. correlation coefficients are not zero?  Barlett s test will almost always be significant because of the sample size 3. Multicollinearity and singularity  Correlations higher than.80 may course multicollinearity problems  Check that correlation determinant is >
23 Extraction method Total variance of variables Principal components Common variance of variables Principal axis factoring Maximum likelihood 23
24 Extraction method (Cont d) 1. Total variance of variables (Principal component) Linear combinations in order to reduce N of variables Number of factors 1 k F1= a*x1+ b*x2 +.. F2= c*x1+ d*x2 +.. The first factor accounts for most of the variance, and the last factor least of the variance Retain only factors that account for more variance than a single variable 24
25 Extraction method (Cont d) 2. Common variance of variables (Principal axis; Maximum likelihood) We should know in advance the number and nature of latent dimensions The latent dimensions cause the variation in variables The variation in each variable can be divided into two components: common variance + unique (error) variance Factors are linear combinations of the common variance + error x1= a*f1+ b*f2 +..e1 x2= c*f1+ d*f2 +..e2 25
26 Number of factors Theoretical reasons Eigenvalue (latent root criterion) greater than 1 Scree plot => cut where the plot levels off Percentage of variance explained, e.g. 60% Meaningful interpretation 26
27 Rotation Orthogonal Oblique Factors are rotated but kept independent: varimax, quartimax, equamax Factors are allowed to correlate: direct oblimin, promax 27
28 Interpretation: Factor loadings Interpretation of the results of factor analysis is based on factor loadings Loading = correlation between factor and variable Range Squared loading indicates how many % of the variance of the variable the factor explains Ideally, each variable has a high ( >.4) loading on one factor and low loadings on other factors Rotation makes interpretation easier The common factor manifests an underlying latent dimension 28
29 Interpretation: Loadings significance substantial significance of the loading: min.30, preferably >.50 statistical significance: (loading + needed n of observations to make it significant at 5% level) Loading =.30 => n=350 Loading =.40 => n=200 Loading =.50 => n=120 Loading =.60 => n=85 Loading =.70 => n=60 29
30 Interpretation: Communality Computed for each variable, indicates how many % of the variance is explained by the extracted factors = sum of squared loadings Range 0 1 Should exceed.50 Small values indicate that a variable has little in common with other variables, and should be removed 30
31 Validation i.e. how stable and generalizable the solution is Randomly split your cases into two samples and run the same analysis for each partsample Try to rerun with different extraction methods and rotation methods Check that the factors are related to other external variables in the way they should 31
32 Further analysis Factor scores, summated scales, weighted scales Factor scores can be saved and used as any continuous normally distributed variable, e.g. in t tests, correlations or regression analyses Factor scores are standardized into zero mean and unit variance, therefore you cannot compare the overall level of factors with each other 32
33 Part 4: Let s get started with SPSS
34 Data sample 305 companies located in Europe Employees > 50 Manufacturing and service companies 46% response rate Data collected in 2014 Original questionnaire in English, backtranslated into three other European languages 34
35 Getting started Open the IBM SPSS Statistics program Choose: New dataset OK Choose: File Open Data open PP_2604.sav file (available from the MyCourses page) Open 35
36 Click: Analyze Dimension Reduction Factor 36
37 Click: Select the variables to include in the analysis and transfer them to the box Variables 37
38 Descriptives: Check the box with Univariate descriptives and Initial solution Check also Coefficients and Significance levels Determinant (singularity) and KMO Reproduced and Antiimage matrices => Continue 38
39 Extraction: Principal axis factoring Check the box with Correlation matrix Display unrotated factor solutions and Scree plot Extract based on Eigenvalue >1 => Continue 39
40 Rotation: Orthogonal rotation => Varimax Display rotated solution Maximum iterations for convergence 25 => Continue 40
41 Scores: Save scores as variables  If you want to ensure that factor scores are uncorrelated => AndersonRubin  If correlation between factor scores is acceptable => Regression => Continue 4/26/
42 Options: Exclude cases listwise (if you have missing values) Sort coefficients by size Suppress small coefficients (<.30) => Continue => OK (run the analysis) 4/26/
43 Interpreting the output Univariate descriptives: Means Standard deviations N of observations (we chose to exclude cases with missing values listwise, so we have 282 observations with no missing values) 4/26/
44 Correlation Matrix.30 < correlation coefficients <.90 4/26/
45 Correlation Matrix All correlations are significant Check the determinant of correlation matrix (> ) Consider eliminating the variables that may cause multicollinearity 45
46 KMO & Barlett s test Well above the min..50 Falls into the range of meritorious Check also the diagonal of antiimage correlation matrix >.50 Correlation coefficients are significantly different from zero. Perfect. If it is not significant, you certainly have a big problem! 4/26/
47 Antiimage Matrix Check also the diagonal of antiimage correlation matrix (KMO values of individual items) >.50 Offdiagonal are partial correlations, we want them to be small (high partial correlations indicate some diffusion in the pattern of correlations) 47
48 Communalities, Variance explained Check that communalities for each item are >.50 i.e. the extracted factor explains 50% of variance in the item Cumulative % of variance explained is approx. >.60 48
49 Scree plot Cut where the Eigenvalue is <1 (alternatively, one point before the plot levels off) consider also the amount of variance explained by each factor (see table Total Variance Explained ) 4/26/
50 Initial and Rotated solution Check that items load high (>.40) only on one dimension and low (<.40) on others; Loadings eliminating loadings that are <.40 4/26/
51 Reproduced correlation matrix The matrix reproduces correlations between the items based on the factor model, i.e. robservedrfrom model=residual (needs to be <.05) E.g. residual A1A2 = =.298 The diagonal displays communalities 4/26/
52 Reproduced correlation matrix The matrix reproduces correlations between the items based on the factor model, i.e. robservedrfrom model=residual (needs to be <.05) E.g. residual A1A2 = = % of our residuals have value higher than <.05 (if 50% of them are higher, be concerned!) There are communalities on the diagonal 4/26/
53 Factor correlation matrix Try another rotation method, e.g. oblique rotation (direct oblimin) and allow the factors correlate 53
54 Factor scores Now when we have calculated the factor scores you can find them added to your data file as three new columns: AR factor score 1 for analysis 1 AR factor score 2 for analysis 1 AR factor score 3 for analysis 1 The scores are standardized, have a mean of zero and a unit variance The scores of the items that load on a specific factor are usually summated and used as a summated scale (e.g. items ( ) / 5) 4/26/
55 Classroom exercise II 1. Based on our earlier results and provided guidelines for EFA, improve the factor structure of the Purchasing Performance construct. 2. Find an optimal (in your opinion) factor solution. 3. Do not forget to validate your factor solution by splitting the sample in two parts. 4/26/
56 Reliability assessment Reliability = a measure consistently reflects a construct that we are measuring Constructed for each subscale (factor) individually Subscale 1 (Quality): items 15 Subscale 2 (Cost): items 69 Subscale 3 (Flexibility) : items Analyze => Scale => Reliability Analysis Laitoksen nimi 56
57 Reliability assessment In Statistics check Scale if item deleted & Interitem correlations => OK 4/26/
58 Reliability assessment α (Subscale 1) =.856 α (Subscale 2) =.781 α (Subscale 3) =.671 Accepted value of α is above.60 (exploratory) and.70 (theory testing) Remove items with itemtotal correlation less than.50 4/26/
59 Reporting the results Assumptions: KMO; overall and range of interitem correlations Factor extraction method How did you decide about the number of factors? Rotation method 4/26/
60 Classroom exercise III Calculate the scale reliability for your optimal factor solution. Did constructs reliability become better or worse? 60
61 Critical assessment of measures 1. Reliability and coefficient alpha  Coefficient alpha increases when the number of items increases  Internal consistency (potentially equals redundancy) is not necessarily good for validity  Items are highly correlated to reach high alpha;  Items need to predict a true latent score to a high degree (the more they are correlated, the less is their predictive power);  Be aware that coefficient alpha does not ensure unidimensionality (use FA) 2. FA is not PCA!  FA accounts for common + unique variance  PCA creates a linear combination of items (index), it does not account for random error  Thus, it is also not generalizable to other samples 4/26/
62 Critical assessment of measures (2) 3. Selecting a number of factors  Scree plot and Kaiser criterion, but  Kaiser criterion was developed for PCA  In FA low communality => low eigenvalue (Kaiser criterion) => reason for elimination So what do I do?  Check several factor solutions (one more factor, one less, two less)  If there is a single item with high unique variance (low communality), think whether it is a poorly represented new construct 4/26/
63 Critical assessment of measures (3) 4. Dealing with factor rotation  No constructs in the real world are completely uncorrelated (orthogonal rotation)  And if they were, then a rotation method that allows correlation (oblique rotation) will return us an uncorrelated solution  Orthogonal (statistical simplicity): statistical simplicity, no multicollinearity, BUT potentially samplespecificity  Oblique (theoretical rigor): potential multicollinearity, but correctness of pattern discovered and constancy from one sample to another So what do I do?  Start with oblique rotation, use orthogonal when appropriate or necessary  Whatever rotation method is selected, it should be justified conceptually 4/26/
64 So what did we learn today? 1. Develop a measurement construct Borrow smart Use with theoretical rigor 2. To understand and interpret construct quality 3. The use of SPSS for factor analysis 4/26/
