Introduction to Structural Equation Modeling with Amos

Size: px
Start display at page:

Download "Introduction to Structural Equation Modeling with Amos"

Transcription

1 Introduction to Structural Equation Modeling with Amos Dr. Lluís Coromina (University of Girona, Spain) 2 nd and 3 rd October 214

2 Introduction Outline Basic concepts. Types of variables. Basic composition Intuitive explanation of the basics of SEM o Path analysis. The regression analysis model o Indirect effects. Equations. Degrees of freedom. Specification errors Measurement errors in regression models Full SEM model Confirmatory Factor Analysis (CFA) o Scale Reliability and Validity of a Construct SEM and modeling stages. o Model specification o Model identification o Model estimation o Fit diagnostics and model modification Results and interpretation Model modification 1

3 Introduction To introduce models that relate variables measured with error. To introduce Structural Equation Models with latent variables (SEM). To learn all stages of fitting these models. To become familiar with the Amos software. To enable participants to critically read articles in which these models are applied. 2

4 History SEM make it possible to: o Fit linear relationships among a large number of variables. Possibly more than one is dependent. o Validate a questionnaire as a measurement instrument. Quantify measurement error and prevent its biasing effect. o Freely specify, constrain and test each possible relationship using theoretical knowledge, testing hypotheses. In their most recent and advanced versions, SEM enable researchers to: o Analyze non-normal data. o Treat missing values by maximum likelihood. o Treat complex sample data. 3

5 History of models for the study of causality Analysis of variance ( ): decomposition of the variance of a dependent variable in order to identify the part contributed by an explanatory variable. Control of third variables (experimental design). Macroeconometric models (194-5): dependence analysis of non-experimental data. All variables must be included in the model. Path analysis (192-7): analysis of correlations. Otherwise similar to econometric models. Factor analysis (19-197): analysis of correlations among multiple indicators of the same variable. Measurement quality evaluation. SEM (197): Econometric models, path analysis and factor analysis are joined together. Relationships among variables measured with error, on non-experimental data from an interdependence analysis perspective. 4

6 History of models for the study of causality SEM are nowadays very popular because they make it possible to (5 Cs, see Batista & Coenders 2): Work with Constructs/factors/latent variables measured through indicators/observed variables/manifest variables, and evaluate measurement quality. Consider the true Complexity of phenomena, thus abandoning uni and bivariate statistics. Conjointly consider measurement and prediction, factor and path analysis, and thus obtain estimates of relationships among variables that are free of measurement error bias. Introduce a Confirmatory perspective in statistical modelling. Prior to estimation, the researcher must specify a model according to theory. Decompose observed Covariances, and not only variances, from an interdependence analysis perspective. 5

7 Basic Concepts Latent variables (theoretical concepts that cannot be observed directly) = unobserved = unmeasured Observed variables (indicators of the underlying construct which they are presumed to represent)= manifest = measured 6

8 Basic Concepts Exogenous (Independent) vs Endogenous (dependent) latent variables. F1 causes F2 Changes in the values of the exogenous variables are not explaine by the model. Rather, they are considered to be influenced by other factors external to the model (background variabes such as gender, age, etc.). Fluctuations in the endogenous variable is said to be explained by the model because all latent variables that nfluence them are included in the model specification. 7

9 Statistical Modeling Models explain how the observed and latent variables are related to one another. Diagram Equations Specification: Model based on researcher s knowledge of the related theory Testing on sample data Goodness of fit between the hypothesized model and sample data. Testing how well the observed data fit the restricted structure. Observed data Hypothesized model = Residual DATA = MODEL + RESIDUAL 8

10 Observed variables Types of variables Unobserved latent factors Measurement error associated with an observed variable Ei =reflects on their adequacy in measuring the related unobserved (underlying) factors. Residual error (disturbance) in the prediction of an unobserved factor 9

11 Covariance or correlation Path coefficient for regression of one factor onto another factor. Direct relationship Path coefficient for regression of an observed variable onto an unobserved latent variable (or factor). Direct relationship Spurious relationship: both have a common cause 1

12 Indirect relationship: both are related by an intervening variable v3 Joint effect. The difference between Spurious and Indirect. If is that in the latter v1 and v3 are both exogenous so that it is not clear if v3 contributes to the covariance between v1 and v2 through an indirect or spurious mechanism.

13 Factor analytic model Factor Analysis: analisis of covariances among observed variables in order to get information of the underlying latent factors. EFA: Exploratory Factor Analysis. EX: design of a new instrument of measure satisfaction with life. CFA: Confirmatory Factor Analysis. Measurement model in Structural Equation Modeling (SEM). EX: Knowledge of the theory. Hypotesis testing. Factor loadings: Regression paths from the factors to the observed variables. 12

14 Full latent variable model Allows specification of regression structure among latent variables. Testing of the hypothesis of the impact of one latent construct on another in the modeling of causal direction. Full model = measurement model + Structural Model Recursive Full model: direction of cause from one direction only Nonrecursive Full model: allows for reciprocal or feedback effects. 13

15 Estonian data from ESS. Year: 212 Example: European Social Survey (ESS) Ppltrst= Most people can be trusted or you can't be too careful (= You can't be too careful; 1= Most people can be trusted) Pplfair= Most people try to take advantage of you, or try to be fair (= Most people try to take advantage of me; 1= Most people try to be fair) Pplhlp= Most of the time people helpful or mostly looking out for themselves (= People mostly look out for themselves; 1=People mostly try to be helpful) Trstprl= Trust in country's parliament (= Not trust at all; 1: Complete trust) Trstplt= Trust in politicians (= Not trust at all; 1: Complete trust) Trstlgl= Trust in the legal system (= Not trust at all; 1: Complete trust) 14

16 Sample Covariances ppltrst pplfair pplhlp trstprl trstplt trstlgl ppltrst 4,956 pplfair 2,698 4,98 pplhlp 2,55 2,225 5,93 trstprl 1,668 1,584 1,369 6,2 trstplt 1,44 1,3 1,13 3,977 5,8 trstlgl 1,75 1,62 1,355 4,153 3,37 6,24 Sample Correlations ppltrst pplfair pplhlp trstprl trstplt trstlgl ppltrst 1, pplfair,543 1, pplhlp,49,442 1, trstprl,35,289,247 1, trstplt,28,258,217,719 1, trstlgl,38,288,241,68,6 1, 15

17 The path model * Term ei is measurement error (Random measurement error and Systematic error or non-random) * Residual (d1) terms represent error in the prediction of endogenous (Political Trust) factors from exogenous (political Satisfaction) factors. * All dependent variable have assigned an error (measurement error if it is an observed variable and a disturbance if it is latent). 16

18 Basic composition Measurement model: relations between observed and unobserved variables. CFA: pattern by which each measure loads on a particular factor. Structural model: Relations between unobserved variables. A particular latent variable directly or indirectly influence ( cause ) changes in the values of certain other latent variables in the model. Structural Model Structural Model Measurement Model 17

19 Examples and basic concepts. Simple linear regression model. Introduction to interdependence analysis The specification of a SEM consists in a set of assumptions regarding the behaviour of the variables involved. Substantive part: it requires translating verbal theories into equations. Statistical part: it is needed for the eventual estimation and testing of the model. The assumptions regard the distribution of the variables involved. 18

20 Substantive assumptions: v 2 = 21 v 1 +d 2 Linearity. β 21 : effect by how much will the expected value of v 2 increase following a unit increase in v 1? Standardized β 21 : by how many standard deviations will the expected value of v 2 increase following a standard deviation increase in v 1? d 2 collects the effect of omitted explanatory variables, measurement error in v 2 and the random and unpredictable part of v 2 (disturbance). v 1 is assumed to be free of measurement error. 19

21 Statistical assumptions regarding the joint distribution of the sources of variation: v d 1 2 N, 22 Two additional parameters: the variances of v 1 ( ) and d 2 (ψ 22 ). Bivariate normal joint distribution of v 1 and d 2. Variables are mean-centred. Uncorrelation of v 1 and d 2 (inclusion of all relevant variables). If this holds, the variance of v 2 can be additively decomposed into explained variance and disturbance variance. R 2 is the explained percentage. Equations exhaustively describe the joint distribution of v 1 and v 2 as a function of 3 parameters. 2

22 In order to derive the structural equation system Σ=Σ(π) we can apply path analysis : For a model with k observed variables, the number of distinct elements in Σ is (k+1)k/2. Π = (, ψ 22, β 21 ) Determination coefficient R 2 =1-( ψ 22 / 22 ) 21

23 It is possible to solve the system Σ(π)=Σ as it contains an equal number of equations (distinct elements of Σ) and unknowns (elements of π) exactly identified: ˆ and estimate p ˆ, ˆ, the system Σ(p)=S: ˆ ˆ ˆ s s s s s 2 21 s : by solving We can estimate Σ from a sample covariance matrix: S s s 21 s s

24 trsrprl = 21 *ppltrst +d 2 Example (v 2 =trsrprl) can be explained by level of trust in others (v 1 =ppltrst): S s s 21 s s ( ) = ( ) = = / = 1.668/4.956=.34 = = ( /4.956)=

25 ˆ 21 is identical to the ordinary least squares estimation (dependence analysis). In statistical analysis, a function of residuals (e.g. the sum of squares) is used as: A criterion function to minimize during estimation. A goodness of fit measure. In a dependence analysis, a residual = v 2 ˆ 21 v 1. In an interdependence analysis residuals are differences between covariances fitted by the model parameters (p) and sample covariances S. They are arranged in the S-(p) residual matrix. In an exactly identified model they are zero as S=(p) has a solution. 24

26 Notes for Group (Group number 1) The model is recursive. Sample size = 233 AMOS output for trsrprl = 21 *ppltrst +d 2 Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstprl Observed, exogenous variables ppltrst Unobserved, exogenous variables d2 Variable counts (Group number 1) Number of variables in your model: 3 Number of observed variables: 2 Number of unobserved variables: 1 Number of exogenous variables: 2 25

27 Number of endogenous variables: 1 Parameter Summary (Group number 1) Weights Covariances Variances Means Intercepts Total Fixed 1 1 Labeled Unlabeled Total Sample Moments (Group number 1) Sample Covariances (Group number 1) ppltrst trstprl ppltrst 4,956 trstprl 1,668 6,2 Sample Correlations (Group number 1) ppltrst trstprl ppltrst 1, trstprl,35 1, 26

28 Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 3 Number of distinct parameters to be estimated: 3 Degrees of freedom (3-3): Result (Default model) Minimum was achieved Chi-square =, Degrees of freedom = Probability level cannot be computed 27

29 Estimates (Group number 1 - Default model) Scalar Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. P Label trstprl <--- ppltrst,337,22 15,479 *** par_1 Critical Ratio= Dividing the regression weight estimate by the estimate of its standard error gives z =,337/,22 = 15,479. Sometimes it is called t-value. Standardized Regression Weights: (Group number 1 - Default model) Estimate trstprl <--- ppltrst,35 Variances: (Group number 1 - Default model) Estimate S.E. C.R. P Label ppltrst 4,956,145 34,125 *** par_2 d2 5,458,16 34,125 *** par_3 The exogenous variance (ppltrst) is trivially equal to the sample variance

30 Squared Multiple Correlations: (Group number 1 - Default model) Estimate trstprl,93 ### R 2 ### Standardized Residual Covariances (Group number 1 - Default model) ppltrst trstprl ppltrst, In an exactly identified model. trstprl,, they are zero as S=(p) has a solution 29

31 Simple Regression with SPSS: trsrprl = 21 *ppltrst +d 2 Model R R squared Adjusted R Squared 1,35,93,93 Coeficients Model Non standardized coefficients Satandard ized coef. t Sig. B s.e. Beta (Constant) 2,93,129 16,238, 1 Most people can be trusted or you can't be too careful,337,22,35 15,476, a. Variable dependent: Trust in country's parliament 3

32 31 Model with two dependent variables and an indirect effect. Identification, goodness of fit and specification errors v 2 = 21 v 1 +d 2 v 3 = 32 v 2 +d , N d d v Σ is 33 and contains 43/2=6 nonduplicated elements. has 5 elements (, ψ 22, ψ 33, β 21, β 32 ). The difference is the number of degrees of freedom (df) of the model.

33 Structural equation system: EXERCISE: Derive Equation using path analysis. 32

34 AMOS OUTPUT Notes for Group (Group number 1) The model is recursive. Sample size = 233 Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstprl stfeco Observed, exogenous variables ppltrst Unobserved, exogenous variables d2 d3 33

35 Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstprl stfeco Observed, exogenous variables ppltrst Unobserved, exogenous variables d2 d3 Variable counts (Group number 1) Number of variables in your model: 5 Number of observed variables: 3 Number of unobserved variables: 2 Number of exogenous variables: 3 Number of endogenous variables: 2 34

36 Parameter Summary (Group number 1) Weights Covariances Variances Means Intercepts Total Fixed 2 2 Labeled Unlabeled Total Sample Covariances (Group number 1) ppltrst trstprl stfeco ppltrst 4,956 trstprl 1,668 6,2 stfeco 1,382 3,7 4,885 Sample Correlations (Group number 1) ppltrst trstprl stfeco ppltrst 1, trstprl,35 1, stfeco,281,554 1, 35

37 Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 6 Number of distinct parameters to be estimated: 5 Degrees of freedom (6-5): 1 Result (Default model) Minimum was achieved Chi-square = 46,528 Degrees of freedom = 1 Probability level =, Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) trstprl <--- ppltrst stfeco <--- trstprl Estimate S.E. C.R. P Label,337,22 15,479 *** par_1,499,16 32,156 *** par_2 36

38 Standardized Regression Weights: (Group number 1 - Default model) Estimate trstprl <--- ppltrst,35 stfeco <--- trstprl,554 Variances: (Group number 1 - Default model) ppltrst d2 d3 Estimate S.E. C.R. P Label 4,956,145 34,125 *** par_3 5,458,16 34,125 *** par_4 3,383,99 34,125 *** par_5 Squared Multiple Correlations: (Group number 1 - Default model) ### R 2 ### Estimate trstprl,93 stfeco,37 Standardized Residual Covariances (Group number 1 - Default model) ppltrst trstprl stfeco ppltrst, trstprl,, stfeco 5,33,, 37

39 Degrees of freedom introduce restrictions in the covariance space. From equation: Implies: This derives from many explicit or implicit restrictions of the model. The existence of degrees of freedom implies higher parsimony. It is a true model in the scientific sense, which is a simplification of reality. 38

40 The existence of degrees of freedom affects estimation. In general, no p vector of estimates will exactly satisfy (p)=s. Estimation consists in finding a p vector that leads to an S-(p) matrix with small values. A function of all elements in S-(p)) called fit function is minimized (F min ) The existence of degrees of freedom makes it possible to test the model fit. A model with df= leads to a p vector that always fulfils (p)=s or S-(p)= and thus perfectly fits any data set. In a correct model with df> (π)= in the population and (p)s in the sample. If S- (p), contains large values, we can say that some of the restrictions are false. If assumptions are fulfilled and under H (null hypothesis: the model contains all necessary parameters), a transformation of the minimum value of the fit function follows a 2, which makes it possible to test the model restrictions (significance of omitted parameters). Note that standard testing procedures in statistical modelling (e.g. t-values) test the parameters which are present in the model. 39

41 Specification errors Errors such as the omission of important explanatory variables, the omission of model parameters, or the inclusion of wrong restrictions are known as specification errors. Specification errors are frequent. In general, a specification error can bias any parameter estimate. If the model is incorrect because v 3 receives a direct effect from v 1 : v 2 = 21 v 1 +d 2 v 3 = 31 v v 2 +d 3 and we apply path analysis, then we observe that the new parameter affects σ 31 y σ 33 : 4

42 If we fit the model in to the covariances in the equation, we find σ 31 to be affected by the absent β 31 parameter but fitted only by the present parameters β 21 and β 32. ˆ 21 and ˆ 32 will be biased. 41

43 42 Attempts must be made to detect specification errors by all means, both statistical and theoretical: Specification errors are undetectable in any model with df=. They are also undetectable if they involve variables that are NOT in the model. It can happen that many models with different interpretations have a similarly good fit, even an exactly equal fit (equivalent models). The following model has a completely different causal interpretation: v 1 = 12 v 2 +d 1 v 2 = 23 v 3 +d , N v d d EXERCISE: Derive the system for this model and (equivalent to the previous model).

44 43 If we estimate a general model: v 1 = 12 v 2 +d 1 v 2 = 21 v v 3 +d 2 v 3 = 32 v 2 +d , N d d d then the parameter vector includes 7 elements π=( ψ, ψ 22, ψ 33, β 12, β 21, β 23, β 32 ) versus 6 (3*4/2) equations: infinite number of solutions (underidentified model). Ψ Ψ 22 Ψ 33

45 Identification of the model Degrees of freedom (df)= elements matrix S - parameters to be estimated = [(p)(p+1)]/2 - parameters Just identified model (df=): number of data variances and covariances equals the number of parameters. The model yields an unique solution for all parameters, but scientifically it is not interesting because without degrees of freedom it never can be rejected. No goodness of fit of the model is possible. Over identified model (df>): It allows to reject the model. It allows to analyze the discrepancy between S and (p), thereby rendering it of scientific use. The aim in SEM then is specify over identified models Under identified model: infinite number of solutions. No useful at all 44

46 Simple regression model with errors in the explanatory variable. Introduction to models with measurement error The observed explanatory variable (v 1 ) is measured with error (e 1 ). The unobservable error-free value f 1 is called factor or latent variable. f 2 is observed because e 2 is for the moment assumed to be zero. Two equation types: Relating factors to one another: f 2 =β 21 f 1 +d 2 θ φ β 21 Ψ 22 Relating factors to observed variables or indicators: v 1 =f 1 +e 1 v 2 =f 2 45

47 46 Assumptions: Measurement errors are uncorrelated with factors (as in factor analysis). Disturbances are uncorrelated with the explanatory factor (as in regression) , N d e f These assumptions make it possible to decompose the variance of observed variables into true score variance (explained by factors) and measurement error variance. R 2 is called measurement quality and is represented as

48 The structural equations become: Underidentified model: 4 parameters (,, 21, 22 ) and three variances and covariances (only those of observed variables count). The OLS estimator assumes that =, which is a specification error and leads to bias. The probability limit of the OLS estimator is: ˆ 21 s s and is thus biased unless κ 1 =1. 47

49 Simple linear regression model with multiple indicators The solution to measurement error bias in SEM involves the use of multiple indicators, at least of the explanatory latent variables. The equations relating factors to indicators become: f 2 =β 21 f 1 +d 2 v 1 =1* f 1 +e 1 v 2 =f 2 v 3 =λ 31 *f 1 +e 3 48

50 49 The equation includes a loading λ 31 (L31) which relates the scales of f 1 and v 3 : The researcher must fix the latent variable scale, usually by anchoring it to the measurement units of an indicator whose equals 1. Standardized instead of raw loadings are usually interpreted. If there is only one factor per indicator, they lie within -1 and +1 and equal the square root of κ. New assumption of uncorrelated measurement errors of different indicators: , N d e e f

51 Addition of v 3 in the structural equations. This is an exactly identified model, all of whose parameters can be solved, even those related to unobservable variables. The extent to which multiple indicators of the same construct converge (correlate) provides information to estimate the parameters

52 Applied example f 2 =v 2 =trstprl from Social Trust f 1 =SocT, measured by its two indicators (v 1 =ppltrst and v 3 =pplhlp): ppltrst = 1* SocT+ e 1 pplhlp = 31 * SocT+e 3 trstprl=trstprl trstprl=β 21 * SocT+d 2 51

53 The equivalence between the observed and latent dependent variable makes it possible to simplify the path and equations as: θ ψ 22 ppltrst = 1* SocT+ e 1 pplhlp = 31 * SocT+e 3 φ 1 β 21 trstprl= β 21 * SocT+d 2 θ 33 λ 31 We define a latent variable called Social Trust, measured by ppltrst and pplhlp. The loading of ppltrst (first indicator) is constrained to 1 in order to fix the scale of the latent variable. Each indicator automatically receives a (error variance, e i ) parameter. The regression is of trstprl (observed, dependent) on Social Trust (latent, explanatory). This automatically defines a (regression weight), a (variance of independent variable) and a (disturbance term) parameter. 52

54 AMOS OUTPUT 53

55 The model is recursive. Sample size = 233 Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables ppltrst pplhlp trstprl Unobserved, exogenous variables SocialTrust e3 e1 d2 Variable counts (Group number 1) Number of variables in your model: 7 Number of observed variables: 3 Number of unobserved variables: 4 Number of exogenous variables: 4 Number of endogenous variables: 3 54

56 Parameter Summary (Group number 1) Weights Covariances Variances Means Intercepts Total Fixed 4 4 Labeled Unlabeled Total Sample Moments (Group number 1) Sample Covariances (Group number 1) trstprl pplhlp ppltrst trstprl 6,2 pplhlp 1,369 5,93 ppltrst 1,668 2,55 4,956 Sample Correlations (Group number 1) trstprl pplhlp ppltrst trstprl 1, pplhlp,247 1, ppltrst,35,49 1, 55

57 Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 6 Number of distinct parameters to be estimated: 6 Degrees of freedom (6-6): Result (Default model) Minimum was achieved Chi-square =, Degrees of freedom = Probability level cannot be computed 56

58 Estimates (Group number 1 - Default model) Scalar Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. P Label ppltrst <--- SocialTrust 1, fixed at 1 pplhlp <--- SocialTrust,821,69,972 *** par_1 free 31 trstprl <--- SocialTrust,666,56,954 *** par_2 Standardized Regression Weights: Estimate ppltrst <--- SocialTrust,7 pplhlp <--- SocialTrust,575 trstprl <--- SocialTrust,43 Squared Multiple Correlations: (Group number 1 - Default model) Estimate We can compute R 2 trstprl,185,475 2 =,185 pplhlp,331,575 2 =,331 ppltrst,55,7 2 =.55 57

59 Variances: (Group number 1 - Default model) Estimate S.E. C.R. P Label SocialTrust 2,55,239 1,496 *** par_3 e3 3,47,169 2,161 *** par_4 33 e1 2,452,215,41 *** par_5 d2 4,99,17 28,942 *** par_6 22 Matrices (Group number 1 - Default model) Implied (for all variables) Covariances (Group number 1 - Default model) SocialTrust trstprl pplhlp ppltrst SocialTrust 2,55 trstprl 1,668 6,2 pplhlp 2,55 1,369 5,93 ppltrst 2,55 1,668 2,55 4,956 Implied (for all variables) Correlations (Group number 1 - Default model) SocialTrust trstprl pplhlp ppltrst SocialTrust 1, trstprl,43 1, pplhlp,575,247 1, ppltrst,7,35,49 1, 58

60 Full SEM model Example: Independent variables (8): 6 errors: e1, e2, e3, e4, e5, e6 1 disturbance: d1 1 latent variable: SocialTrust Dependent variables (7): 6 represent observed variables: ppltrst; pplfair; pplhlp; trstprl; trstplt; trstlgl 1 represents an unobserved variable (or factor Political Trust). 59

61 Definition of the model: Model Equations Political Trust =? * SocialTrust + d1 ppltrst = 1* SocialTrust + e1 pplfair =? * SocialTrust + e2 pplhlp =? * SocialTrust + e3 trstprl = 1*Political Trust + e4 trstplt =? * Political Trust + e5 trstlgl =? * Political Trust + e6 Structural model Social Trust measurement scale model Political trust measurement scale model Variances of independent variables e1 =? ; e2 =?; e3 =?; e4=?; e5 =?; e6 =?; SocialTrust =?; d1 =? 6

62 Rules for determining the model parameters Rule 1: All the variances of the independent variables are parameters Rule 2: All covariances between independent variables are parameters Rule 3: All load factors between latent and its indicators are parameters Rule 4: All regression coefficients between observed or latent variables are parameters Rule 5: (i) The variances of dependent variables, (ii) the covariance between dependent variables and (iii) the covariance between dependent and independent variables, are never parameters (are explained by other parameters of the model) Rule 6: For each latent variable must be set its metric: For independent latent two ways: Set its variance set to a constant (usually 1) Fix a load factor (λ) between latent and its factor (usually 1) 61

63 Determining the model parameters: For the latent dependent only one way: fix a coefficient between it and one of the observed variables to a constant (usually 1) An equation for each variable (latent or observable) that receives a one-way arrow (dependent variables) (7) So many variances as independent variables (8) So many covariances as two-way arrows [] 62

64 Computation of degrees of freedom (Default model) Number of distinct sample moments: 21 = = (6 * 7/2) Number of distinct parameters to be estimated: 13 8 variances of independent variables 4 coefficients of latent factors with indicators 1 regression coefficient Degrees of freedom (21-13): 8 63

65 Determining the model parameters: Adding a covariance between e5 and e6, we introduce a parameter, and we lose one degree of freedom. Computation of degrees of freedom (Default model) Number of distinct sample moments: 21 Number of distinct parameters to be estimated: 14 Degrees of freedom (21-14): 7 64

66 Confirmatory Factor Analysis CFA. Introduction to reliability and validity assessment θ λ λ 21 θ 22 θ 33 λ 32 φ 12 θ 44 λ 42 65

67 This model does not contain equations relating factors to one another but only covariances. All factors are exogenous. No or parameters, only, θ and φ. At least three indicators are needed for models with one factor and two for models with more factors. In CFA models it is possible to standardize factors to unit variances instead of fixing a loading to 1. Then the parameters are factor correlations. For 2 factors and 2 indicators we have the following equations: v 1 = f 1 +e 1 v 2 = 21 f 1 +e 2 v 3 = 32 f 2 +e 3 v 4 =λ 42 f 2 +e 4 66

68 67 The model has df=1. = 22 =1: , N e e e e f f

69 The correlation between two indicators of the same factor depends on : and the correlation between two indicators of different factors is attenuated with respect to the correlation between factors (effect of measurement error): A CFA model is likely to fit the data only if items of the same factor correlate highly and higher than items of different factors. We advise researchers to carefully examine the correlation matrix prior to fitting a CFA model. 68

70 Reliability for each item: Trstprl κ 1 = =.735 trstlgl κ 2 = =.629 stfgov κ 3 = =.789 stfdem κ 4 =??? Correlations: = = =.862* = ???? 32???? 34???? 42???? 69

71 Amos Output Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstlgl trstprl stfdem stfgov Unobserved, exogenous variables e2 e1 e4 SATCNTRY e3 PolTrust Variable counts (Group number 1) Number of variables in your model: 1 Number of observed variables: 4 Number of unobserved variables: 6 Number of exogenous variables: 6 Number of endogenous variables: 4 7

72 Sample Covariances (Group number 1) stfgov stfdem trstprl trstlgl stfgov 5,534 stfdem 3,792 5,182 trstprl 3,861 3,152 6,2 trstlgl 3,446 3,34 4,153 6,24 Sample Correlations (Group number 1) stfgov stfdem trstprl trstlgl stfgov 1, stfdem,78 1, trstprl,669,564 1, trstlgl,588,583,68 1, Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 1 Number of distinct parameters to be estimated: 9 Degrees of freedom (1-9): 1 71

73 Estimates (Group number 1 - Default model) Scalar Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) stfgov <--- SATCNTRY stfdem <--- SATCNTRY trstlgl <--- PolTrust trstprl <--- PolTrust Estimate S.E. C.R. P Label 2,89,42 49,72 *** par_1 1,815,42 43,216 *** par_3 1,976,46 42,68 *** par_4 2,12,45 46,963 *** par_5 Standardized Regression Weights: Estimate stfgov <--- SATCNTRY,888 stfdem <--- SATCNTRY,797 trstlgl <--- PolTrust,793 trstprl <--- PolTrust,857 λ i 72

74 Squared Multiple Correlations: 1-λ i 2 =θ i Estimate stfgov,789 1-,789=,2 stfdem,635 1-,635=,365 trstlgl,629 1-,629=,371 trstprl,735 1-,735=,265 Covariances: (Group number 1 - Default model) SATCNTRY <--> PolTrust Estimate S.E. C.R. P Label,862, 76,185 *** par_2 Correlations: (Group number 1 - Default model) Estimate SATCNTRY <--> PolTrust,862 The same value than covariance. This is because variance of latent factors is fixed to 1. = 22 =1 73

75 Variances: (Group number 1 - Default model) Estimate S.E. C.R. P Label PolTrust 1, SATCNTRY 1, e2 2,3,98 23,458 *** par_6 e1 1,61,93 17,164 *** par_7 e4 1,887,79 23,765 *** par_8 e3 1,169,83 14,13 *** par_9 Implied (for all variables) Correlations (Group number 1 - Default model) PolTrust SATCNTRY stfgov stfdem trstprl trstlgl PolTrust 1, SATCNTRY,862 1, stfgov,766,888 1, stfdem,688,797,78 1, trstprl,857,739,656,589 1, trstlgl,793,684,68,545,68 1, 74

76 Purification of the measures Total item correlation serves as a criterion for initial assessment and purification. Various cut-off points are adopted:.3 by Cristobal et al.(27).4 by Loiacono et al. (22).5 by Francis and White(22) and Kim and Stoel (24) Wolfinberger and Gilly (23) are rigorous in retaining only items that load at.5 or more on a factor do not load at more than.5 on two factors have an item total correlation of more than.4 75

77 Reliability and Validity A measure is reliable to the extent that independent but comparable measures of the same trait or construct of a given object agree. Reliability depends on how much of the variation in scores is attributable to random or chance errors. If a measure is perfectly reliable, X R = A measure is valid if when the differences in observed scores reflect true differences on the characteristic one is attempting to measure and nothing else, that is, X O = X T. a) Reliability of individual items: o loadings greater than.5 on the respective construct (Hulland, 1999; White et al., 23; Ribbink et al. 24) o exhibit loadings with the intended construct of.7 or more, and are statistically significant (Ledden, 27) 76

78 b) Reliability of a construct or Internal consistency of the scale It allows to check the internal consistency of all indicators to measure the concept (thoroughness with which all indicators measure the same) Internal homogeneity of a set of items: o Composed Reliability (CR) greater than.7 (Anderson and Gerbing, 1988; Bagozzi and Yi, 1988) o Correlation between each item and its construct >.5 and correlations among items from the same construct >.3. Cronbach s α greater than.7 (Fornell and Larcker, 1981; Nunally and Bernstein, 1994) 77

79 c) Convergent validity the extent to which a set of items assumed to represent a construct does in fact converge on the same construct: o Average variance extracted (AVE - amount of variance that a construct obtains from the indicators in relation to the amount of variance of the measurement error) greater than.5 (Fornell and Larcker, 1981; Chin and Newsted, 1999; Gounaris and Dimitriadis, 23) o Factor loadings greater than.5 (Grewal et al., 1998). (d) Discriminant validity (when there are some scales in the model) the extent to which measures of theoretically unrelated constructs do not correlate with one another: inter-factor correlations are less than the square root of the average variance extracted (AVE) (Fornell and Larcker, 1981) 78

80 Numerical example: Reliability of a construct Reliability (SATCNTRY)= =.831 Reliability (PolTrust)= =.8 79

81 Convergent validity 1 st ) Factor loadings are significant and greater than.5 2 nd ) Average Variance Extracted (AVE) for each of the factors >.5. Squared Multiple Correlations: AVE Estimate stfgov,789 ( )/2=.712 SATCNTRY stfdem,635 trstlgl,629 ( )/2=.682 PolTrust trstprl,735 8

82 Discriminant Validity of construct Average variance extracted (AVE). For this a construct must have more variance with its indicators than with other constructs of the model. It is when between each pair of factors > estimated correlation between those factors StfCntry PolTrust StfCntry.843 PolTrust sqrt(.712)=.832 sqrt(.682)=.823 Correlations: Estimate SATCNTRY <--> PolTrust,862 81

83 Exploratory Factor Analysis (EFA) 82

84 83 Comparison with exploratory factor analysis (EFA) v 1 = f f 2 +e 1 v 2 = 21 f f 2 +e 2 v 3 = 31 f f 2 +e 3 v 4 =λ 41 f f 2 +e , N e e e e f f In EFA, which items measure which dimensions is the outcome, in CFA it is the input. In EFA questionnaire items aim globally at a broad concept, in CFA each questionnaire item is designed to tackle a specific dimension of the concept.

85 Random and systematic error. Reliability and validity assessment with CFA Reliability: Extent to which a measurement procedure would yield the same result upon several independent trials under identical conditions. In other words, low random measurement error (any systematic error would replicate). Random measurement error is a problem for OLS regression but not for SEM with multiple indicators, because it is accounted for by the parameters. Validity: Extent to which a measurement procedure measures what it is intended to measure and only what it is intended to measure, except for random measurement error. In other words, absence of systematic error. Assuming the validity of v, its reliability is the percentage of variance explained by f. Always follow this golden rule: Estimate reliability after validity has been diagnosed. Test the specification of measurement equations in a CFA model prior to specifying equations relating factors. Otherwise, relationships among factors might be biased (specification errors) or even meaningless (invalidity). 84

86 Construct validation: Estimate a CFA model that assumes validity... All items load on the factor they are supposed to measure (a second loading is a sign of measuring another factor which is in the model). No error correlations are specified (error correlations contain common unknown variance, a sign of measuring an unknown factor which is not in the model)....and diagnose its goodness of fit. You can never be certain of validity, but a CFA model can help detect signs of invalidity such as: It does not correctly reproduce the covariance matrix (additional loadings or error correlations are needed, thus revealing mixed items, additional necessary dimensions). Convergent invalidity. Some variables have too low to be attributed to solely random error (convergent invalidity). Some factors have correlations very close to unity (discriminant invalidity). Some factors have correlations of unexpected signs or magnitudes (nomological invalidity). 85

87 Modelling stages in SEM Verbal theories 1) SPECIFICATION Model: equations and assumptions 2) IDENTIFICATION Estimable model MODIFICATION 3) DATA COLLECTION Exploratory data analysis. Computation of S 4) ESTIMATION Methods to fit Σ(p) to S 5) FIT DIAGNOSTICS Discrepancies between Σ(p) and S NO ADEQUATE? YES 6) UTILIZATION 86 - Theory validation, validity and reliability assessment...

88 1) Specification Theoretical and statistical grounds Formal establishment of a statistical model: set of statistical and substantive assumptions that structure the data according to a theory. Equations: one or two of the following systems of equations: Relating factors or error free variables to one another (structural equations). Relating factors to indicators with error (measurement equations). Parameters: two types: o Free (unknown and freely estimated). o Fixed (known and constrained to a given value, usually or 1). The amount of the researchers prior knowledge will affect the modelling strategy: If this knowledge is exhaustive and detailed, it will be easily translated into a model specification. The researchers aim will simply be to use the data to estimate and confirm or reject the model (confirmatory strategy). If this knowledge is less exhaustive and detailed, the fixed or free character of a number of parameters will be dubious. This will lead to a model modification process by repeatedly going through the modelling stages (exploratory strategy). 87

89 Full SEM model: 88

90 Equations: StfCntry = 31 * SocialTrust + d3 PolTrust = 21 * SocialTrust + d2 ppltrst = 1* SocialTrust + e1 pplfair = 21 * SocialTrust + e2 pplhlp = 31 * SocialTrust + e3 trstprl = 1*Political Trust + e4 trstplt = 52 * Political Trust + e5 trstlgl = 62 * Political Trust + e6 stfeco= 73 * StfCntry + e7 stfgov= 83 * StfCntry + e8 stfdem=1 * StfCntry + e9 Total number of parameters = 2 Variances of independent variables e1 = ; e2 = 22 ; e3 = 33 ; e4= 44 ; e5 = 55 ; e6 = 66 ; e7 = 77 ; ; e8 = 88 ; ; e9 = 99 ; SocialTrust = φ ; d 2 = ψ 22 ; d 3 = ψ 33 89

91 2) Identification Can model parameters be derived from variances and covariances? Identification must be studied prior to data collection If a model is not identified: o Seek more restrictive specifications with additional constraints (if theoretically justifiable). o Add more indicators or more exogenous factors. Identification conditions o Underidentification (df<): infinite number of solutions that make S equal (p). o Possibly identified (df=): there may be a unique solution that makes S equal (p). This type of models is less interesting in that their restrictions are not testable. o Possibly overidentified (df>): there may be a unique solution that minimizes discrepancies between S and (p). Only these models, more precisely their restrictions, can be tested from the data. 9

92 Example Full SEM model: 9 observed variables lead to (91/2)=45 variances and covariances: possibly overidentified model. Total number of parameters = 2 Degrees of Freedom = 45-2 = 25 > The model fulfils enough sufficient conditions: 1) Equations relating factors are recursive 2) Disturbances are uncorrelated 3) All factors have at least two pure indicators. 91

93 3) Data collection and exploratory analyses Valid sampling methods In their standard form, SEM assumes simple random sampling. Extensions to stratified and cluster samples have been recently developed. In any case, they must be random samples. Sample size Sample sizes in the 2-5 range are usually enough. Sample requirements increase: For smaller R 2 and percentages of explained variance. When collinearity is greater. For smaller numbers of indicators per factor (especially less than 3). Under non normality, the required sample size is larger (in the 4-8 range). Outlier and non-linearity detection As before doing any other type of statistical modeling, outliers and non-linear relationships must be detected by means of exploratory data analysis. 92

94 4) Estimation First estimate the sample variances and covariances (S) and then find the best fitting p parameter values. A fit function related to the size of the residuals in S-(p) is minimized. Each choice of fit function results in an alternative estimation method. One of these choices leads to the maximum likelihood estimator (ML) which is the most often used. Estimation assumes that a covariance matrix is analyzed. Estimations obtained from a correlation matrix are only correct only under very specific conditions. 93

95 Normality assumed: ML and GLS Normality not assumed: ULS, Scale LS, AD The two most commonly used estimation techniques are Maximum likelihood (ML) and normal theory generalized least square (GLS). ML and GLS: large sample size, continuous data, and assumption of multivariate normality Unweighted least squares (ULS): scale dependent. Asymptotically distribution free (ADF) (Weighted least squares, WLS): serious departure from normality. 94

96 Examining the ordered correlation matrix Let us look at correlations as well and spot low correlations of items measuring the same or large correlations between items measuring different dimensions: stfeco stfgov stfdem ppltrst pplfair pplhlp trstprl trstplt trstlgl stfeco 1, stfgov,718 1, stfdem,653,78 1, ppltrst,281,276,276 1, pplfair,285,273,275,543 1, pplhlp,298,277,262,49,442 1, trstprl,554,669,564,35,289,247 1, trstplt,515,642,53,28,258,217,719 1, trstlgl,526,588,583,38,288,241,68,6 1, 95

97 Amos output Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstlgl trstplt trstprl pplhlp pplfair ppltrst stfdem stfgov stfeco Unobserved, endogenous variables PolTrust StfCntry Unobserved, exogenous variables e6 e5 e4 SocialTrust e3 e2 e1 e9 e8 e7 d3 d2 96

98 Variable counts (Group number 1) Number of variables in your model: 23 Number of observed variables: 9 Number of unobserved variables: 14 Number of exogenous variables: 12 Number of endogenous variables: Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 45 Number of distinct parameters to be estimated: 2 Degrees of freedom (45-2): 25 Result (Default model) Minimum was achieved Chi-square = 197,457 Degrees of freedom = 25 Probability level =, 97

99 Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. P Label PolTrust <--- SocialTrust 1,985, 19,721 *** par_7 StfCntry <--- SocialTrust 1,67,87 19,2 *** par_8 trstlgl <--- PolTrust,897,21 43,529 *** par_1 trstplt <--- PolTrust,845,18 46,8 *** par_2 trstprl <--- PolTrust 1, pplhlp <--- SocialTrust,9,64 14,299 *** par_3 pplfair <--- SocialTrust,984,65 15,135 *** par_4 ppltrst <--- SocialTrust 1, stfdem <--- StfCntry 1, stfgov <--- StfCntry 1,155,24 47,27 *** par_5 stfeco <--- StfCntry,975,23 42,54 *** par_6 98

100 Standardized Regression Weights: Estimate PolTrust <--- SocialTrust,94 StfCntry <--- SocialTrust,899 trstlgl <--- PolTrust,775 trstplt <--- PolTrust,86 trstprl <--- PolTrust,877 pplhlp <--- SocialTrust,396 pplfair <--- SocialTrust,432 ppltrst <--- SocialTrust,44 stfdem <--- StfCntry,8 stfgov <--- StfCntry,894 stfeco <--- StfCntry,83 Variances: Estimate S.E. C.R. P Label SocialTrust,96,93 1,32 *** par_9 d3,636,76 8,417 *** par_1 d2,841,16 7,927 *** par_ e6 2,478,9 27,622 *** par_12 e5 1,778,68 25,983 *** par_13 e4 1,393,71 19,718 *** par_14 e3 4,296,13 32,982 *** par_15 e2 4,5,124 32,75 *** par_16 e1 3,996,122 32,639 *** par_17 e9 1,868,69 26,999 *** par_18 e8 1,5,6 18,45 *** par_19 e7 1,737,65 26,838 *** par_2 99

101 Squared Multiple Correlations: Estimate StfCntry,88 PolTrust,818 stfeco,644 stfgov,799 stfdem,64 ppltrst,194 pplfair,187 pplhlp,156 trstprl,769 trstplt,65 trstlgl,61 1

102 Implied (for all variables) Correlations (Group number 1 - Default model) SocialTrust StfCntry PolTrust stfeco stfgov stfdem ppltrst pplfair pplhlp trstprl trstplt trstlgl SocialTrust 1, StfCntry,899 1, PolTrust,94,813 1, stfeco,722,83,653 1, stfgov,83,894,727,717 1, stfdem,719,8,65,642,715 1, ppltrst,44,396,398,318,354,316 1, pplfair,432,389,391,312,347,3,19 1, pplhlp,396,356,358,285,318,284,174,171 1, trstprl,793,713,877,572,637,57,349,343,314 1, trstplt,729,655,86,526,586,524,321,315,288,77 1, trstlgl,71,63,775,56,563,54,39,33,277,679,625 1,

103 5) Fit diagnostics Interpretation does not proceed until the goodness of fit has been assessed. The fit diagnostics attempt to determine if the model is correct and useful. o Correct model: its restrictions are true in the population. Relationships are correctly specified without the omission of relevant parameters. o In a correct model, the differences between S and (p) are small and random. o Correctness must not be strictly understood. A model must be an approximation of reality, not an exact copy of it. o Thus, a good model will be a compromise between parsimony and approximation. Diagnostics will usually do well at distinguishing really badly fitting models from fairly well fitting models. Many models will fit fairly well (even exactly equally well if equivalent) and will be hard to distinguish statistically, they can be only distinguished theoretically. 12

104 The 2 goodness of fit statistic Null hypothesis: the model is correct, without omitted relevant parameters: H :=() 2 goodness of fit statistic follows a 2 distribution with g degrees of freedom. Rejection implies concluding that some relevant parameters have been omitted. Sample size and power of the test are often high. Researchers are usually willing to accept approximately correct models with small misspecifications, which are rejected due to the high power. Quantifying the degree of misfit is more useful than testing the hypothesis of exact fit. 13

105 5.1 Global diagnostics First look for serious problems (common for small samples, very badly fitting models, and models with two indicators per factor): o Lack of convergence of the estimation algorithm. o Underidentification. Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 1 Number of distinct parameters to be estimated: 12 Degrees of freedom (1-12): -2 Result (Default model) The model is probably unidentified. In order to achieve identifiability, it will probably be necessary to impose 1 additional constraint. If Inadmissible estimates (e.g. negative variances, correlations larger than 1...). 14

106 Fix negative unsignificant variances to zero: Revise the model if there are significant negative variances. Merge in one pairs of factors with correlations larger than 1 or not significantly lower than 1. 15

107 The Tucker and Lewis (1973) index (TLI) and Bentler s (199) comparative fit index (CFI) introduce the degrees of freedom of the base (g b ) and researcher (g) models to account for parsimony. They will increase after adding parameters only if the 2 statistic decreases more substantially than g. TLI 2 b g b 2 b 2 g g b 1 CFI 2 2 ( b gb ) ( g ) min ; 1 2 g Root mean squared error of approximation (RMSEA) (Steiger, 199): 2 max g ; RMSEA N Values below.5 are considered acceptable. g b The sampling distribution is known, which makes it possible to do confidence intervals and test the hypothesis of approximate fit. If both extremes of the interval are larger than.5, a very bad fit can be concluded. If both extremes are below.5, a very good fit can be concluded. b 16

108 5.2. Detailed diagnostics Are standardized estimated values reasonable and of the expected sign? Are there significant residuals that suggest the addition of parameters? (To estimate them in Amos Analysis properties \ Output \ Residual moments). The values are t-values. Each residual covariance, has been divided by an estimate of its standard error. In sufficiently large samples, these standardized residual covariances have a standard normal distribution if the model is correct. So, if the model is correct, most of them should be less than two in absolute value. Are there low R 2 values suggesting the omission of explanatory variables or low values suggesting a lack of validity? (To estimate them in Amos Analysis properties \ Output \ Squared Multiple Correlations) 17

109 The modification index is an individual significance test of omitted parameters (H o : the omitted parameter is zero in the population). o Reject hypothesis above critical 2 value with 1 df for type I risk 5%. o Always consider the expected standardized estimated parameter and its sign: if power is high, parameters of a substantially insignificant value can be statistically significant. Only add parameters of a substantial size. Residuals and modification indices can suggest the addition of parameters in order to improve fit. A model can also be improved by dropping irrelevant parameters (parsimony principle). The usual t statistic tests the significance of included parameters (H o : the included parameter is zero in the population). o Non-significant disturbance covariances and measurement error covariances should be dropped from the model. Non-significant parameters may be dropped from the model if their theoretical argumentation is weak. Non -significant parameters reveal invalidity. 18

110 Standardized Residual Covariances (Group number 1 - Default model) stfeco stfgov stfdem ppltrst pplfair pplhlp trstprl trstplt trstlgl stfeco, stfgov,37, stfdem,442 -,255, ppltrst -1,694-3,55-1,881, pplfair -1,257-3,363-1,656 16,729, pplhlp,569-1,851-1,39,168 12,884, trstprl -,739 1,37 -,236-1,988-2,438-3,59, trstplt -,483 2,35,262-1,888-2,613-3,316,487, trstlgl,888 1,53 3,394 -,47 -,685-1,682,1-1,4, 19

111 Modification Indices Covariances: (Group number 1 - Default model) M.I. Par Change d2 <--> d3 37,757,224 e7 <--> d2 1,661 -,152 e8 <--> d2 59,293,325 e1 <--> d3 68,71 -,454 e1 <--> d2 21,676 -,34 e1 <--> e8 44,484 -,38 e2 <--> d3 55,293 -,41 e2 <--> d2 4,987 -,421 e2 <--> e8 42,665 -,374 e2 <--> e1 474,885 1,882 e3 <--> d3 9,736 -,177 e3 <--> d2 71,245 -,57 e3 <--> e7 7,221,17

112 Model Fit Summary RMR, GFI Model RMR GFI AGFI PGFI Default model,417,894,89,496 Saturated model, 1, Independence model 2,251,366,27,293 Baseline Comparisons Model NFI RFI IFI TLI Delta1 rho1 Delta2 rho2 CFI Default model,897,852,899,854,899 Saturated model 1, 1, 1, Independence model,,,,, RMSEA Model RMSEA LO 9 HI 9 PCLOSE Default model,136,129,143, Independence model,356,35,361, 1

113 5.3. Model modification. Capitalization on chance Frequently models fail to pass the diagnostics. Which modifications introduce and in which order? o Introduce modifications one at a time, and carefully examine results before introducing the next. One modification can modify the need for another. o First improve fit (add parameters). Then improve parsimony (drop parameters). o Disregard high modification indices with very small expected estimates. o Consider models with good descriptive fit indices, even if the 2 test rejects them (parsimony-approximation compromise). o Avoid adding theoretically uninterpretable parameters, no matter how significant. o Make few modifications. o The selected model must pass the diagnostics, theoretically relevant and useful. o Modified models can be compared with CFI and RMSEA. 2

114 Model modification has some undesirable statistical consequences, especially if modifications are blindly done using only statistics, that is, without theory. Even if model modification has been done carefully, modifications are based on a particular sample. Have we reached a model that fits the population? Bias of estimates and significance tests: only large and significant parameters have been considered to be candidates for addition. The introduction of modifications that improve the fit to the sample but not to the population is known as capitalization on chance. The only solution is to check that the model fits well beyond the particular sample used: o Crossvalidation: estimation and goodness of fit test of the model on an independent sample of the same population. If only one sample is available, it can be split: the first half is used for model modification and the second for validation. Crossvalidation is successful if the model fits the second sample reasonably well. 3

115 Complete Example Modeling Stages: CFA model 4

116 trstprl = 1*Political Trust + e4 trstplt = 52 * Political Trust + e5 trstlgl = 62 * Political Trust + e6 stfeco= 73 * StfCntry + e7 stfgov= 83 * StfCntry + e8 stfdem=1 * StfCntry + e9 Variances of independent variables e4= 44 ; e5 = 55 ; e6 = 66 ; e7 = 77 ; ; e8 = 88 ; ; e9 = 99 ; Cov(StfCntry,Poltrust)= φ 23 ; Var(StfCntry) = φ 33 ; Var(polTrust) = φ 22 ; 5

117 The model has 6x7/2=21 variances and covariances, and 13 parameters (6 error variances, 2 factor variances, 1 factor covariance, 4 loadings): 8 degrees of freedom. Each factor has at least 2 pure indicators: the measurement part is identified. In the complete model with parameters, the factors are related in a recursive system without error covariances: it is identified. This model only has measurement equations. The loading of the first variable in each factor is equal to 1. The remaining loadings are free. Each observed variable also has an error variance. By default all factor variances and covariances are free. To constrain factors to be uncorrelated one would add the constrained parameter with a value of in the Covariance: 6

118 Estimation Your model contains the following variables (Group number 1) Observed, endogenous variables trstlgl trstplt trstprl stfdem stfgov stfeco Unobserved, exogenous variables PolTrust e6 e5 e4 StfCntry e9 e8 e7 Variable counts (Group number 1) Number of variables in your model: 14 Number of observed variables: 6 Number of unobserved variables: 8 Number of exogenous variables: 8 Number of endogenous variables: 6 7

119 Parameter Summary (Group number 1) Weights Covariances Variances Means Intercepts Total Fixed 8 8 Labeled Unlabeled Total Sample Moments (Group number 1) Sample Covariances (Group number 1) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 4,885 stfgov 3,734 5,534 stfdem 3,284 3,792 5,182 trstprl 3,7 3,861 3,152 6,2 trstplt 2,565 3,45 2,721 3,977 5,8 trstlgl 2,898 3,446 3,34 4,153 3,37 6,24 8

120 Sample Correlations (Group number 1) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 1, stfgov,718 1, stfdem,653,78 1, trstprl,554,669,564 1, trstplt,515,642,53,719 1, trstlgl,526,588,583,68,6 1, Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 21 Number of distinct parameters to be estimated: 13 Degrees of freedom (21-13): 8 Result (Default model) Minimum was achieved Chi-square = 121,665 Degrees of freedom = 8 Probability level =, 9

121 Estimates (Group number 1 - Default model). Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. P Label trstlgl <--- PolTrust,893,21 43,329 *** par_1 ### Significant### trstplt <--- PolTrust,848,18 46,413 *** par_2 ### Significant### trstprl <--- PolTrust 1, stfdem <--- StfCntry 1, stfgov <--- StfCntry 1,174,25 47,43 *** par_3 ### Significant### stfeco <--- StfCntry,97,23 41,41 *** par_4 ### Significant### Standardized Regression Weights: Estimate We can compute R 2 trstlgl <--- PolTrust,771,771 2 =,594 trstplt <--- PolTrust,81,81 2 =,656 trstprl <--- PolTrust,877 ### Large values ###,877 2 =,769 stfdem <--- StfCntry,795,795 2 =,632 stfgov <--- StfCntry,93,93 2 =,815 stfeco <--- StfCntry,795,795 2 =,632 12

122 Squared Multiple Correlations: (Group number 1 - Default model). This is R 2 Estimate stfeco,631 =,795 2 stfgov,816 =,93 2 stfdem,632 =,795 2 trstprl,769 =,877 2 trstplt,656 =,81 2 trstlgl,595 =,771 2 Covariances: (Group number 1 - Default model) Estimate S.E. C.R. P Label PolTrust <--> StfCntry 3,281,129 25,485 *** par_5 ### Significant ### Correlations: (Group number 1 - Default model) Estimate 121

123 Estimate PolTrust <--> StfCntry,843 ### Smaller than 1 ### Variances: (Group number 1 - Default model) Estimate S.E. C.R. P Label PolTrust 4,627,181 25,579 *** par_6 StfCntry 3,275,147 22,256 *** par_7 e6 2,513,9 27,853 *** par_8 e5 1,75,68 25,858 *** par_9 ###Positive ### e4 1,393,7 19,843 *** par_1 e9 1,97,7 27,4 *** par_ e8 1,2,59 17,269 *** par_12 e7 1,8,66 27,421 *** par_13 Implied (for all variables) Correlations (Group number 1 - Default model) StfCntry PolTrust stfeco stfgov stfdem trstprl trstplt trstlgl StfCntry 1, PolTrust,843 1, 122

124 StfCntry PolTrust stfeco stfgov stfdem trstprl trstplt trstlgl stfeco,795,67 1, stfgov,93,761,718 1, stfdem,795,67,632,718 1, trstprl,739,877,587,667,587 1, trstplt,682,81,542,616,542,71 1, trstlgl,65,771,516,587,517,676,624 1, 123

125 FIT DIAGNOSTICS Model Fit Summary CMIN Model NPAR CMIN DF P CMIN/DF Default model ,665 8, 15,28 Saturated model 21, Independence model ,41 15, 581,761 RMR, GFI Model RMR GFI AGFI PGFI Default model,3,982,953,374 Saturated model, 1, Independence model 2,88,342,78,244 Baseline Comparisons Model NFI RFI IFI TLI Delta1 rho1 Delta2 rho2 CFI Default model,986,974,987,976,987 Saturated model 1, 1, 1, Independence model,,,,, 124

126 RMSEA Model RMSEA LO 9 HI 9 PCLOSE Default model,78,66,91, Independence model,499,491,58, AIC Model AIC BCC BIC CAIC Default model 147, , , ,462 Saturated model 42, 42, , ,826 Independence model 8738, , , ,

127 Matrices (Group number 1 - Default model Residual Covariances (Group number 1 - Default model) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco, stfgov,3, stfdem,16 -,53, trstprl -,177,1 -,128, trstplt -,136,138 -,62,52, trstlgl,55,7,374,21 -,136, Standardized Residual Covariances (Group number 1 - Default model) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco, stfgov,23, stfdem,86 -,387, ### Values < 2 ### trstprl -1,356,72 -,956, trstplt -1,161 1,66 -,515,367, trstlgl,426,47 2,827,14 -,99, Modification Indices (Group number 1 - Default model) 126

128 Covariances: (Group number 1 - Default model) M.I. Par Change e7 <--> StfCntry 8,471,1 e7 <--> PolTrust 12,56 -,164 e8 <--> PolTrust 5,674,96 e9 <--> e7 1,516,146 e9 <--> e8 6,49 -,96 e4 <--> StfCntry 8,131 -,16 e4 <--> PolTrust 5,47,99 e4 <--> e7 5,2 -, e4 <--> e9 12,892 -,164 e5 <--> e7 7,757 -,125 e5 <--> e8 22,745,193 e5 <--> e9 6,924 -,122 e5 <--> e4 4,317,88 e6 <--> StfCntry 8,9,135 e6 <--> PolTrust 6,25 -,134 e6 <--> e8 19,99 -,2 e6 <--> e9 62,688,427 e6 <--> e5 13,862 -,193 If you repeat the analysis treating the covariance between e7 and StfCntry as a free parameter, its estimate will become larger by approximately,1 than it is in the present analysis. PROBLEM: Change in covariances, the change in correlations is not known! Regression Weights: (Group number 1 - Default model) 127

129 M.I. Par Change stfeco <--- trstprl 4,641 -,27 stfeco <--- trstplt 6,938 -,36 stfgov <--- trstplt,171,41 stfdem <--- trstlgl 22,12,59 trstprl <--- stfeco 5,41 -,31 trstprl <--- stfdem 8,843 -,4 trstplt <--- trstlgl 4,962 -,27 trstlgl <--- stfdem 28,944,85 128

130 Model modification: 129

131 Variable Summary (Group number 1) Your model contains the following variables (Group number 1) Observed, endogenous variables trstlgl trstplt trstprl stfdem stfgov stfeco Unobserved, exogenous variables PolTrust e6 e5 e4 StfCntry e9 e8 e7 Variable counts (Group number 1) Number of variables in your model: 14 Number of observed variables: 6 Number of unobserved variables: 8 Number of exogenous variables: 8 Number of endogenous variables: 6 13

132 Sample Moments (Group number 1) Sample Covariances (Group number 1) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 4,885 stfgov 3,734 5,534 stfdem 3,284 3,792 5,182 trstprl 3,7 3,861 3,152 6,2 trstplt 2,565 3,45 2,721 3,977 5,8 trstlgl 2,898 3,446 3,34 4,153 3,37 6,24 Sample Correlations (Group number 1) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 1, stfgov,718 1, stfdem,653,78 1, trstprl,554,669,564 1, trstplt,515,642,53,719 1, trstlgl,526,588,583,68,6 1, 131

133 Notes for Model (Default model) Computation of degrees of freedom (Default model) Number of distinct sample moments: 21 Number of distinct parameters to be estimated: 14 Degrees of freedom (21-14): 7 Result (Default model) Minimum was achieved Chi-square = 56,72 Degrees of freedom = 7 Probability level =, 132

134 Estimates (Group number 1 - Default model) Maximum Likelihood Estimates Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. P Label trstlgl <--- PolTrust,882,2 43,145 *** par_1 trstplt <--- PolTrust,846,18 46,79 *** par_2 trstprl <--- PolTrust 1, stfdem <--- StfCntry 1, stfgov <--- StfCntry 1,196,26 46,64 *** par_3 stfeco <--- StfCntry,979,24 41,131 *** par_4 Standardized Regression Weights: (Group number 1 - Default model) Estimate trstlgl <--- PolTrust,765 trstplt <--- PolTrust,812 trstprl <--- PolTrust,881 stfdem <--- StfCntry,787 stfgov <--- StfCntry,91 stfeco <--- StfCntry,

135 Covariances: (Group number 1 - Default model) Estimate S.E. C.R. P Label PolTrust <--> StfCntry 3,229,128 25,315 *** par_5 e6 <--> e9,448,57 7,815 *** par_6 Correlations: (Group number 1 - Default model) Estimate PolTrust <--> StfCntry,835 e6 <--> e9,199 Variances: (Group number 1 - Default model) PolTrust StfCntry e6 e5 e4 e9 e8 e7 Estimate S.E. C.R. P Label 4,674,182 25,727 *** par_7 3,22,146 21,86 *** par_8 2,581,92 28,8 *** par_9 1,732,68 25,635 *** par_1 1,346,71 19,5 *** par_ 1,969,72 27,27 *** par_12,955,6 15,814 *** par_13 1,816,67 27,214 *** par_14 134

136 Squared Multiple Correlations: (Group number 1 - Default model) Estimate stfeco,628 stfgov,827 stfdem,619 trstprl,776 trstplt,659 trstlgl,585 Matrices (Group number 1 - Default model) Implied (for all variables) Covariances (Group number 1 - Default model) StfCntry PolTrust stfeco stfgov stfdem trstprl trstplt trstlgl StfCntry 3,22 PolTrust 3,229 4,674 stfeco 3,134 3,161 4,885 stfgov 3,828 3,862 3,748 5,534 stfdem 3,22 3,229 3,134 3,828 5,171 trstprl 3,229 4,674 3,161 3,862 3,229 6,2 trstplt 2,733 3,956 2,676 3,268 2,733 3,956 5,8 trstlgl 2,847 4,121 2,787 3,45 3,295 4,121 3,488 6,

137 Implied (for all variables) Correlations (Group number 1 - Default model) StfCntry PolTrust stfeco stfgov stfdem trstprl trstplt trstlgl StfCntry 1, PolTrust,835 1, stfeco,793,662 1, stfgov,91,759,721 1, stfdem,787,657,624,716 1, trstprl,736,881,583,669,579 1, trstplt,678,812,537,616,533,715 1, trstlgl,638,765,56,581,581,674,621 1, Implied Covariances (Group number 1 - Default model) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 4,885 stfgov 3,748 5,534 stfdem 3,134 3,828 5,171 trstprl 3,161 3,862 3,229 6,2 trstplt 2,676 3,268 2,733 3,956 5,8 trstlgl 2,787 3,45 3,295 4,121 3,488 6,

138 Implied Correlations (Group number 1 - Default model) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco 1, stfgov,721 1, stfdem,624,716 1, trstprl,583,669,579 1, trstplt,537,616,533,715 1, trstlgl,56,581,581,674,621 1, Standardized Residual Covariances (Group number 1 - Default model) stfeco stfgov stfdem trstprl trstplt trstlgl stfeco, stfgov -,15, stfdem 1,223 -,268,72 trstprl -1,188 -,1 -,576, trstplt -,949 1,56 -,12,148, trstlgl,865,293,6,215 -,861 -,58 137

139 Modification Indices (Group number 1 - Default model) Covariances: (Group number 1 - Default model) M.I. Par Change e7 <--> StfCntry 5,54,85 e7 <--> PolTrust 7,595 -,129 e8 <--> StfCntry 4,415 -,66 e8 <--> PolTrust 7,139,17 e9 <--> e7 13,443,164 e4 <--> StfCntry 5,354 -,85 e4 <--> e7 5,321 -,12 e5 <--> e7 8,455 -,131 e5 <--> e8 13,574,148 e6 <--> e7 7,879,146 e6 <--> e5 8,934 -,153 Variances: (Group number 1 - Default model) M.I. Par Change 138

140 Regression Weights: (Group number 1 - Default model) M.I. Par Change stfeco <--- stfdem 6,196,33 stfeco <--- trstplt 6,1 -,33 stfgov <--- trstplt 8,429,35 stfdem <--- stfeco 4,332,29 trstprl <--- stfeco 4,489 -,29 trstlgl <--- stfeco 4,35,32 Model Fit Summary CMIN Model NPAR CMIN DF P CMIN/DF Default model 14 56,72 7, 8,1 Saturated model 21, Independence model ,41 15, 581,

141 RMR, GFI Model RMR GFI AGFI PGFI Default model,74,992,976,331 Saturated model, 1, Independence model 2,88,342,78,244 Baseline Comparisons Model NFI RFI IFI TLI Delta1 rho1 Delta2 rho2 CFI Default model,994,986,994,988,994 Saturated model 1, 1, 1, Independence model,,,,, RMSEA Model RMSEA LO 9 HI 9 PCLOSE Default model,55,42,69,252 Independence model,499,491,58, 14

142 References Anderson, J.C., & Gerbing, D.W. (1988), Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 13(3), Bagozzi, R.P., & Yi, Y. (1988). On the evaluation of structural equation models. Journal of Academy of Marketing Science, 6(1), Chin, W.W. and Newsted, P.R. (1999), Structural equation modeling analysis with small samples using partial least squares, in Hoyle, R.R. (Ed.), Statistical Strategies for Small Sample Research, Sage Publications, Thousand Oaks, CA, pp Churchill, G. (1979), A Paradigm for Developing Better Measures of Marketing Constructs, Journal of Marketing Research, Vol. 16 (February 1979), pp Fornell, C. and Larcker, D.F. (1981), Evaluating structural equation models with unobservable variables and measurement error, Journal of Marketing Research, Vol. 18, No. 1, pp Gounaris, S., Dimitriadis, S., 23. Assessing service quality on the web: evidence from business-to-consumer portals. Journal of Services Marketing 17 (4/5), Grewal D, Monreo KB, Krishnan R. The effects of price-comparison advertising on buyers perceptions of acquisition value, transaction value, and behavioral intentions. J Market 1998;62(2): Hair, J.F., Jr, Anderson, R.E., Tatham, R.L., & Black, W.C. (1999). Multivariate data analysis. London: Prentice Hall. Hulland, J. (1999), Use of partial least squares (PLS) in strategic management research: a review of four recent studies, Strategic Management Journal, Vol. 2, No. 2, pp Ladhari, R. (21): Developing e-service quality scales: a literature review, Journal of Retailing and Consumer Services, Vol 17. pp Ledden, L., Kalafatis, S., Samouel, P. (27), The relationship between personal values and perceived value of education, Journal of Business Research 6 (27) Nunnally, J.C. and Bernstein, I.H. (1994), Psychometric Theory, 3rd ed., McGraw-Hill, New York, NY. Ribbink, D., van Riel, A., Liljander, V. and Streukens, S. (24), Comfort your online customer: quality, trust and loyalty on the internet, Managing Service Quality, Vol. 14, No. 6, pp White, J.C., Varadarajan, P.R. and Dacin, P.A. (23), Market situation interpretation and response: the role of cognitive style, organizational culture, and information use, Journal of Marketing, Vol. 67, No. 3, pp Wolfinbarger, M., Gilly, M.C., 23. ETailQ: dimensionalizing, measuring and predicting retail quality. Journal of Retailing 79 (3),

143 Amos Graphics 142

144 143

145 144

146 145

147 146

148 147

149 148

Applications of Structural Equation Modeling in Social Sciences Research

Applications of Structural Equation Modeling in Social Sciences Research American International Journal of Contemporary Research Vol. 4 No. 1; January 2014 Applications of Structural Equation Modeling in Social Sciences Research Jackson de Carvalho, PhD Assistant Professor

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Introduction to Path Analysis

Introduction to Path Analysis This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

An Introduction to Path Analysis. nach 3

An Introduction to Path Analysis. nach 3 An Introduction to Path Analysis Developed by Sewall Wright, path analysis is a method employed to determine whether or not a multivariate set of nonexperimental data fits well with a particular (a priori)

More information

Structural Equation Modelling (SEM)

Structural Equation Modelling (SEM) (SEM) Aims and Objectives By the end of this seminar you should: Have a working knowledge of the principles behind causality. Understand the basic steps to building a Model of the phenomenon of interest.

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

More information

An Empirical Study on the Effects of Software Characteristics on Corporate Performance

An Empirical Study on the Effects of Software Characteristics on Corporate Performance , pp.61-66 http://dx.doi.org/10.14257/astl.2014.48.12 An Empirical Study on the Effects of Software Characteristics on Corporate Moon-Jong Choi 1, Won-Seok Kang 1 and Geun-A Kim 2 1 DGIST, 333 Techno Jungang

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Canonical Correlation Analysis

Canonical Correlation Analysis Canonical Correlation Analysis LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the similarities and differences between multiple regression, factor analysis,

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling?

Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling? Structural Equation Modeling (SEM) for Dummies Joseph J. Sudano, Jr., PhD Center for Health Care Research and Policy Case Western Reserve University at The MetroHealth System Presentation Outline Conceptual

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Common factor analysis

Common factor analysis Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: SIMPLIS Files Table of contents SIMPLIS SYNTAX FILES... 1 The structure of the SIMPLIS syntax file... 1 $CLUSTER command... 4

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Use of structural equation modeling in operations management research: Looking back and forward

Use of structural equation modeling in operations management research: Looking back and forward Journal of Operations Management 24 (2006) 148 169 www.elsevier.com/locate/dsw Use of structural equation modeling in operations management research: Looking back and forward Rachna Shah *, Susan Meyer

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine 2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels

More information

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong Seminar on Quantitative Data Analysis: SPSS and AMOS Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong SBAS (Hong Kong) Ltd. All Rights Reserved. 1 Agenda MANOVA, Repeated

More information

Analyzing Structural Equation Models With Missing Data

Analyzing Structural Equation Models With Missing Data Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.

More information

Confirmatory factor analysis in MPlus

Confirmatory factor analysis in MPlus Jan Štochl, Ph.D. Department of Psychiatry University of Cambridge Email: js883@cam.ac.uk Confirmatory factor analysis in MPlus The Psychometrics Centre Agenda of day 1 General ideas and introduction to

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Confirmatory Factor Analysis using Amos, LISREL, Mplus, SAS/STAT CALIS* Jeremy J. Albright

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

More information

Goodness of fit assessment of item response theory models

Goodness of fit assessment of item response theory models Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht This article was downloaded by: [University Library Utrecht] On: 15 May 2012, At: 01:20 Publisher: Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

More information

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract

More information

1 Another method of estimation: least squares

1 Another method of estimation: least squares 1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

More information

WHAT IS A JOURNAL CLUB?

WHAT IS A JOURNAL CLUB? WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Correlational Research

Correlational Research Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data. Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

More information

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

More information

Example G Cost of construction of nuclear power plants

Example G Cost of construction of nuclear power plants 1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor

More information