ASSESSING UNIDIMENSIONALITY OF PSYCHOLOGICAL SCALES: USING INDIVIDUAL AND INTEGRATIVE CRITERIA FROM FACTOR ANALYSIS. SUZANNE LYNN SLOCUM

Size: px
Start display at page:

Download "ASSESSING UNIDIMENSIONALITY OF PSYCHOLOGICAL SCALES: USING INDIVIDUAL AND INTEGRATIVE CRITERIA FROM FACTOR ANALYSIS. SUZANNE LYNN SLOCUM"

Transcription

1 ASSESSING UNIDIMENSIONALITY OF PSYCHOLOGICAL SCALES: USING INDIVIDUAL AND INTEGRATIVE CRITERIA FROM FACTOR ANALYSIS by. SUZANNE LYNN SLOCUM M.A., Boston College (1999) B.A. (Honors Economics), Albion College (1995) A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Measurement, Evaluation and Research Methodology Program v THE UNIVERSITY OF BRITISH COLUMBIA May 2005 Suzanne Lynn Slocum, 2005

2 11 Abstract Whenever one uses a composite scale score from item responses, one is tacitly assuming that the scale is dominantly unidimensional. Investigating the unidimensionality of item response data is an essential component of construct validity. Yet, there is no universally accepted technique or set of rules to determine the number of factors to retain when assessing the dimensionality of item response data. Typically factor analysis is used with the eigenvaluesgreater-than-one rule, the ratio of first-to-second eigenvalues, parallel analysis (PA), root-meansquare-error-of-approximation (RMSEA), or hypothesis testing approaches involving chi-square tests from Maximum Likelihood (ML) or Generalized Least Squares (GLS) estimation. The purpose of this study was to investigate how these various procedures perform individually and in combination when assessing the unidimensionality of item response data via a computer simulated design. Conditions such as sample size, magnitude of communality, distribution of item responses, proportion of communality on second factor, and the number of items with nonzero loadings on the second factor were varied. Results indicate that there was no one individual decision-making method that identified undimensionality under all conditions manipulated. All individual decision-making methods failed to detect unidimensionality for the case where sample size was small, magnitude of communality was low, and item distributions were skewed. In addition, combination methods performed better than any one individual decision-making rule in certain sets of conditions. A set of guidelines and a new statistical methodology are provided for researchers. A future program of research is also illustrated.

3 Table of Contents iii ABSTRACT TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES ACKNOWLEDGEMENTS CHAPTER 1 1 INTRODUCTION TO THE PROBLEM 1 BACKGROUND 1 PROBLEM UNDER INVESTIGATION 3 APPLICATION OF THE DECISION-MAKING RULES AND INDICES 6 OBJECTIVES AND PURPOSE OF THE STUDY 11 CHAPTER II 13 INTRODUCTION TO THE FACTOR ANALYTIC MODEL 13 DIMENSIONALITY 15 Strict and Essential Unidimensionality 17 FACTOR ANALYSIS PROCEDURES AND CONCEPTS 17 Factor Analytic Model 18 Communality 21 Assumptions of the Common Factor Model 23 Correlated and Uncorrelated Factor Models 23 Eigenvalues and Eigenvectors 25 CHAPTER III 30 REVIEW OF THE RELEVANT LITERATURE 30 METHODS FOR DETERMINING THE NUMBER OF FACTORS TO RETAIN 30 Definition of Decision-Making Rules and Indices 32 ITEM RESPONSE THEORY 35 Investigating the Dimensionality of Test Item Data: The Role oflrt. 37 Simulation Studies 38 INVESTIGATING THE RECOVERY OF POPULATION FACTOR STRUCTURE 41 Theoretical Development of the Sources of Error in Factor Analysis 41 Over determination 44 Simulation Studies 45 PREVIOUS RESEARCH FINDINGS ON DECISION-MAKING RULES AND INDICES 49 RESEARCH QUESTIONS 53 II Ill..VI VIII X

4 Research Question One 54 Research Question Two 54 Research Question Three 55 RATIONALE AND GENERAL PREDICTIONS OF THIS STUDY 55 MEANINGFUL AND APPROPRIATE NUMBER OF FACTORS: FINDINGS AND RECOMMENDATIONS. 58 Appropriate Number of Factors to Retain in This Dissertation 64 SIGNIFICANCE OF THE DISSERTATION 66 Scope of the Study 66 CURRENT RESEARCH PRACTICES 67 CHAPTER IV 68 AN INVESTIGATION OF THE CURRENT RESEARCH PRACTICES OF ASSESSING DIMENSIONALITY 68 VIA EXPLORATORY FACTOR ANALYSIS 68 iv INTRODUCTION TO THE PROBLEM 68 PURPOSE OF THE INVESTIGATION 69 PREVIOUS SURVEYS 70 METHODS 71 Criteria Recorded 72 RESULTS 75 Results from Recorded Criteria 75 Summary, 80 CONCLUSION 82 CHAPTER V, 83 COMPUTER SIMULATION METHODOLOGY 83 STRICT UNIDIMENSIONALITY 83 Conditions for the Magnitude of Communality 85 Conditions for the Sample Size 85 Conditions for Distributions 86 ESSENTIAL UNIDIMENS ION ALLT Y 87 Conditions for the Proportion of Communality on Secondary Factor 88 Conditions for the Number of Test Items with Non-Zero Loadings on the Second Factor.. 90 FACTORS HELD CONSTANT 91 DEPENDENT VARIABLES: DECISION-MAKING RULES AND INDICES 92 PROCEDURES 93 DATA ANALYSES 98 Research Questions 98 Criteria for Best and Optimal Performance 101 SOFTWARE 104

5 CHAPTER VI 105 RESULTS 105 DATA CHECK 105 RESEARCH QUESTION 1A 107 Strict Unidimensionality 107 Essential Unidimensionality Ill RESEARCH QUESTION IB 122 Strict Unidimensionality 123 Essential Unidimensionality 133 RESEARCH QUESTIONS 2A AND 2B 147 Strict Unidimensionality 148 Combinations for Strict Unidimensionality 151 Essential Unidimensionality 153 Combinations for Essential Unidimensionality 156 RESEARCH QUESTION 3A AND 3B 158 Strict Unidimensionality 158 Essential Unidimensionality 163 SUMMARY FOR STRICT UNEDIMENSIONALITY 172 SUMMARY FOR ESSENTIAL UNIDIMENSIONALITY 173 CHAPTER VII 176 DISCUSSION 176 GENERAL PREDICTIONS OF THE STUDY 176 CONTRIBUTIONS 179 Guidelines 180 LIMITATIONS AND FUTURE PROGRAM OF RESEARCH 187 REFERENCES 193 APPENDIX A 203 DETAILS OF CRITERIA REPORTED IN JOURNALS 203 V

6 List of Tables vi Table 1 11 Decision-Making Rules and Indices with the Students in my Classroom Scale 11 Table 3 74 Summary of Criteria Reported in Journals 74 Table 4 89 Example of the Relationship between Communality and Factor Loadings 89 Table 5 94 Example of Population Pattern Matrix: Essential Unidimensional 94 Table Strict Unidimensional Results of the Mean Accuracy Rates for Rules and Indices Table Essential Unidimensional Results of the Mean Accuracy Rates for Rules and Indices 112 (Skewness of 0.0 and Communality of 0.20) 112 Table Essential Unidimensional Results of the Mean Accuracy Rates for Rules and Indices 114 (Skewness of 0.00 and Communality of 0.90) 114 Table Essential Unidimensional Results of the Mean Accuracy Rates for Rules and Indices 115 (Skewness of 2.50 and Communality of 0.20) 115 Table Essential Unidimensional Results of the Mean Accuracy Rates for Rules and Indices 117 (Skewness of 2.50 and Communality of 0.90) 117 Table Binary Logistic Regression Results for Main Effects: Strict Unidimensionality 130 Table Binary Logistic Regression Results for Main Effects: Essential Unidimensionality 143 Table List of New Combination Rules for Strict Unidimensionality 152 Table List of New Combination Rules for Essential Unidimensionality 157 Table Strict Unidimensional Results of the Mean Accuracy Rates for the Four Main Groups 159 Table Strict Unidimensional Results of the Mean Accuracy Rates for the New Rules 160 Table Essential Unidimensional Results of the Mean Accuracy Rates for the Four Main Groups 163 Table Essential Unidimensional Results of the Mean Accuracy Rates for the New Rules 167 Table Recommended Combination Rules for Strict Unidimensionality 182

7 Table Recommended Individual Rules for Strict Unidimensionality 183 Table Recommended Combination Rules for Essential Unidimensionality 184 Table Recommended Individual Rules for Essential Unidimensionality 185 vii

8 List of Figures viii Figure 1 7 Students in my Classroom Scale 7 Figure 2 8 Distribution of the Total Score for the Students in My Classroom Scale 8 Figure 3 10 Combined Scree Plots for the Continuous PA, Ordinal PA, and Student Data 10 Figure 4 97 Flowchart of Simulation Methodology 97 Figure Chi-square Statistic for Strict Unidimensionality: Effect of Sample Size 123 Figure Chi-square Statistic for Strict Unidimensionality: Effect of Skewness ; 124 Figure Chi-square Statistic for Strict Unidimensionality: Effect of the Magnitude of Communality Figure Eigenvalue Rules for Strict Unidimensionality: Effect of Sample Size 125 Figure Eigenvalue Rules for Strict Unidimensionality: Effect of Skewness 125 Figure Eigenvalue Rules for Strict Unidimensionality: Effect of the Magnitude of Communality 126 Figure Parallel Analysis for Strict Unidimensionality: Effect of Sample Size 126 Figure Parallel Analysis for Strict Unidimensionality: Effect of Skewness 127 Figure Parallel Analysis for Strict Unidimensionality: Effect of the Magnitude of Communality 127 Figure RMSEA for Strict Unidimensionality: Effect of Sample Size 128 Figure RMSEA for Strict Unidimensionality: Effect of Skewness 128 Figure RMSEA for Strict Unidimensionality: Effect of the Magnitude of Communality 129 Figure 17 : 133 Chi-square Statistic for Essential Unidimensionality: Effect of Sample Size 133 Figure Chi-square Statistic for Essential Unidimensionality: Effect of Skewness 133 Figure Chi-square Statistic for Essential Unidimensionality: Effect of the Magnitude of Communality 134 Figure

9 Chi-square Statistic for Essential Unidimensionality: Effect of the Proportion of Communality on the Second Factor 134 Figure Chi-square Statistic for Essential Unidimensionality: Effect of the Number of Items Loading on the Second Factor 135 Figure Eigenvalue Rules for Essential Unidimensionality: Effect of Sample Size 135 Figure 23 '. 136 Eigenvalue Rules for Essential Unidimensionality: Effect of Skewness 136 Figure Eigenvalue Rules for Essential Unidimensionality: Effect of the Magnitude of Communality. 136 Figure 25 : 137 Eigenvalue Rules for Essential Unidimensionality: Effect of the Proportion of Communality on the Second Factor 137 Figure Eigenvalue Rules for Essential Unidimensionality: Effect of the Number of Items Loading on the Second Factor 137 Figure Parallel Analysis for Essential Unidimensionality: Effect of Sample Size 138 Figure Parallel Analysis for Essential Unidimensionality: Effect of Skewness 138 Figure Parallel Analysis for Essential Unidimensionality: Effect of the Magnitude of Communality Figure Parallel Analysis for Essential Unidimensionality: Effect of the Proportion of Communality on Second Factor 139 Figure Parallel Analysis for Essential Unidimensionality: Effect of the Number of Items Loading on the Second Factor 140 Figure RMSEA for Essential Unidimensionality: Effect of Sample Size 140 Figure RMSEA for Essential Unidimensionality: Effect of Skewness 141 Figure RMSEA for Essential Unidimensionality: Effect of the Magnitude of Communality 141 Figure RMSEA for Essential Unidimensionality: Effect of the Proportion of Communality on the Second Factor 142 Figure RMSEA for Essential Unidimensionality: Effect of the Number of Items Loading on the Second Factor 142 IX

10 X Acknowledgements I would like to acknowledge and thank my academic and dissertation supervisor, Dr. Bruno D. Zumbo. His immeasurable support with my research and professional growth is greatly valued and appreciated. I would also like to acknowledge my supervisory committee members, Dr. Bruno D. Zumbo, Dr. Anita Hubley, and Dr. Susan James, for being incredibly thorough and flexible. I thank each of you from the bottom of my heart. Finally, I would like to thank my parents, Frederick and Jill Slocum. My parents have been extraordinarily supportive and giving throughout my education, and I don't know what I would of done without them.

11 Chapter I 1 Introduction to the Problem The purpose of this chapter is to provide a general introduction to the research problem investigated in this dissertation. In doing so, the current chapter presents the background to the research questions and highlights the purpose and objectives of the study. In addition, the research problem is set in context through an example. Background Millions of people are tested every year. Inferences made from the results of these tests are used to make high-stake decisions in education and social-welfare systems, as well as for diagnosis and treatment in health care systems. The test scores, in particular, are used for research purposes. The inferences made from these test scores are highly dependent upon the accuracy and validity of the interpretation of the test results. According to the Standards for Educational and Psychological Testing (APA, AERA, & NCME, 1999), validity refers to "the appropriateness, meaningfulness, and usefulness of the specific inferences made from test scores" (p.9). As Zumbo and Hubley (1996) state, a test score is not necessarily directly linked to the construct that is being measured, but should actually be considered one (of various) indicators of the construct. Furthermore, Messick (1975) and Guion (1977) claim that in order to adequately assess the meaningfulness of an inference, one should have some kind of confirmation of what the test score itself actually exhibits. Therefore, in order to properly assess

12 the appropriateness, meaningfulness, and usefulness of the inferences made from test scores, one needs to take construct validity into consideration. Construct validity seeks agreement between a theoretical concept and a specific measurement (e.g., test). Messick (1998) has claimed that one major concern of the validity of inferences is the unanticipated or negative consequences of test score interpretation, which can often be traced to construct under-representation or construct-irrelevant variance. If a measurement instrument excludes essential facets of a construct, or if the instrument includes facets that may be irrelevant, the inferences made from the test scores could involve a variety of inappropriate consequences (e.g., social, political, or health). For example, if an instrument intended to measure depression, and actually included secondary facets of anxiety, the inferences made from the test scores could lead to inappropriate diagnosis and treatment. A patient may be prescribed a medication that treats only depression, whereas a more appropriate treatment may be a different medication or therapy. One way, among many, to prevent inappropriate consequences from test score interpretation is to focus on the theoretical dimensions of the construct a test is intending to measure. Investigating the dimensionality (i.e., the structure of a specific phenomenon) of tests is an essential component of construct validity. In fact, construct validity has often been referred to as factorial validity (Thompson & Daniel, 1996). Furthermore, when the items of a test are summed to one total score, there is a tacit assumption that the test is measuring one dimension and that this dimension is measured on a continuum. If, for example, a measure is actually multi-dimensional yet is being treated as unidimensional, interpretation of the test scores can be 2

13 invalid and entail harmful consequences, as suggested in the previous example of measuring 3 depression. Problem Under Investigation Investigating the dimensionality of test data is one of the most commonly encountered activities in day-to-day research. The structure (i.e., dimensionality) of item response data is composed of a certain number of factors. The decision, of how many factors to retain in a factor analysis not only explains the structure of the underlying phenomena, but also accounts for the relationships between the test items. This verdict is critical in that it tends to be one of the most common problems facing researchers who assess the dimensionality of measures (Crawford, 1975; Fabrigar, Wegener, MacCallum, & Strahan, 1999). When assessing dimensionality, the goal is to obtain factor solutions that are reliable and mirror the population factor structure. Evidently, it has been noted by several researchers that this verdict can lead to researchers deciding on an erroneous number of factors (e.g., Crawford, 1975; Fabrigar et al., 1999; Russell, 2002). Moreover, the consequences of over- and under- extracting the number of factors can lead to inappropriate inferences and decisions (e.g., high-stakes measurement decisions or inappropriate interventions). Yet, there is no universally accepted technique or set of rules to determine the number of factors to retain when assessing the dimensionality of item response data. Previous research has shown that many of the decision-making rules and indices that are commonly used to identify dimensionality do not work efficiently under certain conditions (e.g.,

14 small sample sizes), over-and under-extract the number of factors, and have a limited range of 4 accuracy (Gorsuch, 1983; Fabrigar et al., 1999; Russell, 2002). Most importantly, however, observations made by Hattie (1985) and Lord (1980) still hold today. Specifically, Hattie noted that there has been no empirical study that has examined the efficiency of combinations of these rules, indices, and methods. Lord declared that there is a need for an index or rule to define unidimensionality, in particular. Furthermore, it has been recommended that researchers apply multiple criteria when deciding on the appropriate number of factors to retain (Gessaroli & De Champlain, 1996; Fabrigar et al., 1999; Davison & Sireci, 2000). In fact, as demonstrated in Chapter IV, researchers are actually utilizing multiple criteria, but the rationale behind the combinations is unclear and the selection of combinations do not necessarily prove to be effective. Although there are numerous rules and indices that researchers can utilize in order to determine the dimensionality of item response data, there are a select few that have been suggested in the literature. These rules have been noted as being commonly used and, under certain conditions, successful in determining dimensionality (Hattie, 1984; Fabrigar et al., 1999; Russell, 2002). Such rules and indices include (1) the chi-square statistic from Maximum Likelihood (ML) factor analysis,-(2) the chi-square statistic from Generalized Least Squares (GLS) factor analysis, (3) the eigenvalues-greater-than-one rule, (4) the ratio-of- first- to-second-eigenvaluesgreater-than-three rule, (5) the ratio-of- first- to-second-eigenvalues-greater-than-four rule, (6) parallel analysis (PA) using continuous data, (7) parallel analysis (PA) using ordinal or rating scale (Likert) data, (8) the Root Mean Square of Approximation (RMSEA) index from ML

15 estimation, and (9) the Root Mean Square of Approximation (RMSEA) index from GLS 5 estimation. These nine decision-making rules and indices will be described in Chapter III. Although there are inherent differences in decision-making rules versus decision-making methods when determining the dimensionality of measures, this study will be using the terms interchangeably. There are numerous methods, procedures, approaches, statistics, rules and indices that are used by researchers in order to assess the dimensionality of measures. For example, the eigenvalues-greater-than-one rule entails a mathematical foundation that actually makes this rule an index (Kaiser, 1960). Parallel analysis is often considered a procedure, whereas the RMSEA is an index. The chi-square test produces a statistic, and a scree plot is used as an approach (Fabrigar et al., 1999). In summary, while there are theoretical and practical differences in these terms, in order to establish an understanding of the vocabulary used in this dissertation, 'decision-making rules and indices' will be used throughout the dissertation to refer to the various techniques that are used to determine dimensionality. As mentioned previously, the validity of inferences based on test scores is grounded upon the dimensional structure of a test. The scoring of item response data rests on an implicit, but grounding, principal: when test items are summed to one total scale score, there is a tacit assumption that a test is unidimensional (i.e., one factor). Because these total scale scores are often used for interpretation in the measurement field, it is especially important to focus on this assumption. If items are summed to a total scale score, inferences are made in regards to the one construct (i.e., factor). Lord's (1980) claim, that there is a need for an index or rule to define unidimensionality, is one that is critical to the measurement process of assessing the

16 dimensionality of tests. Because there is not a universal and sound index or methodology to 6 assess unidimensional measures, one may question the validity of the inferences made from total scale scores. This dissertation focused on developing a (universal) methodology in order to assess unidimensionality. In doing so, researchers could have more assurance that the inferences made from test scores are valid. Multidimensional measures are also used in the assessment field. However, in order to develop a baseline methodology for assessing the structure of tests, there is a need to start at the unidimensional level. Consequently, a methodology can be generalized and enhanced (i.e., builtupon) to assess multidimensional measures with confidence and accuracy. Unidimensional measures, in particular, could have a variety of different factor structures, ranging from structures with one-and-only- one factor to measures with one dominant factor that include secondary- minor dimensions. As mentioned above, this dissertation is focusing on nine decision-making rules and indices that are commonly used to assess unidimensional measures. An example is provided below in order to put (the use of) these nine individual decision-making rules and indices in context. Application of the Decision-Making Rules and Indices Students in My Classroom (Battistich, Solomon, Kim, Watson, & Schaps, 1995) is a 14- item scale that measures the degree to which students feel their classmates are supportive (i.e., measures the students' sense of the school as a community). A total scale score is computed by summing the 14 items, and each item is rated on a 5-point scale. Items 5, 9, and 10 are reversed coded. The scale can be seen in Figure 1 below.

17 7 Figure 1 Students in my Classroom Scale Directions: For the following sayings, think about yourself and people your age when you answer. For each sentence, circle the number that describes how true it is for you. 1 Disagree a lot 2 Disagree a little 3 Don't agree or disagree 4 Agree a little 5 Agree a lot Items 1...willing to go out of their way to help care about my work like a family really care about each other put others down help each other learn help each other get along together very well just look out for themselves mean to each other trouble with my school work treat each other with respect work together to solve problems everyone in the class feels good Note: Due to copyright restrictions, the complete scale is not reproduced. Instead, key words are provided to help the reader understand the context of the questions. The scale was administered to 450 students in 28 different rural and urban schools in British Columbia, Canada. The students' age ranged from 8 to 14 years with an average age of 11.3 years and a median age of 11 years. The sample consisted of 227 boys (50.4%) and 223 girls (49.6%). The scale's internal consistency (i.e., Cronbach's alpha) for this data was The distributions of the items were slightly skewed (i.e., looking across the items, the skewness

18 8 indices ranged from to 0.357). The total scale score ranged from 22 to 70 with an average of 43.4 and a median of The distribution of the total scale score can be seen in Figure 2 below. Figure 2 Distribution of the Total Score for the Students in My Classroom Scale 50 H BO.OO Total Scale Score The nine decision-making rules and indices were applied to this data in order to determine the number of factors to retain. Previous research has shown the Students in My Classroom Scale to be a unidimensional measure (Roberts, Horn & Battistich, 1995).When conducting a principal components analysis (PCA) on the current data set, the first eigenvalue (6.25) accounted for 44.67% of the total variance, the second eigenvalue (1.52) accounted for % percent of the total variance, and the third eigenvalue (0.96) accounted for 6.85% of the

19 total variance. The results of applying all nine decision-making rules and indices indicate that 9 two methods identified Students in My Classroom Scale as unidimensional: the ratio of first-tosecond-eigenvalue-greater-than three rule and the ratio of first-to-second-eigenvalue-greater-than four rule. First, as stated above, the third eigenvalue is 0.958, the second eigenvalue is 1.52, and the first is According to the eigenvalues-greater-than-one rule, two factors would be retained. In addition, PA for continuous data and PA for ordinal data generated random data with the first two eigenvalues less than the first two eigenvalues of the real data (i.e., the student data), and the third eigenvalue from both the PA continuous and PA ordinal data was greater than the third eigenvalue of the student data. Therefore, if using PA for continuous and PA for ordinal data as approaches for determining the number of factors to retain, two factors would be retained. The scree plot for the student data, random continuous PA data, and random ordinal PA data can be seen in Figure 3 below. As shown, the point in which the eigenvalues from the random data are less than the eigenvalues from the real data (i.e., student data) indicates the number of factors to retain, which was two factors in this context.

20 Figure 3 10 Combined Scree Plots for the Continuous PA, Ordinal PA, and Student Data Scree Plots for Continuous PA, Ordinal PA, and Student Data Eigenvalues for Student Data Eigenvalues for Continuous PA Data "Eigenvalues for Ordinal PA Data Furthermore, the ML and GLS chi-square tests were statistically significant, which indicates that a unidimensional model doesn't fit the data. Finally, the values for the ML RMSEA (0.11) and GLS RMSEA (0.09) were above 0.05, which indicates that a unidimensional model is not a good fit (Fabrigar et al., 1999). Therefore, a unidimensional model did not fit the data according to the both of the RMSEA indices. On the other hand, the ratio of the first-tosecond-eigenvalues was 4.11, which is greater than both three and four. Therefore, both the ratio of first-to-second-eigenvalues-greater-than-three rule and the ratio of first-to-secondeigenvalues-greater-than-four rule identified this scale as unidimensional. Table 1 below shows the results from applying the nine decision-making rules and indices to this sample data. Not only is it important to recognize the difference in the performance of these nine decision-making rules and indices, but also to acknowledge the

21 11 inconclusive findings in regards to the number of factors to retain. Which one of the nine decision-making rules are most reliable? In other words, which one of the rules should be used in determining the number of factors to retain? It is examples like the Students in my Classroom Scale that provided motivation for the investigations that were conducted in this dissertation. Table 1 Decision-Making Rules and Indices with the Students in my Classroom Scale Rule Decision Criteria Decision of Unidimensionality Chi-square ML Chi-square=429.97, df=77, p=.0001 no Chi-square GLS Chi-square=290.11, df=77, /?=.0001 no Eigenvalue>l Rule /L, = 6.254, X, =1.521 no Eigenvalue Ratio>3 Rule 6.254/1.521=4.11 yes Eigenvalue Ratio>4 Rule 6.254/1.521=4.11 yes PA Continuous Method A, (random)=1.209 X x (real-data)=6.254 no PA Likert Method X x (random)=1.207, /I, (real-data)=6.254 no RMSEA ML 0.11 no RMSEA GLS 0.09 no Objectives and Purpose of the Study The purpose of this dissertation is to extend previous research by investigating how the nine decision-making rules and indices perform individually and in combination under varying conditions (e.g., different sample sizes or magnitude of communality) when assessing the underlying unidimensionality of item response data that is often found in psychological, educational, and health measurements (e.g., subjective well-being, depression, motivation). The overall objective of the present study is to provide guidelines to assist the social and behavioral science researcher in the decision-making process of retaining factors in an assessment of

22 12 unidimensionality. With an eye toward addressing the research goals, the factor analytic model is reviewed in Chapter II, which includes a theoretical and statistical perspective on dimensionality. Relevant psychometric research literature on the assessment of dimensionality is reviewed in Chapter III. Given that a focus of this dissertation is informing day-to-day research practice, a short note summarizing how researchers are currently practicing day-to-day factor analytic investigations of determining the dimensionality of test items is presented in Chapter IV. Chapters III and IV informed the methodology presented in Chapter V. Results are described in Chapter VI, and the discussion is outlined in Chapter VII.

23 Chapter II 13 Introduction to the Factor Analytic Model The purpose of this chapter is to describe the well-known common factor model. The remainder of this dissertation will require an understanding of the statistical and theoretical assumptions and implications of factor analytic procedures. Thus, at this point, it is deemed appropriate to provide a brief introduction to the statistical foundation of factor analysis. The chapter begins by defining exploratory factor analysis (EFA) and its' purpose. Dimensionality is defined by integrating several perspectives. Strict and essential unidimensionality are defined. Lastly, the factor model is presented. There are various researchers and statisticians that have documented the factor analytic model and methods. This chapter closely follows the work of Comrey and Lee (1973; 1992), Gorsuch (1983), and Tabachnick and Fidell (2001). The equations presented are taken from Gorsuch (1983) and Comrey and Lee (1973;1992). Principal Components Analysis and Factor Analysis It is important to note that Principal Components Analysis (PCA) and Factor Analysis (FA) are two distinct methods, but are often categorized together or as the same process (i.e., PCA has been shown to be a kind of FA) (Fabrigar et al., 1999). Traditionally, the purpose of FA and PCA was to account for variance, which both of them do. FA, however, is often used to reproduce the population covariance matrix, something PCA cannot do solely alone. PCA attempts to account for all the variance of each variable, and therefore, it does not differentiate

24 14 between common and unique variance (i.e., it is assumed that all the variance is relevant). FA includes an error component to the model (i.e., DATA = MODEL +ERROR), and thus the error that is not accounted for is separated. Additionally, PCA does not produce latent constructs, but rather generates components, which are defined as linear composites of the original measured variables. Therefore, it is not conceptually correct to equate factors to components. PCA provides a complete set of eigenvalues for the correlation matrix (eigenvalues will be defined below). As in common practice, these eigenvalues will be used in this dissertation in order to determine the number of factors to retain (e.g., ratio of first- to-second eigenvalues). Likewise, FA will be used to generate the chi-square statistic, df, and p values from ML and GLS estimation methods in order to determine the number of factors to retain (e.g., ML chi-square decision-making method, GLS RMSEA). PCA and FA are mathematically and statistically different. As seen in the equations below, FA is represented in Equation 10, whereas PCA is represented in Equation 11 (Equation 10 is on page 24 and Equation 11 is on page 26). Factor analysis (FA), in particular, is a statistical procedure that is applied to a set of observed variables (e.g., test items) in order to determine which observed variables form individual subsets or clusters that ultimately combine into factors or dimensions. These factors are considered to mirror the underlying phenomenon that creates the correlation among the observed variables (Tabachnick & Fidell, 2001). Exploratory factor analysis (EFA), specifically, is used to gain a theoretical insight on a given set of test items by identifying the underlying latent dimension(s). The primary purpose of EFA is to explain the relationship between the set of observed variables (i.e., test items) when little to no prior information on the data structure is

25 15 available. Thus, EFA is thought of as a tool for generating theories. An EFA has often been utilized as an initial assessment tool. In common test development practice, however, an EFA is conducted to validate instruments when developing or revising a scale. For example, researchers may utilize an EFA to determine the dimensionality of an instrument, and subsequently use this information to construct composite scores for either hypothesis testing or for making inferences. There are fundamental measurement theories and practices that are vital to making such inferences. There is an assumption that the sampling process and the development of test items were conducted with as little error as possible. The process of sampling has been discussed in the literature, as well as which statistical procedures are most appropriate for certain sampling designs (Korn & Graubbard, 1999). Additionally, the development of test items could include error such as culturally biased language, grammatical errors, and more importantly, theory that is not related to the set of test items. These kinds of errors are often represented in the statistical output, but sampling and item-writing errors can confuse the interpretation of variance accounted for and other statistical results. Most of these issues can be screened if a researcher conducts the initial stages of research appropriately. Dimensionality One of the most important assumptions of measurement theory is that the test items (i.e., observed variables) of an instrument all measure just one thing. In order to make psychological sense, ordering persons on an attribute, describing individual differences, or grouping people by

26 ability, the underlying latent variable of a composite score must be unidimensional (Hattie, ). One important goal of assessing unidimensionality, in particular, is to summarize the patterns of correlations among the observed variables (Tabachnick & Fidell, 2001). This is often done by reducing variables down to smallest number as possible to account for the underlying phenomenon. The underlying phenomenon is considered to be the reason why the observed variables are correlated in the first place. The underlying phenomena can reflect one or more dimensions. Dimensionality refers to the structure of a specific phenomenon (Pett, Lackey, Sullivan, 2003). Unidimensionality, in particular, refers to one dominant latent variable or phenomena. The use of composite scale scores are often used to make inferences in social and behavioral sciences, and unidimensionality is assumed when using these composite scores. There are several statistical procedures that provide a structural analysis of a selected set of observed variables (e.g., factor analysis or multidimensional scaling). Ultimately, these procedures ideally obtain an appropriate number of dimensions to justify the use of composite scores and to explain the pattern of correlations among observed variables. Dimensions (i.e., latent variables) are known to be constructed variables that come prior to the observed variables (i.e., test items). That is, it is assumed that if two test items are correlated, they have something unobserved in common (i.e., the latent dimension). Despite the significance of the theoretical and procedural aspects of assessing the dimensionality of measures via FA, there is not an acceptable index to represent the unidimensionality of a set of test items.

27 Strict and Essential Unidimensionality 17 Psycho-educational and health measures often utilize composite scale scores in order to make inferences. Although some of the psycho-educational and health measures meet strict unidimensionality, most of these measures are actually essential unidimensional. According to Humphreys (1952, 1962), in order to measure any psychological latent variable of interest (e.g., subjective well being), the inclusion of numerous minor latent variables is not merely desirable but unavoidable. The inclusion of secondary minor latent variables is often referred to as 'essential unidimensionality'. Strict unidimensionality, on the other hand, is defined as one dominant latent variable with no secondary minor dimensions. Both essential and strict unidimensionality were investigated in this dissertation. Strict unidimensionality was investigated via manipulation of the magnitude of the communalities (h 2 ). Essential unidimensionality was also examined by varying the magnitude of the communalities (h 2 ), but the proportion of communality on the second factor and the number of test items with non-zero loadings on the second factor was also manipulated for essential unidimensionality. As shown in Chapter V, the methodology of this dissertation (informally) provided an operational definition for essential unidimensionality: manipulation of the magnitude of communality, proportion of communality on the second factor, and the number of test items with non-zero loadings on the second factor. Factor Analysis Procedures and Concepts FA particularly aims to summarize the interrelationships among the test items in a

28 18 concise, but accurate manner (i.e., reducing the number of variables down to a few factors or dimensions). Specifically, FA generates several linear combinations of observed variables, and each one of these combinations represent a factor. These factors explain the system of correlations in the observed correlation matrix, and ultimately these factors can be used to reproduce the observed correlation matrix (Tabachnick & Fidell, 2001). In order to employ FA appropriately, there are a set of essential steps that must be conducted. These include selecting and measuring (i.e., administering to a set of individuals) a set of observed variables, generating a correlation matrix, extracting a set of factors from this matrix, determining the number of factors, rotating the factors (this is optional), and interpreting the results. There are various other statistical techniques that need to be considered in order to conduct these steps, but the interpretation of the results seems to be an essential part of FA. Interpreting and labeling factors depends on the theoretical foundation of the observed variables that form a particular factor as well as the purpose of the test, and construct validity plays a large role here. In general, a factor is more interpretable when the set of observed variables correlate highly with that factor and do not correlate with the other factors (Tabachnick & Fidell, 2001). This is often referred to as simple structure. Factor Analytic Model In FA, it is expected to start with a matrix of correlations among the observed variables. Ultimately, a matrix of factor loadings is interpreted as the correlations between the observed variables and the specific hypothetical factor (i.e., dimension). Therefore, factor loadings reflect

29 quantitative relationships between the observed variables and the dimensions. In other words, the 19 loadings are the degree of the generalizability found between the observed variable and the factor: the farther the loadings are from zero, the more one can generalize from the factor to the variable (Gorsuch, 1983). There are several different arbitrary cut-off scores that are presented in psychometric literature, which provide a boundary for whether a loading is worthy of being generalized. For example, loadings that are greater than 0.30 (or even 0.40 in some contexts) are considered significant (Crawford & Hambuger, 1967). With this introduction of factor loadings, it seems appropriate to discuss how these loadings are actually generated. The common factor model will be presented here. In comparison, a full factor model is based on a perfect calculation of the variables from components. In other words, the full factor model does not include error terms, and thus it is assumed that this model would lead to perfect reproduction of the variables, and any observed error is a reflection of the model itself (Gorsuch, 1983). It is seldom in empirical research that observed data are obtained in a perfect manner. Therefore, the full component model will not be used in this context. The common factor model, on the other hand, consists of two different components: common factors and unique factors. Common factors refer to the factors that contribute to two or more observed variables'. The unique factor represents the non-common factor variance for each observed variable, and this variance needs to be accounted for in order to complete the actual prediction of that variable (Gorsuch, 1983). The unique factor, however, does not include error of the model. This factor includes random error of measurement such as biases, unknown sampling error, or administration or test discrepancies. It has been recommended that large samples should be used The psychometric literature presents various cut-off points for the number of observed variables needed to define a factor. This cut-off point of two or more is taken from Gorsuch (1983).

30 20 for FA so that the sampling error can be ignored (Gorsuch, 1983, Comrey & Lee, 1973; 1992). As defined by Comrey & Lee (1992) and Gorsuch (1983), the traditional common factor model is defined by the following equation: X ik = w n F ik + - W if F fi + W u, U ik + W ie E ik (!) where: x jk = a standard score for person k on observed variable i w n - a factor loading for observed variable i on common factor 1 F lk - a standard score for person k on common factor 1 w if = a factor loading for observed variable i on the common factor/ F jk = a standard score for person k on the common factor/ w lu = a factor loading for observed variable i on unique factor u U ik - a standard score for person k on unique factor i w ie - a factor loading for observed variable / on error factor e E jk - a standard score for person k on error factor e The x, F, U, and E scores in Equation 1 are all standard scores that have a Mean (M) of zero and a standard deviation (<y ) of 1.00 (Comrey & Lee, 1992). Each w value is considered to be a factor loading and falls between a and The x score on the left hand side of the equation is empirically obtained from the data collection process (i.e., it is the participant response to the test item). The U will change for each variable because each variable's unique factor only affects that variable. Equation 1 can then be represented in matrix form for all the possible values of / and k (i.e., observed variables and people) simultaneously: 2 2 For a detailed description of the matrix algebra that is specific to factor analysis and to Equation 2, please see Comrey & Lee (1992) or Gorsuch (1983).

31 21 (2) Z jk = a standard score data matrix of k individuals and i variables F kf - a standardized factor score matrix for the same number of individuals k with/factors P fi = a factor by variable weight matrix U ik = a matrix of unique factor scores with one unique factor for each variable for k individuals D d - a diagonal matrix giving each unique factor weight for reproducing the variable with which the weight is associated. In summary, U and D are unique factor scores and weights whereas F and P are the common factor scores and weights. It is important to state that the unique factor score and weight represent that part of the variable that is not predictable from the set of other observed variables and is assumed to be uncorrelated with the common factors and the other unique factors. The unique factors increase the amount of variance of the variables (Gorsuch, 1983). In saying so, it is also essential to present additional components of the factor analysis process to further explain the common factor model. Communality The communality of a variable is the proportion of variance that is accounted for by the common factors only. More specifically, the communality is the sum of squared loadings for a variable across factors, and it can be considered the squared multiple correlation of the variable as predicted from the factors (Tabachnick & Fidell, 2001). The following equation shows the communality for variable X:

32 , 2 I^ + II^O* /i = (where j^k) (3) 22 Given Equation 3, the variable's uniqueness, u 2, is defined as the proportion of variance not including the variance from the common factors. From one perspective, the uniqueness factor is equal to the sum of squares of the loadings in the specific and error factors (Comrey & Lee, 1973). It can also be assessed as follows: u 2 = \-h 2 (4) To further divide u 2, it is important to consider nonrandom variance. The reliability coefficient, r xx, is a measure of the variable's variance that is nonrandom and an estimate of the true score 2 3 variance, r Xl (Gorsuch, 1983). It is important to note here that a communality cannot be greater than the true score variance. A specific factor's proportion of variance, s 2, is a difference of the proportion of nonerror variance, r 2,, and the variance accounted for by the common factors, h 2 : 4 s 2 = r 2, - h 2 (5) The error contribution, excluding the specific factor, is calculated as follows: e 2 = l-r 2, (6) With these equations in mind, the following holds true: u 2 = s 2 - e 2 (7) l=h 2 + s 2 + e 2 (8) 3 A true score is a hypothetical score that represents an assessment result, which is entirely free of error. Sometimes a true score is thought of as the average score of an infinite series of assessments with the same or exactly equivalent instruments, but with no practice effect or change in the person being assessed across the series of assessments. A specific factor is a composite of all an observed variable's reliable variance that does not overlap with a factor in the analysis.

33 Finally, the proportion of total variance contributed by common factors is obtained as follows: 23 z>, 2 Proportion of total variance = (v is the number of variables) (9) v Assumptions of the Common Factor Model According to Gorsuch (1983), the following is assumed to be true for a given population of individuals: The observed variables are estimated from the common factors by multiplying each factor by the appropriate weight and summing across factors. Both unique and common factors are included in the model. The common factors correlate zero with unique factors. The unique factors correlate zero with each other. All factors have a mean of zero and a standard deviation of 1.0. Correlated and Uncorrelated Factor Models There are two forms of common factor models: the more general case, where common factors are correlated with each other, and the uncorrelated models. In the former case, the correlation can range, of course, up to a (absolute) value of 1.0 (Gorsuch, 1983). There is an assumption when conducting EFA that factors will be rotated and selected to exhibit meaningful dimensions. Highly overlapping factors are not of great theoretical interest (Gorsuch, 1983). If variables are "too" correlated, researchers are more likely to run analyses again or select a factor model with one less factor. Gorsuch (1983) presents a challenge for many researchers: "If a

34 24 procedure has factors that correlate too high, then it should be rejected. If a procedure has factors that correlate too low, then it should also be rejected". The definition of what is too high or too low is extremely subjective and differs depending on the theoretical foundation of the factors. In general, the correlation among the variables themselves should indicate the level of correlation between the factors: certain variables often cluster together (i.e., correlated variables) and therefore will "load" onto a specific factor. As mentioned previously, FA starts with a matrix of correlations among the observed variables, and ultimately a matrix of factor loadings are generated which can be interpreted as the correlations between the observed variables and the specific hypothetical factor. The common factor correlation matrix is as follows for a population: R vv = P vf R ff P^U vv (10) The R ff is the correlation between the factors, P^ is the factor by variable weight matrix, the is the transposed factor by variable weight matrix, and the U m is a diagonal matrix of the squared unique factor loadings. 5 The second form of the common factor model is that which includes orthogonal rotation methods (i.e., methods for uncorrelated factors). These models include simplified equations that are easier to compute. Uncorrelated models assume the factors are independent of each other. The factor structure is equal to the factor pattern in these models (Gorsuch, 1983). Orthogonal models indicate that the factor matrix, when multiplied by its transpose, gives a diagonal matrix. In knowing this, Gorsuch (1983, p.55) notes: "Because the factor scores are assumed to be 5 A transpose of a matrix is obtained by interchanging the rows and columns of an original matrix (Comrey & Lee, 1973). 6 An Identity matrix is a matrix of correlations between factor scores containing elements 0 and 1, depending on whether the correlation is that of a factor score with itself or with another factor. Multiplying a matrix by an Identity matrix leaves it unchanged (Comrey & Lee, 1973).

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

A Brief Introduction to Factor Analysis

A Brief Introduction to Factor Analysis 1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique

More information

How to report the percentage of explained common variance in exploratory factor analysis

How to report the percentage of explained common variance in exploratory factor analysis UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

Common factor analysis

Common factor analysis Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

More information

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

More information

WHAT IS A JOURNAL CLUB?

WHAT IS A JOURNAL CLUB? WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

More information

Canonical Correlation Analysis

Canonical Correlation Analysis Canonical Correlation Analysis LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the similarities and differences between multiple regression, factor analysis,

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Introduction to Principal Component Analysis: Stock Market Values

Introduction to Principal Component Analysis: Stock Market Values Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

More information

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

More information

What is Rotating in Exploratory Factor Analysis?

What is Rotating in Exploratory Factor Analysis? A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables. FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

More information

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Factor Analysis Using SPSS

Factor Analysis Using SPSS Psychology 305 p. 1 Factor Analysis Using SPSS Overview For this computer assignment, you will conduct a series of principal factor analyses to examine the factor structure of a new instrument developed

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Practical Considerations for Using Exploratory Factor Analysis in Educational Research

Practical Considerations for Using Exploratory Factor Analysis in Educational Research A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS 1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2

More information

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? 1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than

More information

What Are Principal Components Analysis and Exploratory Factor Analysis?

What Are Principal Components Analysis and Exploratory Factor Analysis? Statistics Corner Questions and answers about language testing statistics: Principal components analysis and exploratory factor analysis Definitions, differences, and choices James Dean Brown University

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Using Principal Components Analysis in Program Evaluation: Some Practical Considerations

Using Principal Components Analysis in Program Evaluation: Some Practical Considerations http://evaluation.wmich.edu/jmde/ Articles Using Principal Components Analysis in Program Evaluation: Some Practical Considerations J. Thomas Kellow Assistant Professor of Research and Statistics Mercer

More information

Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Validity, Fairness, and Testing

Validity, Fairness, and Testing Validity, Fairness, and Testing Michael Kane Educational Testing Service Conference on Conversations on Validity Around the World Teachers College, New York March 2012 Unpublished Work Copyright 2010 by

More information

Applications of Structural Equation Modeling in Social Sciences Research

Applications of Structural Equation Modeling in Social Sciences Research American International Journal of Contemporary Research Vol. 4 No. 1; January 2014 Applications of Structural Equation Modeling in Social Sciences Research Jackson de Carvalho, PhD Assistant Professor

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Topic 10: Factor Analysis

Topic 10: Factor Analysis Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called

More information

X = T + E. Reliability. Reliability. Classical Test Theory 7/18/2012. Refers to the consistency or stability of scores

X = T + E. Reliability. Reliability. Classical Test Theory 7/18/2012. Refers to the consistency or stability of scores Reliability It is the user who must take responsibility for determining whether or not scores are sufficiently trustworthy to justify anticipated uses and interpretations. (AERA et al., 1999) Reliability

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

Reliability Analysis

Reliability Analysis Measures of Reliability Reliability Analysis Reliability: the fact that a scale should consistently reflect the construct it is measuring. One way to think of reliability is that other things being equal,

More information

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

More information

DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION

DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION Analysis is the key element of any research as it is the reliable way to test the hypotheses framed by the investigator. This

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias Glossary of Terms Ability A defined domain of cognitive, perceptual, psychomotor, or physical functioning. Accommodation A change in the content, format, and/or administration of a selection procedure

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

More information

Missing data in randomized controlled trials (RCTs) can

Missing data in randomized controlled trials (RCTs) can EVALUATION TECHNICAL ASSISTANCE BRIEF for OAH & ACYF Teenage Pregnancy Prevention Grantees May 2013 Brief 3 Coping with Missing Data in Randomized Controlled Trials Missing data in randomized controlled

More information

Constructing a TpB Questionnaire: Conceptual and Methodological Considerations

Constructing a TpB Questionnaire: Conceptual and Methodological Considerations Constructing a TpB Questionnaire: Conceptual and Methodological Considerations September, 2002 (Revised January, 2006) Icek Ajzen Brief Description of the Theory of Planned Behavior According to the theory

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

PSYCHOLOGY Vol. II - The Construction and Use of Psychological Tests and Measures - Bruno D. Zumbo, Michaela N. Gelin, Anita M.

PSYCHOLOGY Vol. II - The Construction and Use of Psychological Tests and Measures - Bruno D. Zumbo, Michaela N. Gelin, Anita M. THE CONSTRUCTION AND USE OF PSYCHOLOGICAL TESTS AND MEASURES Bruno D. Zumbo, Michaela N. Gelin and Measurement, Evaluation, and Research Methodology Program, Department of Educational and Counselling Psychology,

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2

More information

interpretation and implication of Keogh, Barnes, Joiner, and Littleton s paper Gender,

interpretation and implication of Keogh, Barnes, Joiner, and Littleton s paper Gender, This essay critiques the theoretical perspectives, research design and analysis, and interpretation and implication of Keogh, Barnes, Joiner, and Littleton s paper Gender, Pair Composition and Computer

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

Factor Analysis and Structural equation modelling

Factor Analysis and Structural equation modelling Factor Analysis and Structural equation modelling Herman Adèr Previously: Department Clinical Epidemiology and Biostatistics, VU University medical center, Amsterdam Stavanger July 4 13, 2006 Herman Adèr

More information

Canonical Correlation Analysis

Canonical Correlation Analysis Canonical Correlation Analysis Lecture 11 August 4, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #11-8/4/2011 Slide 1 of 39 Today s Lecture Canonical Correlation Analysis

More information

LINEAR EQUATIONS IN TWO VARIABLES

LINEAR EQUATIONS IN TWO VARIABLES 66 MATHEMATICS CHAPTER 4 LINEAR EQUATIONS IN TWO VARIABLES The principal use of the Analytic Art is to bring Mathematical Problems to Equations and to exhibit those Equations in the most simple terms that

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Statistical estimation using confidence intervals

Statistical estimation using confidence intervals 0894PP_ch06 15/3/02 11:02 am Page 135 6 Statistical estimation using confidence intervals In Chapter 2, the concept of the central nature and variability of data and the methods by which these two phenomena

More information

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of

More information

Crime and cohesive communities

Crime and cohesive communities Crime and cohesive communities Dr Elaine Wedlock Home Office Online Report 19/06 The views expressed in this report are those of the authors, not necessarily those of the Home Office (nor do they reflect

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information