FACTORANALYSISOFCATEGORICALDATAINSAS
|
|
|
- Willis George
- 9 years ago
- Views:
Transcription
1 FACTORANALYSISOFCATEGORICALDATAINSAS Lei Han, Torsten B. Neilands, M. Margaret Dolcini University of California, San Francisco ABSTRACT Data analysts frequently employ factor analysis to extract latent factors from sets of survey items. Often these items are not continuous scales; instead, they are either polytomous (e.g., Likert scaled) or dichotomous (e.g., "yes/no") items. The FACTOR procedure in SAS computes Pearson product-moment correlations from raw data as its default input matrix. This approach may not be optimal for polytomous or dichotomous input data. Polychoric and tetrachoric correlation coefficients for polytomous and dichotomous items, respectively, may be a better choice. This paper illustrates the use of the SAS Institute polychor.sas macro program to compute polychoric and tetrachoric correlation matrices; these matrices are subsequently analyzed using PROC FACTOR. Limitations and benefits of this approach are discussed. INTRODUCTION Survey research questionnaires often contain orderedpolytomous variables such as Likert scale items and dichotomous variables (''yes/no" items) to measure respondent attitudes, beliefs, or preferences. Researchers frequently employ factor analysis techniques to ascertain the presence of underlying latent traits that govern responses to survey items. Factor analysis methods that are implemented in common statistical software programs such as SAS, however, have been developed for the analysis of continuous variables via factoring of the Pearson correlations among the observed items. The basic factor analysis model assumes a linear relationship between a set of observed variables and latent trait variables, namely common factors. However, if survey items are related to a common factor via a nonlinear function, which is particularly likely for dichotomous variables, the linear factor analysis model may produce mathematical artifacts (McDonald, 1985). For example, it has been noted by several researchers that factor loadings for binary items tend to be highly correlated with item means (i.e., the proportion of "1" responses in a 0/1 dichotomy), suggesting that something other than an underlying trait is represented in such cases (Gorsuch, 1974; Mcdonald, 1985, Waltman & Dunbar, 1994). The correct interpretation of a factor's meaning depends on appropriate model selection and input data In the present illustration we mainly focus on one input data issue, the correlation matrix to be factored, and one model selection issue, the choice of a factor extraction method. Specifically,. given the choice of correlation matrices available from SAS, which correlation coefficients are most appropriate to describe the relationship of ordered polytomous variables or dichotomous variables and serve as the appropriate input matrix for factor analysis? Likert scaled items usually feature four to seven score points indicating level of agreement, importance, or frequency. Binary data, by definition, have two points. When Pearson correlation coefficients are computed for dichotomous or polytomous variables, the magnitudes of these correlations shrink due to range restriction. This attenuation is more severe when two variables differ substantially in their means or marginal distributions. Polycboric (for polytomous variables) or tetrachoric (for dichotomous variables) correlation coefficients are proposed to overcome this limitation (Gorsuch, 1974; Cohen and Cohen, 1983). Factor analysis results are also determined by the factor extraction method used. One typically employed method is principal axis factoring (P AF), in which the factor analysis begins with squared multiple correlations (SMC) of each item with the remaining items in the main diagonal of the factored correlation matrix. Thus, the SMC. values represent initial estimates of communalities, the proportion of variance accounted for within each item by the factor analysis solution. A contrasting method, unweighted least-squares (ULS), employs a least-squares approach to minimize item uniquenesses (residuals) and maximize factor loading values. This is one recommended extraction method for the factor analysis of poiychoric and tetrachoric correlation coefficients. The present paper will address two practical questions. First, do factor analysis results differ substantially when dichotomous and Likert scaled variables are treated as categorical rather than as continuous variables? Second, which factor extraction procedure, principal ais factoring (P AF) or unweighted least squares (ULS), works best for dichotomous and Likert scale items (i.e., yields simple structure)? METHODS The data used in this illustration come from a longitudinal study of adolescent neighborhood friendships. Respondents are African American youth of ages living in a single neighborhood in a West Coast city. Probability sampling and 190
2 multiple. phase subject recruitment procedures were described in detail in Dolcini et al. (2001). In the current analysis We use only cross-sectional data from the baseline interview (N = 201). In a face-to-face interview participants responded to items assessing health-related attitudes and behaviors. Each respondent also identified three to five close friends. A total of 7 42 friends ranging in age from 13 to 21 were nominated and relationship data were obtained. Measures We use two measures included in the baseline interview questionnaire in the present analyses for illustrative purposes. One measure, composed of seven dichotomous (yes/no) items, is an inventory designed by Dolcini et al. (2001) to measure the quality of friendship. The second instrument encompasses eight Likert scale items designed to assess degree of depression. These items are drawn from the Center for Epidemiological Studies Depression Scale (CESD; Radloff, 1977) short form. Items on this inventory were scored on a four point Likert format: 1 = "Rarely or none of the time ( < 1 day)"; 2 = "Some or a little of the time (1-2 days)", 3 ="Occasionally or a moderate amount of the time" (3-4 days)", and 4 = "Most or all of the time (5-7 days)". Higher scores indicate increased levels of depression per week. The items for the friendship scale and the CESD appear in Table 1 and in Table 2, respectively. Procedure Prior to factor analysis, the distributions of the two sets of items were examined and descriptive statistics were computed. SAS PROC FACTOR was performed on both Pearson r and polychoric correlation matrices on both binary and Likert data sets. The maximum number of factors to be extracted was limited to two because the number of items for each instrument is relatively small. The two factor extraction methods described above, iterated principal factor estimation (P AF) and unweighted least squared (ULS), were conducted on the polychoric correlation matrix. Only PAF was performed on the Pearson r because the distributions of all items are skewed in the same direction. factor analysis of a polychoric correlation matrix using SAS proceeds in two steps: (1) Computing the polychroic correlation matrix and (2) submitting the computed polychoric correlation rna$ to SAS PROC FACTOR for factor extraction. The SAS macro polychor.sas was used to compute polychoric and tetrachoric correlations. (The tetrachoric correlation is the polychoric correlation. between twq dichtomous items. The macro automatically computes tetrachoric correlations for dichotomous items) The macro program can be downloaded from the SAS Institute's Web site at the following URL: macro{oolych QrJllm! Following computation of the polychoric correlations, the data analyst then factor analyzes the po1ychoric correlation matrix using PROC FACTOR, as shown in the sample syntax below. Libname data c:\my documents\nacs\baseline\data'; Data friend; infile c:\my documents\nacs\baseline\data\base7var.txt ; input TRUST1 HOLD1 BACKUP1 MONEY1 PROBLEM1 TROUBLE1 BUSINES1;. proc means;run; %inc c:\my documents\nacs\sas\prg\polychor.sas ; %polychor(data=friend,var=trust1 HOL01 BACKUP1 MONEY1 PROBLEM1 TROUBLE1 BUSINES1, out=tetcorb, type=corr); proc print; run; proc factor data=tetcorb method=prinit priors=smc scree residual rotate=promax; vartrust1 HOLD1 BACKUP1 MONEY1 PROBLEM1 TROUBLE1 BUSINESj run; Based on the obtained factor loadings, items were grouped into subscales. Finally, Cronbach's coefficient alpha was computed for each subscale as an index of internal consistency. RESULTS Descriptive Statistics The proportion of respondents who endorsed each of the friendship items are listed in Table 1. Descriptive statistics for the eight Likert variables from the CESD are listed in Table 2. Within each set of variables, all items are skewed in the same direction. For the seven friendship variables, we asked only about participants' 3-5 closest friends, so the restricted range in means (from 0.82 to 0.97) is not surprising. The overall mean for the eight depression variables is 1.55 indicating that participants had minimal levels of depression at the time of interview. Table 1. Descriptive Statistics fm Seven Binary Items from the Friendship Inventory. (N = 742). Item N=742 P*(Mean) Label Do you trust your E? Wouldyouletl'holdSOJDethingforyou? Would l' ever back you up? Would you lend f money if you had it? S Would you talk toe about personal prob.? If you were in trouble would you turn to E'l Would you telleyourbusiness? Total 0.89 P is tbe percent of respondents who endorse the item. 191
3 Table 2. Distributions of the CESD Items. (N = 201} Item N=201 Prop(%) Mean so Shake off the blues < 1 day days days days 10 s 2 Feel depressed < 1 day days days days My life is a failure <I day days days days Fearful <I day days days days Restless <I day days days 29 IS 5-7days Feel lonely <I day days days II days Crying spells <I day days days days Feel sad <I day days days days 10 S Total Correlation Coefficients For the seven dichotomous friendship items, the Pearson correlations ranged from 0.19 to 0.48, while the corresponding tetrachoric correlations ranged from 0.41 to 0.75, almost twice as large as the Pearson r values (See Table 3). The last three items correlate more highly than the first four items. For the eight Likert scaled CESD items, r ranges from 0.12 to 0.63; the corresponding polychroic values are 0.25 to 0.73 (See Table 4). In general, the discrepancy between the polychoric and Pearson r is larger for binary data than for Likert variables. This finding is reasonable because as the number of categories increases, the items.behave more like continuous variables. This result suggests that Pearson correlation coefficients may not be badly suited for Likert scale items with large numbers of categories, provided that research participants use the full scale (i.e., that sufficient variance exists for all items employed in the analysis). Table 3. Pearson rand Tetrachoric Correlation Coefficients for Seven Binary Items from the Friendship Inventory. (N = 742) ltem2 :J1 Item! itejd2 Item3 ltem4 ItemS ltem6.75 ltem3..jz.71 ltem4..ll.69 ItemS.23 ltem6..2.i.60 ltem jz j!l.55..j!l.ss.2l!.62..j!l J dd.65.&a : Underlined coefficients are Pearson product-moment correlations; non-underlined values are polychoric correlation coefficients. Table 4. Pearson rand Polychoric Correlation Coefficients for Eight Likert CESD items. (N = 201) ltem2.!! Item I ltem2 ltem3.70 ltem3 M..60 Item4.ll.37 ItemS.J1.23 ltem6 Item7.52 ItemS.67 ltem7 & ItemS ll.38..zo.23 M.55,'], M.74 ltem4 ItemS i ! Item6 Note: Underlined coefficients are Pearson product-moment correlations; non-underlined values are polychoric correlation coefficients. Factor Analyses: Friendship Inventory The principal axis factor analysis results for binary data for two types of correlation coefficients, Pearson r and tetrachoric are summarized in Table 5 and Table 6, respectively. The tables report the squared multiple correlations (SMC) of each itern with all other items in the analysis. The tables also show the first few positive proportions of eigenvalues divided by the sum of the total eigenvalues of the reduced correlation matrix.so &l
4 (Prop). The tables also display the factor loadings for the one and two factor solutions, the latter obtained via promax rotation. The tables also report the root mean-square residual for each solution, an index of badness of fit of the factor analysis model to the input data (RMS). Smaller RMS values indicate superior factor analysis solutions. If the data analyst does not specify the number of factors to extract, SAS uses a proportion of variance criterion to determine number of factors to retain. Similar to Kaiser's eigenvalue greater than one rule, this criterion retains the number of factors whose sum of squared factor loadings exceeds the total sum of squared loadings for all factors. From a practical perspective, this means that the sum of the proportion values should exceed 1.00 to ensure a sufficient number of factors extracted. For example, in Table 5 the frrst factor's proportion value is 1.15, which exceeds the threshold value of 1.00, so the analyst would retain a single factor from this solution. By contrast, in Table 6 the sum of two factors' proportion values is required ( ) to exceed the 1.00 cutoff value, so two factors are retained from this solution. Table 5. Factor Loading for Seven Binary Friendship Inventory Items using Pearson Correlation Coefficients. {N = 742) Item SMC Prop ;;!Factms Fl F1 F2 I O.S RMS: Alpha: Proportion of each eigenvalue to the sum of all eigenvalues. Table 6. Factor Loading for Seven Binary Friendship Inventory Items using Tetrachoric Correlation Coefficients. {N = 742) Item SMC Prop.1:fll la[ Fl Fl F o, RMS: * Proportion of each eigenvalue to the sum of all eigenvalues. Several differences between these two solutions are evident. First, the factor loadings in the solution based on Pearson correlations are attenuated. Interestingly, this attenuation. is mirrored in the factor intercorrelation values obtained in the two factor solutions: r "" = 0 for the solution based on Perason correlations whereas rn,. = 0.67 for the solution based on tetrachoric correlations. Second, an investigator using Pearson correlations exclusively would extract a single factor via the proportion criterion whereas an analyst using the tetrachoric correlation matrix as the input to the factor analysis would extract two factors via the proportion criterion. The reliability for the scale derived from the single factor solution is 0.73 whereas alpha is 0.64 and 0.69 for the subscales based on the first four and last three items, respectively, in the dual factor solution. FadrorA:CESD The factor analysis results for the Likert scale items are summarized in Table 7 and Table 8, respectively. Table 7. Factor Loading for Eight CESD Ukert Items Using Pearson Correlation Coefficients. {N = 201) Item SMC Prop.1:fll la[ -Factors F1 F1 F s RMS: Alpha: Proportion of each eigenvalue to the sum of all eigenvalues. Table 8. Factor Loading for Eight CESD Likert Items Using Polychoric Correlation Coefficients. (N = 201) Item SMC Prop*.1.:EI&I!li: 2-fac:tor< F1 F1 F ' :79 RMS: Proportion of each eigenvalue to the sum of all eigenvalues. 193
5 As was the case in the analyses of the friendship inventory, the factor analysis based on Pearson correlations returns a one factor solution whereas the analysis based on polychoric correlations yields a two factor solution. Unlike the analysis of dichotomous items, however, the difference between the factor loadings based on the Pearson r and polychoric correlation matrices are negligible. Furthermore, the factor intercorrelation values for the two factor solutions derived from both the Pearson and the polychoric correlation input matrices are identical: The alpha is 0.82 for eight items for a single factor scale and 0.76 and 0.74 for subscales derived from a two factor solution. Factor Extraction Methods To compare differences among factor extraction methods, unweighted least-squares extraction was performed on the correlation matrices for each survey instrument The results are not listed in the tables due to limited space. However, ULS extraction produced highly similar results to the PAP method with both binary and Likert scale data. Interestingly, the Ul..S method proved more vulnerable to Heywood cases (i.e., negative residual variance estimates) with the Likert scale polychoric correlations; removing items 4 and 5, those with the weakest communality estimates, fixed this problem. DISCUSSION Many measurable qualities of interest to researchers are dichotomous. This paper provides two empirical examples comparing factor analysis results from two different methods of analysis. SAS PROC FACTOR analyzes a Pearson product-moment correlation matrix by default. While this approach may not perform poorly for the analysis of Likert scaled data where research participants endorse a sufficiently wide range of scale points, our examples illustrate the serious attenuation of correlation coefficients and the attendant reduction of factor loading and factor intercorrelation values when factor analyzing a Pearson product moment correlation matrix derived from binary items. This in tum can lead to misinterpretation of the dimensionality of the solution and possibly the omission of valuable survey items due to artificially low factor loading values for those items in a factor analysis based on Pearson correlations. Fortunately, SAS Institute provides the readily available and easily employed polychor.sas macro program to compute polychoric and tetrachoric correlation coefficients. These coefficients may in turn be factor analyzed by SAS PROC FACTOR, as our example syntax demonstrates. This option affords SAS users a convenient method of comparing the results derived from the standard Pearson correlation-based factor analysis with those obtained from tetrachoric or polychoric correlations, a valuable analysis tool SAS PROC FACTOR provides the data analyst flexibility to extract factors from correlation matrices using a number of extraction methods, including principal axis factoring and unweighted least-squares. Some extraction methods may perform better with Pearson correlations whereas other methods may perform better with polychoric and tetrachoric correlations. The interaction of extraction method and input correlation matrix type is an area deserving further exploration and research. The use of polychoric and tetrachoric correlations are not without limitations, however. Polychoric and tetrachoric correlations are assumed to be derived from a set of underlying normally-distributed latent variables, an untestable assumption that may not be true in the population from which the researcher draws her data. In addition, because each polychoric correlation coefficient is computed separately, a matrix of these coefficients may not be positive definite, i.e., non-gramian (Gorsuch, 1974; Cohen and Cohen, 1983). Special software programs designed for the factor analysis of categorical and dichotomous outcome data also exist and may be of use for extensive analyses involving such data (e.g., Mplus; Muthen & Muthen, 2001). These programs compute tetrachoric and polychoric correlation matrices internally as part of the factor analysis procedure, thereby saving the data analyst the two step process outlined above. Furthermore, the Mplus program generates several goodness-of-fit statistics that may be of use to data analysts in choosing the number of factors to retain from any given factor analysis solution. Nonetheless, despite the advantages of special purpose software programs and the limitations inherent in the SASbased approach documented above, we believe that the use of tetrachoric and polychoric correlation coefficients in conjunction with PROC FACTOR in SAS enables data analysts to obtain reasonable answers to research questions involving the factor analysis of dichotomous and polytomous survey items. REFERENCES Cohen J. & Cohen P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ, Lawrence Erlbaum Associates, Inc., Publishers. Dolcini, M. M, Happer, G., Watson, S., Han, L., Ellen, J., & Catania, J. (2001). The structure and quality of adolescent friendships in an urban African American neighborhood. Paper presented at the biannual meeting of Society for Research in Child Development, Minneapolis, MN. Gorsuch, R. L. (1974). Factor Analysis. Philadelphia, PA, W. B. Saunders Company. McDonald, R. P. (1985). Factor Analysis and Related Methods. Hillsdale, NJ, Lawrence Erlbaum Associates, Inc., Publishers. 194
6 Muthen, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, Muthen, L. & Muthen, B. (2001). Mplus User's Guide, v2. Los Angeles, CA, Muthen & Muthen. Uebersax, J., S. (2000). Estimating a latent trait model by factor analysis of tetrachoric correlations. Web site: htm. Waltman K. K. & Dunbar, S. B. (1994). Dimensions of content and difficulty in binary test items. A paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans. CONTACT INFORMATION Lei Han, Ph.D. Statistician Center for AIDS Prevention Studies University of California, San Francisco 74 New Montgomery St. Suite 600 San Francisco, CA Phone: (415) Fax: (415) Address: [email protected] Tor Neilands, Ph.D. Specialist/Senior Statistician Margaret Dolcini, Ph.D. Principal Investigator Center for AIDS Prevention Studies University of California, San Francisco 74 New Montgomery St. Suite 600 San Francisco, CA AUTHOR NOTES The authors are grateful to Dr. Lance Pollack and Melissa Krone for an earlier review of a draft of this article. 195
How to report the percentage of explained common variance in exploratory factor analysis
UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables--factors are linear constructions of the set of variables; the critical source
A Brief Introduction to SPSS Factor Analysis
A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended
Exploratory Factor Analysis
Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor
2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate
Research Methodology: Tools
MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi
Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)
Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR
Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003
Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not
FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.
FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have
PRINCIPAL COMPONENT ANALYSIS
1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2
Introduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
Using Principal Components Analysis in Program Evaluation: Some Practical Considerations
http://evaluation.wmich.edu/jmde/ Articles Using Principal Components Analysis in Program Evaluation: Some Practical Considerations J. Thomas Kellow Assistant Professor of Research and Statistics Mercer
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)
1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting
What Are Principal Components Analysis and Exploratory Factor Analysis?
Statistics Corner Questions and answers about language testing statistics: Principal components analysis and exploratory factor analysis Definitions, differences, and choices James Dean Brown University
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
A Brief Introduction to Factor Analysis
1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique
Goodness of fit assessment of item response theory models
Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing
How to Get More Value from Your Survey Data
Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2
X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Common factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
How To Run Factor Analysis
Getting Started in Factor Analysis (using Stata 10) (ver. 1.5) Oscar Torres-Reyna Data Consultant [email protected] http://dss.princeton.edu/training/ Factor analysis is used mostly for data reduction
Data analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
PSYCHOLOGY Vol. II - The Construction and Use of Psychological Tests and Measures - Bruno D. Zumbo, Michaela N. Gelin, Anita M.
THE CONSTRUCTION AND USE OF PSYCHOLOGICAL TESTS AND MEASURES Bruno D. Zumbo, Michaela N. Gelin and Measurement, Evaluation, and Research Methodology Program, Department of Educational and Counselling Psychology,
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
Principal Component Analysis
Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.
How To Determine If Binge Eating Disorder And Bulimia Nervosa Are Distinct From Aorexia Nervosa
Three Studies on the Factorial Distinctiveness of Binge Eating and Bulimic Symptoms Among Nonclinical Men and Women Thomas E. Joiner, Jr., 1 * Kathleen D. Vohs, 2 and Todd F. Heatherton 2 1 Department
How to do a factor analysis with the psych package
How to do a factor analysis with the psych package William Revelle Department of Psychology Northwestern University February 28, 2013 Contents 0.1 Jump starting the psych package to do factor analysis
Issues in Information Systems Volume 15, Issue II, pp. 270-275, 2014
EMPIRICAL VALIDATION OF AN E-LEARNING COURSEWARE USABILITY MODEL Alex Koohang, Middle Georgia State College, USA, [email protected] Joanna Paliszkiewicz, Warsaw University of Life Sciences, Poland,
A Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis
Tutorials in Quantitative Methods for Psychology 2013, Vol. 9(2), p. 79-94. A Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis An Gie Yong and Sean Pearce University of Ottawa
EXCHANGE. J. Luke Wood. Administration, Rehabilitation & Postsecondary Education, San Diego State University, San Diego, California, USA
Community College Journal of Research and Practice, 37: 333 338, 2013 Copyright# Taylor & Francis Group, LLC ISSN: 1066-8926 print=1521-0413 online DOI: 10.1080/10668926.2012.754733 EXCHANGE The Community
Factor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
Chapter VIII Customers Perception Regarding Health Insurance
Chapter VIII Customers Perception Regarding Health Insurance This chapter deals with the analysis of customers perception regarding health insurance and involves its examination at series of stages i.e.
9.2 User s Guide SAS/STAT. The FACTOR Procedure. (Book Excerpt) SAS Documentation
SAS/STAT 9.2 User s Guide The FACTOR Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete
Statistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
Simple Second Order Chi-Square Correction
Simple Second Order Chi-Square Correction Tihomir Asparouhov and Bengt Muthén May 3, 2010 1 1 Introduction In this note we describe the second order correction for the chi-square statistic implemented
Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences
Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences Third Edition Jacob Cohen (deceased) New York University Patricia Cohen New York State Psychiatric Institute and Columbia University
Using R and the psych package to find ω
Using R and the psych package to find ω William Revelle Department of Psychology Northwestern University February 23, 2013 Contents 1 Overview of this and related documents 1 2 Install R and relevant packages
Factor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis
Factor Analysis Overview Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of
Running head: ONLINE VALUE AND SELF-EFFICACY SCALE
Online Value and Self-Efficacy Scale Running head: ONLINE VALUE AND SELF-EFFICACY SCALE Development and Initial Validation of the Online Learning Value and Self-Efficacy Scale Anthony R. Artino Jr. and
Windows-Based Meta-Analysis Software. Package. Version 2.0
1 Windows-Based Meta-Analysis Software Package Version 2.0 The Hunter-Schmidt Meta-Analysis Programs Package includes six programs that implement all basic types of Hunter-Schmidt psychometric meta-analysis
Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use
: Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????
This chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
Factor Analysis Using SPSS
Factor Analysis Using SPSS The theory of factor analysis was described in your lecture, or read Field (2005) Chapter 15. Example Factor analysis is frequently used to develop questionnaires: after all
Latent Class Regression Part II
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Contemporary Issues In Education Research First Quarter 2012 Volume 5, Number 1
The Relationship Between Engineering Students Self-Directed Learning Abilities And Online Learning Performances: A Pilot Study Pao-Nan Chou, National University of Tainan, Taiwan ABSTRACT This study aimed
T-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
Reliability Analysis
Measures of Reliability Reliability Analysis Reliability: the fact that a scale should consistently reflect the construct it is measuring. One way to think of reliability is that other things being equal,
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance
What is Rotating in Exploratory Factor Analysis?
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
Factor Analysis and Structural equation modelling
Factor Analysis and Structural equation modelling Herman Adèr Previously: Department Clinical Epidemiology and Biostatistics, VU University medical center, Amsterdam Stavanger July 4 13, 2006 Herman Adèr
Exploratory Factor Analysis
Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.
To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.
Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the
PRE-SERVICE SCIENCE AND PRIMARY SCHOOL TEACHERS PERCEPTIONS OF SCIENCE LABORATORY ENVIRONMENT
PRE-SERVICE SCIENCE AND PRIMARY SCHOOL TEACHERS PERCEPTIONS OF SCIENCE LABORATORY ENVIRONMENT Gamze Çetinkaya 1 and Jale Çakıroğlu 1 1 Middle East Technical University Abstract: The associations between
CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
Validation of the Treatment Related Impact Measure for Diabetes Treatment and Device: TRIM-Diabetes and TRIM-Device
Validation of the Treatment Related Impact Measure for Diabetes Treatment and Device: TRIM-Diabetes and TRIM-Device Authors Meryl Brod President, The Brod Group Mette Hammer Associate Director, Health
Richard E. Zinbarg northwestern university, the family institute at northwestern university. William Revelle northwestern university
psychometrika vol. 70, no., 23 33 march 2005 DOI: 0.007/s336-003-0974-7 CRONBACH S α, REVELLE S β, AND MCDONALD S ω H : THEIR RELATIONS WITH EACH OTHER AND TWO ALTERNATIVE CONCEPTUALIZATIONS OF RELIABILITY
Understanding and Quantifying EFFECT SIZES
Understanding and Quantifying EFFECT SIZES Karabi Nandy, Ph.d. Assistant Adjunct Professor Translational Sciences Section, School of Nursing Department of Biostatistics, School of Public Health, University
[This document contains corrections to a few typos that were found on the version available through the journal s web page]
Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,
Factor Analysis - SPSS
Factor Analysis - SPSS First Read Principal Components Analysis. The methods we have employed so far attempt to repackage all of the variance in the p variables into principal components. We may wish to
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
Determining the Number of Factors to Retain in EFA: an easy-touse computer program for carrying out Parallel Analysis
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
Principal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
Dimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
Multivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business
Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables
A Comparison of Variable Selection Techniques for Credit Scoring
1 A Comparison of Variable Selection Techniques for Credit Scoring K. Leung and F. Cheong and C. Cheong School of Business Information Technology, RMIT University, Melbourne, Victoria, Australia E-mail:
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
Subspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China [email protected] Stan Z. Li Microsoft
User Manual for the COPING STRATEGIES INVENTORY
User Manual for the COPING STRATEGIES INVENTORY David L. Tobin 1984, 2001 2 Scale Format The Coping Strategies Inventory is a 72-item self-report questionnaire designed to assess coping thoughts and behaviors
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
9. Sampling Distributions
9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling
Older People s Quality of Life Questionnaire (OPQOL) summed scoring and reverse coding:
Older People s Quality of Life Questionnaire (OPQOL) summed scoring and reverse coding: OPQOL items coded 1-5; scoring = reverse coding of positive items; sum sub-scales names as marked in table headers;
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data ABSTRACT INTRODUCTION SURVEY DESIGN 101 WHY STRATIFY?
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health, ABSTRACT
Exploratory Factor Analysis
Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than
On the Practice of Dichotomization of Quantitative Variables
Psychological Methods Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 7, No. 1, 19 40 1082-989X/02/$5.00 DOI: 10.1037//1082-989X.7.1.19 On the Practice of Dichotomization of Quantitative
MANAGER VIEW 360 PERFORMANCE VIEW 360 RESEARCH INFORMATION
MANAGER VIEW 360 PERFORMANCE VIEW 360 RESEARCH INFORMATION Manager View/360 was first designed and developed in early 1985 by Kenneth M. Nowack, Ph.D. and originally titled the Management Practices Questionnaire.
Estimating ordinal reliability for Likert-type and ordinal item response data: A conceptual, empirical, and practical guide
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
5.2 Customers Types for Grocery Shopping Scenario
------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------
Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.
Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, [email protected] A researcher conducts
Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
Stability of School Building Accountability Scores and Gains. CSE Technical Report 561. Robert L. Linn CRESST/University of Colorado at Boulder
Stability of School Building Accountability Scores and Gains CSE Technical Report 561 Robert L. Linn CRESST/University of Colorado at Boulder Carolyn Haug University of Colorado at Boulder April 2002 Center
Conducting Exploratory and Confirmatory Factor Analyses for Competency in Malaysia Logistics Companies
Conducting Exploratory and Confirmatory Factor Analyses for Competency in Malaysia Logistics Companies Dazmin Daud Faculty of Business and Information Science, UCSI University, Kuala Lumpur, MALAYSIA.
