INTRODUCTION DATA SCREENING

Size: px
Start display at page:

Download "INTRODUCTION DATA SCREENING"

Transcription

1 EXPLORATORY FACTOR ANALYSIS ORIGINALLY PRESENTED BY: DAWN HUBER FOR THE COE FACULTY RESEARCH CENTER MODIFIED AND UPDATED FOR EPS 624/725 BY: ROBERT A. HORN & WILLIAM MARTIN (SP. 08) The purpose of this lesson on is to understand and apply statistical techniques to a single set of variables when the researcher is interested in discovering which variables in the set form coherent subsets that are relatively independent of one another. Variables that are correlated with one another but largely independent of other subsets of variables are combined into factors. Factors are thought to reflect underlying processes that have created the correlations among variables. INTRODUCTION That dataset (FACTOR.sav) that we will be using is part of a larger data set from Tabachnick and Fidell (2007). The study involved 369 middle-class, English-speaking women between the ages of 21 and 60 who completed the Bem Sex Role Inventory (BSRI). Respondents attribute traits to themselves by assigning numbers between 1 (never or almost never true of me) and 7 (always or almost always true of me) to each of the items. Forty-four items from the BSRI were selected for this research example. DATA SCREENING SAMPLE SIZE A general rule of thumb is to have at least 300 cases for factor analysis. Solutions that have several high loading marker variables (>.80) do not require such large sample sizes (about 150 cases should be sufficient) as solutions with lower loadings (Tabachnick & Fidell, 2007, p. 613). *Our data set has an adequate sample size of 369 cases. Bryant and Yarnold (1995) state that, one s sample should be at least five times the number of variables. The subjects-to-variables ratio should be 5 or greater. Furthermore, every analysis should be based on a minimum of 100 observations regardless of the subjects-to-variables ratio (p. 100). MISSING DATA To check for missing data: Click Analyze Descriptive Statistics Click Frequencies Click over all 44 Items to Variable(s): (except Subno) De-select [ ] Display frequency tables This will produce a warning message, simply click OK Click OK

2 The first table of the output identifies missing values for each item. Scrolling across the output, you will notice that there are no missing values for this set of data. If there were missing data, use one option (estimate, delete, or missing data pairwise correlation matrix is analyzed). If nonrandom pattern or small sample size, consider estimation but it can lead to overfitting the data resulting in too high correlations. Please refer to Tabachnick and Fidell (2007) to obtain more information about deleting and dealing with missing data. DETECTING MULTIVARIATE OUTLIERS For the sake of this training, we will start with an assessment of multivariate outliers. However, we would usually begin by conducting screening for univariate outliers and assumptions. Many statistical methods are sensitive to outliers so it is important to identify outliers and make decisions about what to do with them. Recall, that a multivariate outlier is an extreme score on one or more variables. REASON FOR OUTLIERS (TABACHNICK & FIDELL, 2007) 1. Incorrect data entry 2. Failure to specify missing values in the computer syntax so missing values are read as real data. 3. Outlier is not member of population that you intended to sample. 4. Outlier is representative of population you intended to sample but population has more extreme scores than a normal distribution. To check for multivariate outliers: Click Analyze Regression Click Linear Dependent: Independent(s): Click Save Under Distances [ ] Mahalanobis Click Continue Click OK subno All remaining 44 Items Page 2

3 An output page will be produced Minimize the output page and go to the Data View page. Once there, you will need to scroll over to the last column to see the Mahalanobis results for all 44 variables. To detect if a variable is a multivariate outlier, one must know the critical value for which the Mahalanobis distance must be greater than. Using the criterion of α =.001 with 44 df (number of variables), the critical Χ 2 = According to Tabachnick and Fidell (2007), we are not using N 1 for df because Mahalanobis distance is evaluated as Χ 2 with degrees of freedom equal to the number of variables (p. 99). Thus, all Mahalanobis variables must be examined to see if they value exceeds the critical value of Χ 2 = Due to the large number of variables to examine, an easy way to analyze all the Mahalanobis distance values for the 44 items is to Click Data Click Sort Cases Scroll down the variable list to the last variable and highlight the Mahalanobis Distance variable (MAH_1) and click it over to the Sort by: box Then under Sort Order Click OK e Descending We can also sort by moving the cursor over the variable of interest (e.g., MAH_1), right clicking on the mouse and click on Sort Descending The values under the Mahalanobis (MAH_1) column will then be arranged in descending order from highest to lowest values. On the Data View page, examine the top values and determine how many cases meet the criteria for a multivariate outlier (i.e., > 78.75). For this set of data there should be 25 cases that are considered multivariate outliers, leaving 344 non-outlying cases still an acceptable number of cases. We are opting to delete the 25 outlying cases. To delete the cases, highlight the gray numbers 1 through 25 (on the left of the screen) then click the Delete key. Save As the modified data set, FACTORMINUSMVOUTLIERS OPTIONS FOR DEALING WITH OUTLIERS (TABACHNICK & FIDELL, 2007) 1. Delete variable that may be responsible for many outliers, especially if it is highly correlated with other variables in the analysis. 2. If you decide that cases with extreme scores are not part of the population you sampled, then delete them. Page 3

4 3. If cases with extreme scores are considered part of the population you sampled then a way to reduce the influence of a univariate outlier is to transform the variable to change the shape of the distribution to be more normal. Tukey said you are merely reexpressing what the data have to say in other terms (Howell, 2007). 4. Another strategy for dealing with a univariate outlier is to assign the outlying case(s) a raw score on the offending variable that is one unit larger (or smaller) than the next most extreme score in the distribution (Tabachnick & Fidell, 2007, p. 77). 5. Univariate transformations and score alterations often help reduce the impact of multivariate outliers but they can still be a problem. These cases are usually deleted (Tabachnick & Fidell, 2007). All transformations, changes to scores, and deletions are reported in the results section with the rationale and with citations. MULTICOLLINEARITY AND SINGULARITY Multicollinearity occurs when the IVs are highly correlated. Singularity occurs when you have redundant variables. To test for multicollinearity and singularity, use the following SPSS commands: Click Analyze Regression Click Linear Click Reset Dependent: Independent(s): Click Statistics subno All 44 Items Be sure not to include MAH_1 [ ] Collinearity diagnostics Click Continue Click OK This will produce an output page If the determinant of R and eigenvalues associated with some factors approach 0, multicollinearity or singularity may be in existence. To investigate further, look at the SMCs for each variable where it serves as DV with all other variables as IVs (Tabachnick & Fidell, 2007, p. 614). Page 4

5 Looking at the output page on the following page, under Collinearity Statistics look at the Tolerance values for each item on the test. We want the Tolerance values to be high, closer to 1.0. Next, we want to explore SMCs (squared multiple correlations) of a variable where it serves as DV with the rest as IVs in multiple correlation (Tabchnick & Fidell, 2007). Many programs, including SPSS, convert the SMC values for each variable to tolerance (1 SMC) and deal with tolerance instead of SMC. Thus, we have to calculate the SMCs ourselves. Turn to the next page of this handout and next to the tolerance values calculate the SMCs for the first tem items (1 Tolerance). We want the SMCs to be low, closer to.00. If any of the SMCs are one (1), then singularity if present. If any of the SMCs are very large (i.e., near one), then multicollinearity is present (Tabachnick & Fidell, 2007). The tolerance and SMC values were fine for this group of data. However, if the tolerance values are too low, we would want to scroll down to the next table and examine the Condition Index for each item. According to Tabachnick and Fidell (2007), we do not want the Condition Index values to be greater than 30. Examine the Condition Index for all 44 items. As you can see, the last 25 items have Condition Indexes that are grater than 30. Because of these high Condition Indexes, you would next need to examine the Variance Proportion for those high Condition Index items which are located next to the Condition Index. According to Tabachnick and Fidell (2007), we do not want two Variance Proportions to be greater than.50 for each item. To explain further, look at the Variance Proportion of Dimension 45. Scroll across the page and see if there are two items with Variance Proportions that are greater than.50 for Dimension 45. Next, you have to make some decisions about multicollinearity. Because we did not find evidence of any Variance Proportions that are grater than.50, we may decide that we do not have evidence of multicollinearity. However, one can also combine evidence (explore the SMC, Tolerance Values, Condition Index, and Variance Proportions) and decide if there is combined evidence of multicollinearity. Generally, if the Condition Index and Variance Proportion values are high, then there is evidence of multicollinearity. For this set of data we have no evidence that multicollinearity or singularity exist. Save the output as MULTICOLLINEARITY Page 5

6 Model 1 (Constant) helpful self reliant defend beliefs yielding cheerful independent athletic shy assertive strong personality forceful affectionate flatter loyal analyt feminine sympathy moody sensitiv undstand compassionate leadership ability eager to soothe hurt feelings willing to take risks makes decisions easily self sufficient conscientious dominant masculin willing to take a stand happy soft spoken warm truthful tender gullible act as a leader childlik individualistic use foul language love children competitive ambitious gentle a. Dependent Variable: Subject identification Unstandardized Coefficients Coefficients a Standardized Coefficients Collinearity Statistics t Sig. Tolerance VIF B Std. Error Beta Page 6

7 NORMALITY If Principal Factor Analysis is used descriptively, then assumptions about distributions are not essential. However, normality of variables enhances the solution (Tabachnick & Fidell, 2007). When the numbers of factors are determined using statisicial inference, multivariate normality is assumed. Normality among single variables is assessed by skewness and kurtosis (Tabachnick & Fidell, 2007, p. 613) and as such, the distributions of the 44 variables need to be examined for skewness and kurtosis. To obtain the skewness and kurtosis of the 44 variables one would first Click Analyze Descriptive Statistics Click Frequencies Click Reset Click over all 44 Items to Variable(s): box Be sure not to include Subno and MAH_1 Click Statistics Click Charts Under Dispersion [ ] all Under Central Tendency [ ] all Under Distribution [ ] all Click Continue e Histograms [ ] With normal curve Click Continue De-select [ ] Display frequency tables Click OK An output will be produced scroll to the top of the output to Frequencies. You will see the skewness values and their standard error values for all 44 items. Page 7

8 Skewness: A distribution that is not symmetric but has more cases (more of a tail ) toward one end of the distribution than the other is said to be skewed (Norusis, 1994). Value of 0 = normal Positive Value = positive skew (tail going out to right) Negative Value = negative skew (tail going out to left) Divide the skewness statistic by its standard error. We want to know if this standard score value significantly departs from normality. Concern arises when the skewness statistic divided by its standard error is greater than z = (p <.001, two-tailed test) (Tabachnick & Fidell, 2007). To illustrate, calculate the standardized skewness of one item labeled helpful and provide the information asked for below. Keep in mind, that you would do this for each of the 44 items. helpful Skewness Value = Std. Error Skewness Standard Score Direction of the Skewness Significant Departure? (yes, no) Scroll to the top of the output to Frequencies. You will see the kurtosis values and their standard error values for all 44 items. Kurtosis: The relative concentration of scores in the center, the upper and lower ends (tails) and the shoulders (between the center and the tails) of a distribution (Norusis, 1994). Value of 0 = mesokurtic (normal, symmetric) Positive Value = leptokurtic (shape is more narrow, peaked) Negative Value = platykurtic (shape is more broad, widely dispersed, flat) Divide the kurtosis statistic by its standard error. We want to know if this standard score value significantly departs from normality. Concern arises when the kurtosis statistic divided by its standard error is greater than z = (p <.001, two-tailed test) (Tabachnick & Fidell, 2007). To illustrate, calculate the standardized kurtosis of one item labeled helpful and provide the information asked for below. Keep in mind, that you would do this for each of the 44 items. helpful Kurtosis Value Std. Error = Kurtosis Standard Score Direction of the Kurtosis Significant Departure? (yes, no) Page 8

9 LINEARITY Overall, many of the variables are negatively skewed and a few are positively skewed, However, because the BSRI is already published and in use, no deletion of variables or transformations of them is performed (Tabachnick & Fidell, 2007, p. 652). Save the output as NORMALITY Multivariate normality implies linearity, so linearity among pairs of variables is assessed through inspection of scatterplots (Tabachnick & Fidell, 2007, p. 613). With 44 variables, however, examination of all pairwise scatterplots (about 1,000 plots) is impractical. Therefore, to spot check for linearity, we will examine Loyal (with strong negative skewness) and Masculin (with strong positive skewness). To create a scatterplot, select Click Graphs Legacy Dialogs Click Scatter/Dot Click Simple Scatter (this should be the default) Click Define Y-Axis: X-Axis: Click OK An output (graph) will then be produced Save the output as LINEARITY Masculin Loyal The scatterplot should show a balanced spread of scores. According to Tabachnick and Fidell (2007), when assessing bivariate scatterplots if they are oval-shaped, they are normally distributed and linearly related. Although the plot is far from pleasing, and shows departure from linearity as well as the possibility of outliers, there is no evidence of true curvilinearity. And again, transformations are viewed with disfavor considering the variable set and the goals of analysis (Tabachnick & Fidell, 2007, p. 652 Page 9

10 CONDUCTING A PRINCIPAL FACTOR ANALYSIS Click Analyze Data Reduction Click Factor Highlight all 44 Items and click them over to the Variable(s): box. Be sure not to include Subno and MAH_1 Click Descriptives Under Statistics [ ] Univariate descriptives [ ] Initial solution (default) Page 10

11 Under Correlation Matrix [ ] Coefficients [ ] Determinant [ ] KMO and Bartlett s test of sphericity Click Continue Click Extraction Click OK An output will then be produced Change Method to Principal axis factoring Under Display [ ] Unrotated factor solution (default) [ ] Scree plot Click Continue INTERPRETATION OF THE EXPLORATORY FACTOR ANALYSIS To review the study, a sample of 369 middle-class, English-speaking women between the ages of 21 and 60 completed the Bem Sex Role Inventory (BSRI) and 44 items (variables) were used in the analysis. The research question is: Will the factor structure of the BSRI be similar to previous research indicating the presence of between three and five factors underlying the items of the BSRI for this sample of women? The purpose of factor analysis is to study a set of variables and discover subsets of variables that are relatively independent from one another. The subsets of variables that correlate with each other are combined as factors (linear combinations of observed variables) and are thought to reflect underlying processes (latent variables) that have created the correlations among the observed variables. Principal components analysis (PCA) uses the total variance (common variance + unique variance + error variance) to derive components (Hair, et al., 2006). PCA is an empirical summary of the data set. PCA aggregates the correlated variables, the variables produce the components. Common variance is variance in a variable that is shared with all other variables in the analysis. A variable s communality is the estimate of such shared variance. Unique variance is variance only associated with a specific variable which is not explained by correlations to other variables. Error variance cannot be explained by correlations to other variables either but it is due to unreliability in data-gathering, measurement error, or random selection. Page 11

12 Factor Analysis (FA) focuses only on the common variance (covariance, communality) that each observed variable shares with other observed variables. FA excludes unique and error variance which confuses the understanding of underlying processes (latent variables). FA is the choice if a theoretical solution of factors is thought to cause or produce scores on variables. The steps of interpretation are (1) selecting and measuring variables, (2) preparing the correlation matrix, (3) determining the factorability of R, (4) assessing the adequacy of extraction and determining the number of factors, (5) extraction and rotating the factors to increase interpretability, and (6) interpreting the results. Once an initial final solution is selected validation continues using cross-validation, confirmatory factor analysis, and criterion validation methods (Tabachnick & Fidell, 2007). FACTORABILITY OF R: There are several sources of information to determine if the R matrix is likely to produce linear combinations of variables as factors. Look at the Correlation Matrix (R) produced on the output page. A matrix that is factorable should include several sizable correlations. The expected size depends, to some extent, on N (larger sample sizes tend to produce smaller correlations), but if no correlation exceeds.30, use of FA is questionable because there is probably nothing to factor analyze (Tabachnick & Fidell, 2007, p. 614). We want the correlations between items to be greater than.30. Interpret the correlation matrix: High bivariate correlations, however, are not ironclad proof that the correlation matrix contains factors. It is possible that the correlations are between only two variables and do not reflect underlying processes that are simultaneously affecting several variables. For this reason, it is helpful to examine matrices of partial correlations where pairwise correlations are adjusted for effects of all other variables (Tabachnick & Fidell, 2007, p. 614). To examine partial correlations, look on the output page at the KMO. The Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy is the sum of all the squared correlation coefficients in the numerator and the denominator is the sum of all the squared correlation coefficients plus the sum of all of the squared partial correlation coefficients (Norusis, 2003). A partial correlation is a value that measures the strength of the relationship between a dependent variable and a single independent variable when the effects of other independent variables are held constant (Hair, et al., 2006). Page 12

13 The following criteria are used to assess and describe the sampling adequacy (Kaiser, 1974):.90 = Marvelous.80 = Meritorious.70 = Middling.60 = Mediocre.50 = Miserable Below.50 = Unacceptable If small KMOs, it is a good idea not to do factor analysis. Please interpret the KMO below: KMO Value: Sampling Adequacy Criteria Rating: Next, look at Bartlett s Test of Sphericity on the output page. Bartlett s (1954) Test of Sphericity is a notoriously sensitive test of the hypothesis that the correlations in a correlation matrix are zero. According to Tabachnick and Fidell (2007), the test is likely to be significant with samples of substantial size even if correlations are very low. Therefore, use of the test is recommended only if there are fewer than, say, five cases per variable (p. 614). Overall, we want Bartlett s Test of Sphericity to be significant so that we can reject the hypothesis. Interpret Bartlett s Test of Sphericity by providing the information asked for below. Approx. Chi-Square Significance Bartlett s Test of Sphericity What was your decision about the null hypothesis? ADEQUACY OF EXTRACTION AND NUMBER OF FACTORS: An initial factor analysis is run using principal axis factoring with an unrotated factor solution with the purpose to determine the adequacy of extraction and to identify the likely number of factors in the solution. PCA is often used for the same initial purpose. Page 13

14 Look at the communalities from the output. A communality of the variable is the proportion of variance explained by the common factors. The initial communalities are the SMC of each variable as DV with the others in the sample as IVs. Extraction communalities are SMCs between each variable as DV and the factors as IVs. Communalities range from 0 to 1 where 0 means that the factors don t explain any of the variance and 1 means that all of the variance is explained by the factors. Variables with small extraction communalities cannot be predicted by the factors and you should consider eliminating them if too small (<.20). How many extraction communalities are below.20? A first check of the number of factors is obtained from the sizes of the eigenvalues reported as part of an initial run with principal axis factoring extraction. An eigenvalue (latent root) represents the amount of variance accounted for by a factor. Because the variance that each standardized variable contributes to a principal factor extraction is 1, a factor with an eigenvalues less than 1 is not as important, from a variance perspective, as an observed variable. Look at the output and look under the heading: Total Variance Explained. Then look under the heading: Initial Eigenvalues. Examine the Initial Eigenvalues and under Total examine how many factors are above the value of one (1). How many factors are above an initial eigenvalue of 1.0? There should be 11 factors above one. However, having 11 factors is not parsimonious. Thus, you may use eigenvalues over two (2) as the criterion in specifying which factors are the most worthy of further exploration. Tabachnick & Fidell (2007) say Eigenvalues for the first four factors are all larger than two, and, after the sixth factor, changes in successive eigenvalues are small. This is taken as evidence that there are probably between 4 and 6 factors (p. 657). A second criterion is the scree test of eigenvalues plotted against factors. Factors, in descending order, are arranged along the abscissa with eigenvalues as the ordinate. Usually the scree plot is negatively decreasing the eigenvalue is highest for the first factor and moderate but decreasing for the next few factors before reaching small values for the last several factors. Examine the Scree Plot on your output page According to Norusis (2003), the plot most often will show a distinct break between the steep slope of the large factors and the gradual trailing off of the rest of the factors, the scree that forms at the foot of a mountain. One should use only the factors before the scree begins. According to Hair et al. (2006), starting with the first factor, the plot slopes steeply downward initially and then slowly becomes an approximately horizontal line. The point at which the curve first begins to straigten out is considered to indicate the maximum number of factors to extract (p.120). You look for the point where the line drawn through the points change slope. Page 14

15 Unfortunately, the scree test is not exact; it involves judgment of where the discontinuity in eigenvalues occurs and researchers are not perfectly reliable judges (Tabachnick & Fidell, 2007). In the example, a single straight line can comfortably fit the first four eigenvalues. After that, another line, with a noticeably different slope, best fits the remaining eight points. Therefore, there appears to be about four (4) factors in the data. Once you have determined the number of factors by these criteria, it is important to look at the rotated loading matrix to determine the number of variables that load on each factor. CREATING 4 FACTORS: Click Analyze Data Reduction Click Factor Click Reset Highlight all 44 Items and click them over to the Variable(s): box. Be sure not to include Subno and MAH_1 Click Extraction Change Method to Principal axis factoring Under Extract e Number of factors: Type in the number 4 (four) Click Continue Click Rotation Click OK An output should be produced Under Method e Varimax Click Continue EXTRACTION AND ROTATING THE FACTORS TO INCREASE INTERPRETABILITY: We are now looking for the most parsimonious final solution of factors representing the R matrix and the theory of the problem related to the presence of between three and five factors underlying the items of the BSRI. We specified 4 factors for the run. Again, we will use principal axis factoring which maximizes variance extracted by orthogonal Page 15

16 factors. It estimates communalities to attempt to eliminate unique and error variance from the variables (Tabachnick & Fidell, 2007). Principal axis factoring is the most commonly used FA is often the beginning extraction method used by researchers. There are several other extraction procedures available (see Tabachnick & Fidell, 2007, p. 633). It is common to use other procedures, perhaps varying number of factors, communality estimates, and rotational methods with each run. Analysis terminates when the research decides on the preferred solution ( Tabachnick & Fidell, 2007, p. 634). You want increase variance explained by the most parsimonious set of factors. An extraction procedure is usually accompanied by rotation to improve the interpretability and scientific utility of the solution. The purpose of rotation is to achieve a simple structure in which each factor has large loadings in absolute value for only some of the variables, making it easier to identify (Nourusis, 2003). If effective, rotation amplifies high loadings (correlations) of variables to factors and reduces low loadings. A geometric illustration of rotation is on page 641 of Tabachnick & Fidell (2007). Orthogonal Factor Rotation is used when each factor is independent (orthogonal) of all other factors. The factors are extracted so that their axes are maintained at 90 degrees. Oblique Factor Rotation is used when the extracted the factors are correlated with each other and identifies the extent to which the factors are correlated. We chose to use the most common orthogonal rotation method known as varimax. A varimax rotation minimizes the complexities of factors by maximizing variance of loadings on each factor (Tabachnick & Fidell, 2007). Again, there are several rotational techniques for both orthogonal and oblique solutions and they are identified on page 639 of Tabachnick and Fidell (2007). As with extraction methods, it is acceptable and common to experiment with various extraction and rotation procedures before deciding upon the preferred solution (Tabachnick & Fidell, 2007). We are using principal axis factoring extraction and varimax with Kaiser normalization rotation. Kaiser normalization is automatically apart of the analysis and it is used to rescale the rotated matrix to restore the original row sums of squares. INTERPRETING THE RESULTS: Next, we will interpret the results. In actuality, we may run several different FA s using differing numbers of factors, extractions, and rotations to find the most parsimonious solution. Moreover, for development of a new instrument especially, it is likely that cases and variables will be deleted as you make several FA runs. Deletion of variables are done when underlying assumptions are not met. Deletion of variables are conducted by looking at communalities, loadings, inter-correlations, and coefficient alphas. But, we show the final solution chosen. If cases or variables are deleted then the KMO and Bartlett Test of Sphericity, and communalities should be assessed for each run with a change. Look at the Communalities chart on your output. Under the Extraction heading, we want values to be greater than.20. Looking at the output, you can see that there are several variables below.20. Identify the number of extraction communalities below.20: Page 16

17 Having many factors less than.20 indicates that the items are not loading properly on the factors. However, Tabachnick and Fidell (2007) explain that factorial purity was not a consideration with the development of the BSRI which means that when developing the BSRI there was no concern with items loading on certain factors. Next, examine the table labeled Total Variance Explained on your output. Under Rotation Sums of Squared Loadings, you can see that the four factors have eigenvalues greater than two (2). Factor Total % of Variance Cumulative % Finally, examine the Rotated Factor Matrix table on your output. Factors are interpreted through their factor loadings. Factor loadings are the correlations between the original variables and the factors. Squaring these loadings indicates what percentage of the variance in an original variable is explained by a factor. Tabachnick and Fidell (2007) decided to use a loading of.45 (20% variance overlap between variable and factor). Factors appear as columns and items appear as rows. Tabachnick and Fidell also recommend a minimum factor loading of.32. The greater the loading, the more the variable is a pure measure of the factor. Comrey and Lee (1992) suggest that loadings in excess of.71 (50% overlapping variance) are considered excellent,.63 (40% overlapping variance) are considered very good,.55 (30% overlapping variance) are considered good,.45 (20% overlapping variance) are considered fair, and.32 (10% overlapping variance) are considered poor. Choice of the cutoff for size of loading to be interpreted is a matter of researcher preference (Tabachnick & Fidell, 2007). Look at the output for the Rotated Factor Matrix. For each factor column (there should be four of them), circle the values that exceed.45 for each factor column. There should be twelve (12) items circled for Factor 1, six (6) under Factor 2, five (5) under Factor 3, and three (3) under Factor 4. Examine the items circles and label the factors accordingly. Page 17

18 helpful self reliant defend beliefs yielding cheerful independent athletic shy assertive strong personality forceful affectionate flatter loyal analyt feminine sympathy moody sensitiv undstand compassionate leadership ability eager to soothe hurt feelings willing to take risks makes decisions easily self sufficient conscientious dominant masculin willing to take a stand happy soft spoken warm truthful tender gullible act as a leader childlik individualistic use foul language love children competitive ambitious gentle Rotated Factor Matrix a Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 9 iterations. Factor Page 18

19 LABELING FACTORS: One of the most important reasons for naming a factor is to communicate to others. The name should capsulize the substantive nature of the factor and enable others to grasp its meaning (Rummel, 1970). The choice of factor names should be related to the basic purpose of the factor analysis. If the goal is to describe or simplify the complex interrelationships in the data, a descriptive factor label can be applied. The descriptive approach to factor naming involves selecting a label that best reflects the substance of the variables loaded highly and near zero on a factor. The factors are classificatory and names to define each category are sought (Rummel, 1970). There are a number of considerations involved in descriptively naming factors: 1. Those variables with zero or near-zero loadings are unrelated to the factor. In interpreting a factor, these unrelated variables should also be taken into consideration. The name should reflect what is as well as what is not involved in a factor (Rummel, 1970). 2. The loading squared gives the variance of a variable explained by an orthogonal factor. Squaring the loadings on a factor helps determine the relative weight the variables should have in interpreting a factor (Rummel, 1970). 3. The naming of the factors with high positive and high negative loadings should reflect this bipolarity. One term may be appropriate, as is temperature for a hotcold bipolar factor. Additionally, each pole may be interpreted separately and the factor named by its opposite, e.g., hot versus cold (Rummel, 1970). Review the names of the variables that have loadings circled for each factor and look for a theme of the variable names for each factor and choose a name to represent each factor. Factor Name (Label) Page 19

20 INTERNAL CONSISTENCY OF FACTORS Click Analyze Scale Click Reliability Analysis Click over the 44 Items under the Items: box Be sure not to include Subno and MAH_1 For the Model: box be sure that Alpha is selected Click OK Cronbach s coefficient alpha is a measure of internal consistency of the items of a total test or scales of a test based upon the scores of the particular sample. The scores range from 0-1. Scores on the higher range of the scale (>.70) suggest that the items of the total test or scales are measuring the same thing. Interpret Cronbach s Alpha by providing the information asked for below: Cronbach s Alpha For all 44 items N of items Interpretation: FOR EACH FACTOR (SCALE) Next, examine the internal consistency of the items which have high factor loadings on each of the four factors (i.e., >.45). These are the item loadings you circled for each of the four factors in the Rotated Factor Matrix. Click Analyze Scale Click Reliability Analysis Click Reset Click over the items for that factor under the Items: box For the Model: box be sure that Alpha is selected Click OK Page 20

21 Cronbach s Alpha For Factor 1 N of items Interpretation: Do the same procedure for the next three factors and interpret Cronbach s Alpha by providing the information asked for below: Cronbach s Alpha For Factor 2 N of items Interpretation: Cronbach s Alpha For Factor 3 N of items Interpretation: Cronbach s Alpha For Factor 4 N of items Interpretation: Page 21

22 References Bartlett, M. S. (1954). A note on the multiplying factors for various chi square approximations. Journal of Royal Statistical Society, 16(Series B), Bryant, F. B., & Yarnold, P. R. (1995). Principal-components analysis and exploratory and confirmatory factor analysis. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and understanding multivariate statistics (pp ). Washington, DC: American Psychological Association. Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. Hair, J. R., Jr., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate analysis. Upper saddle River, NJ: Pearson Prentice Hall. Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson Wadsworth. Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrica, 39, Norusis, M. J. (2003). SPSS 12.0 Statistical Procedures Companion. Upper Saddle, NJ: Prentice Hall. Norusis, M. J. (1994). SPSS advanced statistics 6.1. Chicago, IL: SPSS Inc. Rummel, R. J. (1970). Applied multivariate statistics for the social sciences. Mahwah, NJ: Lawrence Erlbaum Associates. Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Boston: Allyn and Bacon. Page 22

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

Common factor analysis

Common factor analysis Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

More information

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables--factors are linear constructions of the set of variables; the critical source

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Correlation and Regression Analysis: SPSS

Correlation and Regression Analysis: SPSS Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,

More information

A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various

More information

5.2 Customers Types for Grocery Shopping Scenario

5.2 Customers Types for Grocery Shopping Scenario ------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Admin. Assignment 2: Final Exam. Small Group Presentations. - Due now.

Admin. Assignment 2: Final Exam. Small Group Presentations. - Due now. Admin Assignment 2: - Due now. Final Exam - June 2:30pm, Room HEATH, UnionComplex. - An exam guide and practice questions will be provided next week. Small Group Presentations Kaleidoscope eyes: Anomalous

More information

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR

More information

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables. FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Introduction to Principal Component Analysis: Stock Market Values

Introduction to Principal Component Analysis: Stock Market Values Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor

More information

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method. Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the

More information

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Factor Analysis Using SPSS

Factor Analysis Using SPSS Psychology 305 p. 1 Factor Analysis Using SPSS Overview For this computer assignment, you will conduct a series of principal factor analyses to examine the factor structure of a new instrument developed

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Choosing the Right Type of Rotation in PCA and EFA James Dean Brown (University of Hawai i at Manoa)

Choosing the Right Type of Rotation in PCA and EFA James Dean Brown (University of Hawai i at Manoa) Shiken: JALT Testing & Evaluation SIG Newsletter. 13 (3) November 2009 (p. 20-25) Statistics Corner Questions and answers about language testing statistics: Choosing the Right Type of Rotation in PCA and

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

The Effectiveness of Ethics Program among Malaysian Companies

The Effectiveness of Ethics Program among Malaysian Companies 2011 2 nd International Conference on Economics, Business and Management IPEDR vol.22 (2011) (2011) IACSIT Press, Singapore The Effectiveness of Ethics Program among Malaysian Companies Rabiatul Alawiyah

More information

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Instructions for SPSS 21

Instructions for SPSS 21 1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Research Methodology: Tools

Research Methodology: Tools MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

A Brief Introduction to Factor Analysis

A Brief Introduction to Factor Analysis 1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique

More information

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

More information

Practical Considerations for Using Exploratory Factor Analysis in Educational Research

Practical Considerations for Using Exploratory Factor Analysis in Educational Research A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

There are six different windows that can be opened when using SPSS. The following will give a description of each of them. SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY In the previous chapters the budgets of the university have been analyzed using various techniques to understand the

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

More information

Multiple Regression Using SPSS

Multiple Regression Using SPSS Multiple Regression Using SPSS The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerably and I suggest (especially if you re confused) that

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

More information

Using Principal Components Analysis in Program Evaluation: Some Practical Considerations

Using Principal Components Analysis in Program Evaluation: Some Practical Considerations http://evaluation.wmich.edu/jmde/ Articles Using Principal Components Analysis in Program Evaluation: Some Practical Considerations J. Thomas Kellow Assistant Professor of Research and Statistics Mercer

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Factor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis

Factor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis Factor Analysis Overview Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of

More information

A Comparison of Variable Selection Techniques for Credit Scoring

A Comparison of Variable Selection Techniques for Credit Scoring 1 A Comparison of Variable Selection Techniques for Credit Scoring K. Leung and F. Cheong and C. Cheong School of Business Information Technology, RMIT University, Melbourne, Victoria, Australia E-mail:

More information

Data analysis and regression in Stata

Data analysis and regression in Stata Data analysis and regression in Stata This handout shows how the weekly beer sales series might be analyzed with Stata (the software package now used for teaching stats at Kellogg), for purposes of comparing

More information

SAS Analyst for Windows Tutorial

SAS Analyst for Windows Tutorial Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating

More information

Factor Analysis Using SPSS

Factor Analysis Using SPSS Factor Analysis Using SPSS The theory of factor analysis was described in your lecture, or read Field (2005) Chapter 15. Example Factor analysis is frequently used to develop questionnaires: after all

More information

An introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.

An introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co. An introduction to Principal Component Analysis & Factor Analysis Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.uk Monday, 23 April 2012 Acknowledgment: The original version

More information

How to report the percentage of explained common variance in exploratory factor analysis

How to report the percentage of explained common variance in exploratory factor analysis UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Validation of the Core Self-Evaluations Scale research instrument in the conditions of Slovak Republic

Validation of the Core Self-Evaluations Scale research instrument in the conditions of Slovak Republic Validation of the Core Self-Evaluations Scale research instrument in the conditions of Slovak Republic Lenka Selecká, Jana Holienková Faculty of Arts, Department of psychology University of SS. Cyril and

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION

DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION DATA ANALYSIS AND INTERPRETATION OF EMPLOYEES PERSPECTIVES ON HIGH ATTRITION Analysis is the key element of any research as it is the reliable way to test the hypotheses framed by the investigator. This

More information

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Multivariate Analysis

Multivariate Analysis Table Of Contents Multivariate Analysis... 1 Overview... 1 Principal Components... 2 Factor Analysis... 5 Cluster Observations... 12 Cluster Variables... 17 Cluster K-Means... 20 Discriminant Analysis...

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

SPSS TUTORIAL & EXERCISE BOOK

SPSS TUTORIAL & EXERCISE BOOK UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

More information

Understanding Power and Rules of Thumb for Determining Sample Sizes

Understanding Power and Rules of Thumb for Determining Sample Sizes Tutorials in Quantitative Methods for Psychology 2007, vol. 3 (2), p. 43 50. Understanding Power and Rules of Thumb for Determining Sample Sizes Carmen R. Wilson VanVoorhis and Betsy L. Morgan University

More information

Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

More information