The Not-Even-Remotely Close to Being a Complete Guide to SPSS / PASW Syntax. (For SPSS / PASW v.18+)

Size: px
Start display at page:

Download "The Not-Even-Remotely Close to Being a Complete Guide to SPSS / PASW Syntax. (For SPSS / PASW v.18+)"

Transcription

1 The Not-Even-Remotely Close to Being a Complete Guide to SPSS / PASW Syntax (For SPSS / PASW v.18+) Dr. Bryan R. Burnham Department of Psychology University of Scranton 1 of 49

2 Table of Contents 1. What is SPSS / PASW? Where is it Available? Finding and Opening PASW Three Types of PASW Files Data Files (Data View) Defining and Adjusting Variables in Data Files (Variable View) Basic Structure of Output Files Data Files Associated with this Guide The Syntax Editor Why Syntax? Because it s Better! Some Syntax Basics...It s Easy? Opening.sav files with Syntax Opening Microsoft Excel (.xls) Files with Syntax Opening Text (.txt) files with Syntax Syntax for Basic Statistical Needs Variable Labels Value Labels Frequencies Descriptive Statistics SORT CASES SPLIT FILE Correlation & Regression Pearson Correlations (Bivariate) Pearson Correlations (Partial) Univariate Regression (one regressor) t-tests One-Sample t-test Independent Groups t-tests Correlated Samples (Paired Samples) t-tests Analysis of Variance Oneway Analysis of Variance (via GLM) Between Subjects Factorial ANOVA (via GLM) Repeated Measures ANOVA (via GLM) Chi Square Cross-Tabulation Procedure (Factorial Chi-Square) Oneway Chi-Square Goodness of Fit Test Alternative Method for Goodness of Fit Test of 49

3 1. What is SPSS / PASW? Statistics Package for Social Scientists (SPSS) is a software tool for analyzing sets of data. I have absolutely no idea what the acronym PASW stands for. I wish it was PAWS, because it would be easier to say. Anyway, PASW is just the newest version of SPSS (currently in version 18). SPSS/PASW operate like a spreadsheet program, such as Microsoft Excel, and the data files look a lot like Excel. Unlike Excel, PASW/SPSS is designed for manipulating and analyzing data. As part of your course requirements, you will gain basic understanding of how to use PASW. Indeed, most statistical analyses are performed with PASW or some other software. Why do we teach you this stuff by hand, why not just use PASW? Simply put, it s because without conceptual knowledge of where the results of an analysis done with PASW come from, they re just a bunch of numbers in a computer file! Thus, we teach you what the variance of a set of data is and where it comes from by showing you how it s calculated. This way, variance should make sense when using PASW. If my logic doesn t make sense, drop out of the course and preferably out of college. :-) 1.1 Where is it Available? At the University of Scranton, SPSS / PASW is available in the Weinberg Memorial Library (WML) on the 1st floor and in group study rooms, Brennan Hall (BRN) rooms 102 and 201, McGurrin Hall (MGH) room 110, Hyland (HYL) Café and room 102 (where statistics classes are held), and Alumni Memorial Hall (AMH) rooms 214 and 202. It may be available in the PT/OT lab in the basement of Leahy Hall and in the Nursing Lab and the Stout Lab in McGurrin Hall Finding and Opening PASW From the Start Menu, All Programs SPSS Inc. PASW Statistics 18 PASW Statistics 18 (the red icon with the gray sigma symbol). Σ 1.3 Three Types of PASW Files There are three main files associated with PASW (and SPSS): 1. Data Files contain data to be analyzed, and have the extension '.sav'. Data files look a lot like a Microsoft Excel spreadsheet, with columns, rows and cells. Columns represent variables, with an abbreviated name of the variable at the top of each column. Rows represent cases, or research subjects. That is, each row/case could be the data associated with an individual, or a sample. The cells and values within the file are the data. (See Figures 1 & 2) 2. Syntax Files are used to request PASW conduct an analysis, and have the extension '.sps'. Hence, syntax files are command files that tell PASW what to do with data. I admit that most analyses and procedures in PASW can be obtained through the pull-down menus in the data file; but, syntax is better for reasons given later. Syntax files are similar to text editors where you insert text-based commands for PASW to interpret and, hopefully, run your requested analyses on the data. (See Figure 5) 3. Output Files are generated in response to PASW running an analysis on a set of data, and have the extension '.spv' (in SPSS the extension is '.spo'). Importantly, if something was written incorrectly in the syntax file, PASW will produce a Warning, usually with no additional output. Most of an output file is table-format, with the exception of graphs and charts. (See Figure 3) 1 Thanks to Dr. Barry Kuhle (University of Scranton) for compiling this list. 3 of 49

4 1.4 Data Files (Data View) There are two different 'views' of a PASW data file: 1. Data View, where your data can be entered by hand, and where you can view the actual values of the working data file. 2. Variable View, where you can define parameters of your variables, such as how many decimals are showing, whether the variable is a string, a date, or a numeric variable, etc. The figure below is a screen shot of the Data View in a blank PASW data file: Figure 1: Data View of a blank PASW data file. You can toggle between the Data View and the Variable View by clicking on the appropriate tab at the bottom left hand corner in any data file. You can also toggle back and forth between the Data View and the Variable View by double-clicking on any variable name. This amounts to double-clicking a column in Data View and double-clicking any row in Variable View. I will assume that you can figure out how to insert values into a data file, so I will not cover them here. 1.5 Defining and Adjusting Variables in Data Files (Variable View) If necessary, it is good to define the parameters of your variables first, so that when when you run an 4 of 49

5 analysis the output of any tables and graphs will be complete and understandable. Below is a screen shot of the Variable View in a blank PASW data file: Figure 2: Variable View of a blank PASW data file. Below, I've listed each of the parameters that can be seen at the top of each column in Variable View, with a brief description of what each parameter can do: NAME Refers to variables labels that you can enter, but must begin with a letter. TYPE Indicates whether a variable is numeric, a string, a date, etc. Clicking TYPE opens a dialogue box, in which you can specify the type of data contained in a variable. WIDTH Is how many numbers or letters is allowable for a value under a variable. DECIMAL The number of decimal places displayed for numeric variables. LABEL Allows you to assign a longer name to an abbreviated variable label in the data file. That is, you could 'name' a variable STAI, but 'label' the variable State Trait Anxiety Inventory at time 1. The abbreviated name appears under NAME, and the longer LABEL will appear on any tables or graphs in the output. 5 of 49

6 VALUES Allows you to assign dummy-codes to variable. For example, if your data file contains the variable Sex, a 0 could refer to males and 1 could refer to females. But, 0's and 1's are arbitrary unless they are defined. This packet will show you how to assign labels using syntax. MISSING Refers to what PASW should do with missing data entries. COLUMNS Refers to how many columns wide you want the variable name to appear in the Data View. Normally this is set to eight. ALIGN Allows you to have the values in each column left-justified, right-justified, or centered. MEASURE Relevant to numeric variables. Indicates the measurement scale of a variable. It allows three levels: nominal, ordinal and scale, which refers to both interval and ratio data. Most of these parameters are irrelevant for the time being. Later, you'll learn how to assign longer, more descriptive labels to a variable name, as well as dummy-code a variable. 1.6 Basic Structure of Output Files After you have opened a data file, written syntax commands to request an analysis, and then run that analysis; PASW will produce an output file, like that below: Figure 3: Example output file. 6 of 49

7 The output file is what we are trying to get PASW to provide us. It presents, in table or graph form, the descriptive and/or inferential statistics requested. As you can see in Figure 3, the output contains a single table with a listing of several descriptive statistics (N, Minimum, Maximum, Mean, Standard Deviation), for two different variables (SAT_CR and SAT_M). Don't worry about the variable names right now; trust me, you'll know what they are in a bit. Later, when you have PASW run an analysis on a set of data, I will not include whole screen shots of the output. Rather, I'll simply paste the output tables into the document. (Gotta conserve megabytes!) 1.7 Data Files Associated with this Guide The data file that will be used throughout most of this packet is 'GRE Therapy Data File.sav', and is available on my statistics course website ( on the course files page. There are actually three data files with the same name ('GRE Therapy Data File.sav'; 'GRE Therapy Data File.xls'; and 'GRE Therapy Data File.txt'). I'll show you how to open each of these types of data files using syntax, so download each file. Here's a screen shot of a portion the data file: Figure 4: A portion of the data file used in this packet. The file contains a set of data from a fictitious study that examined the influence of a new Study Drug and different Types of Tutoring on student scores on the Graduate Record Examination (GRE). The GREs are a set of standardized examinations, like the Scholastic Aptitude Tests (SATs). The GREs are required by most graduate school programs to be reported by applicants. The GREs contain three sections, like the SATs: (1) quantitative reasoning, (2) verbal reasoning, and (3) analytical writing. 7 of 49

8 In this fictitious study, researchers investigated whether two independent variables (Study Drug and Type of Tutoring) improved scores on each section of the GREs. For the independent variable Study Drug, subjects were given nothing (control group), a placebo (placebo group), or one of two different dosages of the drug (100 mg/day or 200 mg/day). For the independent variable Type of Tutoring, subjects were not tutored (control group), were tutored with other students in small groups (Group Tutoring), or were tutored one-on-one (Individual Tutoring). Subjects were tested at the beginning of the study during a pretest phase (before the independent variables were administered), and were tested several months later during a posttest phase (after the independent variables should have an influence). In addition to scores on each of the three sections of the GREs, there are a number of other variables included in the data set. Each subject's SAT scores were collected, their heights and weights were measured, and each subject was measured on their level of Trait Anxiety (enduring level of anxiety) and State Anxiety (temporary, situational anxiety). Trait and State anxieties were assessed using the State Trait Anxiety Inventory (STAI), during both the pretest and posttest phase. The table below lists the abbreviated NAME for each variable, along with a brief description of each variable. Variable NAME Description of Variable ID Sex Coll_Class Identification number assigned to each subject. Each subject's biological sex; dummy-coded, where 1 = male and 2 = female. Each subject's current year in college; dummy coded, where 1 = Freshmen, 2 = Sophomore, 3 = Junior, and 4 = Senior. Coll_Maj Each subject's primary major; dummy-coded, where 1 = Psychology, 2 = History, 3 = Biology, 4 = Communications, 5 = English, and 6 = Mathematics. Height_cm Weight_kg SAT_CR SAT_M SAT_V SAT_Tot GPA Drug_Group Tutor_Group Pre_STAIt Pre_STAIs Pre_GREv Pre_GREq Pre_GREa Post_STAIt Post_STAIs Post_GREv Post_GREq Post_GREa Each subject's height, measured to the nearest 0.1 cm. Each subject's weight, measured to the nearest 0.1 kg. Each subject's score on the Critical Reading (CR) section of the SATs. Each subject's score on the Mathematics (M) section of the SATs. Each subject's score on the Verbal (V) section of the SATs. Each subject's summed SAT score (SAT_CR + SAT_M + SAT_V) Each subject's current cumulative GPA. Level of the independent variable Drug Group, into which the subject was assigned; dummy-coded, where 1 = Control Group (no drug given), 2 = Placebo Group, 3 = 100-mg of Drug/Day, and 4 = 200-mg of Drug/Day. Level of the independent variable Tutor Group, into which the subject was assigned; dummy-coded, where 1 = Control Group (no tutoring), 2 = Group Tutoring, 3 = Individual Tutoring. Each subject's trait anxiety (t) during the pretest phase; measured using the State Trait Anxiety Inventory (STAI). Each subject's state anxiety (s) during the pretest phase; measured using the State Trait Anxiety Inventory (STAI). Each subject's score on the Verbal Reasoning (v) section of the GREs, during the pretest phase. Each subject's score on the Quantitative Reasoning (q) section of the GREs, during the pretest phase. Each subject's score on the Analytical Writing (a) section of the GREs, during the pretest phase. Each subject's trait anxiety (t) during the posttest phase; measured using the State Trait Anxiety Inventory (STAI). Each subject's state anxiety (s) during the posttest phase; measured using the State Trait Anxiety Inventory (STAI). Each subject's score on the Verbal Reasoning (v) section of the GREs, during the posttest phase. Each subject's score on the Quantitative Reasoning (q) section of the GREs, during the posttest phase. Each subject's score on the Analytical Writing (a) section of the GREs, during the posttest phase. Table 1: Variable NAMES and brief descriptions. 8 of 49

9 2. The Syntax Editor Looks and works like a text editor (Text Pad, Note Pad, Word Pad). You type in what you want PASW to do, in the correct sequence and using PASWs language, and PASW does what you asked it to do (hopefully). If anyone has ever done a little computer programming (C, C++, Matlab, etc.), then this is just like writing code; albeit much simpler code! PASW Syntax files have the file extension *.sps. Here s an example of what the text editor looks like: Figure 5: Example PASW syntax editor. Note, if you use SPSS, then you won't have the various colors and the numbers for each line. The inclusion of different colors for different syntax statements I the PASW structure is a huge improvement over SPSS. From here on out, I won't be pasting in screen shots of the syntax that we'll be using. Rather, I'll just be writing the syntax that you need to include in order to run a specific analysis or procedure. For example, rather than including a screen shop like Figure 5, I'll type out the syntax (with the appropriate colors and line numbers). Note, that you do not have to type out line numbers. Thus, the syntax in Figure 5 will appear as (see top of next page): 9 of 49

10 1 GET DATA 2 /TYPE=XLS 3 /FILE='C:\Documents and Settings\burnhamb2\My Documents\Class Materials\PSYC 210'+ 4 'Statistics\SPSS Assignments\SPSS-PASW Packet\GRE Therapy Data File.xls' 5 /SHEET=name 'Sheet1' 6 /CELLRANGE=full 7 /READNAMES=on 8 /ASSUMEDSTRWIDTH= DATASET NAME DataSet 2 WINDOW=FRONT Don't worry about what all of this means right now, it will make sense in a little while. :-) 2.1 Why Syntax? Because it s Better! There are two methods that can be used to have PASW do stuff: (1) using pull-down menus, (2) telling PASW what to do by writing syntax commands. (I ll refer to these as the wrong-way and right-way, respectively.) Is the syntax-method easier? No, but it s much more useful, for a variety of reasons. First, you can do more within one syntax file and in a shorter time than with the pull-down menu method. Specifically, you can plan out all of the stuff you need PASW to do, write the appropriate syntax for everything, and then run it all at once. In contrast, with pull-down menus you have to do one thing at a time. Second, you can do more with syntax. There are certain procedures that are simply not possible with the pulldown menus, but that are possible with syntax. Third (and certainly not finally), if you go to grad school, especially in the sciences, you ll need to learn programming. I m giving you a head start. You re welcome! 2.2 Some Syntax Basics...It s Easy? PASW syntax is not case-sensitive, except for variable names. Remember: variable names are case sensitive. If you spell a variable's name correctly, but forget to capitalize a letter or make a letter lowercase, the syntax will not run. I suggest writing commands and sub-commands in CAPS to help distinguish between commands and variables. This will allow you to parse the syntax quickly, especially if you write variable names in lowercase and uppercase. Syntax commands and sub-commands should be entered on separate lines, or ended with a period (.), but not every syntax line has to end with a period, just the overall procedures. That is, if you look at the syntax in Figure 5, there is a period only on Line 8. This is because lines 1-8 are, collectively, asking PASW to retrieve a data file; hence, these eight lines encompass one whole pocedure. Sub-commands within a command procedure, and parts of a command that appear on different lines, must start with a forward-slash (/), not a backward slash. PASW will not know what to do with such sub-commands if the forward slash is not entered. For example, if you look at Figure 5, you can see a forward slash beginning lines 2,3,5,6,7, and 8 (there is no slash in line 4, because line 4 is a continuation of line 3). It is good to enter 'EXECUTE.' at the end of a command procedure. Some commands will not run without this terminator command. Unfortunately, I have never figured out which commands will and will not run with and without this ending statement. 10 of 49

11 Once your syntax is written, you need to run it in order to generate an output file. Highlight the syntax that you want to run and hit Ctrl+R to run the procedures. Or, instead of hitting Ctrl+R, click the Run Button on the toolbar. The Run Button is the green rightward-pointing arrow in the middle. 2.3 Opening.sav files with Syntax I admit that if you have a PASW data file already created, you can really just locate that file and double click to open. Nonetheless, here's how to open a PASW data file using syntax (notes follow): 1 GET 2 FILE='C:\Documents and Settings\burnhamb2\Desktop\GRE Therapy Data File.sav'. 3 DATASET NAME DataSet1 WINDOW=FRONT. The file directory address in line 2 will differ, depending on where the file is placed on your hard drive. In this case, I placed the file on the Desktop for easy access. Note that the directory address for the file must be contained in single quotes ('). DATASET NAME on line 3 should just be set to DataSet1 as listed. An output file will be generated when you run any syntax. When opening a data set, the output file will contain only the commands that led to the opening of the file. You can delete that output file. 2.4 Opening Microsoft Excel (.xls) Files with Syntax Below is an example of the syntax needed to open a data file saved as a Microsoft Excel spreadsheet: 1 GET DATA 2 /TYPE=XLS 3 /FILE='C:\Documents and Settings\burnhamb2\Desktop\GRE Therapy Data File.xls' 4 /SHEET=name 'Sheet1' 5 /CELLRANGE=full 6 /READNAMES=on 7 /ASSUMEDSTRWIDTH= DATASET NAME DataSet1 WINDOW=FRONT. Notice that Line 1 here and Line 1 for opening a PASW data file are the same (GET DATA). You can think of this statement as the 'major command' that are you asking PASW to perform; all of the additional lines are sub-commands. When opening an Excel spreadsheet, special care must be taken that you are asking PASW to open the correct sheet within the workbook (usually Sheet1), that you are asking for the correct cells in the worksheet, and that you have asked PASW to read in any variable names in the spreadsheet. The sub-command on Line 2 (/TYPE) lists XLS, which is the file extension for Microsoft Excel files. On Line 4 (/SHEET=name), the name between the single quotes ('Sheet1') is the name of the worksheet within the Excel workbook where the data is located. If the data sheet in the workbook has a different name or number, this needs to be changed here. Line 5 (/CELLRANGE=full), refers to which cells within the named workbook sheet that are to be imported into PASW. If all of the cells with data are to 11 of 49

12 be imported, just use 'full', but if only some of the cells are to be imported, this should be indicated here (e.g., A1:B200). On Line 6 (/READNAMES=on), this tells PASW that the first row of the Excel sheet contains the names of the variables, and these should be treated as variable names. If the Excel book does not include variable names, then 'off' should be substituted for on. 2.5 Opening Text (.txt) files with Syntax Below is an example of the syntax necessary to open a data file that is saved as a text file: 1 GET DATA 2 /TYPE=TXT 3 /FILE="C:\Documents and Settings\burnhamb2\Desktop\GRE Therapy Data File.txt" 4 /DELCASE=LINE 5 /DELIMITERS="\t" 6 /ARRANGEMENT=DELIMITED 7 /FIRSTCASE=2 8 /IMPORTCASE=ALL 9 /VARIABLES= 10 ID F Sex F Coll_Class F Coll_Maj F Height_cm F Weight_kg F SAT_CR F SAT_M F SAT_V F SAT_Tot F GPA F Drug_Group F Tutor_Group F Pre_STAIt F Pre_STAIs F Pre_GREv F Pre_GREq F Pre_GREa F Post_STAIt F Post_STAIs F Post_GREv F Post_GREq F Post_GREa F CACHE. 34 EXECUTE. 35 DATASET NAME DataSet4 WINDOW=FRONT. First thing, I have no idea why the lines are not colored; I was surprised myself. This set of syntax is a bit longer, mainly because you need to tell PASW to read in each variable name form the text file (Lines 10 32). Like the PASW syntax for importing data in an Excel spreadsheet, you need to be careful to include certain commands. 12 of 49

13 On Line 2 (/TYPE=TXT), the TXT is the file extension for text files. On Line 4 (/DELCASE=LINE), this is telling PASW that each new case (i.e., each subject) is a different line (row) within the text file. On Line 5 (/DELIMITERS="\t"), 'delimiters' define the boundaries between adjacent entries, that is, data points in a data file. The \t is telling PASW that the boundaries are defined by TABS. On Line 7 (/FIRSTCASE=2), this is telling PASW that the data in the text file actually begin on line 2; that is, the first case (subejct) is on line 2 of the data file. On Line 8 (/IMPORTCASE=ALL), this is telling PASW to import all of the data. This can be changed is you only want to import some of the data file. Lines list the labels of each variable in the data set. These variable labels actually appear on line 1 of the data set. Once you have opened a data set, you should save it as a PASW data file to be used in the future. Then, you can just double click it open. Throughout the reminder of this packet, when I am providing syntax examples or the output of a procedure, I am not going to provide too much commentary. I'd rather you explore the output and the syntax on your own to get a feel for everything. 13 of 49

14 3. Syntax for Basic Statistical Needs 3.1 Variable Labels In the data file, the NAME given to each variable is a short acronym. For example, 'ID' stands for 'Identification Number', 'Coll_Maj' stands for 'College Major', 'SAT_CR' stands for 'Critical Reading Score on the SATs', etc. So that you do not have to memorize each of these acronyms, it's a good idea to assign a LABEL to each variable. These VARIABLE LABELS do not show up in the data file, but will show up in an output file. Here is how to use the VARIABLE LABELS syntax to assign the label 'SAT Critical Writing Score' to SAT_CR (remember, you do not type the number at the beginning): 1 VARIABLE LABEL SAT_CR 'SAT Critical Writing Score'. All that you need to do is to list the variable NAME (SAT_CR) followed by the LABEL you wish to assign (SAT Critical Writing Score). Be sure that the label is in single quotes. You can also assign labels to more than one variable at a time: 1 VARIABLE LABEL SAT_CR 'SAT Critical Writing Score' SAT_M 'SAT Math Score'. 3.2 Value Labels For independent variables that have several levels/groups, it is best to dummy-code those groups in the data file. That is, in the data file, male subjects and female subjects will not be called 'male' and 'female'; rather, they will be assigned arbitrary numbers. In the data file for this packet, for the variable 'Sex', males are assigned 1 and females are assigned 2. The numbers can be anything, as long as all males have the same number, and all females have the same number. The reason, is that if you want to compare levels/groups of an independent variable, PASW requires they have numeric labels. The downside, is that if you run an analysis that involves those groups/levels, only the arbitrary numbers will appear in the output. You'd have to memorize what the label 1 means for the variable Sex, versus what the label 1 means for another independent variable. But, you can assign LABELS to the dummy-code VALUE assigned to groups. These VALUE LABELS will not show in the data file, but do show in output. Here is an example of how to use the VALUE LABELS syntax to assign labels to the dummy-coded males and females for the variable Sex: 1 VALUE LABEL Sex 1 'Males' 2 'Females'. If you want to assign labels to more than one independent variable at a time, it is best to use several individual commands: 1 VALUE LABEL Sex 1 'Males' 2 'Females'. 2 VALUE LABEL Coll_Class 1 'Freshmen' 2 'Sophomore' 3 'Junior' 4 'Senior'. 3 VALUE LABEL Coll_Maj 1 'Psychology' 2 'History' 3 'Biology' 4 'Communications' 5 'English' 6 14 of 49

15 4 'Mathematics'. 5 VALUE LABEL Drug_Group 1 'Control Group (no drug)' 2 'Placebo Group' 3 '100 mg/day 6 Group' 4 '200 mg/day Group'. 7 VALUE LABEL Tutor_Group 1 'Control Group (no tutoring)' 2 'Group Tutoring' 3 'Individual 8 Tutoring'. In the data file, I have assigned VALUE LABELS to each independent variable. Hence, when output is presented later in this packet, the groups will not have dummy-codes, they have the labels assigned from the syntax above. 3.3 Frequencies The FREQUENCIES command is used to obtain a frequency table for a variable. The syntax below asks PASW to determine the frequency for each group within the variables Sex and Coll_Class. Note that the variable names have to be entered just as they appear at the top of the columns in the data file. Also, note that you can request frequencies for several variables at once. This is typical for most PASW commands: you can request a procedure for several variables simultaneously: 1 FREQUENCIES VARIABLES=Sex Coll_Class 2 /ORDER=ANALYSIS. The syntax above provides the following output (comments were added by me): Statistics Sex Coll_Class Coll_Maj N Valid Missing How many cases (subjects) that contribute to each of the three variables. Frequency Table Sex Each group that contributes to each variable is listed to the left Frequency Percent Valid Percent Cumulative Percent Valid Males Females Total Coll_Class Frequency Percent Valid Percent Cumulative Percent Valid Freshmen Sophomore Junior Senior Total of 49

16 3.4 Descriptive Statistics Although descriptive statistics can be requested as a sub-command within many PASW commands, there is a specific DESCRIPTIVES command. Like the FREQUENCIES command, you can request descriptive statistics for several variables at the same time. In the syntax below, I requested PASW to compute descriptive statistics on the variables Height_cm and Weight_kg: 1 DESCRIPTIVES VARIABLES=Height_cm Weight_kg 2 /STATISTICS=MEAN SUM STDDEV VARIANCE RANGE MIN MAX SEMEAN KURTOSIS 3 SKEWNESS. You can request a variety of descriptive statistics. On Lines 2 and 3, I listed each descriptive statistic that can be requested; most should be self-explanatory, except for 'SEMEAN', which stands for standard error of the mean, and KURTOSIS and SKEWNESS, which refer to the peakedness of a distribution and the skewness of a distribution, respectively. In the output that follows, I did not request the KURTOSIS and the SKEWNESS statistics: Each requested variable is listed in a different column. Descriptive Statistics Std. N Range Minimum Maximum Sum Mean Deviation Variance Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Height_cm Weight_kg Valid N 240 (listwise) Each variable is listed in the far left column. 3.5 SORT CASES If you want to sort all of the cases in the data file in ascending or descending order, based on a certain variable, the following SORT CASES command is used. The syntax below asks PASW to arrange the data file in ascending order (A) based on the variable Coll_Class. In the data file, freshmen will appear first, then sophomores, followed by juniors, and finally seniors. If you want to sort in descending order, use (D) in place of (A). (There is no output for this syntax command.) 1 SORT CASES BY Coll_Class(A). 16 of 49

17 3.6 SPLIT FILE I section 3.4 above, where PASW was asked to calculate descriptive statistics, each statistic was based on the n = 240 subjects in the data file. There is nothing wrong with this, but what if you wanted to look at the means and descriptive statistics for different groups? For example, you may want to look at students' mean weights and mean heights for each college class. But, the output in section 3.4 includes data combined from across all four college classes. Luckily, PASW has a SPLIT FILE command that asks PASW to calculate descriptive statistics for different groups within some independent variable. For example, say you wanted to examine the descriptive statistics by college class. First, you need to use the following syntax to 'split' the output file into different groups: 1 SORT CASES BY Coll_Class. 2 SPLIT FILE SEPARATE BY Coll_Class. T he variable by which you want the output 'split' into different groups is listed here. Next, run the same DESCRIPTIVES syntax in Section 3.4: 1 DESCRIPTIVES VARIABLES=Height_cm Weight_kg 2 /STATISTICS=MEAN SUM STDDEV VARIANCE RANGE MIN MAX SEMEAN KURTOSIS 3 SKEWNESS. You will get the following output, which is the descriptive statistics performed on each group within the variable Coll_Class: Coll_Class = Freshmen Descriptive Statistics N Range Minimum Maximum Sum Mean Std. Deviation Variance Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Height_cm Weight_kg Valid N (listwise) 57 Coll_Class = Sophomore Descriptive Statistics N Range Minimum Maximum Sum Mean Std. Deviation Variance Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Height_cm Weight_kg Valid N (listwise) of 49

18 Coll_Class = Junior Descriptive Statistics N Range Minimum Maximum Sum Mean Std. Deviation Variance Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Height_cm Weight_kg Valid N (listwise) 63 Coll_Class = Senior Descriptive Statistics N Range Minimum Maximum Sum Mean Std. Deviation Variance Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Height_cm Weight_kg Valid N (listwise) 55 When you're done using the SPLIT FILE COMMAND, don't forget to turn it off; or else all of your output will be separated into different groups: 1 SPLIT FILE OFF. 18 of 49

19 4. Correlation & Regression 4.1 Pearson Correlations (Bivariate) PASW can measure the statistical association between two variables in a variety of ways (e.g., Pearson correlation, Spearman correlation, Chi-Square, gamma coefficients). For the data in our file, we'll be dealing with how PASW can calculate the Pearson correlation between two variables. The CORRELATIONS syntax below asks PASW to calculate the Pearson correlation between the variables SAT_CR (SAT Critical Writing Score) and SAT_M (SAT Math Score). All that you need to do is to list on Line 2 the variables between which you want the Pearson correlation measured: 1 CORRELATIONS 2 /VARIABLES= SAT_CR SAT_M 3 /PRINT=TWOTAIL NOSIG 4 /MISSING=PAIRWISE. On Line 3, the TWOTAIL sub-command tells PASW to run the inferential test on the Pearson correlation as a non-directional, two-tailed test. NOSIG asks PASW to indicate which correlations are statistically significant with an asterisk (*). On Line 4, the /MISSING=PAIRWISE sub-command tells PASW what to do with any missing data points. (In this data file, there are no missing data.) If you have a missing data point, PASW must know what to do with that subject's data. You have two options: handle missing data PAIRWISE or LISTWISE. If you choose LISTWISE, any subject who has a missing data point for any variable will be excluded from all correlations. If you choose PAIRWISE, a subject will be excluded from only those correlations where the subject is missing a data point. When you run the syntax above, you get the following output: Correlations SAT_CR SAT_M SAT_CR Pearson Correlation Sig. (2-tailed).340 N SAT_M Pearson Correlation Sig. (2-tailed).340 N Each variable is listed in its own column and own row. To find the Pearson correlation between two variables, cross-reference one variable in the columns with the other variable in the rows. The Sig. (2- tailed) value under the Pearson correlation is the p-value for that correlation. It is the exact alpha-level (α) associated with that size correlation (r = -.062) based on that sample size (n = 240). To interpret a p-value: if the listed p-value is less than your chosen alpha-level, which is generally α =.05 or less, then the correlation is significant. In this case, the Pearson correlation is not significant, because the p-value (p =.340) is greater than of 49

20 It is also possible to calculate several Pearson correlations at the same time. The more variables that you list on the /VARIABLES sub-command line, the more correlations will be calculated. For example, in the syntax below, I have listed three variables (SAT_CR, SAT_M, and SAT_V). When I run this syntax, PASW will generate the Pearson correlation between each pair of variables: 1 CORRELATIONS 2 /VARIABLES= SAT_CR SAT_M SAT_V 3 /PRINT=TWOTAIL NOSIG 4 /MISSING=PAIRWISE. Correlations SAT_CR SAT_M SAT_V SAT_CR Pearson Correlation ** Sig. (2-tailed) N SAT_M Pearson Correlation Sig. (2-tailed) N SAT_V Pearson Correlation.481 ** Sig. (2-tailed) N You can see in the output above, in addition to the correlation between SAT_CR and SAT_M that was calculated earlier, PASW also calculated the correlation between SAT_CR and SAT_V (r =.541), and between SAT_M and SAT_V (r = -0.48). PASW also has a sub-command that allows you to request descriptive statistics to be calculated for each variable, and for the sums of squares, variances, sums of cross products, and covariances to be calculated. On line 4 of the syntax below, the DESCRIPTIVES command requests the means and standard deviations for each variable, and the XPROD command requests the variability and covariability measures: 1 CORRELATIONS 2 /VARIABLES=SAT_CR SAT_M SAT_V 3 /PRINT=TWOTAIL NOSIG 4 /STATISTICS DESCRIPTIVES XPROD 5 /MISSING=PAIRWISE. 20 of 49

21 Here is the output from the last set of syntax. The first table includes the descriptive statistics for each variable, and the second table includes the person correlations, measures of variability, and measures of co-variability: Descriptive Statistics Mean Std. Deviation N SAT_CR SAT_M SAT_V Between two different variables, this is the sum of cross products. Between the same variable, this is the sum of squares. Correlations SAT_CR SAT_M SAT_V SAT_CR Pearson Correlation ** Sig. (2-tailed) Sum of Squares and Cross-products Covariance N SAT_M Pearson Correlation Sig. (2-tailed) Sum of Squares and Cross-products Covariance N SAT_V Pearson Correlation.481 ** Sig. (2-tailed) Sum of Squares and Cross-products Covariance N Between two different variables, this is the covarance. Between the same variable, this is the variance. 4.2 Pearson Correlations (Partial) Having PASW calculate the partial correlation between two variables (the correlation between two variables with the influence of other variables factored out from both variables), is not much different than asking PASW to calculate a raw (zero-order) correlation. For example, say you want to calculate the partial correlation between GPA and Pre_GREv scores (Pretest GRE Verbal Reasoning Scores), while factoring out the SAT_CR scores (Critical Reasoning Scores on the SAT) from both variables. 21 of 49

22 In the syntax below, on the /VARIABLES sub-command line, the two variables listed before the BY (GPA and Pre_GREv) are the variables between between which we want to calculate a partial correlation. The variable that comes after the BY (SAT_CR) is the variable we want factored out of the other variables. Please note that you can ask PASW to factor out more than one variable: 1 PARTIAL CORR 2 /VARIABLES=GPA Pre_GREv BY SAT_CR 3 /SIGNIFICANCE=TWOTAIL 4 /STATISTICS=DESCRIPTIVES CORR 5 /MISSING=LISTWISE. On Line 3, the /SIGNIFICANCE=TWOTAIL asks PASW to run the inferential test on the partial correlation as a non-directional, two-tailed test. You have the option of selecting a ONETAIL test as well. On line 4, the /STATISTICS sub-command is asking PASW to calculate the descriptive statistics (DESCRIPTIVES) for each variable. The CORR sub-command is asking PASW to provide the raw Pearson correlations between each pair of variables, in addition to the partial correlation between GPA and Pre_GREv. Here is the output from the syntax above. The first table reports the descriptive statistics, and the second table is the correlations and partial correlations. The areas in yellow are the raw Pearson correlations, and the areas in green are the partial correlations: Mean Std. Deviation N GPA Pre_GREv SAT_CR Correlations Control Variables GPA Pre_GREv SAT_CR -none- a GPA Correlation Significance (2-tailed) df Pre_GREv Correlation Significance (2-tailed) df SAT_CR Correlation Significance (2-tailed) df SAT_CR GPA Correlation Significance (2-tailed)..074 df Pre_GREv Correlation Significance (2-tailed).074. df of 49

23 4.3 Univariate Regression (one regressor) There is a mountain of stuff that you can do with PASWs REGRESSION procedures, including how a regression analysis is performed and what statistics can be requested. Below, I am performing a 'barebones' REGRESSION analysis to keep things simple. The analysis below will regress (predict) GPA on the Summed SAT Scores (SAT_tot). Hence, GPA is the dependent variable (Y) and SAT_tot is the predictor variable (X). In the syntax below, PASW is being asked to regress GPA on SAT_tot. The DEPENDENT (predicted, or regressed) variable is listed on Line 6. The predictor (independent, or regressor) variable is listed on Line 7 after the?method sub-command. A few notes on Line 7: First, if you have more than one predictor, each predictor would be entered here. In this example we have only one predictor (SAT_tot). Second, there are a number of methods that you can use to have PASW conduct the analysis (ENTER, STEPWISE, etc.), but this is beyond the scope of this packet. Just use METHOD=ENTER: 1 REGRESSION 2 /MISSING LISTWISE 3 /STATISTICS COEFF OUTS R ANOVA 4 /CRITERIA=PIN(.05) POUT(.10) 5 /NOORIGIN 6 /DEPENDENT GPA 7 /METHOD=ENTER SAT_Tot. The /STATISTICS sub-command on Line 2 is where you can ask PASW to provide various statistics and inferential tests as part of the regression analysis. COEFF requests the slope and intercept coefficients in the regression model. OUTS asks PASW to list any predictors that were entered into the regression model, but were not included due to their not meeting criteria specified on Line 4. 'R' asks for the R and R 2 values of the regression model. ANOVA ask for the analysis of variance to be conducted on the overall regression model. On Line 4, the /CRITERIA=PIN(.05) POUT(.10) are inclusion and exclusion criteria for each regressor coefficient that is initially entered into the model. Basically, if a regressor coefficient does not meet these set criteria, which are based on the t-tests for the coefficients, they are not included in the final regression model. These values can be adjusted, but the.05 and.10 are used by default. When you run the syntax above, you get the following output: Variables Entered/Removed b Model Variables Entered Variables Removed Method Model 1 SAT_Tot a. Enter Model Summary R R Square Adjusted R Square Std. Error of the Estimate a T his table simply lists the predictor variables that are being entered into the regression analysis. T his table provides the R and R 2 values. T he R 2 is the proportion of explained variance. 23 of 49

24 ANOVA b Model Sum of Squares df Mean Square F Sig. 1 Regression a Residual Total T he ANOVA is the overall analysis of the regression model. Model Coefficients a Unstandardized Coefficients Standardized B Std. Error Beta T his table provides the values of the coefficients in the regression equation, as well as t-tests on each coefficient. Coefficients t Sig. 1 (Constant) SAT_Tot You can also ask PASW to report descriptive statistics for each variable, correlations between variables, and a host of other information. In the syntax below, I added a /DESCRIPTIVES subcommand on Line 2 that asks for the MEAN and standard deviation (STDEV) for each variable, the Pearson correlation (CORR) between each pair of variables, that a significance test (SIG) be performed on each correlation, and for the number of subjects (N) contributing to each variable and to each correlation: 1 REGRESSION 2 /DESCRIPTIVES MEAN STDDEV CORR SIG N 3 /MISSING LISTWISE 4 /STATISTICS COEFF OUTS R ANOVA ZPP 5 /CRITERIA=PIN(.05) POUT(.10) 6 /NOORIGIN 7 /DEPENDENT GPA 8 /METHOD=ENTER SAT_Tot. I also added ZPP to the /Statistics sub-command on Line 4. This asks PASW to calculate the zeroorder, partial, and semi-partial correlations between every pair of variables. In this case, because no variable is being factored out of the relationship between GPA and SAT_tot, each of these correlations will be the same. The output from this syntax appears below and on the next page: Descriptive Statistics Mean Std. Deviation N GPA SAT_Tot T he requested descriptive statistics for each variable. 24 of 49

25 Lists the requested correlations and p-values. Correlations GPA SAT_Tot Pearson Correlation GPA SAT_Tot Sig. (1-tailed) GPA..000 SAT_Tot.000. N GPA SAT_Tot Variables Entered/Removed b Model Variables Entered Variables Removed Method 1 SAT_Tot a. Enter Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate a ANOVA b Model Sum of Squares df Mean Square F Sig. 1 Regression a Residual Total Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. Correlations B Std. Error Beta Zero-order Partial Part 1 (Constant) SAT_Tot Here are the requested zero-order, partial, and semi-partial correlations. 25 of 49

26 5. t-tests 5.1 One-Sample t-test There are three t-tests PASW can perform on a set of data: one-sample t-test, independent-groups t- Test (independent-samples t-test), and correlated samples t-test (paired-samples t-test). But, the statistics that can be requested and the test parameters that you can control are very limited. The syntax below asks PASW to run a one-sample t-test. The dependent variable is GPA, which is entered on the /VARIABLES sub-command on Line 4: 1 T-TEST 2 /TESTVAL=3 3 /MISSING=ANALYSIS 4 /VARIABLES=GPA 5 /CRITERIA=CI(.95). Importantly, for the one-sample t-test, you must state a value to which the mean of the dependent variable is compared. This value is entered after the /TESTVAL sub-command on Line 2. In this case, PASW is being asked to compare the mean GPA to a value of 3, which coincides with a grade of 'B'. The /CRITERIA sub-command on Line 5 is pretty much all you have control over, besides the /TESTVAL on Line 2. The CI value tells PASW what size confidence interval and what alpha-level to use in the t-test. In this case,.95 corresponds to the 95% confidence interval, and alpha level of.05. If you run the syntax above, you get the following in the output file: One-Sample Statistics N Mean Std. Deviation Std. Error Mean GPA T his table presents the descriptive statistics for the dependent variable. T his table presents the results of the inferential, one-sample t-test. One-Sample Test Test Value = 3 t df Sig. (2-tailed) Mean Difference 95% Confidence Interval of the Difference Lower Upper GPA In the table for the One-Sample Test above, the Sig. (2-tailed) value is the p-value used as a basis for determining statistical significance. If it is less than your chosen alpha level (α =.05, or less), then the 26 of 49

27 difference between the mean ( ) and the test value (3) is significant. In this case, the difference is not significant, because.803 >.05. The values underneath the heading 95% Confidence Interval of the Difference are the upper and lower boundaries for the 95% confidence interval around the difference between the mean and the test value ( ). As another example, the syntax below asks PASW to compare the mean pretest score on from the Analytical Writing section of the GREs (Pre_GREa) to a test value of 4.9. This test value of 4.9 is actually the national mean score on that section of the GREs: 1 T-TEST 2 /TESTVAL=4.9 3 /MISSING=ANALYSIS 4 /VARIABLES=Pre_GREa 5 /CRITERIA=CI(.95). Running this syntax, we get the following in the output file: One-Sample Statistics N Mean Std. Deviation Std. Error Mean Pre_GREa One-Sample Test Test Value = 4.9 t df Sig. (2-tailed) Mean Difference 95% Confidence Interval of the Difference Lower Upper Pre_GREa In this case, the One-Sample Test indicates that the mean difference (-.785) is statistically significant, because the p-value in the Sig. (2-tailed) column is less than the conventional alpha-level of α = Independent Groups t-tests The syntax on the next page illustrates how to conduct an independent groups t-test. Note that when comparing two different groups or levels within a between-subjects independent variable, you must be sure that the groups/levels of that independent variable have been dummy-coded; that is, assigned numeric values in the data file. PASW will not run the independent groups t-test if the groups have been assigned descriptive (string) labels in the data file. Say that we want to compare the mean posttest score on the Verbal Reasoning Section of the GREs between different levels of the independent variable Tutor_Group. Specifically, we want to compare mean performance between the group of subjects who did not receive tutoring (Control Group) and the group of subjects who received individual tutoring (Individual Tutoring Group). Recall, within the 27 of 49

28 independent variable Tutor_Group, the group that did not receive tutoring was dummy-coded with 1 and the group that received individual tutoring was dummy-coded with 3 (the group that received group tutoring was dummy coded with 2). In the syntax below, after the T-TEST command, the GROUPS sub-command is listed. In the parentheses, the 1 and 3 are the values that were assigned to the no tutoring group and the individual tutoring group, respectively. The dependent variable (Post_GREv) is listed after the /VARIABELS subcommand on Line 3: 1 T-TEST GROUPS=Tutor_Group(1 3) 2 /MISSING=ANALYSIS 3 /VARIABLES=Post_GREv 4 /CRITERIA=CI(.95). When you run this syntax, you get the following output: T his table presents the descriptive statistics on the dependent variable for each group within the independent variable. Group Statistics Tutor_Group N Mean Std. Deviation Std. Error Mean Post_GREv Control Group (no tutoring) Individual Tutoring Post_GREv Equal variances assumed Equal variances not assumed Independent Samples Test Levene's Test for Equality of Variances F Sig. t df Sig. (2- tailed) T his table presents the results of the independent groups t-test that is comparing the means in the table above. t-test for Equality of Means Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper The table Independent Samples Test lists a lot of information, some of which is relevant, some of which is less relevant. First, you will almost always assume equal variances, so be sure to use information from those rows. Second, Levene's Test for Equality of Variances is a test for whether the variances of the groups being compared are statistically equivalent. If Levene's Test is not significant, which is the case here, then we can assume that the variances are indeed equal. The information under the heading t-test for Equality of Means is relevant to the independent groups t- Test on the data and most of the terms should be self-explanatory. Importantly, the Sig. (2-tailed) value is the p-value used for determining statistical significance. If it is less than a chosen alpha level (α =. 05, or less), then the mean difference ( ) is significant, which is the case here. Please note that the mean difference is negative because of how the groups were entered into the t-test in the syntax. That is, the no tutoring group was entered first in the syntax and the individual tutoring group was 28 of 49

29 entered second. This means that PASW will subtract the individual tutoring mean from the no tutoring mean. Thus, this value is negative only because of how the groups are being entered; it has nothing to do with any hypotheses. A nice feature about the PASW independent groups t-test procedure is that you can run several t- Tests that are comparing performance between the same two groups. For example, let's say we also want to compare mean posttest score on the Analytical Writing Section of the GREs between the no tutoring group and the individual tutoring group. All that you have to do is add this dependent variable on the /VARIABLES sub-command on Line 3: 1 T-TEST GROUPS=Tutor_Group(1 3) 2 /MISSING=ANALYSIS 3 /VARIABLES=Post_GREv Post_GREa 4 /CRITERIA=CI(.95). Running this syntax, we get the following output: Group Statistics Tutor_Group N Mean Std. Deviation Std. Error Mean Post_GREv Control Group (no tutoring) Individual Tutoring Post_GREa Control Group (no tutoring) Individual Tutoring Post_GREv Post_GREa Equal variances assumed Equal variances not assumed Equal variances assumed Equal variances not assumed Independent Samples Test Levene's Test for Equality of Variances F Sig. t df Sig. (2- tailed) t-test for Equality of Means Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper of 49

30 5.3 Correlated Samples (Paired Samples) t-tests Recall, that the correlated samples t-test is used to compare performance on some dependent variable across levels of a within-subjects independent variable. In PASW, the correlated samples t- Test is called the paired samples t-test. As was the case for the one-sample t-test and independent groups t-tests, there is not much control over what you can request for the paired samples t-test. Say we want to compare trait-anxiety levels between the pretest and posttest periods. Recall, in the hypothetical study, the researchers measured each subject's state anxiety and trait anxiety using the State-Trait Anxiety Inventory (STAI), and these types of anxiety were measured during the pretest and posttest periods. If we are interested in, specifically, the change in trait anxiety between the pretest and posttest periods, we're going to want to compare the Pre_STAIt mean with the Post_STAIt mean. The syntax below asks PASW to compare Pre_STAIt with Post_STAIt scores. On the T-TEST command line, the PAIRS sub-command tells PASW to run a paired samples t-test. The levels of the variable being compared come before and after the WITH. Thus, Line 1 is basically telling PASW to compare Pre_STAIt scores WITH Post_STAIt scores using a PAIRED samples t-test: 1 T-TEST PAIRS=Pre_STAIt WITH Post_STAIt (PAIRED) 2 /CRITERIA=CI(.95) 3 /MISSING=ANALYSIS. Running this syntax, you get the following output: Paired Samples Statistics Mean N Std. Deviation Std. Error Mean Pair 1 Pre_STAIt Post_STAIt Paired Samples Correlations N Correlation Sig. Pair 1 Pre_STAIt & Post_STAIt Pair 1 Pre_STAIt - Post_STAIt Mean Std. Deviation Paired Samples Test Paired Differences t df Std. Error 95% Confidence Interval of Mean the Difference Lower Upper Sig. (2- tailed) In the table Paired Samples Test, most of the statistics should be familiar and straightforward. Mean is the mean difference in the dependent variable between the levels of the independent variable. Note 30 of 49

31 that this value is positive because of how PASW entered the levels of the independent variable into the t-test. In the syntax, Pre_STAIt was entered before WITH and Post_STAIt was entered after WITH; hence, the Post_STAIt mean was subtracted from the Pre_STAIt mean. Thus, it is positive only because of how the levels were entered. Ion this example, the mean difference (.175) is not statistically significant, because the p-value (.163) is greater than.05. You can request several Paired Samples t-tests at the same time. For example, in addition to comparing the Pre_STAIt mean with the Post-STAIt mean, say we also want to compare the pretest and posttest scores from the Quantitative Reasoning section of the GREs (Pre_GREq compared to Post_GREq). In the syntax below, two variables are listed before WITH (Pre_STAIt and Pre_GREq) and two variables are listed after WITH (Post_STAIt and Post_GREq): 1 T-TEST PAIRS=Pre_STAIt Pre_GREq WITH Post_STAIt Post_GREq (PAIRED) 2 /CRITERIA=CI(.95) 3 /MISSING=ANALYSIS. When the syntax is run, PASW will compare the mean of the first variable before WITH (Pre_STAIt) with the mean of first variable after WITH (Post_STAIt); and PASW will compare the mean of the second variable before WITH (Pre_GREq) with the mean of second variable after WITH (Post_GREq). Thus, it is critical to enter the variables on each side of the WITH in the appropriate order when running several paired-samples t-tests. Running this syntax provides the following output: Paired Samples Statistics Mean N Std. Deviation Std. Error Mean Pair 1 Pre_STAIt Post_STAIt Pair 2 Pre_GREq Post_GREq Paired Samples Correlations N Correlation Sig. Pair 1 Pre_STAIt & Post_STAIt Pair 2 Pre_GREq & Post_GREq Pair 1 Pre_STAIt - Post_STAIt Pair 2 Pre_GREq - Post_GREq Paired Samples Test Paired Differences t df Sig. (2- tailed) Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference Lower Upper of 49

32 6. Analysis of Variance 6.1 Oneway Analysis of Variance (via GLM) Analysis of Variance (ANOVA) is used, for among other reasons, to compare performance on a dependent variable across two or more levels of one or more independent variables. Oh the things I could say about ANOVA and experimental design! Alas, we do not have time. The PASW procedure for ANOVA is the General Linear Model (GLM). Don't worry about what it means, just know that it calculates F-tests for single-factor and factorial designs. ANOVA can be used when the levels of an independent variable are manipulated (experimental design), or naturally-occurring (quasi-experimental design). Critically: Setting up ANOVA in PASW requires you to think about the design: Is there one independent variable, or more? How many levels of each independent variable are there? Do the levels of the independent variables differ betweensubjects or within-subjects? I don't want to get technical, so I'll be as simple as possible. From the data set, say we want to compare the Posttest GRE Verbal Reasoning Scores (Post_GREv) across the four groups within the independent variable Drug_Group. Thus, we have a oneway ANOVA; that is, one independent variable and one dependent variable. The syntax below presents the minimal set of sub-commands needed to run a oneway ANOVA. This syntax is used only if the independent variable is between-subjects (withing-subjects variables require a repeated measured GLM): 1 UNIANOVA Post_GREv BY Drug_Group 2 /METHOD=SSTYPE(3) 3 /INTERCEPT=INCLUDE 4 /CRITERIA=ALPHA(.05) 5 /DESIGN=Drug_Group. The variable before BY (Post_GREv) is always the dependent variable and the variable after BY (Drug_Group) is always the independent variable. If you have a factorial design, the additional independent variables would be entered here. On Line 2, the /METHOD sub-command tells PASW how the sums of squares should be calculated (SSTYPE), which is usually set to 3. On Line 4, the /CRITERIA sub-command tells PASW what alpha level to use. Finally, on Line 5, the /DESIGN subcommand is where you build the effects to be examined in the ANOVA. In the case of a oneway design, there is only one independent variable to influence the dependent variable; hence, you list that independent variable. When you are using ANOVA to analyze a factorial designs, additional factors can be included. Running the syntax above gives you the following: T his table lists each level of the independent variable, as well as the number of subjects (N) contributing to each level. Between-Subjects Factors Value Label N Drug_Group 1 Control Group (no drug) 60 2 Placebo Group mg/day Group mg/day Group of 49

33 Tests of Between-Subjects Effects Dependent Variable:Post_GREv Source Type III Sum of Squares df Mean Square F Sig. Corrected Model a Intercept 4.487E E Drug_Group Error Total 4.586E7 240 Corrected Total T his table is the ANOVA summary table. T he sums of squares, degrees of freedoms, mean squares, F-Tests, and p- values are listed here. The ANOVA summary table (Tests of Between-Subjects Effects) contains a lot of information, some of it unnecessary for our present purpose. I have highlighted relevant portions of the table in yellow. The terms associated with between group variance (variability due to the independent variable) are in the row labeled Drug_Group, which is the independent variable. The terms associated with the within group variance are in the row labeled Error. Most values in each column should be straightforward: Sums of squares for each source of variance are in the second column, degrees of freedom are in the third column, mean squares come next, followed by the F-test, and finally p-values. In this case, the F- Test on the independent variable is not statistically significant, because the p-value (.075) is greater than the chosen alpha-level (.05). Let's assume the test was significant, so we can do post-hoc tests. If you have a statistically significant F-Test, you need to know between which levels of the independent variable there is a significant difference in the dependent variable: we need post-hoc tests. The syntax below includes additional sub-commands. First, the /POSTHOC sub-command on Line 4 asks PASW to compare levels of the independent variable Drug_Group using Fisher's Least Significant Difference test (LSD). You have several options for what post-hoc test to use (TUKEY, BONFERRONI), but we'll stick with LSD for now. On Lines 5 and 6, the /EMEANS sub-command asks PASW to calculate the estimated mean of the dependent variable at each levels of the independent variable. Specifically, Line 5 asks for the grand mean (OVERALL), and Line 6 asks for the estimated mean for each level of Drug_Group. Finally, the /PRINT sub-command on Line 7 asks PASW to include additional items in the output. Specifically, ETASQ requests the eta-squared measure for the effect size, and DESCRIPTIVE asks for the descriptive statistics. There are many additional items that you can ask PASW to 'print' in the output, but we'll stick with these. 1 UNIANOVA Post_GREv BY Drug_Group 2 /METHOD=SSTYPE(3) 3 /INTERCEPT=INCLUDE 4 /POSTHOC=Drug_Group(LSD) 5 /EMMEANS=TABLES(OVERALL) 6 /EMMEANS=TABLES(Drug_Group) 7 /PRINT=ETASQ DESCRIPTIVE 8 /CRITERIA=ALPHA(.05) 9 /DESIGN=Drug_Group. 33 of 49

34 When you run the syntax, you get the following output: Between-Subjects Factors Value Label N Drug_Group 1 Control Group (no drug) 60 2 Placebo Group mg/day Group mg/day Group 60 T his table comes from requesting DESCRIPT IVES as part of the /PRINT sub-command. Descriptive Statistics Dependent Variable:Post_GREv Drug_Group Mean Std. Deviation N Control Group (no drug) Placebo Group mg/day Group mg/day Group Total Tests of Between-Subjects Effects Dependent Variable:Post_GREv Source Type III Sum of Partial Eta Squares df Mean Square F Sig. Squared Corrected Model a Intercept 4.487E E Drug_Group Error Total 4.586E7 240 Corrected Total Estimated Marginal Means 1. Grand Mean Dependent Variable:Post_GREv Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Estimated marginal means come from the /EMEANS sub-commands. Table 1 comes from the OVERALL request on Line 5 of the syntax, and Table 2 comes from Line Drug_Group Dependent Variable:Post_GREv Drug_Group Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Control Group (no drug) Placebo Group mg/day Group mg/day Group of 49

35 Post Hoc Tests Drug_Group T his table presents all of the pairwise comparisons between levels of the independent variable; that is, all of the POST HOC comparisons. Multiple Comparisons Dependent Variable:Post_GREv (I) Drug_Group (J) Drug_Group Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval Lower Bound Upper Bound LSD Control Group (no drug) Placebo Group mg/day Group mg/day Group * Placebo Group Control Group (no drug) mg/day Group mg/day Group * mg/day Group Control Group (no drug) Placebo Group mg/day Group mg/day Group Control Group (no drug) * Placebo Group * mg/day Group In the output above, the estimated marginal means and the descriptive statistics table provide more or less the same information: the means of each level of the independent variable Drug_Group. The table under Post Hoc Tests tells you which differences between levels of the independent variable are statistically significant. To read the Post Hoc Tests (Multiple Comparisons) table: There are two columns (I and J), both of which are labeled with the independent variable (Drug_Group). Under column I, one level of the independent variable should be listed, and in column J each of the other three levels of that independent variable are listed in separate rows. For example, the first level of the independent variable listed in column I is Control Group (no drug), and each of the other three levels of the independent variable are listed under column J: Placebo Group, 100 mg/day group, 200 mg/day Group. You should see a mean difference next to each of the groups in column J. This is the mean difference in the dependent variable between the level of the independent variable listed in column J with the level of the independent variable listed in column I. Thus, the mean difference in Posttest Verbal Reasoning GRE scores between the Placebo Group and the Control group is The mean difference in Posttest Verbal Reasoning GRE scores between the 100 mg/day Group and the Control group is (Note, they are negative only because of the direction PASW is subtracting.) To determine whether a mean difference is statistically significant, look at the column labeled Sig. This column lists the p-value that can be used to determine whether the mean difference is significant. If the p-value is less than a chosen alpha level (α =.05, or less), then the mean difference is significant. In this data set, the only statistically significant mean differences are between the Control Group and the 200 mg/day Group (-26.00, p =.027) and between the Placebo Group and the 200 mg/day Group (-27.00, p =.021). But, it should be noted that because the F-Test was not significant, these post-hoc, pairwise comparisons are meaningless. 35 of 49

36 6.2 Between Subjects Factorial ANOVA (via GLM) Factorial designs examine the influence of two or more independent variables on a dependent variable, and several possible effects can be significant (or not) in a factorial ANOVA: main effects and interactions. (I assume you know what these are.) The PASW procedure for requesting a factorial ANOVA is not very different from requesting a oneway ANOVA. In the syntax the follows, we will cover how to request factorial ANOVA in PASW with two between-subjects independent variables. Say that we want to examine the influence of the independent variables Drug_Group and Tutor_Group on Posttest GRE Verbal Reasoning Scores (Post_GREv). Recall that Drug_Group has four levels (Control, Placebo, 100 mg/day, and 200 mg/day), and Tutor_Group has three levels (Control, Group Tutoring, and Individual Tutoring). Thus, we have a 4 (Drug_Group) x 3 (Tutor_Group) factorial design. The set of syntax, below, which we not actually run, includes minimum sub-commands needed to have PASW run a factorial ANOVA. On the UNIANOVA command line (Line 1), before the BY, the dependent variable (Post_GREv) is listed. After the BY, both independent variables are listed (Drug_Group and Tutor_Group). The inclusion of the second independent variable is one difference from the oneway ANOVA in Section 6.1. Lines 2 4 are exactly the same at the oneway ANOVA performed in Section 6.1, and need no additional commentary. 1 UNIANOVA Post_GREq BY Drug_Group Tutor_Group 2 /METHOD=SSTYPE(3) 3 /INTERCEPT=INCLUDE 4 /CRITERIA=ALPHA(.05) 5 /DESIGN=Drug_Group Tutor_Group Drug_Group*Tutor_Group. The /DESIGN sub-command on Line 5 is where you request effects to be included in the overall ANOVA design. Remember, in factorial designs there is the potential of a main effect of each independent variable, and the potential for interactions between independent variables. Thus, each main effect and interaction that should be included in the analysis should be listed here. To include a main effect in the design, list the name of that independent variable. In the syntax above, the inclusion of Drug_Group and Tutor_Group on Line 5 asks PASW conduct F-Tests for those main effects. To include an interaction, list the independent variables that are part of the desired interaction and include an asterisk (*) between them. In the syntax above, the inclusion of Drug_Group*Tutor_Group asks PASW to conduct an F-Test on that interaction. Because we have only two independent variables, this is the only possible interaction. With three of more independent variables, additional interactions could be listed here. So that's it! Again, we won't run this syntax; I'll include some more stuff before presenting any output. Below, I listed the minimum syntax for a oneway ANOVA alongside the minimum syntax for a factorial ANOVA, for comparison: Line Oneway ANOVA Line Factorial NAOVA 1 UNIANOVA Post_GREv BY Drug_Group 1 UNIANOVA Post_GREq BY Drug_Group Tutor_Group 2 /METHOD=SSTYPE(3) 2 /METHOD=SSTYPE(3) 3 /INTERCEPT=INCLUDE 3 /INTERCEPT=INCLUDE 4 /CRITERIA=ALPHA(.05) 4 /CRITERIA=ALPHA(.05) 5 /DESIGN=Drug_Group. 5 /DESIGN=Drug_Group Tutor_Group Drug_Group*Tutor_Group. 36 of 49

37 The syntax below (output follows) builds on the syntax above. The /POSTHOC sub-command on Line 4 requests Fisher's LSD tests to be conducted for the main effects of Drug_Group and Tutor_Group. Post hoc tests for interactions are usually done by way of a simple main effects analysis, or t-test, but this is beyond the scope of this packet for now. The /EMEANS sub-commands on Line 5 8 asks PASW to calculate the grand mean (OVERALL, Line 5), the mean for each level of Drug_Group (Line 6), the mean for each level of Tutor_Group (Line 7), and the mean for each cell in the Drug_Group by Tutor_group design (Line 8). Finally, the /PRINT sub-command asks PASW to provide descriptive statistics (DESCRIPTIVE) and the eta-squared measure of the effect size for each F-Test: 1 UNIANOVA Post_GREq BY Drug_Group Tutor_Group 2 /METHOD=SSTYPE(3) 3 /INTERCEPT=INCLUDE 4 /POSTHOC=Drug_Group Tutor_Group(LSD) 5 /EMMEANS=TABLES(OVERALL) 6 /EMMEANS=TABLES(Drug_Group) 7 /EMMEANS=TABLES(Tutor_Group) 8 /EMMEANS=TABLES(Drug_Group*Tutor_Group) 9 /PRINT=ETASQ DESCRIPTIVE 10 /CRITERIA=ALPHA(.05) 11 /DESIGN=Drug_Group Tutor_Group Drug_Group*Tutor_Group. Running the syntax above, you get the following output: Between-Subjects Factors Value Label N Drug_Group 1 Control Group (no drug) 60 2 Placebo Group mg/day Group mg/day Group 60 Tutor_Group 1 Control Group (no tutoring) 80 2 Group Tutoring 80 3 Individual Tutoring 80 T his table lists each independent variable (far left) and each level of each independent variable (under Value Label), along with the number of subjects in each combination of the variables. This table lists the descriptive statistics (mean and std. Deviation) for each level of each independent variable, as w ell as for Descriptive Statistics each combination of the levels of the independent variables. Dependent Variable:Post_GREq Drug_Group Tutor_Group Mean Std. Deviation N Control Group (no drug) Control Group (no tutoring) Group Tutoring Individual Tutoring Total Placebo Group Control Group (no tutoring) Group Tutoring Individual Tutoring Total of 49

38 Descriptive Statistics Dependent Variable:Post_GREq 100 mg/day Group Control Group (no tutoring) Group Tutoring Individual Tutoring Total mg/day Group Control Group (no tutoring) Group Tutoring Individual Tutoring Total Total Control Group (no tutoring) Group Tutoring Individual Tutoring Total Tests of Between-Subjects Effects This table is the ANOVA summary table. The highlighted sections are relevant for the F-Tests. In this case, only the main effect of Tutor_Group w as significant Dependent Variable:Post_GREq Source Type III Sum of Partial Eta Squares df Mean Square F Sig. Squared Corrected Model a Intercept 8.386E E Drug_Group Tutor_Group Drug_Group * Tutor_Group Error Total 8.538E7 240 Corrected Total Estimated Marginal Means 1. Grand Mean Dependent Variable:Post_GREq Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Drug_Group Estimates Dependent Variable:Post_GREq Drug_Group Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Control Group (no drug) Placebo Group mg/day Group mg/day Group of 49

39 3. Tutor_Group Estimates Dependent Variable:Post_GREq Tutor_Group Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Control Group (no tutoring) Group Tutoring Individual Tutoring Drug_Group * Tutor_Group Dependent Variable:Post_GREq Drug_Group Tutor_Group Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Control Group (no drug) Control Group (no tutoring) Group Tutoring Individual Tutoring Placebo Group Control Group (no tutoring) Group Tutoring Individual Tutoring mg/day Group Control Group (no tutoring) Group Tutoring Individual Tutoring mg/day Group Control Group (no tutoring) Group Tutoring Individual Tutoring Post Hoc Tests Drug_Group Multiple Comparisons Dependent Variable:Post_GREq (I) Drug_Group (J) Drug_Group Mean Difference (I- J) Std. Error Sig. 95% Confidence Interval Lower Bound Upper Bound LSD Control Group (no drug) Placebo Group mg/day Group mg/day Group Placebo Group Control Group (no drug) mg/day Group mg/day Group mg/day Group Control Group (no drug) Placebo Group mg/day Group mg/day Group Control Group (no drug) Placebo Group mg/day Group of 49

40 Tutor_Group Multiple Comparisons Dependent Variable:Post_GREq (I) Tutor_Group (J) Tutor_Group Mean Difference (I- J) Std. Error Sig. 95% Confidence Interval Lower Bound Upper Bound LSD Control Group (no tutoring) Group Tutoring Individual Tutoring * Group Tutoring Control Group (no tutoring) Individual Tutoring Individual Tutoring Control Group (no tutoring) * Group Tutoring The majority of the output in above is not all that much different than the output from the oneway ANOVA performed in Section 6.1, and needs no elaboration. As stated in the comment on the ANOVA summary table: only the main effect of Tutor Group was significant (p =.016). Exploring this main effect, you can see from the Multiple Comparisons table that includes the post hoc tests between the levels of the independent variable for Tutor_Group, the only statistically significant mean difference is between the Control Group (no tutoring) and the Individual Tutoring group (mean difference = , p =.004).The mean difference between the Group Tutoring group and the Individual Tutoring group was nearly significant (mean difference = , p =.091). I encourage the reader to explore the output more thoroughly. 6.3 Repeated Measures ANOVA (via GLM) Sections 6.1 and 6.2 showed how to request ANOVAs when the levels of an independent variable differed between subjects. In this section, I briefly introduce how to request ANOVA when the levels of an independent variable differ within subjects (repeated measures ANOVA). This section will cover only how to request a oneway, repeated measures ANOVA, as the data file includes only a single independent variable that can be considered to differ 'within subjects', and that is the pretest versus posttest period. The PASW GLM procedure for within-subjects variables is referred to as the 'repeated measures GLM'. Let's say that we want to compare the mean score on the Verbal Reasoning Section of the GREs between the pretest and posttest periods (Pre_GREv vs. Post_GREv). Thus, we have one independent variable (Pretest vs. Posttest) with two levels. The syntax below lists the minimum set of sub-commands needed to perform this oneway repeated measures ANOVA: 1 GLM Pre_GREv Post_GREv 2 /WSFACTOR=Pretest_Posttest 2 Difference 3 /METHOD=SSTYPE(3) 4 /CRITERIA=ALPHA(.05) 5 /WSDESIGN=Pretest_Posttest. On the GLM command line, the levels of the within-subject variable are listed (Pre_GREv and Post_GREv). If there was three or more levels of the independent variable, they would be listed here as well. The order in which the levels are entered is critically important for repeated measures factorial 40 of 49

41 ANOVAs, but is of a concern for oneway repeated measures ANOVAs. On the /WSFACTOR command line (Line 2), the independent variable is listed. This independent variable does not actually appear in the data set; rather, it is a name that you give to the independent variable. In this example, because we are comparing GRE Verbal Reasoning scores between the pretest and posttest periods, I have called the independent variable Pretest_Posttest (PASW does not allow spaces in the name). On this line, the 2 indicates how many levels are within that independent variable. Finally the 'Difference' request is telling PASW how to compare the levels of that independent variable. This is akin to requesting a post-hoc test. The 'Difference' request tells PASW to compare each level with every other level, just like a Fisher's LSD test. Lines 3 and 4 should be familiar from the between-subjects ANOVAs performed in Sections 6.1 and 6.2. The last line, /WSDESIGN lists each factor that should be included in the analysis. In this case, because we have only one independent variable, it should be the only factor listed. The syntax above will output the results of only the F-Test; it will not provide any descriptive information. The syntax below includes the /PRINT sub-command on Line 4, which asks PASW to provide the DESCRIPTIVE statistics as well as the eta-squared measure of effect size. There is also the ability to request descriptive statistics through the /EMEANS sub-command: 1 GLM Pre_GREv Post_GREv 2 /WSFACTOR=Pretest_Posttest 2 Difference 3 /METHOD=SSTYPE(3) 4 /PRINT=DESCRIPTIVE ETASQ 5 /CRITERIA=ALPHA(.05) 6 /WSDESIGN=Pretest_Posttest. If you run the syntax above, you get the following output: Within-Subjects Factors Measure:MEASURE_1 Pretest_Posttest Dependent Variable 1 Pre_GREv 2 Post_GREv Descriptive Statistics Mean Std. Deviation N Pre_GREv Post_GREv of 49

42 T his tab;e is not relevant for our purposes. Multivariate Tests b Effect Partial Eta Value F Hypothesis df Error df Sig. Squared Pretest_Posttest Pillai's Trace a Wilks' Lambda a Hotelling's Trace a Roy's Largest Root a Measure:MEASURE_1 Mauchly's Test of Sphericity b This table lists the outcome of a 'sphericity' test, w hich is similar to homogeneity of variance. If sphericity is violated, it can be an issue. Within Subjects Effect Mauchly's W Square df Sig. Epsilon a Approx. Chi- Greenhouse- Geisser Huynh-Feldt Lower-bound Pretest_Posttest T his is the ANOVA summary table. Measure:MEASURE_1 Tests of Within-Subjects Effects Source Type III Sum Mean Partial Eta of Squares df Square F Sig. Squared Pretest_Posttest Sphericity Assumed Greenhouse-Geisser Huynh-Feldt Lower-bound Error(Pretest_Posttest) Sphericity Assumed Greenhouse-Geisser Huynh-Feldt Lower-bound of 49

43 Measure:MEASURE_1 Tests of Within-Subjects Contrasts Source Pretest_Posttest Type III Sum of Squares df Mean Square F Sig. Pretest_Posttest Level 2 vs. Level Error(Pretest_Posttest) Level 2 vs. Level This table lists the results of the post hoc tests betw een levels of the independent variable. Measure:MEASURE_1 Transformed Variable:Average Source Type III Sum of Tests of Between-Subjects Effects Squares df Mean Square F Sig. Partial Eta Squared Intercept 8.561E E Error This table lists ANOVA results for any betw een subjects factors, w hich w e did not have in this analysis. The table Tests for Within Subjects Effects is the output of the ANOVA summary table. I have highlighted the relevant portions of the table in yellow. The terms associated with the effect of the independent variable (between group variability) are in the rows headed by Pretest_Posttest. The terms associated with the error variance are in the rows headed by Error(Pretest_Posttest). The information in most of the columns should be self explanatory. To determine whether the influence of the independent variable is statistically significant, look to the column labeled Sig. This is the p-value. If this value is less than your chosen alpha level (α =.05 or less), then the independent variable had a statistically significant influence on the dependent variable, which is the case (p <.001). Because the influence of the independent variable is statistically significant, you can conclude that the mean difference between Pre_GREv and Post_GREv is statistically significant. You can find the mean for each level of the independent variable in the Descriptive Statistics table. The table Tests of Within- Subjects Contrasts lists the post hoc test results of each comparisons between levels of the independent variable. Because there is only two levels of the independent variable, there is only one possible comparison (Level 1 vs. Level 2). 43 of 49

44 7. Chi Square 7.1 Cross-Tabulation Procedure (Factorial Chi-Square) The PASW procedure for requesting a factorial chi-square analysis (a chi-square analysis with two or more independent variables in the design) is done by way of PASWs CROSSTABS (cross-tabulation) procedure. Cross-tabulation is the process of creating a contingency table from two or more independent variables. You can have PASW create a contingency table for several independent variables but without actually conducting the chi-square analysis. From the data set, let's say that we want to know whether the n = 240 subjects in the study are equally distributed across the levels of the independent variables college class (Coll_Class) and college major (Coll_Maj). The syntax below presents the basic set of sub-commands needed to have PASW carry out a chi-square analysis though the cross-tabulation procedure: 1 CROSSTABS 2 /TABLES=Coll_Class BY Coll_Maj 3 /FORMAT=AVALUE TABLES 4 /STATISTICS=CHISQ PHI 5 /CELLS=COUNT EXPECTED. The /TABLES sub-command on Line 2 lists the independent variables being set up in the contingency table. One independent variable has to go before BY and the other goes after the BY, but it is not terribly important which one goes where. The /FORMAT sub-command on Line 3 is tells PASW in what format the output should be presented. In this case, tabled form (TABLE) and in the table the entries should appear in ascending order (AVALUE). The /STATISTICS sub-command on Line 4 is where you request the chi-square analysis (CHISQ); if you do not include this sub-command, PASW will not perform the analysis. I have also included PHI request, which has PASW calculate Cramer's C and the Phi Coefficient as measures of effect size. The /CELLS sub-command on Line 5 tells PASW what information to include in each cell of the cross-tabulation table. In this case, PASW is being told to include the observed frequency (COUNT) and the expected frequency (EXPECTED). If you run the syntax about, you get the following output: T his table tells you the total number of subjects/cases included (240), and whether any appear to be missing. Case Processing Summary Cases Valid Missing Total N Percent N Percent N Percent Coll_Class * Coll_Maj % 0.0% % 44 of 49

45 Coll_Class * Coll_Maj Crosstabulation This is the cross-tabulation table w ith college majors listed in columns and college classes in row s. The values in each cell are the observed and expected frequencies. Coll_Maj Total Psychology History Biology Communications English Mathematics Coll_Class Freshmen Count Expected Count Sophomore Count Expected Count Junior Count Expected Count Senior Count Expected Count Total Count Expected Count Chi-Square Tests Value df Asymp. Sig. (2-sided) Pearson Chi-Square a Likelihood Ratio Linear-by-Linear Association N of Valid Cases 240 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is Symmetric Measures Value Approx. Sig. Nominal by Nominal Phi Cramer's V N of Valid Cases 240 This table reports the results of the chi-square analysis. You use the terms in the Pearson Chi-Square row. This table reports Cramer's C (V) and the Phi coefficient measures of effect size. The Chi-Square Tests table above, included information relevant for determining whether there is a significant difference between the observed and expected frequencies. Use the information in the row labeled Pearson Chi-Square. The number in the Value column (16.617) is the chi-square statistic. The value under the df column (15) are the degrees of freedom in the cross-tabulation table. The value under the Asymp. Sig. (2-sided) column (.342) is the p-value used to determine significance. If this value is less than your chosen alpha-level (α =.05 or less), then there is a significant difference between the observed and the expected frequencies. In this case,. Because.345 >.05, there is not a significant difference between the observed and the expected frequencies. 45 of 49

46 7.2 Oneway Chi-Square If you have only one independent variable and want to know whether a set of observed frequencies differ across the levels of the independent variable from what frequencies are expected, you do not use the CROSSTABS procedure from Section 7.1. The CROSSTABS procedure is used only when there are two or more independent variables. There is a separate chi-square procedure within PASWs non-parametric test (NPAR TESTS) for dealing with one independent variable. Say that we want to determine whether the observed frequencies across the four college classes differ from a set of frequencies expected by chance. The syntax below lists the sub-commands needed to run a oneway chi-square test to determine whether the observed frequencies in each college class differ from what frequency is expected for each college class: 1 NPAR TESTS 2 /CHISQUARE=Coll_Class 3 /EXPECTED=EQUAL 4 /MISSING ANALYSIS. The /CHISQURE sub-command on Line 2 tells PASW to perform a chi-square test across the levels of the independent variable Coll_Class listed after the equal sign. The /EXPECTED sub-command on Line 3 tells PASW how to calculate the expected frequencies. In this case the choice of EQUAL asks PASW to assume the expected frequency should be equal for each college class. Hence, with 240 students and four college classes, the expected frequency for each college class should be 240/4 = 60. Finally, the /MISSING sub-command on Line 4 tells PASW how to handle missing data, which is usually set to ANALYSIS, or LISTWISE. When you run the syntax above, you get the following output: Coll_Class Observed N Expected N Residual Freshmen Sophomore Junior Senior Total 240 This table lists each of the levels of the independent variable, the observed frequencies, the expected frequencies, and the difference betw een them. Test Statistics T his table lists the outcome of the chi-square test between the expected and observed frequencies. Coll_Class Chi-square a df 3 Asymp. Sig..769 a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is of 49

47 From the Test Statistics table you can determine whether the observed frequencies significantly differ from the expected frequencies by examining the p-value in the Asymp. Sig. Row. If this value is less than your chosen alpha-level (generally α =.05 or less), then there is a significant difference between the observed and the expected frequencies. In this case, because.769 >.05, there is not a significant difference between the observed and the expected frequencies. 7.3 Goodness of Fit Test Requesting a goodness of fit test is virtually identical to requesting a chi-square analysis for one independent variable. Assume that in most research studies performed using college students, freshmen are most likely to participate, sophomores are second-most likely, juniors are third-most likely, and seniors are least likely. Thus, we may expect that 50% (.5) of the subjects in a study are freshmen, 25% (.25) are sophomores, 15% (.15) are juniors, and 10% (.1) are seniors. We want to run a goodness of fit test to determine whether the frequencies observed in each college class are consistent with these expected percentages. The syntax below lists the sub-commands needed to run a goodness of fit test to determine whether the frequencies observed in each college class are congruent with the predicted percentages above: 1 NPAR TESTS 2 /CHISQUARE=Coll_Class 3 /EXPECTED= /MISSING ANALYSIS. Notice that Lines 1, 2, and 4 are identical to the oneway chi-square conducted in Section 7.2; the only difference is the /EXPECTED sub-command on Line 3. The numbers after the equal sign are the expected proportions of freshmen (.5), sophomores (.25), juniors (.15), and seniors (.1) from above. For the goodness of fit test, you can use proportions as done here, or expected frequencies. Ehich would need to be determined. Importantly: the order of proportions must coincide with the dummy-codes assigned to the levels of the independent variable. That is, whichever level was dummy-coded as 1 would have it's expected proportion presented first, whichever level was dummy-coded as 2 would have it's expected proportion presented second, etc. In the data file, freshmen were coded 1, sophomores were coded 2, etc. When you run the syntax above, you get the following output: Coll_Class Observed N Expected N Residual Freshmen Sophomore Junior Senior Total 240 This table lists each of the levels of the independent variable, the observed frequencies, the expected frequencies, and the difference betw een them. The expected frequencies are obtained by multiplying the sample size (240) by the expected proportion of each level. 47 of 49

48 Test Statistics Chi-square Coll_Class a df 3 Asymp. Sig..000 a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is This table lists the outcome of the chi-square test betw een the expected and observed frequencies. From the Test Statistics table you can determine whether the observed frequencies significantly differ from the expected frequencies by examining the p-value in the Asymp. Sig. Row. If this value is less than your chosen alpha-level (generally α =.05 or less), then there is a significant difference between the observed and the expected frequencies. In this case, because p <.001, there is a significant difference between the observed and the expected frequencies. 7.4 Alternative Method for Goodness of Fit Test There is an alternative procedure for requesting a goodness of fit test. Say that you know the numbers of freshmen (57), sophomores (66), juniors (63), and seniors (56) in the study, and want to run a goodness of fit test on those numbers, but have not set up an entire data file with all 240 subject cases. The screen shot below, shows a data file with the independent variable Coll_Class (1 = freshmen; 2 = sophomores; 3 = juniors; 4 = seniors) and the Observed frequencies in each class. We could also run a goodness of fit test (or a oneway chi-square) when data are set up in this manner. Figure 6: Observed frequencies in each college class. 48 of 49

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Psych. Research 1 Guide to SPSS 11.0

Psych. Research 1 Guide to SPSS 11.0 SPSS GUIDE 1 Psych. Research 1 Guide to SPSS 11.0 I. What is SPSS: SPSS (Statistical Package for the Social Sciences) is a data management and analysis program. It allows us to store and analyze very large

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

January 26, 2009 The Faculty Center for Teaching and Learning

January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

An introduction to using Microsoft Excel for quantitative data analysis

An introduction to using Microsoft Excel for quantitative data analysis Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing

More information

When to use Excel. When NOT to use Excel 9/24/2014

When to use Excel. When NOT to use Excel 9/24/2014 Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

More information

SPSS (Statistical Package for the Social Sciences)

SPSS (Statistical Package for the Social Sciences) SPSS (Statistical Package for the Social Sciences) What is SPSS? SPSS stands for Statistical Package for the Social Sciences The SPSS home-page is: www.spss.com 2 What can you do with SPSS? Run Frequencies

More information

SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

More information

Multiple Regression. Page 24

Multiple Regression. Page 24 Multiple Regression Multiple regression is an extension of simple (bi-variate) regression. The goal of multiple regression is to enable a researcher to assess the relationship between a dependent (predicted)

More information

Data Analysis in SPSS. February 21, 2004. If you wish to cite the contents of this document, the APA reference for them would be

Data Analysis in SPSS. February 21, 2004. If you wish to cite the contents of this document, the APA reference for them would be Data Analysis in SPSS Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Heather Claypool Department of Psychology Miami University

More information

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

IBM SPSS Statistics 20 Part 1: Descriptive Statistics CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 1: Descriptive Statistics Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your

More information

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

How to Use a Data Spreadsheet: Excel

How to Use a Data Spreadsheet: Excel How to Use a Data Spreadsheet: Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures. Although

More information

Introduction to Using SPSS Command Files

Introduction to Using SPSS Command Files Introduction to Using SPSS Command Files Joel P. Wiesen, Ph.D. jwiesen@appliedpersonnelresearch.com 31th Annual IPMAAC Conference St. Louis, MO June 13, 2007 Wiesen (2007), IPMAAC Conference 1 Outline

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

Introduction to SPSS 16.0

Introduction to SPSS 16.0 Introduction to SPSS 16.0 Edited by Emily Blumenthal Center for Social Science Computation and Research 110 Savery Hall University of Washington Seattle, WA 98195 USA (206) 543-8110 November 2010 http://julius.csscr.washington.edu/pdf/spss.pdf

More information

This book serves as a guide for those interested in using IBM SPSS

This book serves as a guide for those interested in using IBM SPSS 1 Overview This book serves as a guide for those interested in using IBM SPSS Statistics software to assist in statistical data analysis whether as a companion to a statistics or research methods course,

More information

SPSS: Getting Started. For Windows

SPSS: Getting Started. For Windows For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 Introduction to SPSS Tutorials... 3 1.2 Introduction to SPSS... 3 1.3 Overview of SPSS for Windows... 3 Section 2: Entering

More information

SPSS Workbook 1 Data Entry : Questionnaire Data

SPSS Workbook 1 Data Entry : Questionnaire Data TEESSIDE UNIVERSITY SCHOOL OF HEALTH & SOCIAL CARE SPSS Workbook 1 Data Entry : Questionnaire Data Prepared by: Sylvia Storey s.storey@tees.ac.uk SPSS data entry 1 This workbook is designed to introduce

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

An SPSS companion book. Basic Practice of Statistics

An SPSS companion book. Basic Practice of Statistics An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

This book serves as a guide for those interested in using IBM

This book serves as a guide for those interested in using IBM 1 Overview This book serves as a guide for those interested in using IBM SPSS/PASW Statistics software to aid in statistical data analysis whether as a companion to a statistics or research methods course

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

There are six different windows that can be opened when using SPSS. The following will give a description of each of them. SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet

More information

Getting Started with Excel 2008. Table of Contents

Getting Started with Excel 2008. Table of Contents Table of Contents Elements of An Excel Document... 2 Resizing and Hiding Columns and Rows... 3 Using Panes to Create Spreadsheet Headers... 3 Using the AutoFill Command... 4 Using AutoFill for Sequences...

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

More information

An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman

An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman An Introduction to SPSS Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman Topics to be Covered Starting and Entering SPSS Main Features of SPSS Entering and Saving Data in SPSS Importing

More information

SPSS Basic Skills Test

SPSS Basic Skills Test SPSS Basic Skills Test (This document is available at http://www.psy.mq.edu.au/psystat/skillstest ) The following is a test of your ability to carry out a few basic procedures in SPSS. Everything that

More information

XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables

XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables Contents Simon Cheng hscheng@indiana.edu php.indiana.edu/~hscheng/ J. Scott Long jslong@indiana.edu

More information

Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

More information

SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. LindaL@niu.edu 815-753-9516

SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. LindaL@niu.edu 815-753-9516 SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. LindaL@niu.edu 815-753-9516 Technical Advisory Group Customer Support Services Northern Illinois University 120 Swen Parson Hall DeKalb, IL 60115 SPSS

More information

Excel Charts & Graphs

Excel Charts & Graphs MAX 201 Spring 2008 Assignment #6: Charts & Graphs; Modifying Data Due at the beginning of class on March 18 th Introduction This assignment introduces the charting and graphing capabilities of SPSS and

More information

Tutorial: Get Running with Amos Graphics

Tutorial: Get Running with Amos Graphics Tutorial: Get Running with Amos Graphics Purpose Remember your first statistics class when you sweated through memorizing formulas and laboriously calculating answers with pencil and paper? The professor

More information

Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

More information

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or 1 Chapter 7 Comparing Means in SPSS (t-tests) This section covers procedures for testing the differences between two means using the SPSS Compare Means analyses. Specifically, we demonstrate procedures

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Introduction Course in SPSS - Evening 1

Introduction Course in SPSS - Evening 1 ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/

More information

Tutorial: Get Running with Amos Graphics

Tutorial: Get Running with Amos Graphics Tutorial: Get Running with Amos Graphics Purpose Remember your first statistics class when you sweated through memorizing formulas and laboriously calculating answers with pencil and paper? The professor

More information

Reporting Statistics in Psychology

Reporting Statistics in Psychology This document contains general guidelines for the reporting of statistics in psychology research. The details of statistical reporting vary slightly among different areas of science and also among different

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

More information

Microsoft Excel Tips & Tricks

Microsoft Excel Tips & Tricks Microsoft Excel Tips & Tricks Collaborative Programs Research & Evaluation TABLE OF CONTENTS Introduction page 2 Useful Functions page 2 Getting Started with Formulas page 2 Nested Formulas page 3 Copying

More information

A Basic Guide to Analyzing Individual Scores Data with SPSS

A Basic Guide to Analyzing Individual Scores Data with SPSS A Basic Guide to Analyzing Individual Scores Data with SPSS Step 1. Clean the data file Open the Excel file with your data. You may get the following message: If you get this message, click yes. Delete

More information

This chapter reviews the general issues involving data analysis and introduces

This chapter reviews the general issues involving data analysis and introduces Research Skills for Psychology Majors: Everything You Need to Know to Get Started Data Preparation With SPSS This chapter reviews the general issues involving data analysis and introduces SPSS, the Statistical

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS

More information

SPSS-Applications (Data Analysis)

SPSS-Applications (Data Analysis) CORTEX fellows training course, University of Zurich, October 2006 Slide 1 SPSS-Applications (Data Analysis) Dr. Jürg Schwarz, juerg.schwarz@schwarzpartners.ch Program 19. October 2006: Morning Lessons

More information

IBM SPSS Statistics for Beginners for Windows

IBM SPSS Statistics for Beginners for Windows ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

More information

Assignment objectives:

Assignment objectives: Assignment objectives: Regression Pivot table Exercise #1- Simple Linear Regression Often the relationship between two variables, Y and X, can be adequately represented by a simple linear equation of the

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Excel for Data Cleaning and Management

Excel for Data Cleaning and Management Excel for Data Cleaning and Management Background Information This workshop is designed to teach skills in Excel that will help you manage data from large imports and save them for further use in SPSS

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

Table of Contents. Preface

Table of Contents. Preface Table of Contents Preface Chapter 1: Introduction 1-1 Opening an SPSS Data File... 2 1-2 Viewing the SPSS Screens... 3 o Data View o Variable View o Output View 1-3 Reading Non-SPSS Files... 6 o Convert

More information

Excel 2003 Tutorial I

Excel 2003 Tutorial I This tutorial was adapted from a tutorial by see its complete version at http://www.fgcu.edu/support/office2000/excel/index.html Excel 2003 Tutorial I Spreadsheet Basics Screen Layout Title bar Menu bar

More information

Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks

Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks Overview The purpose of this assignment is to compare the number of car accidents that men and women have. The goal is to determine

More information

An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

More information

IBM SPSS Missing Values 22

IBM SPSS Missing Values 22 IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,

More information

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: EXCEL Analysis TookPak [Statistical Analysis] 1 First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: a. From the Tools menu, choose Add-Ins b. Make sure Analysis

More information

Instructions for SPSS 21

Instructions for SPSS 21 1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3

More information

5. Correlation. Open HeightWeight.sav. Take a moment to review the data file.

5. Correlation. Open HeightWeight.sav. Take a moment to review the data file. 5. Correlation Objectives Calculate correlations Calculate correlations for subgroups using split file Create scatterplots with lines of best fit for subgroups and multiple correlations Correlation The

More information

Working with SPSS. A Step-by-Step Guide For Prof PJ s ComS 171 students

Working with SPSS. A Step-by-Step Guide For Prof PJ s ComS 171 students Working with SPSS A Step-by-Step Guide For Prof PJ s ComS 171 students Contents Prep the Excel file for SPSS... 2 Prep the Excel file for the online survey:... 2 Make a master file... 2 Clean the data

More information

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002 EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002 Table of Contents Part I Creating a Pivot Table Excel Database......3 What is a Pivot Table...... 3 Creating Pivot Tables

More information

Using MS Excel to Analyze Data: A Tutorial

Using MS Excel to Analyze Data: A Tutorial Using MS Excel to Analyze Data: A Tutorial Various data analysis tools are available and some of them are free. Because using data to improve assessment and instruction primarily involves descriptive and

More information

INTRODUCTION TO EXCEL

INTRODUCTION TO EXCEL INTRODUCTION TO EXCEL 1 INTRODUCTION Anyone who has used a computer for more than just playing games will be aware of spreadsheets A spreadsheet is a versatile computer program (package) that enables you

More information

Using Microsoft Excel to Plot and Analyze Kinetic Data

Using Microsoft Excel to Plot and Analyze Kinetic Data Entering and Formatting Data Using Microsoft Excel to Plot and Analyze Kinetic Data Open Excel. Set up the spreadsheet page (Sheet 1) so that anyone who reads it will understand the page (Figure 1). Type

More information

4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency 4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

More information

SPSS Introduction. Yi Li

SPSS Introduction. Yi Li SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf

More information

Module 4 (Effect of Alcohol on Worms): Data Analysis

Module 4 (Effect of Alcohol on Worms): Data Analysis Module 4 (Effect of Alcohol on Worms): Data Analysis Michael Dunn Capuchino High School Introduction In this exercise, you will first process the timelapse data you collected. Then, you will cull (remove)

More information

Drawing a histogram using Excel

Drawing a histogram using Excel Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to

More information

SPSS The Basics. Jennifer Thach RHS Assessment Office March 3 rd, 2014

SPSS The Basics. Jennifer Thach RHS Assessment Office March 3 rd, 2014 SPSS The Basics Jennifer Thach RHS Assessment Office March 3 rd, 2014 Why use SPSS? - Used heavily in the Social Science & Business world - Ability to perform basic to high-level statistical analysis (i.e.

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions

Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions 2010 Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions This document describes how to perform some basic statistical procedures in Microsoft Excel. Microsoft Excel is spreadsheet

More information

Moderation. Moderation

Moderation. Moderation Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Using Microsoft Excel to Manage and Analyze Data: Some Tips

Using Microsoft Excel to Manage and Analyze Data: Some Tips Using Microsoft Excel to Manage and Analyze Data: Some Tips Larger, complex data management may require specialized and/or customized database software, and larger or more complex analyses may require

More information

Charting LibQUAL+(TM) Data. Jeff Stark Training & Development Services Texas A&M University Libraries Texas A&M University

Charting LibQUAL+(TM) Data. Jeff Stark Training & Development Services Texas A&M University Libraries Texas A&M University Charting LibQUAL+(TM) Data Jeff Stark Training & Development Services Texas A&M University Libraries Texas A&M University Revised March 2004 The directions in this handout are written to be used with SPSS

More information

How to Make APA Format Tables Using Microsoft Word

How to Make APA Format Tables Using Microsoft Word How to Make APA Format Tables Using Microsoft Word 1 I. Tables vs. Figures - See APA Publication Manual p. 147-175 for additional details - Tables consist of words and numbers where spatial relationships

More information

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

More information

4 Other useful features on the course web page. 5 Accessing SAS

4 Other useful features on the course web page. 5 Accessing SAS 1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu

More information

Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different)

Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different) Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different) Spreadsheets are computer programs that allow the user to enter and manipulate numbers. They are capable

More information

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red. Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red. 1. How to display English messages from IBM SPSS Statistics

More information

SPSS INSTRUCTION CHAPTER 1

SPSS INSTRUCTION CHAPTER 1 SPSS INSTRUCTION CHAPTER 1 Performing the data manipulations described in Section 1.4 of the chapter require minimal computations, easily handled with a pencil, sheet of paper, and a calculator. However,

More information

Advanced Excel for Institutional Researchers

Advanced Excel for Institutional Researchers Advanced Excel for Institutional Researchers Presented by: Sandra Archer Helen Fu University Analysis and Planning Support University of Central Florida September 22-25, 2012 Agenda Sunday, September 23,

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis 2010 Using Excel for Statistical Analysis Microsoft Excel is spreadsheet software that is used to store information in columns and rows, which can then be organized and/or processed. Excel is a powerful

More information

Microsoft Excel Tutorial

Microsoft Excel Tutorial Microsoft Excel Tutorial Microsoft Excel spreadsheets are a powerful and easy to use tool to record, plot and analyze experimental data. Excel is commonly used by engineers to tackle sophisticated computations

More information

Microsoft Excel 2007 Consolidate Data & Analyze with Pivot Table Windows XP

Microsoft Excel 2007 Consolidate Data & Analyze with Pivot Table Windows XP Microsoft Excel 2007 Consolidate Data & Analyze with Pivot Table Windows XP Consolidate Data in Multiple Worksheets Example data is saved under Consolidation.xlsx workbook under ProductA through ProductD

More information