Introduction Course in SPSS - Evening 1



Similar documents
Using SPSS, Chapter 2: Descriptive Statistics

Introduction to SPSS 16.0

SPSS (Statistical Package for the Social Sciences)

GETTING YOUR DATA INTO SPSS

4. Descriptive Statistics: Measures of Variability and Central Tendency

An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman

Directions for using SPSS

SPSS for Simple Analysis

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

An introduction to IBM SPSS Statistics

STC: Descriptive Statistics in Excel Running Descriptive and Correlational Analysis in Excel 2013

Excel Charts & Graphs

January 26, 2009 The Faculty Center for Teaching and Learning

SPSS Manual for Introductory Applied Statistics: A Variable Approach

CREATING EXCEL PIVOT TABLES AND PIVOT CHARTS FOR LIBRARY QUESTIONNAIRE RESULTS

Using MS Excel to Analyze Data: A Tutorial

SPSS The Basics. Jennifer Thach RHS Assessment Office March 3 rd, 2014

Using Excel for descriptive statistics

Data exploration with Microsoft Excel: analysing more than one variable

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

IBM SPSS Statistics for Beginners for Windows

Instructions for SPSS 21

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

Introduction to PASW Statistics

How to Use a Data Spreadsheet: Excel

Data exploration with Microsoft Excel: univariate analysis

INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21

Learning SPSS: Data and EDA

Appendix 2.1 Tabular and Graphical Methods Using Excel

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Table of Contents. Preface

Introduction to IBM SPSS Statistics

Information Literacy Program

SPSS: Getting Started. For Windows

MS Excel. Handout: Level 2. elearning Department. Copyright 2016 CMS e-learning Department. All Rights Reserved. Page 1 of 11

SPSS Workbook 1 Data Entry : Questionnaire Data

Analyzing Data Using Excel

Introduction to SPSS (version 16) for Windows

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Data analysis process

Excel for Data Cleaning and Management

Advanced Excel 10/20/2011 1

Charting LibQUAL+(TM) Data. Jeff Stark Training & Development Services Texas A&M University Libraries Texas A&M University

Importing and Exporting With SPSS for Windows 17 TUT 117

Using Excel s PivotTable to Analyze Learning Assessment Data

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

Participant Guide RP301: Ad Hoc Business Intelligence Reporting

Custom Reporting System User Guide

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Computer Training Centre University College Cork. Excel 2013 Pivot Tables

Microsoft Excel Training - Course Topic Selections

Excel Tutorial. Bio 150B Excel Tutorial 1

Excel 2007 Basic knowledge

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

IBM SPSS Direct Marketing 23

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

3 What s New in Excel 2007

An introduction to using Microsoft Excel for quantitative data analysis

Chapter 4 Displaying and Describing Categorical Data

Create Custom Tables in No Time

IBM SPSS Direct Marketing 22

Using Excel in Research. Hui Bian Office for Faculty Excellence

As in the example above, a Budget created on the computer typically has:

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Describing, Exploring, and Comparing Data

How To Use Spss

INTRODUCTION TO EXCEL

Using Excel as a Management Reporting Tool with your Minotaur Data. Exercise 1 Customer Item Profitability Reporting Tool for Management

IBM SPSS Data Preparation 22

SPSS Introduction. Yi Li

TIPS FOR DOING STATISTICS IN EXCEL

MetroBoston DataCommon Training

Microsoft Excel 2007 Level 2

Microsoft Excel 2010 Pivot Tables

Using Excel for Analyzing Survey Questionnaires Jennifer Leahy

Table of Contents TASK 1: DATA ANALYSIS TOOLPAK... 2 TASK 2: HISTOGRAMS... 5 TASK 3: ENTER MIDPOINT FORMULAS... 11

Excel 2003 Tutorial I

Getting started in Excel

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

Getting Started with Excel Table of Contents

Excel 2010: Create your first spreadsheet

SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D

Scatter Plots with Error Bars

Advanced Presentation Features and Animation

Scientific Graphing in Excel 2010

Figure 1. An embedded chart on a worksheet.

MARS STUDENT IMAGING PROJECT

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Microsoft Excel. Qi Wei

The Center for Teaching, Learning, & Technology

Excel 2007/2010 for Researchers. Jamie DeCoster Institute for Social Science Research University of Alabama. September 7, 2010

Integrating SAS with JMP to Build an Interactive Application

Appendix III: SPSS Preliminary

When to use Excel. When NOT to use Excel 9/24/2014

2. Creating Bar Graphs with Excel 2007

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2).

SPSS Tutorial, Feb. 7, 2003 Prof. Scott Allard

How To Create A Powerpoint Intelligence Report In A Pivot Table In A Powerpoints.Com

Transcription:

ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/ 1 Statistical Data Analysis Statistical data analysis usually consists of three steps: 1. Data Preparation: After reading in raw data, we need to clean and prepare the data for analysis, i.e. check values, select variables of interest and (possibly) construct new variables. 2. Explorative Data Analysis: Visualization of data (scatterplots, boxplots, barplots etc.) and computing characteristic values (mean, min, max or standard deviation). This step is very important, to get an overview over the data and to recognize irregularities. 3. Inferential Statistics: Testing hypothesis Examples: t-test, regression, analysis of variance, survival analysis Be careful: All statistical methods have assumptions, which should be satisfied in order to trust the test result. 2 Getting Started with SPSS 2.1 Menus At first a short overview over the menus: File: Data files are read in (Open, Open Database, Read Text Data) and later you can Save or Print the data or the output Edit: Inserting new variables / cases or searching for them (Insert Variable, Insert Cases, Find, Go to Case, Go to Variable) and all kinds of other SPSS Options (font, font size, pivot tables, language etc.) View: Visualization of the data in the data view: Grid Lines, Value Labels Data: Data preparation (Sort Cases, Restructure, Select Cases) 1

Transform: Construct new variables (Compute Variables) or modify existing ones (Recode into Same Variable) Analyze: Statistical methods Graphs: Graphics: interactive (Chart Builder) and static (Legacy Dialogs) Windows Management of SPSS windows Help: Topics (search), Tutorials (manual), Case Studies (Explanation of statistical menus and interpretation of outputs), Command Syntax Reference (all syntax commands). Manuals can be found on the homepage of SPSS or via Google. 2.2 Data View, Variable View When SPSS is started, it first shows up a data table: the so-called Data Editor. The Data Editor consists of two sheets: Data View and Variable View. For demonstration, first download the file demo.sav from the course homepage. We can open this SPSS data file (Ending:.sav) via File / Open / Data. The file contains an artificial data set from a company, which sends monthly advertisements to potential costumers. The variable response denotes if a costumer reacts to the promotion. Further, there are a lot of personal and demographic information about the costumers. The Data View shows the data. It is the default view. SPSS is row orientated like Excel, i.e. there is one observation per row and every column represents a variable. The Variable View shows information about the variables (variable type, length etc.). In every row one variable and its properties are listed: Name: Special characters aren t allowed in names (e.g. exclamation marks, question mark or space) Type: Type of variable: numeric (numbers), date, dollar, string (words) are the most important types Width: Maximum number of letters (only useful with string variables) Decimals: Maximum number of decimals (only useful with numeric variables) Label: Full name of variable (name for graphics) Values: Here you can save the coding of factors (e.g. gender: 0 = male, 1 = female). Missing: How did you code missing values? SPSS treats by default only empty cells as missing. If you coded missings e.g. with 99 you have to specify this here Columns: Width of column Align: Alignment of content in cells (right, left, centered) 2

Measure: There are three measures for variables: scale (continuous variables like age or body weight), nominal (categorical variables, e.g. treatment groups) and ordinal (categorical variable with natural ordering like income class or age groups). In both sheets information about your data can be added, modified and deleted. After reading in, you should always check if all information in the variable view is correctly specified. This is very important for the statistical analysis. 2.3 Output Window The output window functions as logbook. Every action and analysis result is printed in form of a table, plot or syntax. The output can be edited (Menu Insert): you can add titles, change fonts etc. At the end of an analysis you can save the output (File / Save). Tables can be exported to Excel or as.pdf; similarly graphics can be exported as.jpg or.pdf (Right click and Export). 3 Reading Data Very often data is not directly stored as.sav file (the data format of SPSS). You may have saved your data in Excel, as text file or in a data base format. Therefore, we now discuss how to read in data stored in a non-spss format. 3.1.dat or.txt Data File / Read Text Data... sprintbiometr.dat First you select the directory where the data is saved. Be aware that SPSS shows by default only.sav files. Thus you have to change the file type to All Files. A dialog window opens. Step-by-step SPSS inquires the the structure of the data in the file, e.g. Are the variable names stored in the first row of the file? and Which delemeter is used between columns?. A preview window helps to decide whether you entered the properties of the file correctly. It is important that every column represents one variable and each row one observation. 3.2.xls Data File / Open... ozon.xls In the second dialog enter the name of the Excel worksheet and the range of columns and rows which you want to analyze in SPSS. There are some rules for.xls data if you do not want to encounter any problems. Variables names should only be placed in the first line of the Excel worksheet. The first line of data should start directly after the line with the variables names. The use of formulae in the worksheet is discouraged. 3

3.3 Enter Data in SPSS It is also possible to enter your data directly in SPSS: File / New / Data After entering you can save it by File / Save....sav If you enter data directly in SPSS, you have to specify all the variable properties manually. There is a dialog window, which simplifies this task: Data / Define Variable Properties 4 Data Preparation 4.1 Menu Data When the raw data is entered in SPSS, the data should be prepared for analysis. SPSS provides several tools in Menu Data. Example: sprintzeit.sav Sort Cases Sorts observations according to names and run Data / Sort Cases name, lauf Restructure Restructure is a very powerful tool, which can create new columns out of rows and rows out of columns. It is particularly useful for repeated measures data. Before restructuring, the data should be saved! Example of restructure: the variable time and the index variable run should be restructured into two variables which show the time of run 1 and run 2. Data / Restructure Restructure selected cases into variables identifier: name index: lauf no further options Merge Files With this tool data from several files can be combined. For example, there could be one file with measurements on pollution and one file with measurement on meteorologic data. In order to combine the files correctly, you need a variable which appears in both files, e.g. ID or date. Example in Exercise. Aggregate The goal of aggregating data is to summarize information on subgroup level. Therefore, characteristic values are calculated for the observation level according to subgroups. The new data file will only contain information on subgroup level. For the aggregation SPSS provides several functions like mean, standard deviation, max, min etc. Example: Aggregate the sprint data for boys and girls. We calculate the average sprint time. (sprint.sav) 4

Data / Aggregate break = sex summaries: zeit1, zeit2 function = mean create a new data set: aggr.sav Split File Split File allows to analyse the data separately for groups. The output is sorted by groups. Example: Calculation of mean split for girls and boys Data/ Split File... Organize Output by group Analyze / Descriptive Statistics / Frequencies... Statistics: mean If the option split is used, you can see split in the right bottom corner of SPSS s Data Editor. All analysis will be split by gender as long as you do not remove this split: Data/ Split File... Analyze all cases Select Cases Select data according to some conditions. Example: We want to compute the mean only for persons older than 16. Data/ Select Cases... if condition is satisfied / alter < 16 To select the matching cases SPSS computes a filter variable (0 = not selected, 1 = selected), which will be added as new column in your data. Furthermore, all non-selected observations are crossed in the Data View. If you want to sort out single observations, you can use the function $Casenum. Assign Weights to Cases Weighting of observations. Example in Exercise: Chi-Squared Test. 4.2 Menu Transform Compute New variables can be constructed. Example: Average time or best running time (sprint.sav): Transform/Compute Variable.../meanTime = (zeit1+zeit2)/2 Transform/Compute Variable.../minTime = min(zeit1,zeit2) Recode With recode you can rename the levels of existing categorical variables or transform a scale variable to a categorical one. Example: Construct a new variable Speed Cat with three levels (<13, 13-14, > 14). 5

Transform / Recode into different variables Input: meantime Output: Speed Cat Old Value: <13 New Value: Fast Old Value: 13-14 New Value: Medium Old Value: > 14 New Value: Slow Visual Binning Transformation of a scale variable into a categorical variable. The ranges for the new categories can be determined by hand (histogram) or with predefined functions like quantiles and fixed interval length. Example: see Exercise 5 Descriptive Analysis and Graphics If we work with nominal or ordinal measured variables, the variables can be nicely summarized in frequency tables. Example: Frequency table of sex (sprint.sav) Analyze/Descriptive Statistics/Frequencies... Variables: Sex Charts: Simple Bar Typical characteristic values for scale variables are mean, standard deviation and quantiles. You can find them also in the Analyze/Descriptive Statistics Menu. Analyze/Descriptive Statistics/Frequencies... Variables: Alter, zeit1, zeit2 Statistics: mean, var, quartile Remove cross: Display frequency table Another typical question of descriptive analysis is about correlation. For example, we want to analyze whether there is any correlation between time1 and time2. Be careful with the interpretation of a correlation without graphic illustration. On the course slides there are examples for various point clouds of very different shapes - but all have a Pearson correlation of 0.7. Therefore, we also draw a scatterplot of time1 vs. time2. In SPSS there are two ways to produce graphics: either you can use the new interactive graphic menu or the old legacy dialog. Interactive Menu: Graphs/Chart Builder The Gallery shows a preview of various graphics, which can be generated in that menu. The preview does not show the real data. The true plot you only see after finishing the layout in the output window of SPSS. Drag a graphic from the gallery to the chart preview on top (here: scatterplot/dots) Drag the variables from the left side into the chart preview and Place them on x or y axis (here: x = zeit1 and y = zeit2) 6

In addition to the main window another window appears, which is called Element Properties. Here you can change the bar style, the limits of axis etc. Legacy plots: Graphs / Legacy Plots / Scatter/Dot / Simple Scatter X-Axis: zeit1 Y-Axis: zeit2 Edit graphics: The graphic can be modified at any time with a double click. A chart editor with a lot of options opens. In addition, there are many more options available after double clicking an element (Element = x-axis, points, lines, titles etc.). For every element a separate property window opens. The options are different for single elements. The chosen element is yellow marked in the chart editor. After closing the chart editor all your modification are used to update the original graphic in the output window. Example: double click on x-axis: we can change the thickness, style and color of the line. The graphic can be exported as.jpg or.pdf by right click and Export. After examining our two variables graphically, we now calculate the correlation: Analyze / Correlate / Bivariate... Variables: zeit1, zeit2 7