Using Excel for Statistical Analysis



Similar documents
Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

A Quick Guide to Using Excel 2007 s Regression Analysis Tool

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

Chapter 7: Simple linear regression Learning Objectives

Creating a Gradebook in Excel

Data exploration with Microsoft Excel: analysing more than one variable

Data Analysis Tools. Tools for Summarizing Data

Directions for using SPSS

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Scatter Plots with Error Bars

Univariate Regression

Simple Predictive Analytics Curtis Seare

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Card sort analysis spreadsheet

Basic Statistical Analysis in Excel NICAR 15 / Norm Lewis, University of Florida

An introduction to using Microsoft Excel for quantitative data analysis

Multiple Regression. Page 24

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

How To Run Statistical Tests in Excel

When to use Excel. When NOT to use Excel 9/24/2014

Getting started in Excel

Linear Models in STATA and ANOVA

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Module 3: Correlation and Covariance

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Chapter 13 Introduction to Linear Regression and Correlation Analysis

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

Dealing with Data in Excel 2010

Our goal, as journalists, is to look for some patterns and trends in this information.

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Spreadsheet software for linear regression analysis

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

How to Get More Value from Your Survey Data

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

Getting started manual

STC: Descriptive Statistics in Excel Running Descriptive and Correlational Analysis in Excel 2013

Instructions for Using Excel as a Grade Book

Business Valuation Review

2013 MBA Jump Start Program. Statistics Module Part 3

TIPS FOR DOING STATISTICS IN EXCEL

Microsoft Excel. Qi Wei

Excel: Analyze PowerSchool Data

Data exploration with Microsoft Excel: univariate analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Using MS Excel to Analyze Data: A Tutorial

Predicting Defaults of Loans using Lending Club s Loan Data

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Figure 1. An embedded chart on a worksheet.

Moderation. Moderation

Scatter Plot, Correlation, and Regression on the TI-83/84

Instructions for applying data validation(s) to data fields in Microsoft Excel

Regression step-by-step using Microsoft Excel

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

SPSS Explore procedure

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

SPSS Tests for Versions 9 to 13

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

SPSS Tutorial, Feb. 7, 2003 Prof. Scott Allard

Simple Linear Regression

Homework 8 Solutions

0 Introduction to Data Analysis Using an Excel Spreadsheet

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

"Excel with Excel 2013: Pivoting with Pivot Tables" by Venu Gopalakrishna Remani. October 28, 2014

Using Excel for Statistical Analysis

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data


A Picture Really Is Worth a Thousand Words

Statistical Data analysis With Excel For HSMG.632 students

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Assignment objectives:

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

MTH 140 Statistics Videos

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions

Step 3: Go to Column C. Use the function AVERAGE to calculate the mean values of n = 5. Column C is the column of the means.

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Chapter 23. Inferences for Regression

Analyzing calorimetry data using pivot tables in Excel

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

To create a histogram, you must organize the data in two columns on the worksheet. These columns must contain the following data:

Working with Spreadsheets

Lin s Concordance Correlation Coefficient

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

The Dummy s Guide to Data Analysis Using SPSS

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

Psychology 205: Research Methods in Psychology

Course Advanced Life Science 7th Grade Curriculum Extension

Chapter 5 Analysis of variance SPSS Analysis of variance

MICROSOFT EXCEL FORECASTING AND DATA ANALYSIS

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

How to Make the Most of Excel Spreadsheets

An analysis of the 2003 HEFCE national student survey pilot data.

430 Statistics and Financial Mathematics for Business

5 Correlation and Data Exploration

Session 10. Laboratory Works

SPSS Guide: Regression Analysis

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Transcription:

Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure you have your Data Analysis Tool pack installed. You should see Data Analysis on the far right of your tool bar. If you don t see it go to FILE OPTION ADD INS and add the Analysis Tool. Learn about your data One nice thing about the Data Analysis tool is that it can do several things at once. If you want a quick overview of your data, it will give you a list of descriptives that explain your data. That information can be helpful for other types of analyses. Let s use the COMM07.xls file (test scores for communication in 7 th grade in Missouri Schools) If we wanted to get a quick overview of the SCORE variable, we can use the DESCRIPTIVE STATISTICS tool. Go to the DATA table and click on the DATA ANALYSIS tool. From the list of tools, choose DESCRIPTIVE STATISTICS: Highlight the column containing the SCORE data (but not the header). Check SUMMARY STATISTICS:

This looks like a lot. But some of these variables can be helpful. For example, when you do a regression, you want the Mean (average) and Median (middle value) to be fairly close together. You want your Standard Deviation to be less than the Mean. So in the above table, our Mean and Median are close together. The standard deviation is about 29 which means that about 70 percent of the schools scored between 155 and 213. Correlation Another good overview of your data is a correlation Matrix, which gives you an overview of what variables tend to go up and down together and in what direction. It s a good first past at relationships in your data before delving into regression. The correlation is measured by a variable called Pearson s R, which ranges between - 1 (indirect relationship) and 1 (perfect relationship). Go to the DATA table and the DATA ANALYSIS tool and choose CORRELATIONS. Choose the range of all the columns (less headers) that you want to compare: You get a table that matches each variable to all other variables. Because the relationship between two columns is the same, no matter which direction you compare, Excel gives you the value just once. Below you see that the correlation between Column 3 and Column 1 is -.43348. It would be the same between Column 1 and Column 3. This shows me that the variables with the strongest relationship are Column 3 (PCT_POOR) and Column 4 (SCORE). Correlation provides a general indicator of the linear relationship between two variables, but it doesn t allow you to let you predict one variable based on another. To do that, you need linear regression, also known as ordinary least squares regression.

Some characteristics help predict others. For example, people growing up in a lower- income family are more likely to score lower on standardized tests than those from higher- income families. Regression helps us see that connection and even say about how much characteristics affect another. A note about Excel: It is picky 1. Your data should be numeric. 2. You should not have any empty cells. To start, you need to determine which characteristics are what we call independent variables. These are the predictors. Next, the characteristics they help predict are the dependent variables. When you run a regression, you get a result called an R- square. That will help you see how much the independent variable predicts the dependent variable. Let s Regress Go to the DATA tab and click on the DATA ANALYSIS tool: Choose Regression from the list of statistics. Next, Excel will ask you to highlight the range of cells for the X and Y ranges. The Y is your DEPENDENT variable and X is your INDEPENDENT variable. Check Confidence Level and leave at 95 percent the common level used for social science research. Check New Worksheet Ply so the output goes into a new worksheet. Leave everything else blank for now.

There s a lot in the output. But for now, let s pay attention to a couple of variables: Adjusted R Square and Significance: R Square tells you how much of the change in your DEPENDENT variable can be explained by your INDEPENDENT variable. In this case, only about 16 percent of the change in SCORE can be explained by PCT TESTED. An R Square is between 0 and 1 the closer to 1, the stronger the relationship. The SIGNIFICANCE variable is 0.000 - - - if this is less than.05 it means your results are significant or that they didn t just occur by chance. Let s check another variable. Run a regression with PCTPOOR as you INDEPENDENT variable and SCORE as your DEPENDENT variable.

So there is a much stronger relationship with PCTPOOR. It explains 80 percent of the variation. The R Square doesn t tell you everything. Another important variable is what is labeled above as X VARIABLE 1, which is - 0.84291 This is the SLOPE of the line. What this tells you is that for every 10 point increase in PCTPOOR, SCORE goes down about 8 points. Using this information, Excel can tell us whether a school is doing better or worse than it should given the percent of poor students. Under RESIDUALS, check RESIDUALS, STANDARDIZED RESIDUALS AND LINE FIT PLOTS The residuals tell you how well a school did: The first school had a score of 210.5, which is 6.77 points better than it should given it s poverty level. The STANDARDIZED RESIDUAL tells you the same information in terms of standard deviation. The plot gives you a graphic image of your model: Practice Use the file strong_mayor.xlsx to see if there is any relationship between population and election results

A note of caution: This class is a quick overview of some statistics that can be tricky sometimes. Be sure to invoke expert opinions before running with the results of a regression.