# Introduction to Regression and Data Analysis

Save this PDF as:

Size: px
Start display at page:

Download "Introduction to Regression and Data Analysis"

## Transcription

1 Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008

3 If you are interested in whether one variable differs among possible groups, for instance, then regression isn t necessarily the best way to answer that question. Often you can find your answer by doing a t-test or an ANOVA. The flow chart shows you the types of questions you should ask yourselves to determine what type of analysis you should perform. Regression will be the focus of this workshop, because it is very commonly used and is quite versatile, but if you need information or assistance with any other type of analysis, the consultants at the Statlab are here to help. II. Regression: An Introduction: A. What is regression? Regression is a statistical technique to determine the linear relationship between two or more variables. Regression is primarily used for prediction and causal inference. In its simplest (bivariate) form, regression shows the relationship between one independent variable (X) and a dependent variable (Y), as in the formula below: The magnitude and direction of that relation are given by the slope parameter ( 1 ), and the status of the dependent variable when the independent variable is absent is given by the intercept parameter ( 0 ). An error term (u) captures the amount of variation not predicted by the slope and intercept terms. The regression coefficient (R 2 ) shows how well the values fit the data. Regression thus shows us how variation in one variable co-occurs with variation in another. What regression cannot show is causation; causation is only demonstrated analytically, through substantive theory. For example, a regression with shoe size as an independent variable and foot size as a dependent variable would show a very high regression coefficient and highly significant parameter estimates, but we should not conclude that higher shoe size causes higher foot size. All that the mathematics can tell us is whether or not they are correlated, and if so, by how much. It is important to recognize that regression analysis is fundamentally different from ascertaining the correlations among different variables. Correlation determines the strength of the relationship between variables, while regression attempts to describe that relationship between these variables in more detail. B. The linear regression model (LRM) The simple (or bivariate) LRM model is designed to study the relationship between a pair of variables that appear in a data set. The multiple LRM is designed to study the relationship between one variable and several of other variables. In both cases, the sample is considered a random sample from some population. The two variables, X and Y, are two measured outcomes for each observation in the data set. For 3

4 example, let s say that we had data on the prices of homes on sale and the actual number of sales of homes: Price(thousands of \$) Sales of new homes x y This data is found in the file house sales.jmp. We want to know the relationship between X and Y. Well, what does our data look like? We will use the program JMP (pronounced jump ) for our analyses today. Start JMP, look in the JMP Starter window and click on the Open Data Table button. Navigate to the file and open it. In the JMP Starter, click on Basic in the category list on the left. Now click on Bivariate in the lower section of the window. Click on y in the left window and then click the Y, Response button. Put x in the X, Regressor box, as illustrated above. Now click OK to display the following scatterplot. 4

5 We need to specify the population regression function, the model we specify to study the relationship between X and Y. This is written in any number of ways, but we will specify it as: where Y is an observed random variable (also called the response variable or the lefthand side variable). X is an observed non-random or conditioning variable (also called the predictor or right-hand side variable). 1 is an unknown population parameter, known as the constant or intercept term. 2 is an unknown population parameter, known as the coefficient or slope parameter. u is an unobserved random variable, known as the error or disturbance term. Once we have specified our model, we can accomplish 2 things: Estimation: How do we get good estimates of 1 and 2? What assumptions about the LRM make a given estimator a good one? Inference: What can we infer about 1 and 2 from sample information? That is, how do we form confidence intervals for 1 and 2 and/or test hypotheses about them? The answer to these questions depends upon the assumptions that the linear regression model makes about the variables. The Ordinary Least Squres (OLS) regression procedure will compute the values of the parameters 1 and 2 (the intercept and slope) that best fit the observations. 5

6 Obviously, no straight line can exactly run through all of the points. The vertical distance between each observation and the line that fits best the regression line is called the residual or error. The OLS procedure calculates our parameter values by minimizing the sum of the squared errors for all observations. Why OLS? It has some very nice mathematical properties, and it is compatible with Normally distributed errors, a very common situation in practice. However, it requires certain assumptions to be valid. C. Assumptions of the linear regression model 1. The proposed linear model is the correct model. Violations: Omitted variables, nonlinear effects of X on Y (e.g., area of circle = *radius 2 ) 2. The mean of the error term (i.e. the unobservable variable) does not depend on the observed X variables. 3. The error terms are uncorrelated with each other and exhibit constant variance that does not depend on the observed X variables. Violations: Variance increases as X or Y increases. Errors are positive or negative in bunches called heteroskedasticity. 4. No independent variable exactly predicts another. Violations: Including monthly precipitation for 12 months, and annual precipitation in the same model. 5. Independent variables are either random or fixed in repeated sampling If the five assumptions listed above are met, then the Gauss-Markov Theorem states that the Ordinary Least Squares regression estimator of the coefficients of the model is the Best Linear Unbiased Estimator of the effect of X on Y. Essentially this means that it is the most accurate estimate of the effect of X on Y. III. Deriving OLS estimators The point of the regression equation is to find the best fitting line relating the variables to one another. In this enterprise, we wish to minimize the sum of the squared deviations (residuals) from this line. OLS will do this better than any other process as long as these conditions are met. 6

7 The best fit line associated with the n points (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) has the form where y = mx + b slope = m = intercept = b = So we can take our data from above and substitute in to find our parameters: Price(thousands of \$) Sales of new homes X y xy x ,160 25, ,540 32, ,400 40, ,500 48, ,680 57, ,400 67, ,600 78,400 Sum Slope = m = Intercept = b = 7

8 To have JMP calculate this for you, click on the small red triangle just to the left of Bivariate Fit of y By x. From the drop down menu, choose Fit Line to produce the following output. Thus our least squares line is y = x 8

9 IV. Interpreting data Coefficient: (Parameter Estimate) for every one unit increase in the price of a house, fewer houses are sold. Std Error: if the regression were performed repeatedly on different datasets (that contained the same variables), this would represent the standard deviation of the estimated coefficients t-ratio: the coefficient divided by the standard error, which tells us how large the coefficient is relative to how much it varies in repeated sampling. If the coefficient varies a lot in repeated sampling, then its t-statistic will be smaller, and if it varies little in repeated sampling, then its t-statistic will be larger. Prob> t : the p-value is the result of the test of the following null hypothesis: in repeated sampling, the mean of the estimated coefficient is zero. E.g., if p = 0.001, the probability of observing an estimate of that is at least as extreme as the observed estimate is 0.001, if the true value of is zero. In general, a p-value less than some threshold, like 0.05 or 0.01, will mean that the coefficient is statistically significant. Confidence interval: the 95% confidence interval is the set of values that lie within 1.96 standard deviations of the estimated Rsquare: this statistic represents the proportion of variation in the dependent variable that is explained by the model (the remainder represents the proportion that is present in the error). It is also the square of the correlation coefficient (hence the name). V. Multivariate regression models. Now let s do it again for a multivariate regression, where we add in number of red cars in neighborhood as an additional predictor of housing sales: Price(thousands of \$) Sales of new homes Number of red cars x y cars

10 Our general regression equation this time is this: To calculate a multiple regression, go the JMP Starter window and select the Model category. Click on the top button, Fit Model for the following dialog: Click on y in the left window, then on the Y button. Now click on x and then Add in the Construct Model Effects section. Repeat with cars then click on Run Model to produce the following window. (I ve hidden a few results to save space. Click on the blue triangles to toggle between displayed and hidden.) 10

12 Other violations of the regression assumptions are addressed below. In all of these cases, be aware that Ordinary Least Squares regression, as we have discussed it today, gives biased estimates of parameters and/or standard errors. Non-linear relationship between X and Y: use Non-linear Least Squares or Maximum Likelihood Estimation Endogeneity: use Two-Stage Least Squares or a lagged dependent variable Autoregression: use a Time-Series Estimator (ARIMA) Serial Correlation: use Generalized Least Squares Heteroskedasticity: use Generalized Least Squares How will you know if the assumptions are violated? You usually will NOT find the answer from looking at your regression results alone. Consider the four scatter plots below: 12

13 Clearly they show widely different relationships between x and y. However, in each of these four cases, the regression results are exactly the same: the same best fit line, the same t-statistics, the same R

14 In the first case, the assumptions are satisfied, and linear regression does what we would expect it to. In the second case, we clearly have a non-linear (in fact, a quadratic) relationship. In the third and fourth cases, we have heteroskedastic errors: the errors are much greater for some values of x than for others. The point is, the assumptions are crucial, because they determine how well the regression model fits the data. If you don t look closely at your data and make sure a linear regression is appropriate for it, then the regression results will be meaningless. VI. Additional resources If you believe that the nature of your data will force you to use a more sophisticated estimation technique than Ordinary Least Squares, you should first consult the resources listed on the Statlab Web Page at You may also find that Google searches of these regression techniques often find simple tutorials for the methods that you must use, complete with help on estimation, programming, and interpretation. Finally, you should also be aware that the Statlab consultants are available throughout regular Statlab hours to answer your questions (see the areas of expertise for Statlab consultants at 14

15 Purpose of statistical analysis Summarizing univariate data Exploring relationships between variables Testing significance of differences Descriptive statistics (mean, standard deviation, variance, etc) Form of data Number of groups Frequencies Measurements One: mean compared to a specified value Two Multiple Number of variables Number of variables One-sample t-test Independent samples Related samples Independent samples Related samples One: compared to theoretical distribution Two: tested for association Two: degree of relationship Multiple: effect of 2+ predictors on a dependant variable Form of data Form of data One independent variable Multiple independent variables Repeatedmeasures ANOVA Chi-square goodness-of-fit test Chi-square test for association Level of measurement Multiple regression Ordinal Interval Ordinal Interval Ordinal Interval Mann-Whitney U test Independentsamples t-test One-way ANOVA Multifactorial ANOVA Spearman's rho Pearson's correlation cooeficient Wilcoxon matchedpairs test Pairedsamples t-test Source: Corston and Colman (2000) Figure 9. Choosing an appropriate statistical procedure

### 6. An Introduction to Statistical Package for the Social Sciences

6. An Introduction to Statistical Package for the Social Sciences 53 Nick Emtage and Stephen Duthy This module provides an introduction to statistical analysis, particularly in regard to survey data. Some

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### 7. Tests of association and Linear Regression

7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.

### , then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### January 26, 2009 The Faculty Center for Teaching and Learning

THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Basic Data Analysis Using JMP in Windows Table of Contents:

Basic Data Analysis Using JMP in Windows Table of Contents: I. Getting Started with JMP II. Entering Data in JMP III. Saving JMP Data file IV. Opening an Existing Data File V. Transforming and Manipulating

### Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Practice 3 SPSS. Partially based on Notes from the University of Reading:

Practice 3 SPSS Partially based on Notes from the University of Reading: http://www.reading.ac.uk Simple Linear Regression A simple linear regression model is fitted when you want to investigate whether

### Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015

AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Chapter 14: Analyzing Relationships Between Variables

Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between

### CRJ Doctoral Comprehensive Exam Statistics Friday August 23, :00pm 5:30pm

CRJ Doctoral Comprehensive Exam Statistics Friday August 23, 23 2:pm 5:3pm Instructions: (Answer all questions below) Question I: Data Collection and Bivariate Hypothesis Testing. Answer the following

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### 1. The Classical Linear Regression Model: The Bivariate Case

Business School, Brunel University MSc. EC5501/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 018956584) Lecture Notes 3 1.

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

### Module 5: Statistical Analysis

Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

### An SPSS companion book. Basic Practice of Statistics

An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by

### Algebra 1 Course Information

Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### IBM SPSS Statistics 23 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 23 Part 4: Chi-Square and ANOVA Winter 2016, Version 1 Table of Contents Introduction... 2 Downloading the Data Files... 2 Chi-Square... 2 Chi-Square Test for Goodness-of-Fit... 2 With

### Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

### Class 6: Chapter 12. Key Ideas. Explanatory Design. Correlational Designs

Class 6: Chapter 12 Correlational Designs l 1 Key Ideas Explanatory and predictor designs Characteristics of correlational research Scatterplots and calculating associations Steps in conducting a correlational

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

### Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

### 12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### Perform hypothesis testing

Multivariate hypothesis tests for fixed effects Testing homogeneity of level-1 variances In the following sections, we use the model displayed in the figure below to illustrate the hypothesis tests. Partial

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### For example, enter the following data in three COLUMNS in a new View window.

Statistics with Statview - 18 Paired t-test A paired t-test compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the

### Moderation. Moderation

Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Multiple Regression in SPSS STAT 314

Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan

### 4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### MAT 12O ELEMENTARY STATISTICS I

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 12O ELEMENTARY STATISTICS I 3 Lecture Hours, 1 Lab Hour, 3 Credits Pre-Requisite:

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

### Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### Course Description. Learning Objectives

STAT X400 (2 semester units in Statistics) Business, Technology & Engineering Technology & Information Management Quantitative Analysis & Analytics Course Description This course introduces students to

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

### Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th Standard 3: Data Analysis, Statistics, and Probability 6 th Prepared Graduates: 1. Solve problems and make decisions that depend on un

### 11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### BIOSTATISTICS QUIZ ANSWERS

BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

### How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo

How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There

### Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking