# Regression III: Dummy Variable Regression

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Regression III: Dummy Variable Regression Tom Ilvento FREC 408 Linear Regression Assumptions about the error term Mean of Probability Distribution of the Error term is zero Probability Distribution of Error Has Constant Variance = F 2 Probability Distribution of Error is Normal Errors are Independent they are uncorrelated with each other (page553) Regression Applications - we will look at Dummy variable regression as an alternative to Multiple Regression Dummy Variable Regression Regression using a dichotomus independent variable or set of variables Regression will estimate the same relationship as There will be a few important changes in Excel The Data need to be in columns with matched data for all the variables no missing values for any variable For Any Categorical Variable I can represent any categorical variable with j classes With j-1 dummy variables, coded as 0 and 1 For example, Sex has 2 classes male and Female Represent as one variable coded 1 if female and 0 if male Dummy Variables Example Religion (Protestant, Catholic, Jewish) Dummy 1 (X1) = 1 if Protestant, 0 if not Dummy 2 (X2) = 1 if Catholic, 0 if not If you are Jewish, then you will have a value of zero on Dummy 1 (X1) and Dummy 2 (X2) The other class is called the reference category and is captured in the intercept term 1

2 Example Problem Examines the Sorption Rate of three different hazardous organic solvents Aromatics Chloroalkanes Esters Asks if there are differences among the three? Sample of 32 sorption rates across the three classes of organic hazardous solvents The dependent variable is Sorption Rate Anova: Single Factor Results SUMMARY Groups Count Sum Average Variance Aromatics Cloro Esters Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total Hypothesis Test for the Factor Rejection Region H 0 : : 1 =: 2 = : 3 H a : At least two means differ Equal variances, normal distribution F* = F.05, 2, 29 d.f. = F* > F > Reject H 0 : : 1 =: 2 = : 3 There are differences across the organic chemicals Regression Approach For Excel, reorganize the data Dependent variable is in a single column Classes are coded as 0/1 in contiguous columns Run Tools, Data Analysis, Regression and pick two of the three classes to be included in the model Let s look at Excel Regression Output with Esters as the reference category Multiple R R Square Adjusted R Square Standard Error Observations 32 df SS MS F Signif. F Regression Residual Total

3 Anova: Single Factor Table is the same! SUMMARY Groups Count Sum Average Variance Aromatics Cloro Esters Source of Variation SS df MS F P-value F crit Between Groups Within Groups Regression gives us Coefficients Coefficients Std Error t Stat P-value Intercept Aromatics Cloro ) Y = ( Aromatics) ( Cloro) Total Estimated values from our equation Since our independent variables are dummy variables, it is easy to solve the equation ) Y = ( Aromatics) ( Cloro) When Aromatics = 1 = (1) (0) =.9422 When Chloroalkanes = 1 = (0) (1) = When Aromatics and Chloroalkanes =0 = (0) (0) =.3300 This represents Esters! The model estimated the mean levels for each solvent Aromatics Cloro Esters Mean Standard Error Median Mode #N/A #N/A Standard Deviation Sample Variance The t-test for the dummy coefficients Hypothesis Test for a slope coefficient for Cloro, Coefficients Std Error t Stat P-value Intercept Aromatics Cloro ) Y = ( Aromatics) ( Cloro) The test-test for Aromatics and Cloro represent a test if each are significantly different from Esters, i.e., a difference of means test!! Calculation P-value H 0 : \$ 2 = 0 H a : \$ 2 0 two-tailed test Large sample, normal t* = )/.1137 t*= P =.000 Reject H 0 : \$ 2 = 0 3

4 Regression Output with Aromatics as the reference category Regression Output with Aromatics as the reference category Multiple R R Square Adjusted R Square Standard Error Observations 32 df SS MS F Signif F Regression Residual Total Coefficients Std Error t Stat P-value Intercept Cloro Esters The coefficients reflects the difference between Cloro and Esters from Aromatics Notice that the t-test for Cloro shows that it is not significantly different from Aromatics MN Apartment Sales Example A real estate agent wants to use regression analysis to explore the relationship between the sale prices of apartment buildings and various characteristics of the apartments, including: # of apartments age of structure lot size # of on-site parking spaces gross building area condition of apartment building We will focus on Condition Price Condition \$79,300 F \$90,300 F \$93,600 F \$108,750 G \$110,000 G \$134,400 E \$155,700 E \$157,500 G \$162,500 G This is just part of the data. We need to create the dummy variables Condition has 3 levels or categories Fair Good Excellent MN Apartment Sales Example - dummy variables First, look at the relationship between PRICE and Condition of apartment building There are three categories Excellent, Good and Fair. Need (3-1) = 2 dummy variables 1, if condition is Excellent X 1 = 0, if condition is NOT excellent X 2 = 1, if condition is Good 0, if condition is NOT good MN Apartment Sales Example - dummy variables The last category Fair is the reference category When X 1 =0 and X 2 =0, the condition of apartment building is Fair I will label the two dummy variables (X 1 and X 2 ) as Excellent and Good 4

5 Regression Analysis In Excel, we want the regression of Price on Conditions by using two dummy variables X 1 and X 2 I will label them as Excellent and Good Reorganize the data Dependent variable Price is in a single column Excellent and Good are coded as 0/1 Run Tools, Data Analysis, Regression and pick Excellent and Good to be included in the model I create two dummy variables to represent Condition Price Condition Excellent Good \$79,300 F 0 0 \$90,300 F 0 0 \$93,600 F 0 0 \$108,750 G 0 1 \$110,000 G 0 1 \$134,400 E 1 0 \$155,700 E 1 0 \$157,500 G 0 1 \$162,500 G 0 1 There does seem to be differences across the groups There also is a lot of spread across the groups Graph of Price by Condition Mean Apartment Price By Condition \$400, \$300, Fair \$200, \$100, Good \$0.00 Mean Level Excellent Excellent Good Fair Regression Output with two dummy variables Multiple R R Square Adjusted R Square Standard Error Observations 25 df SS MS F Sig F Regression Residual Total Intercept Excellent Good Regression Output with two dummy variables Multiple R R Square Adjusted R Square Standard Error Observations 25 R 2 is only Is this number too small? Is our model is a good one? We re going to do some tests on it later. 5

6 Regression gives us Coefficients and Estimated Model between Price and Condition Intercept Excellent Good Estimated equation/model: Yˆ = (Excellent) (Good) Estimated values from our equation When Excellent = 1 Price = (1) (0) = \$350,250 When Good = 1 Price= (0) (1) = \$305,581 When Excellent and Good =0, Fair Price= (0) (0) = \$176,940 Regression/ Output Multiple R R Square Adjusted R Square Standard Error Observations 25 df SS MS F Sig F Regression Residual Total Intercept Excellent Good F Test for the estimated model H 0 : : 1 =: 2 =0 H a : At least one mean differs from 0 Equal variances, normal distribution F* = F.05, 2, 22 d.f. = 3.44 Rejection Region F* < F < 3.44 Can t reject H 0 : : 1 =: 2 =0 It seems the conditions have no statistically significant compact on apartment price. Is that true in real life? The t-test for the dummy coefficients Intercept Excellent Good Y ˆ = ( X1) ( X 2) The t-tests for X1 (Excellent) and X2 (Good) represent a test if each are significantly different from Fair Hypothesis Test for a slope coefficient for Excellent Calculation P-value H 0 : \$ 1 = 0 H a : \$ 1 0 two-tailed test small sample, normal t*=(173310)/ t*= P =.190 Can t reject H 0 : \$ 1 = 0 Based on our model and t-test, it indicates that the condition of the apartment makes no difference on Price! 6

7 A multivariate Example This focuses on whether females mid-level managers have lower salaries than males. The data set contains the following variables for 220 mid-level managers of firms (we will only focus on these four variables): SALARY Dependent Variable Base annual salary in \$1,000s SEX 1= Female; 0 = Male POSITION An index of the position of the employee in the firm, based on the number of employees supervised, size of budget and so forth. A higher number means higher level in the company YEARS EXP The number of years of experience in the company Difference of Means Test t-test: Two-Sample Assuming Equal Variances Males Females Mean Variance Observations Pooled Variance Hypothesized Mean Difference 0 df 218 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Conclusion: We have evidence that males earn more Dummy Variable Regression gives the same result Multiple R R Square Adjusted R Square Standard Error Observations 220 df SS MS F Sig F Regression Residual Total Intercept Sex Multiple Regression Output??? Multiple R R Square Adjusted R Square Standard Error Observations 220 df SS MS F Signif F Regression Residual Total Coeff. Std Error t Stat P-value Intercept Sex Position YearsExper

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### When to use Excel. When NOT to use Excel 9/24/2014

Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

EXCEL Analysis TookPak [Statistical Analysis] 1 First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: a. From the Tools menu, choose Add-Ins b. Make sure Analysis

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

### Data analysis. Data analysis in Excel using Windows 7/Office 2010

Data analysis Data analysis in Excel using Windows 7/Office 2010 Open the Data tab in Excel If Data Analysis is not visible along the top toolbar then do the following: o Right click anywhere on the toolbar

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

### Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Regression Analysis. Data Calculations Output

Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS)

SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS) State of the course address: The Final exam is Aug 9, 3:30pm 6:30pm in B9201 in the Burnaby Campus. (One

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### ANOVA - Analysis of Variance

ANOVA - Analysis of Variance ANOVA - Analysis of Variance Extends independent-samples t test Compares the means of groups of independent observations Don t be fooled by the name. ANOVA does not compare

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Module 6: Introduction to Time Series Forecasting

Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### 1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

### Guide to Microsoft Excel for calculations, statistics, and plotting data

Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### Once saved, if the file was zipped you will need to unzip it.

1 Commands in SPSS 1.1 Dowloading data from the web The data I post on my webpage will be either in a zipped directory containing a few files or just in one file containing data. Please learn how to unzip

### SPSS BASICS. (Data used in this tutorial: General Social Survey 2000 and 2002) Ex: Mother s Education to eliminate responses 97,98, 99;

SPSS BASICS (Data used in this tutorial: General Social Survey 2000 and 2002) How to do Recoding Eliminating Response Categories Ex: Mother s Education to eliminate responses 97,98, 99; When we run a frequency

### How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

### Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

### An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing

### Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

### An SPSS companion book. Basic Practice of Statistics

An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

### ANOVA Analysis of Variance

ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

### Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables

### COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Using Microsoft Excel to Analyze Data from the Disk Diffusion Assay

Using Microsoft Excel to Analyze Data from the Disk Diffusion Assay Entering and Formatting Data Open Excel. Set up the spreadsheet page (Sheet 1) so that anyone who reads it will understand the page (Figure

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Statistical Functions in Excel

Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing