12.1 Inference for Linear Regression

Similar documents
Correlation and Regression

Simple linear regression

2013 MBA Jump Start Program. Statistics Module Part 3

Regression Analysis: A Complete Example

MTH 140 Statistics Videos

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

MULTIPLE REGRESSION EXAMPLE

Chapter 7: Simple linear regression Learning Objectives

Exercise 1.12 (Pg )

STAT 350 Practice Final Exam Solution (Spring 2015)

2. Simple Linear Regression

Simple Regression Theory II 2010 Samuel L. Baker

17. SIMPLE LINEAR REGRESSION II

DATA INTERPRETATION AND STATISTICS

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Lecture 1: Review and Exploratory Data Analysis (EDA)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Simple Linear Regression Inference

Chapter 23. Inferences for Regression

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Simple Linear Regression

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Statistics 151 Practice Midterm 1 Mike Kowalski

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Recall this chart that showed how most of our course would be organized:

Diagrams and Graphs of Statistical Data

August 2012 EXAMINATIONS Solution Part I

Multiple Linear Regression

Factors affecting online sales

Final Exam Practice Problem Answers

AP Physics 1 and 2 Lab Investigations

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Chapter 7 Section 7.1: Inference for the Mean of a Population

AP Statistics: Syllabus 1

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Correlation and Simple Linear Regression

Premaster Statistics Tutorial 4 Full solutions

Chapter 7 Section 1 Homework Set A

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

The importance of graphing the data: Anscombe s regression examples

How To Write A Data Analysis

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Univariate Regression

Geostatistics Exploratory Analysis

12: Analysis of Variance. Introduction

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Summarizing and Displaying Categorical Data

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Correlation key concepts:

Statistics Review PSY379

Unit 26: Small Sample Inference for One Mean

Estimation of σ 2, the variance of ɛ

Scatter Plots with Error Bars

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

List of Examples. Examples 319

Name: Date: Use the following to answer questions 3-4:

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Fairfield Public Schools

Module 4: Data Exploration

Chapter 7. One-way ANOVA

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

1.5 Oneway Analysis of Variance

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Analysis of Variance ANOVA

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

11. Analysis of Case-control Studies Logistic Regression

Simple Linear Regression, Scatterplots, and Bivariate Correlation


USING A TI-83 OR TI-84 SERIES GRAPHING CALCULATOR IN AN INTRODUCTORY STATISTICS CLASS

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Data Analysis Tools. Tools for Summarizing Data

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Relationships Between Two Variables: Scatterplots and Correlation

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Statistical Models in R

Example G Cost of construction of nuclear power plants

Section 1: Simple Linear Regression

Elementary Statistics Sample Exam #3

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

10. Analysis of Longitudinal Studies Repeat-measures analysis

An analysis method for a quantitative outcome and two categorical explanatory variables.

Exploratory Data Analysis

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

MORE ON LOGISTIC REGRESSION

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Using R for Linear Regression

Transcription:

12.1 Inference for Linear Regression Least Squares Regression Line y = a + bx You might want to refresh your memory of LSR lines by reviewing Chapter 3! 1

Sample Distribution of b p740 Shape Center Spread 2

Conditions for Regression Inference Suppose we have n observations over an explanatory variable x and a response variable y. Our goal is to study or predict the behavior of y for given values of x. Linear:The actual relationship between x and y is lenear. For any fixed value of x, the mean response μ y = α +βx. the slope β and the intercept α ar usually unknown parameters. Independent: Individual observations are independent of each other. Normal: For any fixed value of x, the response y varies according to a Normal distribution. Equal variance: The standard deviation of y (call it σ) is the same for all values of x. The common standard deviation σ is usually an unknown parameter. Random: The data come from a well designed random sample or randomized experiment. The acronym LINER should help you remember the conditions for inference about regression. Note that the symbols α and β here refer to the intercept and slope respectively, of the population regression line. These are not related to Type I or Type II errors. 3

How to Check the Conditions for Regression Inference Linear: Examine the scatterplot to check that the overall pattern is roughly linear. Look for curved patterns in the residual plot. check to see that the residuals center on the "residual =0" line at each x value in the residual plot. Independent Look at how the data were produced. Random sampling and random assignment help ensure the independence of individual observations. If sampling is done without replacement, remember to check that the population is at least 10 times as large as the sample (10% condition). Normal Make a stemplot, histogram, or Normal probability plot of the residuals and check for clear skewness or other major departures from Normality. Equal variance Look at the scatter of the residuals above and below the "residual =0" line in the residual plot. The amount of scatter should be roughly the same from the smallest to the largest x value. Random: See if the data were produced by random sampling or a randomized experiment. 4

Ex: The Helicopter Experiment p743 Check the Conditions: LINER Linear: The scatterplot shows a clear linear form. For each drop, the residuals are centered on the horizontal line 0. The residual plot shows a random scatter about the horizontal line. Independent: Released in a random order and none used twice. Normal: The histogram of the residuals is single peaked, unimodal, and somewhat bell shaped. The normal prob. plot is close to linear. Equal Variance: The residual plot shows a similar amount of scatter about the residual = 0 line. Random: Random assignment. 5

Estimating Parameters Ex: The Helicopter p745 The least squares regression line for these data is: flight time =.03761 +.0057244(drop height) You will be asked to estimate from the above data α β σ 6

Constructing a Confidence Interval for the Slope t Interval for the Slope of a Least Squares Regression Line When the conditions for regression inference are met, a level C confidence interval for the slope β of the population (true) regression line is b ± t*se b In this formula, the standard error of the slope is SE b = s/(s x n 1) and t* is the critical value for the t distribution with df = n 2 having area C between t* and t* 7

Ex: p747 The Helicopter A confidence interval for β a) Identify the standard error of the slope SEb.0002018 from SE coef column. If we repeated the random assignment many times, the slope of the sampe regression line would typically vary by about.0002 from the slope of the true regression line. b) Find the critical value for a 95% CI for the slope of the true regression line. Conditions are met. df = n 2 = 70 2 = 68. Using Table B and the conservative df = 60 t* = 2.000. The 95% CI = b ± t*seb =.0057244 ±2.000(.0002018) =.0057244 ±.0004036 = (.0053208,.0061280) This interval is slightly narrower due to the more precise t*critical value. c) We are 95% confident that the interval (.0053208,.0061280) seconds per cm captures the slope of the true regression line relating the flight time y and the drop height x of paper helicopters. d) If we repeat the experiment many, many times, and use the method in part a to construct a conf. interval each time, about 95% of the resulting intervals will capture the slope of the true regression line relating flight time y and drop height x of paper helicopters. 8

Ex: Does Fidgeting Keep you Slim? p749 9

Performing a Significance Test for the Slope t Test for the Slope of the Population Regression Line Suppose the conditions for inference are met. To test the hypothesis H 0 : β = hypothesized value, compute the test statistic t = (b β 0 )/SE b Find the P value by calculating the probability of getting a t statistic this large or larger in the direction specified by the alternative hypothesis H a. Use the t distribution with df = n 2 H a : β > hypothesized value H a : β < hypothesized value H a : β hypothesized value 10

Ex: Crying and IQ p752 11

Ex: Magnetic Fields p755 Technology Corner p756 Regression Inference on the calculator. 12

Homework: P759 1 11 Odd P761 13 19 odd 13