Stat 20: A guide to estimating regression parameters

Similar documents
Soving Recurrence Relations

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

1 Computing the Standard Deviation of Sample Means

Hypothesis testing. Null and alternative hypotheses

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

I. Chi-squared Distributions

Now here is the important step

Confidence Intervals for One Mean

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Incremental calculation of weighted mean and variance

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Section 11.3: The Integral Test

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

5: Introduction to Estimation

1 Correlation and Regression Analysis

Properties of MLE: consistency, asymptotic normality. Fisher information.

5.3. Generalized Permutations and Combinations

Maximum Likelihood Estimators.

A probabilistic proof of a binomial identity

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.


Infinite Sequences and Series

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Determining the sample size

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Sequences and Series

Solving Logarithms and Exponential Equations

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Elementary Theory of Russian Roulette

Finding the circle that best fits a set of points

1. MATHEMATICAL INDUCTION

Chapter 5: Inner Product Spaces

Chapter 7 Methods of Finding Estimators

Basic Elements of Arithmetic Sequences and Series

Lesson 15 ANOVA (analysis of variance)

Normal Distribution.

CHAPTER 3 THE TIME VALUE OF MONEY

Solving equations. Pre-test. Warm-up

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

1. C. The formula for the confidence interval for a population mean is: x t, which was

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

2-3 The Remainder and Factor Theorems

S. Tanny MAT 344 Spring be the minimum number of moves required.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Sampling Distribution And Central Limit Theorem

Theorems About Power Series

Measures of Spread and Boxplots Discrete Math, Section 9.4

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Output Analysis (2, Chapters 10 &11 Law)

Section 8.3 : De Moivre s Theorem and Applications

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

CS103X: Discrete Structures Homework 4 Solutions

A Mathematical Perspective on Gambling

PSYCHOLOGICAL STATISTICS

Systems Design Project: Indoor Location of Wireless Devices

Lesson 17 Pearson s Correlation Coefficient

Practice Problems for Test 3

Overview of some probability distributions.

Chapter 7: Confidence Interval and Sample Size

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Hypergeometric Distributions

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

The Binomial Multi- Section Transformer

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

7.1 Finding Rational Solutions of Polynomial Equations

ODBC. Getting Started With Sage Timberline Office ODBC

Multiple Representations for Pattern Exploration with the Graphing Calculator and Manipulatives

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Math C067 Sampling Distributions

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Multiplexers and Demultiplexers

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

THE HEIGHT OF q-binary SEARCH TREES

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Notes on exponential generating functions and structures.

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Convexity, Inequalities, and Norms

How To Solve The Homewor Problem Beautifully

Lecture 4: Cheeger s Inequality

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

INVESTMENT PERFORMANCE COUNCIL (IPC)

1 The Gaussian channel

LECTURE 13: Cross-validation

Time Value of Money, NPV and IRR equation solving with the TI-86

Building Blocks Problem Related to Harmonic Series

SEQUENCES AND SERIES

5 Boolean Decision Trees (February 11)

Transcription:

Stat 0: A guide to estimatig regressio parameters B. M. Bolstad, bolstad@stat.berkeley.edu November 4, 003 The goal of this documet is to outlie the steps that you should go through to estimate regressio parameters i this class. This mai text should be used i coectio with the flow diagram which gives you a decisio guide for the process of estimatig regressio coefficiets. 1 Writig the regressio model i the geeral form 1.1 Theory We ca write the geeral regressio model i the form µ (x i ) = β 0 g 0 (x i ) + β 1 g 1 (x i ) + + β p g p (x i ) (1) the β 0,..., β p are called the regressio parameters. They are ukow ad we will estimate them as part of the regressio process. The g 0 (x i ),..., g p (x i ) are called the basis fuctios. They represet geeral fuctios of the x data. I practice we will have a regressio model ad will be able to idetify the basis fuctios by lookig at the model. 1. Some examples Cosider x = x. The followig table shows some regressio models ad idetifies the basis fuctios Regressio Model Basis fuctios µ(x) = β 0 + β 1 x g 0 (x) = 1, g 1 (x) = x µ(x) = β 0 + β 1 x + β x g 0 (x) = 1, g 1 (x) = x, g (x) = x 1 µ(x) = β 0 + β 1 x x 1 µ(x) = β 0 + β 1 + β x si(x ) g 0 (x) = 1, g 1 (x) = 1, g x (x) = x g 0 (x) = 1, g 1 (x) = 1, g x (x) = si(x ) Rather tha just a sigle x value lets cosider x = (x 1, x, x 3 ). The followig table shows some regressio models ad idetifies the basis fuctios. Regressio Model Basis fuctios µ(x) = β 0 + β 1 x 1 g 0 (x) = 1, g 1 (x) = x 1 µ(x) = β 0 + β 1 x 1 + β x + β 3 x 3 g 0 (x) = 1, g 1 (x) = x 1, g (x) = x,, g 3 (x) = x 3 µ(x) = β 0 + β 1 x 1 + β x 1 x + β 3 cos(x 3 ) g 0 (x) = 1, g 1 (x) = x 1, g (x) = x 1 x,, g 3 (x) = cos(x 3 ) µ(x) = β 0 + β 1 x 1 + β x 1 x + β 3 x 1 x x 3 g 0 (x) = 1, g 1 (x) = x 1, g (x) = x 1 x,, g 3 (x) = x 1 x x 3 1

Checkig that the basis is orthogoal.1 Theory Give ay two basis fuctio g j (x), g k (x) we say that they are orthogoal if ad oly if g j (x i ) g k (x i ) = 0 () We say a set of basis fuctios is orthogoal if g j (x), g k (x) are orthogoal for all possible j,k with j k.. Some examples Cosider the basis fuctio 1, x. To check whether the basis is orthogoal you eed to check whether (1)(x i) = x i = 0 for your data set. For istace suppose that x = (1, 0, 1,, 0,, 3, 0, 3) the x i = 1 + 0 1 + + 0 + 3 + 0 3 = 0 so the basis 1 ad x is orthogoal for our data. If our basis fuctios were 1 ad x the we would eed to check that (1)(x i ) = x i = 0 assumig we had the same data the x i = 1 + 0 + 1 + 4 + 0 + 4 + 9 + 0 + 9 = 8 which is ot equal to 0 so the fuctios 1 ad x are ot a orthogoal basis. Now for a more complicated example. Cosider the followig data x 1 x 1-1 0-1 -1-1 1 0 0 0-1 0 1 1 0 1-1 1 If our basis fuctios are 1, x 1, x the to check that the basis is orthogoal we check that (1)(x i1) = x i1 = 1+0+ 1+1+0+ 1+1+0 1 = 0, (1)x i = x i = 1 1 1+0+0+0+1+1+1 = 0 ad fially x i1x i = 1 1 + 0 1 + 1 1 + 1 0 + 0 0 + 1 0 + 1 1 + 0 1 + 1 1 = 0. So the basis is orthogoal. Usig the same data suppose istead that the basis fuctios are 1, x 1, x 4. the to check if the basis is orthogoal we check that (1)(x i1) = x i1 = 1 + 0 + 1 + 1 + 0 + 1 + 1 + 0 1 = 0, (1)(x i 4) = ( 1 4)+( 1 4)+( 1 4)+(0 4)+(0 4)+(0 4)+(1 4)+(1 4)+(1 4) = 5 5 5 4 4 4 3 3 3 = 36 ad so 1 ad x are ot orthogoal therefore the basis is ot orthogoal.

3 Estimatig the parameters of a regressio model: The Least squares method 3.1 Theory The goal of the least squares method is to choose the values of β 0, β 1,..., β p which miimize the error sum of squares (SSE) (Y i β 0 g 0 (x i ) β 1 g 1 (x i ) β p g p (x i )) by differetiatig with respect to each of the β 0, β 1,..., β p givig us a system of p + 1 equatios each of which we set equal to zero. This set of p + 1 equatios is called the ormal equatios. Note that i geeral you ca show by differetiatig the SSE above with respect to each of β 0, β 1,..., β p that the geeral form for the ormal equatios is give by ˆβ 0 g j (x i ) g 0 (x i ) + + ˆβ p g j (x i ) g p (x i ) = g j (x i ) Y i for j = 0,..., p The least squares estimates ˆβ 0, ˆβ 1,..., ˆβ p are the foud by solvig the system of equatios. 3. Some examples The least squares method was used to derive formula for the regressio model µ(x) = β 0 + β 1 x, see the otes from Nov 10. Cosider the regressio model µ(x) = β 0 + β 1 x 1 + β x. The sum of squared errors for this model will be (y i β 0 β 1 x i1 β x i ) we wat the values of β 0, β 1 ad β which miimize this sum of squares. Call these values ˆβ 0, ˆβ 1 ad ˆβ. To fid these we must differetiate the sum of square errors with respect to β 0 ad with respect to β 1 ad with respect to β. This gives three equatios which we set equal to zero (these are called the ormal equatios) ad the solve for the ˆβ 0, ˆβ 1 ad ˆβ. So differetiatig the sum of squared error with respect to β 0 ad settig equal to zero we get (y i ˆβ 0 ˆβ 1 x i1 ˆβ ) x i ( 1) = 0 (we put the hat o each β because we set the equatio equal to 0) after simplifyig a little we get the first ormal equatio y i ˆβ 0 ˆβ 1 x i1 ˆβ x i = 0 Differetiatig the sum of squared error with respect to β 1 ad settig equal to zero we get (y i ˆβ 0 ˆβ 1 x i1 ˆβ ) x i ( x i1 ) = 0 3

after simplifyig a little we get the secod ormal equatio x i1 y i ˆβ 0 x i1 ˆβ 1 x i1 ˆβ x i1 x i = 0 Fially differetiatig with respect to β ad settig equal to zero we get (y i ˆβ 0 ˆβ 1 x i1 ˆβ ) x i ( x i ) = 0 after simplifyig a little we get the third ormal equatio x i y i ˆβ 0 x i ˆβ 1 x i1 x i ˆβ x i = 0 The three equatios ca the be solved for ˆβ 0, ˆβ 1 ad ˆβ. Rather tha just solvig these equatios ad gettig geeral formula s for this model we should istead evaluate the summary statistics usig the data, sice this will make maipulatig the equatios much easier. Suppose that we have the followig data: y x 1 x 1 1-1 15 0-1 13-1 -1 0 1 0 5 0 1 18-1 0 9 1 1 0 1 1-1 1 Ad so we see that = 9, y i = 184, x i1 = 0, x i = 1, x i1y i = 18, x iy i = 48, x i1x i = 0, x i1 = 6 ad x i = 7. Thus, the ormal equatios become 184 9 ˆβ 0 ˆβ = 0 18 6 ˆβ 1 = 0 48 ˆβ 0 7 ˆβ = 0 Now we just solve the three equatios for ˆβ 0, ˆβ1 ad ˆβ. The secod equatio gives ˆβ 1 = 18/6 = 3. From the first equatio we get ˆβ = 184 9 ˆβ 0. Substitutig this ito the third equatio we get ad so we get 48 ˆβ 0 7(184 9 ˆβ 0 ) = 0 140 + 6 ˆβ 0 = 0 ad so ˆβ 0 = 140/6 = 0 ad ˆβ = 184 9(0) = 4. 4

3.3 The special case of the model µ (x) = β 0 + β 1 x For the special of the simple liear regressio model µ (x) = β 0 + β 1 x we showed (see 10th Nov), usig the least squares approach that for this particular model we could estimate ˆβ 0 ad ˆβ 1 usig ˆβ 0 = ȳ ˆβ 1 x (3) ad ˆβ 1 = x iy i ȳ x x i x Alteratively, we showed o Nov 1 that formula give by your textbook ˆβ 1 = r s y s x (4) could be used i place of the formula above (ad that the two agreed completely). Note that i this case r is the correlatio, s y is the sample stadard deviatio of the y data ad s x is the sample stadard deviatio of the x data. 4 Estimatig the parameters of the regressio model if the basis is orthogoal 4.1 Theory If all the basis fuctios g 0 (x),..., g p (x) are orthogoal, the we ca use the followig formula to estimate the regressio parameters β 0,..., β p ˆβ j = g j (x i ) y i g j (x i) for j = 0,..., p 4. Some examples Suppose that we have used our data ad via the methods discussed i sectio have show that the basis 1, x, x 5 is orthogoal for our data. The i this case whe we fit the regressio model µ (x) = β 0 + β 1 x + β (x 5) we get the followig estimates parameter estimates: ˆβ 0 = (1)y i (1) = y i = ȳ ˆβ 1 = x iy i x i ˆβ = x i y i (x i ) Cosider a differet dataset, where we show that 1, x 1, x ad x 1 x is a orthogoal basis usig the methods of sectio. The whe it comes to fit the regressio model µ (x 1, x ) = β 0 +β 1 x 1 +β x +β 3 x 1 x we would get the followig parameter estimates ˆβ 0 = (1)y i (1) = y i = ȳ 5

ˆβ 1 = x i1y i x i1 ˆβ = x iy i x i ˆβ 3 = x i1x i y i (x i1x i ) 6