Linear Regression and Multilinear Regression

Similar documents
Question 2: How do you solve a matrix equation using the matrix inverse?

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

5.5. Solving linear systems by the elimination method

Solving simultaneous equations using the inverse matrix

Solving Linear Systems, Continued and The Inverse of a Matrix

Using row reduction to calculate the inverse and the determinant of a square matrix

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

CURVE FITTING LEAST SQUARES APPROXIMATION

Algebraic expressions are a combination of numbers and variables. Here are examples of some basic algebraic expressions.

Solving Systems of Linear Equations Using Matrices

Typical Linear Equation Set and Corresponding Matrices

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Partial Fractions. (x 1)(x 2 + 1)

LS.6 Solution Matrices

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

MATH APPLIED MATRIX THEORY

1 Determinants and the Solvability of Linear Systems

The Method of Partial Fractions Math 121 Calculus II Spring 2015

Partial Fractions: Undetermined Coefficients

Equations, Inequalities & Partial Fractions

FACTORING QUADRATIC EQUATIONS

5 Systems of Equations

Systems of Linear Equations

Name: Section Registered In:

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Factoring Polynomials and Solving Quadratic Equations

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

Solutions to Math 51 First Exam January 29, 2015

1.7. Partial Fractions Rational Functions and Partial Fractions. A rational function is a quotient of two polynomials: R(x) = P (x) Q(x).

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

University of Lille I PC first year list of exercises n 7. Review

MAT 242 Test 3 SOLUTIONS, FORM A

Section 6.1 Factoring Expressions

2.3. Finding polynomial functions. An Introduction:

3.1. RATIONAL EXPRESSIONS

Partial Fractions. p(x) q(x)

Linear Algebraic Equations, SVD, and the Pseudo-Inverse

1 Lecture: Integration of rational functions by decomposition

Solving Systems of Two Equations Algebraically

Section 1.1 Linear Equations: Slope and Equations of Lines

Section 1.5 Linear Models

1 Introduction to Matrices

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Chapter R.4 Factoring Polynomials

Lecture 2 Matrix Operations

NSM100 Introduction to Algebra Chapter 5 Notes Factoring

This is a square root. The number under the radical is 9. (An asterisk * means multiply.)

A synonym is a word that has the same or almost the same definition of

160 CHAPTER 4. VECTOR SPACES

Alum Rock Elementary Union School District Algebra I Study Guide for Benchmark III

The Point-Slope Form

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Homework #1 Solutions

NOTES ON LINEAR TRANSFORMATIONS

SPSS Guide: Regression Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Zeros of Polynomial Functions

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Math 312 Homework 1 Solutions

Linear Equations ! $ & " % & " 11,750 12,750 13,750% MATHEMATICS LEARNING SERVICE Centre for Learning and Professional Development

3.6. Partial Fractions. Introduction. Prerequisites. Learning Outcomes

Algebra Cheat Sheets

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Lecture 5: Singular Value Decomposition SVD (1)

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

2.6 Exponents and Order of Operations

1 Determine whether an. 2 Solve systems of linear. 3 Solve systems of linear. 4 Solve systems of linear. 5 Select the most efficient

3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Module 3: Correlation and Covariance

November 16, Interpolation, Extrapolation & Polynomial Approximation

Integrating algebraic fractions

Operation Count; Numerical Linear Algebra

x = + x 2 + x

ab = c a If the coefficients a,b and c are real then either α and β are real or α and β are complex conjugates

Methods for Finding Bases

Lecture 25 Design Example for a Channel Transition. I. Introduction

Factoring Quadratic Expressions

Chapter 9. Systems of Linear Equations

Orthogonal Projections

Solving Systems of Linear Equations

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

5. Linear Regression

Algebra 1 Course Title

by the matrix A results in a vector which is a reflection of the given

Slope-Intercept Form of a Linear Equation Examples

Notes on Determinant

Solving Equations Involving Parallel and Perpendicular Lines Examples

How To Understand And Solve Algebraic Equations

2013 MBA Jump Start Program

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Regression III: Advanced Methods

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

Transcription:

Linear Regression and Multilinear Regression 1 Linear Regression For this section, assume that we are given data points in the form (x i, y i ) for i = 1,, N and that we desire to fit a line to these data points So, we need to find the slope and y intercept that make a line fit these data best The approach used in linear regression is to minimize the sum of the squares of the differences between the data and a line; that is, find the values of a and b that minimize the sum R = N (ax i + b y i ) (1) i=1 Minimizing this sum is called least squares minimization The minimum of the sum above will occur at a critical point By the first derivative test from calculus, we can find critical points by solving the system R a = 0 R b = 0 Specifically, finding the partial derivatives and rewriting gives a x i + b x i = x i y i a x i + Nb = (3) y i Since the values x i, x i, x i y i, y i can be computed, the above is a system of equations in unknowns These equations are known as the normal equations For example, consider performing linear regression on the data points (1, 101), (, 104), (3, 109), (4, 108), (5, 110) Then, the normal equations in Eq (3) become 55 a + 15 b = 1618 15 a + 5 b = 53 Solving this system gives a = 0, b = 998 So, the regression line (the line of best fit ) for the above data is y = 0x + 998 1 ()

The normal equations in Eq (3) can be solved to obtain general expressions for a and b These are the nasty formulas that are often taught in traditional, elementary statistics courses The formulas are: yi x a = i x i xi y i N x i ( x i ) b = N x i y i x i yi N x i ( x i ) Using matrices makes dealing with the normal equations a little easier For example, we can write the normal equations in Eq (3) as ( ) ( ) ( ) x i xi a = xi y i xi N b yi To solve this system, we must invert the -by- coefficient matrix and multiply it on both sides to get ( ) ( ) a x 1 ( ) = i xi xi y i b xi N yi So, the real work in solving this system is in inverting (if possible) the coefficient matrix Most of the applications of linear regression involve a large number of data points, and hence, a simple way of computing the coefficient matrix is useful The standard approach is to create the coefficient matrix from another matrix which is defined using the data as follows x 1 1 x 1 x 3 1 x N 1 Then, we can write ( x A t i xi xi N ), A t Y = ( ) xi y i yi (Be sure to check by hand that this is true) Thus, the normal equations can be written A t A W = A t Y (4) with W = ( ) a, Y = b y 1 y y 3 y N

The solution to this matrix equation is then W = (A t A) 1 A t Y, (5) if (A t A) 1 exists Although the definition for A appears to be arbitrarily chosen, there is a good reason behind it Remember that for linear regression, we want to find a, b so that we can reproduce y i from x i for each data point Specifically, we want ax 1 + b = y 1 ax + b = y ax 3 + b = y 3 ax N + b = y N This system of equations can also be written using our definition of A, W, and Y as AW = Y To solve this system, we would like to multiply by A 1 on both sides BUT, A is not a square matrix and cannot be inverted So, we need to rewrite the system so that we have a square matrix involved To do this, multiply both sides of the matrix equation by A t to obtain the matrix version of the normal equations A t A W = A t Y So, the definition of A makes sense and works! Multilinear Regression With multilinear regression, we assume that the dependent data, y i, depends linearly on several independent variables, x 1, x,, x k For the purposes of this discussion, assume that the given data depends only on two independent variables So, data points are of the form (x 11, x 1, y 1 ), (x 1, x, y ), (x 13, x 3, y 3 ),, (x 1N, x N, y N ) The goal is to minimize the sum R = N (a 1 x 1i + a x i + b y i ) (6) i=1 3

Ideally, we want to find a 0, a 1, a so that Rewrite this system as where a 1 x 11 + a x 1 + b = y 1 a 1 x 1 + a x + b = y a 1 x 13 + a x 3 + b = y 3 a 1 x 1N + a x N + b = y N AW = Y, x 11 x 1 1 y x 1 x 1 1 a 1 y x 13 x 3 1, W = a, Y = y 3 b x 1N x N 1 To solve this system, we would like to multiply by A 1 on both sides BUT, A is not a square matrix and cannot be inverted So, we need to rewrite the system so that we have a square matrix involved To do this, multiply both sides of the matrix equation by A t to obtain A t A W = A t Y Note that this equation is the same as Eq (4) The difference lies only in how the coefficient matrix A is created Indeed, if you take the partial derivatives R / a 1, R / a, and R / b as we did in linear regression, you will find that the equation above represents the normal equations in multilinear regression Now, A t A is a square matrix We can multiply by its inverse on both sides of our system, if it exists Thus, the solution for multilinear regression is W = (A t A) 1 A t Y, (7) if (A t A) 1 exists The fact that the matrix equations for linear and multilinear regression appear the same make this matrix approach very appealing Also, there are lots of numerical methods for finding (A t A) 1 accurately Finally, consider the data set below as an example (0, 030, 1014),(069, 060, 1193), (110, 090, 1357) (139, 10, 1417),(161, 150, 155), (179, 180, 1615) y N 4

Then, and so Then, 0 030 1 1014 069 060 1 1193 110 090 1 139 10 1, Y = 1357 1417 161 150 1 155 179 180 1 1615 941 871 658 A t 871 819 630 658 630 600 a 1 09 W = a = (A t A) 1 A t Y = 150 b 969 Thus, the line that best fits the data is y = 09x 1 + 150x + 969 5