# The Bivariate Normal Distribution

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included in the 2nd edition (2008). Let U and V be two independent normal random variables, and consider two new random variables X and Y of the form X = au + bv, Y = cu + dv, where a, b, c, d, are some scalars. Each one of the random variables X and Y is normal, since it is a linear function of independent normal random variables. Furthermore, because X and Y are linear functions of the same two independent normal random variables, their joint PDF takes a special form, known as the bivariate normal PDF. The bivariate normal PDF has several useful and elegant properties and, for this reason, it is a commonly employed model. In this section, we derive many such properties, both qualitative and analytical, culminating in a closed-form expression for the joint PDF. To keep the discussion simple, we restrict ourselves to the case where X and Y have zero mean. Jointly Normal Random Variables Two random variables X and Y are said to be jointly normal if they can beexpressedintheform X = au + bv, Y = cu + dv, where U and V are independent normal random variables. Note that if X and Y are jointly normal, then any linear combination Z = s X + s 2 Y For the purposes of this section, we adopt the following convention. A random variable which is always equal to a constant will also be called normal, with zero variance, even though it does not have a PDF. With this convention, the family of normal random variables is closed under linear operations. That is, if X is normal, then ax + b is also normal, even if a =0.

2 2 The Bivariate Normal Distribution has a normal distribution. The reason is that if we have X = au + bv and Y = cu + dv for some independent normal random variables U and V,then Z = s (au + bv )+s 2 (cu + dv )=(as + cs 2 )U +(bs + ds 2 )V. Thus, Z is the sum of the independent normal random variables (as + cs 2 )U and (bs + ds 2 )V, and is therefore normal. A very important property of jointly normal random variables, and which will be the starting point for our development, is that zero correlation implies independence. Zero Correlation Implies Independence If two random variables X and Y are jointly normal and are uncorrelated, then they are independent. This property can be verified using multivariate transforms, as follows. Suppose that U and V are independent zero-mean normal random variables, and that X = au + bv and Y = cu + dv,sothatx and Y are jointly normal. We assume that X and Y are uncorrelated, and we wish to show that they are independent. Our first step is to derive a formula for the multivariate transform M X,Y (s,s 2 ) associated with X and Y. Recall that if Z is a zero-mean normal random variable with variance σz 2, the associated transform is E[e sz ]=M Z (s) =e σ2 Z s2 /2, which implies that E[e Z ]=M Z () = e σ2 Z /2. Let us fix some scalars s, s 2,andletZ = s X + s 2 Y. The random variable Z is normal, by our earlier discussion, with variance σ 2 Z = s2 σ2 X + s2 2 σ2 Y. This leads to the following formula for the multivariate transform associated with the uncorrelated pair X and Y : M X,Y (s,s 2 )=E [e s X+s 2Y ] = E[e Z ] = e (s2 σ2 X +s2 2 σ2 Y )/2. Let now X and Y be independent zero-mean normal random variables with the same variances σx 2 and σ2 Y as X and Y, respectively. Since X and Y are independent, they are also uncorrelated, and the preceding argument yields M X,Y (s,s 2 )=e (s2 σ2 X +s2 2 σ2 Y )/2..

3 The Bivariate Normal Distribution 3 Thus, the two pairs of random variables (X, Y )and(x,y ) are associated with the same multivariate transform. Since the multivariate transform completely determines the joint PDF, it follows that the pair (X, Y ) has the same joint PDF as the pair (X,Y ). Since X and Y are independent, X and Y must also be independent, which establishes our claim. The Conditional Distribution of X Given Y We now turn to the problem of estimating X given the value of Y. To avoid uninteresting degenerate cases, we assume that both X and Y have positive variance. Let us define ˆX = ρ σ X Y, X = X ˆX, where ρ = E[XY ] σ X is the correlation coefficient of X and Y.SinceX and Y are linear combinations of independent normal random variables U and V, it follows that Y and X are also linear combinations of U and V.Inparticular,Y and X are jointly normal. Furthermore, E[Y X] =E[YX] E[Y ˆX] =ρσ X ρ σ X σ 2 Y =0. Thus, Y and X are uncorrelated and, therefore, independent. Since ˆX is a scalar multiple of Y, it follows that ˆX and X are independent. We have so far decomposed X into a sum of two independent normal random variables, namely, X = ˆX + X = ρ σ X Y + X. We take conditional expectations of both sides, given Y,toobtain E[X Y ]=ρ σ X E[Y Y ]+E[ X Y ]=ρ σ X Y = ˆX, where we have made use of the independence of Y and X to set E[ X Y ]=0. We have therefore reached the important conclusion that the conditional expectation E[X Y ] is a linear function of the random variable Y. Using the above decomposition, it is now easy to determine the conditional PDF of X. Given a value of Y, the random variable ˆX = ρσ X Y/ becomes Comparing with the formulas in the preceding section, it is seen that ˆX is defined to be the linear least squares estimator of X, and X is the corresponding estimation error, although these facts are not needed for the argument that follows.

4 4 The Bivariate Normal Distribution a known constant, but the normal distribution of the random variable X is unaffected, since X is independent of Y. Therefore, the conditional distribution of X given Y is the same as the unconditional distribution of X, shiftedby ˆX. Since X is normal with mean zero and some variance σ 2 X, we conclude that the conditional distribution of X is also normal with mean ˆX and the same variance σ 2 X. The variance of X can be found with the following calculation: σ 2 X = E [ ( X ρ σ ) ] 2 X Y = σx 2 2ρσ X ρσ X + ρ σ 2 σ2 X Y =( ρ 2 )σ 2 X, σy 2 σy 2 wherewehavemadeuseofthepropertye[xy]=ρσ X. We summarize our conclusions below. Although our discussion used the zero-mean assumption, these conclusions also hold for the non-zero mean case and we state them with this added generality; see the end-of-chapter problems. Properties of Jointly Normal Random Variables Let X and Y be jointly normal random variables. X and Y are independent if and only if they are uncorrelated. The conditional expectation of X given Y satisfies E[X Y ]=E[X]+ρ σ X ( Y E[Y ] ). It is a linear function of Y and has a normal PDF. The estimation error X = X E[X Y ] is zero-mean, normal, and independent of Y, with variance σ 2 X =( ρ 2 )σ 2 X. The conditional distribution of X given Y is normal with mean E[X Y ] and variance σ 2 X. TheFormoftheBivariateNormalPDF Having determined the parameters of the PDF of X and of the conditional PDF of X, we can give explicit formulas for these PDFs. We keep assuming that

5 The Bivariate Normal Distribution 5 X and Y have zero means and positive variances. Furthermore, to avoid the degenerate where X is identically zero, we assume that ρ <. We have f X( x) =f X Y ( x y) = 2π ρ2 σ X e x2 /2σ 2 X, and f X Y (x y) = ( x ρ σ ) 2 X y /2σ 2 e X, 2π ρ2 σ X where σ 2 X =( ρ 2 )σx 2. UsingalsotheformulaforthePDFofY, f Y (y) = 2πσY e y2 /2σ 2 Y, and the multiplication rule f X,Y (x, y) = f Y (y)f X Y (x y), we can obtain the joint PDF of X and Y. This PDF is of the form where the normalizing constant is f X,Y (x, y) =ce q(x,y), c = 2π ρ 2 σ X. The exponent term q(x, y) is a quadratic function of x and y, ( x ρ σ ) 2 X y q(x, y) = y2 2σY 2 + 2( ρ 2 )σx 2, which after some straightforward algebra simplifies to q(x, y) = x 2 σx 2 2ρ xy σ X 2( ρ 2 ) + y2 σ 2 Y. An important observation here is that the joint PDF is completely determined by σ X,, and ρ. In the special case where X and Y are uncorrelated (ρ = 0), the joint PDF takes the simple form f X,Y (x, y) = x2 e 2σ 2 y2 X 2σY 2, 2πσ X

6 6 The Bivariate Normal Distribution which is just the product of two independent normal PDFs. We can get some insight into the form of this PDF by considering its contours, i.e., sets of points at which the PDF takes a constant value. These contours are described by an equation of the form x 2 σx 2 + y2 σ 2 Y = constant, and are ellipses whose two axes are horizontal and vertical. In the more general case where X and Y are dependent, a typical contour is described by x 2 σx 2 2ρ xy σ X + y2 σ 2 Y = constant, and is again an ellipse, but its axes are no longer horizontal and vertical. Figure 4. illustrates the contours for two cases, one in which ρ is positive and one in which ρ is negative. y y x x Figure 4.: Contours of the bivariate normal PDF. The diagram on the left (respectively, right) corresponds to a case of positive (respectively, negative) correlation coefficient ρ. Example Suppose that X and Z are zero-mean jointly normal random variables, such that σx 2 =4,σZ 2 =7/9, and E[XZ] = 2. We define a new random variable Y =2X 3Z. We wish to determine the PDF of Y, the conditional PDF of X given Y,andthejointPDFofX and Y. As noted earlier, a linear function of two jointly normal random variables is also normal. Thus, Y is normal with variance σ 2 Y = E [ (2X 3Z) 2] =4E[X 2 ]+9E[Z 2 ] 2E[XZ]= =9. Hence, Y has the normal PDF f Y (y) = 2π 3 e y2 /8.

7 The Bivariate Normal Distribution 7 We next note that X and Y are jointly normal. The reason is that X and Z are linear functions of two independent normal random variables (by the definition of joint normality), so that X and Y are also linear functions of the same independent normal random variables. The covariance of X and Y is equal to E[XY ]=E [ X(2X 3Z) ] =2E[X 2 ] 3E[XZ] = =2. Hence, the correlation coefficient of X and Y, denoted by ρ, isequalto ρ = E[XY ] = 2 σ X 2 3 = 3. The conditional expectation of X given Y is E[X Y ]=ρ σx Y = Y = 2 9 Y. The conditional variance of X given Y (which is the same as the variance of X = X E[X Y ]) is ( σ 2 X =( ρ 2 )σx 2 = ) 4= , so that σ X = 32/3. Hence, the conditional PDF of X given Y is f X Y (x y) = 3 e 2π 32 ( x (2y/9) ) /9. Finally, the joint PDF of X and Y is obtained using either the multiplication rule f X,Y (x, y) =f X(x)f X Y (x y), or by using the earlier developed formula for the exponent q(x, y), and is equal to f X,Y (x, y) = 2π 32 e xy 2 3 2( (/9)). y x2 We end with a cautionary note. If X and Y are jointly normal, then each random variable X and Y is normal. However, the converse is not true. Namely, if each of the random variables X and Y is normal, it does not follow that they are jointly normal, even if they are uncorrelated. This is illustrated in the following example. Example Let X have a normal distribution with zero mean and unit variance. Let Z be independent of X, withp(z =)=P(Z = ) = /2. Let

8 8 The Bivariate Normal Distribution Y = ZX, which is also normal with zero mean. The reason is that conditioned on either value of Z, Y has the same normal distribution, hence its unconditional distribution is also normal. Furthermore, E[XY ]=E[ZX 2 ]=E[Z] E[X 2 ]=0 =0, so X and Y are uncorrelated. On the other hand X and Y are clearly dependent. (For example, if X =,theny must be either or.) IfX and Y were jointly normal, we would have a contradiction to our earlier conclusion that zero correlation implies independence. It follows that X and Y are not jointly normal, even though both marginal distributions are normal. The Multivariate Normal PDF The development in this section generalizes to the case of more than two random variables. For example, we can say that the random variables X,...,X n are jointly normal if all of them are linear functions of a set U,...,U n of independent normal random variables. We can then establish the natural extensions of the results derived in this section. For example, it is still true that zero correlation implies independence, that the conditional expectation of one random variable given some of the others is a linear function of the conditioning random variables, and that the conditional PDF of X,...,X k given X k+,...,x n is multivariate normal. Finally, there is a closed-form expression for the joint PDF. Assuming that none of the random variables is a deterministic function of the others, we have f X,...,X n = ce q(x,...,x n), where c is a normalizing constant and where q(x,...,x n ) is a quadratic function of x,...,x n that increases to infinity as the magnitude of the vector (x,...,x n ) tends to infinity. Multivariate normal models are very common in statistics, econometrics, signal processing, feedback control, and many other fields. However, a full development falls outside the scope of this text. Solved Problems on The Bivariate Normal Distribution Problem. Let X and X 2 be independent standard normal random variables. Define the random variables Y and Y 2 by Y =2X + X 2, Y 2 = X X 2. Find E[Y ], E[Y 2], cov(y,y 2), and the joint PDF f Y,Y 2. Solution. The means are given by E[Y ]=E[2X + X 2]=E[2X ]+E[X 2]=0, E[Y 2]=E[X X 2]=E[X ] E[X 2]=0.

9 The Bivariate Normal Distribution 9 The covariance is obtained as follows: cov(y,y 2)=E[Y Y ] E[Y ]E[Y 2] = E [ (2X + X 2) (X X ] 2) = E [ ] 2X 2 X X 2 X 2 2 =. The bivariate normal is determined by the means, the variances, and the correlation coefficient, so we need to calculate the variances. We have Similarly, σ 2 Y = E[Y 2 ] µ 2 Y = E[4X 2 +4X X 2 + X 2 2 ]=5. σ 2 Y 2 = E[Y 2 2 ] µ 2 Y 2 =5. Thus ρ(y cov(y,y2),y 2)= = 2 5. To write the joint PDF of Y and Y 2, we substitute the above values into the formula for the bivariate normal density function. Problem 2. The random variables X and Y are described by a joint PDF of the form f X,Y (x, y) =ce 8x2 6xy 8y 2. Find the means, variances, and the correlation coefficient of X and Y. Also, find the value of the constant c. Solution. We recognize this as a bivariate normal PDF, with zero means. By comparing 8x 2 +6xy +8y 2 with the exponent q(x, y) = of the bivariate normal, we obtain x 2 σx 2 2ρ xy σ X 2( ρ 2 ) + y2 σ 2 Y σ 2 X( ρ 2 )=/4, σ 2 Y ( ρ 2 )=/9, ( ρ 2 )σ X = ρ/3. From the first two equations, we have ( ρ 2 )σ X =/6, which implies that ρ = /2. Thus, σ 2 X =/3, and σ 2 Y =4/27. Finally, c = 2π ρ 2 σ X = 27 π.

10 0 The Bivariate Normal Distribution Problem 3. Suppose that X and Y are independent normal random variables with the same variance. Show that X Y and X + Y are independent. Solution. It suffices to show that the zero-mean jointly normal random variables X Y E[X Y ]andx + Y E[X + Y ] are independent. We can therefore, without loss of generality, assume that X and Y have zero mean. To prove independence, under the zero-mean assumption, it suffices to show that the covariance of X Y and X + Y is zero. Indeed, cov(x Y,X + Y )=E [ (X Y )(X + Y ) ] = E[X 2 ] E[Y 2 ]=0, since X and Y were assumed to have the same variance. Problem 4. The coordinates X and Y of a point are independent zero-mean normal random variables with common variance σ 2. Given that the point is at a distance of at least c from the origin, find the conditional joint PDF of X and Y. Solution. Let C denote the event that X 2 + Y 2 >c 2. The probability P(C) canbe calculated using polar coordinates, as follows: Thus, for (x, y) C, f X,Y C (x, y) = P(C) = 2πσ 2 2π 0 = re r2 /2σ 2 dr σ 2 c = e c2 /2σ 2. fx,y (x, y) P(C) c = re r2 /2σ 2 dr dθ 2πσ 2 e 2σ 2 (x2 + y 2 c 2 ). Problem 5.* Suppose that X and Y are jointly normal random variables. Show that E[X Y ]=E[X]+ρ σx ( Y E[Y ] ). Hint: Consider the random variables X E[X] and Y E[Y ] and use the result established in the text for the zero-mean case. Solution. Let X = X E[X] andỹ = Y E[Y ]. The random variables X and Ỹ are jointly normal. This is because if X and Y are linear functions of two independent normal random variables U and V,then X and Ỹ are also linear functions of U and V. Therefore, as established in the text, E[ X Ỹ ]=ρ( X,Ỹ ) σ X Ỹ. Note that conditioning on Ỹ is the same as conditioning on Y. Therefore, σỹ E[ X Ỹ ]=E[ X Y ]=E[X Y ] E[X].

11 The Bivariate Normal Distribution Since X and X only differ by a constant, we have σ X = σ X and, similarly, σỹ =. Finally, cov( X,Ỹ )=E[ XỸ ]=E[( X E[X] )( Y E[Y ] )] =cov(x, Y ), from which it follows that ρ( X,Ỹ )=ρ(x, Y ). The desired formula follows by substituting the above relations in the formula for E[ X Ỹ ]. Problem 6.* (a) Let X,X 2,...,X n be independent identically distributed random variables and let Y = X + X X n. Show that E[X Y ]= Y n. (b) Let X and W be independent zero-mean normal random variables, with positive integer variances k and m, respectively. Use the result of part (a) to find E[X X + W ], and verify that this agrees with the conditional expectation formula for jointly normal random variables given in the text. Hint: Think of X and W as sums of independent random variables. Solution. (a) By symmetry, we see that E[X i Y ]isthesameforalli. Furthermore, E[X + + X n Y ]=E[Y Y ]=Y. Therefore, E[X Y ]=Y/n. (b) We can think of X and W as sums of independent standard normal random variables: X = X + + X k, W = W + + W m. We identify Y with X + W and use the result from part (a), to obtain E[X i X + W ]= X + W k + m. Thus, E[X X + W ]=E[X + + X k X + W ]= k (X + W ). k + m This formula agrees with the formula derived in the text because We have used here the property σ X ρ(x, X + W ) = cov(x, X + W ) σ X+W σ 2 X+W = k k + m. cov(x, X + W )=E [ X(X + W ) ] = E[X 2 ]=k.

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

### a 1 x + a 0 =0. (3) ax 2 + bx + c =0. (4)

ROOTS OF POLYNOMIAL EQUATIONS In this unit we discuss polynomial equations. A polynomial in x of degree n, where n 0 is an integer, is an expression of the form P n (x) =a n x n + a n 1 x n 1 + + a 1 x

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### CS229 Lecture notes. Andrew Ng

CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

Factoring the trinomial ax 2 + bx + c when a = 1 A trinomial in the form x 2 + bx + c can be factored to equal (x + m)(x + n) when the product of m x n equals c and the sum of m + n equals b. (Note: the

### α = u v. In other words, Orthogonal Projection

Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

### SOLVING POLYNOMIAL EQUATIONS

C SOLVING POLYNOMIAL EQUATIONS We will assume in this appendix that you know how to divide polynomials using long division and synthetic division. If you need to review those techniques, refer to an algebra

### 1.5. Factorisation. Introduction. Prerequisites. Learning Outcomes. Learning Style

Factorisation 1.5 Introduction In Block 4 we showed the way in which brackets were removed from algebraic expressions. Factorisation, which can be considered as the reverse of this process, is dealt with

### Statistics 100A Homework 8 Solutions

Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

### Algebra I Vocabulary Cards

Algebra I Vocabulary Cards Table of Contents Expressions and Operations Natural Numbers Whole Numbers Integers Rational Numbers Irrational Numbers Real Numbers Absolute Value Order of Operations Expression

### Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

1 Chapter 13. VECTORS IN THREE DIMENSIONAL SPACE Let s begin with some names and notation for things: R is the set (collection) of real numbers. We write x R to mean that x is a real number. A real number

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### 1 Sufficient statistics

1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

### 1 Portfolio mean and variance

Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring

### Section 13.5 Equations of Lines and Planes

Section 13.5 Equations of Lines and Planes Generalizing Linear Equations One of the main aspects of single variable calculus was approximating graphs of functions by lines - specifically, tangent lines.

### Chapter 20. Vector Spaces and Bases

Chapter 20. Vector Spaces and Bases In this course, we have proceeded step-by-step through low-dimensional Linear Algebra. We have looked at lines, planes, hyperplanes, and have seen that there is no limit

### 6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives

6 EXTENDING ALGEBRA Chapter 6 Extending Algebra Objectives After studying this chapter you should understand techniques whereby equations of cubic degree and higher can be solved; be able to factorise

### CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

### problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

### Sections 2.11 and 5.8

Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

### Copyrighted Material. Chapter 1 DEGREE OF A CURVE

Chapter 1 DEGREE OF A CURVE Road Map The idea of degree is a fundamental concept, which will take us several chapters to explore in depth. We begin by explaining what an algebraic curve is, and offer two

### FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z

FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z DANIEL BIRMAJER, JUAN B GIL, AND MICHAEL WEINER Abstract We consider polynomials with integer coefficients and discuss their factorization

### Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

### Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

### FACTORING ax 2 bx c. Factoring Trinomials with Leading Coefficient 1

5.7 Factoring ax 2 bx c (5-49) 305 5.7 FACTORING ax 2 bx c In this section In Section 5.5 you learned to factor certain special polynomials. In this section you will learn to factor general quadratic polynomials.

### Algebraic expressions are a combination of numbers and variables. Here are examples of some basic algebraic expressions.

Page 1 of 13 Review of Linear Expressions and Equations Skills involving linear equations can be divided into the following groups: Simplifying algebraic expressions. Linear expressions. Solving linear

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### 1.3. DOT PRODUCT 19. 6. If θ is the angle (between 0 and π) between two non-zero vectors u and v,

1.3. DOT PRODUCT 19 1.3 Dot Product 1.3.1 Definitions and Properties The dot product is the first way to multiply two vectors. The definition we will give below may appear arbitrary. But it is not. It

### CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

### Solutions for Review Problems

olutions for Review Problems 1. Let be the triangle with vertices A (,, ), B (4,, 1) and C (,, 1). (a) Find the cosine of the angle BAC at vertex A. (b) Find the area of the triangle ABC. (c) Find a vector

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Does Black-Scholes framework for Option Pricing use Constant Volatilities and Interest Rates? New Solution for a New Problem

Does Black-Scholes framework for Option Pricing use Constant Volatilities and Interest Rates? New Solution for a New Problem Gagan Deep Singh Assistant Vice President Genpact Smart Decision Services Financial

### ( ) is proportional to ( 10 + x)!2. Calculate the

PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

### Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a

### LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

### THREE DIMENSIONAL GEOMETRY

Chapter 8 THREE DIMENSIONAL GEOMETRY 8.1 Introduction In this chapter we present a vector algebra approach to three dimensional geometry. The aim is to present standard properties of lines and planes,

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

### 6.1 Add & Subtract Polynomial Expression & Functions

6.1 Add & Subtract Polynomial Expression & Functions Objectives 1. Know the meaning of the words term, monomial, binomial, trinomial, polynomial, degree, coefficient, like terms, polynomial funciton, quardrtic

### Solutions to old Exam 1 problems

Solutions to old Exam 1 problems Hi students! I am putting this old version of my review for the first midterm review, place and time to be announced. Check for updates on the web site as to which sections

### 9 Multiplication of Vectors: The Scalar or Dot Product

Arkansas Tech University MATH 934: Calculus III Dr. Marcel B Finan 9 Multiplication of Vectors: The Scalar or Dot Product Up to this point we have defined what vectors are and discussed basic notation

### 3 Orthogonal Vectors and Matrices

3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first

### List the elements of the given set that are natural numbers, integers, rational numbers, and irrational numbers. (Enter your answers as commaseparated

MATH 142 Review #1 (4717995) Question 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Description This is the review for Exam #1. Please work as many problems as possible

### Lectures on Stochastic Processes. William G. Faris

Lectures on Stochastic Processes William G. Faris November 8, 2001 2 Contents 1 Random walk 7 1.1 Symmetric simple random walk................... 7 1.2 Simple random walk......................... 9 1.3

### Solving Quadratic Equations by Factoring

4.7 Solving Quadratic Equations by Factoring 4.7 OBJECTIVE 1. Solve quadratic equations by factoring The factoring techniques you have learned provide us with tools for solving equations that can be written

### December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

### Factoring Polynomials

Factoring Polynomials Any Any Any natural number that that that greater greater than than than 1 1can can 1 be can be be factored into into into a a a product of of of prime prime numbers. For For For

### Section 9.5: Equations of Lines and Planes

Lines in 3D Space Section 9.5: Equations of Lines and Planes Practice HW from Stewart Textbook (not to hand in) p. 673 # 3-5 odd, 2-37 odd, 4, 47 Consider the line L through the point P = ( x, y, ) that

### VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled

### Chapter 9 Descriptive Statistics for Bivariate Data

9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

### Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

### Lecture 13: Martingales

Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of

### Understanding Basic Calculus

Understanding Basic Calculus S.K. Chung Dedicated to all the people who have helped me in my life. i Preface This book is a revised and expanded version of the lecture notes for Basic Calculus and other

### Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.

This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra

### 9 MATRICES AND TRANSFORMATIONS

9 MATRICES AND TRANSFORMATIONS Chapter 9 Matrices and Transformations Objectives After studying this chapter you should be able to handle matrix (and vector) algebra with confidence, and understand the

### Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers

### e.g. arrival of a customer to a service station or breakdown of a component in some system.

Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be

### Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Tiers, Preference Similarity, and the Limits on Stable Partners

Tiers, Preference Similarity, and the Limits on Stable Partners KANDORI, Michihiro, KOJIMA, Fuhito, and YASUDA, Yosuke February 7, 2010 Preliminary and incomplete. Do not circulate. Abstract We consider

### LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN c W W L Chen, 1982, 2008. This chapter originates from material used by author at Imperial College, University of London, between 1981 and 1990. It is available free to all individuals,

### SOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve

SOLUTIONS Problem. Find the critical points of the function f(x, y = 2x 3 3x 2 y 2x 2 3y 2 and determine their type i.e. local min/local max/saddle point. Are there any global min/max? Partial derivatives

### Factoring Trinomials: The ac Method

6.7 Factoring Trinomials: The ac Method 6.7 OBJECTIVES 1. Use the ac test to determine whether a trinomial is factorable over the integers 2. Use the results of the ac test to factor a trinomial 3. For

### Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two

### PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.

PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### 3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

Solving Polynomial Equations 3.3 Introduction Linear and quadratic equations, dealt within Sections 3.1 and 3.2, are members of a class of equations, called polynomial equations. These have the general

### 13.4 THE CROSS PRODUCT

710 Chapter Thirteen A FUNDAMENTAL TOOL: VECTORS 62. Use the following steps and the results of Problems 59 60 to show (without trigonometry) that the geometric and algebraic definitions of the dot product

### Computing divisors and common multiples of quasi-linear ordinary differential equations

Computing divisors and common multiples of quasi-linear ordinary differential equations Dima Grigoriev CNRS, Mathématiques, Université de Lille Villeneuve d Ascq, 59655, France Dmitry.Grigoryev@math.univ-lille1.fr

### FACTORING POLYNOMIALS

296 (5-40) Chapter 5 Exponents and Polynomials where a 2 is the area of the square base, b 2 is the area of the square top, and H is the distance from the base to the top. Find the volume of a truncated

### Math 241, Exam 1 Information.

Math 241, Exam 1 Information. 9/24/12, LC 310, 11:15-12:05. Exam 1 will be based on: Sections 12.1-12.5, 14.1-14.3. The corresponding assigned homework problems (see http://www.math.sc.edu/ boylan/sccourses/241fa12/241.html)

### Clustering in the Linear Model

Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

### Definitions 1. A factor of integer is an integer that will divide the given integer evenly (with no remainder).

Math 50, Chapter 8 (Page 1 of 20) 8.1 Common Factors Definitions 1. A factor of integer is an integer that will divide the given integer evenly (with no remainder). Find all the factors of a. 44 b. 32

### Lecture 05: Mean-Variance Analysis & Capital Asset Pricing Model (CAPM)

Lecture 05: Mean-Variance Analysis & Capital Asset Pricing Model (CAPM) Prof. Markus K. Brunnermeier Slide 05-1 Overview Simple CAPM with quadratic utility functions (derived from state-price beta model)

### Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

### CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

### Schooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)

Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for

### minimal polyonomial Example

Minimal Polynomials Definition Let α be an element in GF(p e ). We call the monic polynomial of smallest degree which has coefficients in GF(p) and α as a root, the minimal polyonomial of α. Example: We

### is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

.4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5

### Chapter 6. Linear Transformation. 6.1 Intro. to Linear Transformation

Chapter 6 Linear Transformation 6 Intro to Linear Transformation Homework: Textbook, 6 Ex, 5, 9,, 5,, 7, 9,5, 55, 57, 6(a,b), 6; page 7- In this section, we discuss linear transformations 89 9 CHAPTER

### it is easy to see that α = a

21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UF. Therefore

FACTORING QUADRATIC EQUATIONS Summary 1. Difference of squares... 1 2. Mise en évidence simple... 2 3. compounded factorization... 3 4. Exercises... 7 The goal of this section is to summarize the methods

### SECTION 0.6: POLYNOMIAL, RATIONAL, AND ALGEBRAIC EXPRESSIONS

(Section 0.6: Polynomial, Rational, and Algebraic Expressions) 0.6.1 SECTION 0.6: POLYNOMIAL, RATIONAL, AND ALGEBRAIC EXPRESSIONS LEARNING OBJECTIVES Be able to identify polynomial, rational, and algebraic

### Notes on Factoring. MA 206 Kurt Bryan

The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

### Discrete Convolution and the Discrete Fourier Transform

Discrete Convolution and the Discrete Fourier Transform Discrete Convolution First of all we need to introduce what we might call the wraparound convention Because the complex numbers w j e i πj N have

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### Exclusive OR (XOR) and hardware random number generators

Exclusive OR (XOR) and hardware random number generators Robert B Davies February 28, 2002 1 Introduction The exclusive or (XOR) operation is commonly used to reduce the bias from the bits generated by

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### College of the Holy Cross, Spring 2009 Math 373, Partial Differential Equations Midterm 1 Practice Questions

College of the Holy Cross, Spring 29 Math 373, Partial Differential Equations Midterm 1 Practice Questions 1. (a) Find a solution of u x + u y + u = xy. Hint: Try a polynomial of degree 2. Solution. Use

### 3.2 Sources, Sinks, Saddles, and Spirals

3.2. Sources, Sinks, Saddles, and Spirals 6 3.2 Sources, Sinks, Saddles, and Spirals The pictures in this section show solutions to Ay 00 C By 0 C Cy D 0. These are linear equations with constant coefficients

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### 4 The M/M/1 queue. 4.1 Time-dependent behaviour

4 The M/M/1 queue In this chapter we will analyze the model with exponential interarrival times with mean 1/λ, exponential service times with mean 1/µ and a single server. Customers are served in order

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### 7: The CRR Market Model

Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney MATH3075/3975 Financial Mathematics Semester 2, 2015 Outline We will examine the following issues: 1 The Cox-Ross-Rubinstein

### DEFINITION 5.1.1 A complex number is a matrix of the form. x y. , y x

Chapter 5 COMPLEX NUMBERS 5.1 Constructing the complex numbers One way of introducing the field C of complex numbers is via the arithmetic of matrices. DEFINITION 5.1.1 A complex number is a matrix of

### INTRODUCTION TO ALGEBRAIC GEOMETRY, CLASS 24

INTRODUCTION TO ALGEBRAIC GEOMETRY, CLASS 24 RAVI VAKIL Contents 1. Degree of a line bundle / invertible sheaf 1 1.1. Last time 1 1.2. New material 2 2. The sheaf of differentials of a nonsingular curve