CITY AND REGIONAL PLANNING Setting Up Models. Philip A. Viton. January 5, What is conditional expectation? 2. 3 The Modelling Setup 4
|
|
- Candice Ethel Jackson
- 7 years ago
- Views:
Transcription
1 CITY AND REGIONAL PLANNING Setting Up Models Philip A. Viton January 5, 2010 Contents 1 Introduction 1 2 What is conditional expectation? 2 3 The Modelling Setup 4 4 Understanding the Assumptions 7 5 Estimation 8 6 Proofs 8 Warning This is likely to contain errors and inconsitencies. Please bring any you find to my attention. 1 Introduction Here are some questions planners might want to investigate: Theory tells us that a cost function depends on output and factor prices. How do total costs change, as output increases? Or more specifically, how do total costs change per unit increase in output, holding factor prices constant? 1
2 If the housing capitalization hypothesis is correct, and assuming that kids attend local schools, then housing prices should increase with the quality of local schools. What is that impact? That is, if school quality goes up by one unit, what will happen to housing prices, holding other determinants of house prices constant? Or, in the jargon that is often used, what is the marginal effect (on housing prices) of a change in school quality? Demand theory suggests that increases in the price of gasoline should make it more likely that a commuter will choose to travel by public transit. How much more likely? That is, what is the unit impact on the choice probability of a unit increase in the price of gasoline, holding other characteristics of autos and transit constant? What is the cross-price elasticity of the demand for public transit? What is common to all these questions is a focus on the impact of one variable on another, holding everything else constant. How are we to formalize the notion of holding everything else constant? For many researchers, that is captured by the conditional expectation function. 2 What is conditional expectation? Consider two discrete random variables: x whose possible values are (1, 2, 5, 7) and y whose possible values are ( 1, 0, 2), and assume that their joint probability density function is as follows p(y, x) = x y For example, we see that the probability of observing a value y = 0 and x = 5 is p(0, 5) = p(y = 0, x = 5) = Note that x,y p(y, x) = 1. The marginal distributions are obtained by summation: p(x) = y p(y, x) and p(y) = x p(y, x). Thus, for example, we have p(x = 1) = = 0.25 : this is the column-sum of the p(y, x) array corresponding to the value x = 1. Similarly, p(x = 2) = 0.330, p(x = 5) = and p(x = 7) =
3 For each possible value of x we have a distribution (density) of the y values: this is the conditional density of y given x. By definition (assuming that p(x) > 0) this is p(y, x) p(y x) = p(x) and for our example this turns out to be p(y x) = x y and we have, for example, p(y = 0 x = 5) = Note that y p(y x) = 1 (the column sums of the p(y x) array are all 1). For each possible value of x we can compute the expected value of y given that x takes the value in question. This is the conditional expectation of y, given that x takes the value in question; and is obtained in the usual way: we sum the products of the possible values of y, each weighed by its conditional density value. For example, for x = 1, the conditional expectation of y is E(y x = 1) = ( 1 p(y = 1 x = 1)) + (0 p(y = 0 x = 1)) + (2 p(y = 2 x = 1)) = or more generally E(y x = x 0 ) = y y p(y x = x 0 ) For our example, the four possible values of E(y x = x 0 ) are given in the last row of the table below: x y E(y x = 1) E(y x = 2) E(y x = 5) E(y x = 7) = = = = We now generalize this to consider the values of the conditional expectations as a group: the result is the conditional expectation function, E(y x), telling us the 3
4 expected (=average) value of y given any value of x. Note that this is a function only of x, since its value depends on which x value we are talking about, and we average over y values: E(y x) = h(x). 1 In this respect, the notation E(y x) can be a bit confusing: some people prefer to write E y (y x) to remind us of what is being summed (integrated) over. It should now be apparent that the conditional expectation function formalizes our intuitive notion of the way the average value of y varies, given a particular value of x. The conditional expectation function (CEF) has some useful properties, summarized in the following (proofs in the final section). Theorem 1 (Properties of the CEF). If z = h(x, y) a. E(z) = E(E(z x)) b. E(y) = E(E(y x)) c. E(x, y) = E(x E(y x)) d. Cov(x, E(y x) = Cov(x, y) The first two results are often referred to as the Law of Iterative Expectations. (Clearly, (b) follows from (a) by taking z = y). So far, so good. But what is the connection to statistical modelling? 3 The Modelling Setup We are given a dependent random variable y and we assume that theory (or intuition) tells us that it depends on (is partly explained by) a k 1 vector of independent random variables x. As we ve argued, we want to be able to investigate the part of y that is explained by x, and this leads us to focus on the conditional expectation E(y x). We can always decompose y into the part that is explained by x and the rest, and we therefore write y = E(y x) + u 1 Of course, a parellel devlopment leads to another conditional expectation function, E(x y) : there is nothing sacrosanct about the E(y x) we have been studying. 4
5 where, by definition, u = y E(y x). When will this be an acceptable statistical model? It turns out that we need to make an assumption about u for the decomposition to go through; this is stated in the following result. Theorem 2 (Decomposition Theorem). If we write y = E(y x) + u Then E(u x) = 0 Corollary 3 For any function h(x) E(u h(x)) = Cov(u, hx(x)) = 0 E(ux) = Cov(u, x) = 0 The Theorem says that, conditional on x, u has expectation zero. The Corollary says that u is uncorrelated with any function of x. In other words, to apply this setup, the error u must not involve either x or any function of x. If we take h(x) 1 then we see that this also implies E(u) = 0. But now a question arises: why focus on this particular decomposition y = E(y x) + u? One reason is, as we ve seen, the conditional expectation function is a way of getting at what we want to understand, namely the way that the average value of y varies with given x. But there is an additional reason. Suppose we would like to predict or explain y by some function of x, say m(x). And suppose we agree to evaluate different m s according to a mean squared error criterion: the preferred m is the one that minimizes the mean (expected) squared error. Then we have the following result: Theorem 4 The function m(x) that minimizes the mean squared error E(y m(x)) 2 is m(x) = E(y x) 5
6 In other words, the conditional expectation function E(y x) is the best (in the sense of minimum mean squared error, MMSE) predictor/explainer of y in terms of x. Of course, we don t know what the CEF is. So let s lower our sights a bit, and consider predicting y linearly, that is, by a function of the form x β. If we retain the MMSE criterion, we have the following result: Theorem 5 Suppose that [E(xx )] 1 exists 2, and consider a linear-in-parameters approximation x β to y. The optimal β (in the MMSE sense) is with ˆβ = [E(xx )] 1 E(x y] Corollary 6 If we write then v = y x ˆβ E(v) = 0 Cov(x, v) = 0 The quantity [E(xx )] 1 E(x y] is called the population regression coefficient vector of y on x, and x ˆβ is the population regression of y on x. So the theorem says that the population regression of y on x is the best linear approximation of y, also called the Best Linear Predictor (BLP) of y. The Corollary gives two properties of the BLP residual: it has mean zero and is uncorrelated with (all) the x s. Note that ˆβ is the coefficient vector in the population regression. Despite what you might think at first glance, it is not the least-squares coefficient vector that you may have seen before: note particularly the expectation operators involved. Finally, we can tie all this together in terms of the object we re interested in, namely the CEF. Theorem 7 Suppose that [E(xx )] 1 exists, and consider a linear-in-parameters approximation x β to E(y x). The optimal β (in the MMSE sense) is with ˆβ = [E(xx )] 1 E(x y] 2 Remember that x is a random variable, so strictly speaking we should require that E[(xx ) 1 ] exists with probability 1 (or perhaps, with probability approaching 1 in large samples). 6
7 In other words, with ˆβ standing for the population regression coefficient vector, the quantity x ˆβ is the optimal linear approximation to the CEF (as well as the optimal linear approximation to y). Note that it may be possible to get a better approximation if we allow non-linear functions of x. Still, it is conventional to restrict ourselves (at least initially) to linear approximations. One reason for this is that the linearity involved here is linearity-in-parameters: the x-vector could well contain squares, exponentials (etc) of the individual terms, but note that it cannot contain an exact linear combination of the x s, or else [E(xx )] 1 will not exist. 4 Understanding the Assumptions At this point we can try to clarify what is going on when we talk about linearity. Remember that we are interested in setting up a model of the conditional expectation function. First, consider y = x β + u As it stands, this is vacuous: we can always write u = y x β. In other words, just writing down y = x β + u is a completely pointless exercise. Next, consider y = x β + u E(u) = 0 This asserts only that E(y) = x β, so is not an assertion about the conditional expectation function at all. Third, consider y = x β + u E(u) = 0 Cov(x, u) = 0 The assertions E(u) = 0 and Cov(x, u) = 0 are characteristics of the best linear predictor (BLP) of y, as we saw in Corollary 6, so this statement is just the assertion that x β is the best linear predictor of y. Again, it is not an assertion of the object we really want to study, namely the conditional expectation function, though it may be useful if you are concerned with predicitng y. Finally consider y = x β + u E(u x) = 0 7
8 The assertion that E(u x) is a characteristic of the conditional expectation function, so here we are making a relevant assumption, namely that the conditional expectation function is linear. From Corollary 3 we see that for this model to work we must have the residual u = y E(y x) uncorrelated with all the variables making up x. It is a vital part of any empirical research that wants to estimate E(y x) to argue that this non-correlation condition really does hold for the problem of interest. Often it does not, and in that case we will need to consider carefully what to do. 5 Estimation If we confine our attention to linear functions, then there is a strong case for being interested in the population regression coefficient vector ˆβ. But to make progress we need to estimate it, given that we ordinarily have a sample of data. How can we do this? The analogy principle suggests that we estimate a population parameter by the corresponding sample parameter. In particular, given that we are interested in (population) expectations (averages), we could try estimating using the (sample) averages (means). This is certainly plausible: if, for example, we had a random sample from a normal population with known variance and unknown mean µ, then we are all used to estimating the unknown mean by the sample average. Using the analogy principle, it is easy to show that an estimate b of the population regression coefficient vector ˆβ is b = (X X) 1 X y where X is the matrix of sample data, with rows x. But of course, the analogy principle is just that: an idea leading to an estimator. What can we say about b that would consider it to be an interesting or good estimator? 6 Proofs Here we sketch proofs of the main results in the text. 8
9 Theorem 1 (Properties of the CEF). If z = h(x, y) a. E(z) = E x (E(z x)) b. E(y) = E(E(y x)) c. E(x, y) = E(x E(y x)) d. Cov(x, E(y x)+ = Cov(x, y) Proof: We prove these results for the continuous case only, under the assumption that all random variables have finite second moments. (For the discrete case, replace integrals by sums). Notation: The random variables x and y have joint density f xy (u, v), conditional density f x y (u, v) and marginal densities f x (u) = f xy (u, v) dv and f y (v) = f xy (u, v)du. For (a), let z = h(x, y). We have: E(z) = h(u, v) f xy (u, v)dv = h(u, v) f y x (u, v) f x (u) du dv ( ) = h(u, v) f y x (u, v) dv f x (u) du = E(z x) f x (u) du = E x (E(z x) For (b), use (a) with z = h(x, y) = y. For (c), For (d): E x (x E(y x)) = = ( ) x y f x y (u, v)dv f x (u)du xy f xy (u, v) du dv = E(x y) Cov(x, E(y x)) = E(x E(x y)) E(x)E E(y x) 9
10 By part (c), E(x E(x y)) = E(x y), and by part (b) E E(y x)) = E(y), so Cov(x, E(y x)) = E(x E(x y)) E(x)E E(y x) = E(xy) E(x)e(y) = Cov(xy) Theorem 2 (Decomposition Theorem). If we write y = E(y x) + u Then E(u x) = 0 Proof: E(u x) = E(y E(y x)) = E(y) E E(y x) = E(y) E(y) = 0 using the Law of Iterated Expectations. Corollary 3 For any function h(x) E(u h(x)) = Cov(u, hx(x)) = 0 E(ux) = Cov(u, x) = 0 Proof: For the first statement E(h(x) u) = E(h(x) E(u x)) using part (c) of the Properties of Conditional Expectation. But E(u x) = 0 by the Decomposition Theorem, so we have E(h(x) u) = E(h(x) E(u x)) = E(h(x) 0) = 0 For the second statement, just take h(x) = x. 10
11 Theorem 4 The function m(x) that minimizes the mean squared error E(y m(x)) 2 is m(x) = E(y x) Proof: Add and subtract E(y x) and write (y m(x)) 2 = ((y E(y x)) + (E(y x) m(x)) 2 = (y E(y x)) 2 + (E(y x) m(x)) 2 +2((y E(y x)) ((E(y x) m(x)) The first term doesn t involve m(x) and can be ignored. In the third term we have y E(y x) = u, so this is u h(x), which will have expectation zero by the Decomposition Theorem. That leaves the second term (m(x) E(y x)) 2 which is clearly minimized when m(x) = E(y x). Theorem 5 Suppose that [E(xx )] 1 exists, and consider a linear-in-parameters approximation x β to y. The optimal β (in the MMSE sense) is with ˆβ = [E(xx )] 1 E(x y] Proof: We want to choose β to The first-order condition (FOC) is min β E(y x β) 2 0 = E(x (y x β)) so, solving under the stated assumptions, = E(x y) E(xx )β ˆβ = [E(xx )] 1 E(x y] Corollary 6 If we write then v = y x ˆβ E(v) = 0 Cov(x, v) = 0 11
12 Proof of the Corollary: The first-order condition can be written E(x v) = 0 This holds for all elements in the vector x. If x has a 1 in it (for an intercept) then we see immediately that E(v) = 0. To show that Cov(x v) = 0 write Cov(x v) = E(x y) E(x)E(v) = E(x v) = 0 since E(v) = 0 and the last term is zero. Theorem 7 Suppose that [E(xx )] 1 exists, and consider a linear-in-parameters approximation x β to E(y x). The optimal β (in the MMSE sense) is with ˆβ = [E(xx )] 1 E(x y] Proof: We want to choose β to solve min β E( E(y x) x β) 2 Consider again the problem min β E(y x β) 2 whose solution ( ˆβ) we have just found. Look at (y x β) 2 and write it as (y x β) 2 = ((y E(y x)) + (E(y x) x β)) 2 = (y E(y x)) 2 + (E(y x) x β) 2 +2(y E(y x))(e(y x) x β) Take expectations. Remembering that we are interested in β, the first term can be ignored, since it doesn t involve β. In the third term, (y E(y x)) = u will have expectation zero by the Decomposition Theorem. Thus we have found that in our context E(y x β) 2 = (E(y x) x β) 2 so they have the same solution, namely ˆβ. 12
Covariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
More informationMath 4310 Handout - Quotient Vector Spaces
Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationPolynomial Invariants
Polynomial Invariants Dylan Wilson October 9, 2014 (1) Today we will be interested in the following Question 1.1. What are all the possible polynomials in two variables f(x, y) such that f(x, y) = f(y,
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More information2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system
1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables
More informationLinear Algebra Notes
Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationTHE DIMENSION OF A VECTOR SPACE
THE DIMENSION OF A VECTOR SPACE KEITH CONRAD This handout is a supplementary discussion leading up to the definition of dimension and some of its basic properties. Let V be a vector space over a field
More information160 CHAPTER 4. VECTOR SPACES
160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results
More informationSection 4.1 Rules of Exponents
Section 4.1 Rules of Exponents THE MEANING OF THE EXPONENT The exponent is an abbreviation for repeated multiplication. The repeated number is called a factor. x n means n factors of x. The exponent tells
More informationHypothesis Testing for Beginners
Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes
More informationCorrelation in Random Variables
Correlation in Random Variables Lecture 11 Spring 2002 Correlation in Random Variables Suppose that an experiment produces two random variables, X and Y. What can we say about the relationship between
More informationTo give it a definition, an implicit function of x and y is simply any relationship that takes the form:
2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to
More informationLS.6 Solution Matrices
LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationRow Echelon Form and Reduced Row Echelon Form
These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More information4.5 Linear Dependence and Linear Independence
4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationLinearly Independent Sets and Linearly Dependent Sets
These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation
More informationRegular Languages and Finite Automata
Regular Languages and Finite Automata 1 Introduction Hing Leung Department of Computer Science New Mexico State University Sep 16, 2010 In 1943, McCulloch and Pitts [4] published a pioneering work on a
More informationNotes on Factoring. MA 206 Kurt Bryan
The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor
More informationSection 6.1 Joint Distribution Functions
Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function
More informationThe Gravity Model: Derivation and Calibration
The Gravity Model: Derivation and Calibration Philip A. Viton October 28, 2014 Philip A. Viton CRP/CE 5700 () Gravity Model October 28, 2014 1 / 66 Introduction We turn now to the Gravity Model of trip
More informationDetermine If An Equation Represents a Function
Question : What is a linear function? The term linear function consists of two parts: linear and function. To understand what these terms mean together, we must first understand what a function is. The
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationNotes on Determinant
ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without
More informationMulti-variable Calculus and Optimization
Multi-variable Calculus and Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Multi-variable Calculus and Optimization 1 / 51 EC2040 Topic 3 - Multi-variable Calculus
More informationLecture 3: Finding integer solutions to systems of linear equations
Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture
More informationRevised Version of Chapter 23. We learned long ago how to solve linear congruences. ax c (mod m)
Chapter 23 Squares Modulo p Revised Version of Chapter 23 We learned long ago how to solve linear congruences ax c (mod m) (see Chapter 8). It s now time to take the plunge and move on to quadratic equations.
More informationMATH 10034 Fundamental Mathematics IV
MATH 0034 Fundamental Mathematics IV http://www.math.kent.edu/ebooks/0034/funmath4.pdf Department of Mathematical Sciences Kent State University January 2, 2009 ii Contents To the Instructor v Polynomials.
More informationNote on growth and growth accounting
CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis
More informationFEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL
FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint
More information3. Mathematical Induction
3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)
More informationName: Section Registered In:
Name: Section Registered In: Math 125 Exam 3 Version 1 April 24, 2006 60 total points possible 1. (5pts) Use Cramer s Rule to solve 3x + 4y = 30 x 2y = 8. Be sure to show enough detail that shows you are
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationVieta s Formulas and the Identity Theorem
Vieta s Formulas and the Identity Theorem This worksheet will work through the material from our class on 3/21/2013 with some examples that should help you with the homework The topic of our discussion
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationThe last three chapters introduced three major proof techniques: direct,
CHAPTER 7 Proving Non-Conditional Statements The last three chapters introduced three major proof techniques: direct, contrapositive and contradiction. These three techniques are used to prove statements
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationSECTION 10-2 Mathematical Induction
73 0 Sequences and Series 6. Approximate e 0. using the first five terms of the series. Compare this approximation with your calculator evaluation of e 0.. 6. Approximate e 0.5 using the first five terms
More informationNotes on Orthogonal and Symmetric Matrices MENU, Winter 2013
Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More information6. LECTURE 6. Objectives
6. LECTURE 6 Objectives I understand how to use vectors to understand displacement. I can find the magnitude of a vector. I can sketch a vector. I can add and subtract vector. I can multiply a vector by
More informationWRITING PROOFS. Christopher Heil Georgia Institute of Technology
WRITING PROOFS Christopher Heil Georgia Institute of Technology A theorem is just a statement of fact A proof of the theorem is a logical explanation of why the theorem is true Many theorems have this
More information5 Homogeneous systems
5 Homogeneous systems Definition: A homogeneous (ho-mo-jeen -i-us) system of linear algebraic equations is one in which all the numbers on the right hand side are equal to : a x +... + a n x n =.. a m
More informationCOMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012
Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about
More informationLinear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)
MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationLinear Programming Notes VII Sensitivity Analysis
Linear Programming Notes VII Sensitivity Analysis 1 Introduction When you use a mathematical model to describe reality you must make approximations. The world is more complicated than the kinds of optimization
More informationDiscrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us
More informationSolution to HW - 1. Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean
Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean of 81 degree F and a standard deviation of 10 degree F. What is the mean, standard deviation and variance in terms
More information5544 = 2 2772 = 2 2 1386 = 2 2 2 693. Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1
MATH 13150: Freshman Seminar Unit 8 1. Prime numbers 1.1. Primes. A number bigger than 1 is called prime if its only divisors are 1 and itself. For example, 3 is prime because the only numbers dividing
More informationECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015
ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationEconomics 326: Duality and the Slutsky Decomposition. Ethan Kaplan
Economics 326: Duality and the Slutsky Decomposition Ethan Kaplan September 19, 2011 Outline 1. Convexity and Declining MRS 2. Duality and Hicksian Demand 3. Slutsky Decomposition 4. Net and Gross Substitutes
More informationLEARNING OBJECTIVES FOR THIS CHAPTER
CHAPTER 2 American mathematician Paul Halmos (1916 2006), who in 1942 published the first modern linear algebra book. The title of Halmos s book was the same as the title of this chapter. Finite-Dimensional
More information10 Evolutionarily Stable Strategies
10 Evolutionarily Stable Strategies There is but a step between the sublime and the ridiculous. Leo Tolstoy In 1973 the biologist John Maynard Smith and the mathematician G. R. Price wrote an article in
More informationChapter 3. Cartesian Products and Relations. 3.1 Cartesian Products
Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing
More informationUnified Lecture # 4 Vectors
Fall 2005 Unified Lecture # 4 Vectors These notes were written by J. Peraire as a review of vectors for Dynamics 16.07. They have been adapted for Unified Engineering by R. Radovitzky. References [1] Feynmann,
More informationPigeonhole Principle Solutions
Pigeonhole Principle Solutions 1. Show that if we take n + 1 numbers from the set {1, 2,..., 2n}, then some pair of numbers will have no factors in common. Solution: Note that consecutive numbers (such
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationSome probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationPractice with Proofs
Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using
More information3.2 The Factor Theorem and The Remainder Theorem
3. The Factor Theorem and The Remainder Theorem 57 3. The Factor Theorem and The Remainder Theorem Suppose we wish to find the zeros of f(x) = x 3 + 4x 5x 4. Setting f(x) = 0 results in the polynomial
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationv w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.
3. Cross product Definition 3.1. Let v and w be two vectors in R 3. The cross product of v and w, denoted v w, is the vector defined as follows: the length of v w is the area of the parallelogram with
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationSo let us begin our quest to find the holy grail of real analysis.
1 Section 5.2 The Complete Ordered Field: Purpose of Section We present an axiomatic description of the real numbers as a complete ordered field. The axioms which describe the arithmetic of the real numbers
More informationBinary Adders: Half Adders and Full Adders
Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order
More informationMath 431 An Introduction to Probability. Final Exam Solutions
Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <
More informationTopic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationSection 1.5 Exponents, Square Roots, and the Order of Operations
Section 1.5 Exponents, Square Roots, and the Order of Operations Objectives In this section, you will learn to: To successfully complete this section, you need to understand: Identify perfect squares.
More information1 Lecture: Integration of rational functions by decomposition
Lecture: Integration of rational functions by decomposition into partial fractions Recognize and integrate basic rational functions, except when the denominator is a power of an irreducible quadratic.
More informationCommon sense, and the model that we have used, suggest that an increase in p means a decrease in demand, but this is not the only possibility.
Lecture 6: Income and Substitution E ects c 2009 Je rey A. Miron Outline 1. Introduction 2. The Substitution E ect 3. The Income E ect 4. The Sign of the Substitution E ect 5. The Total Change in Demand
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationContinued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
More informationMicroeconomic Theory: Basic Math Concepts
Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts
More information6.3 Conditional Probability and Independence
222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted
More informationLecture L3 - Vectors, Matrices and Coordinate Transformations
S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between
More information