# The Delta Method and Applications

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and identically distributed (univariate) random variables with finite variance σ. In this case, the central limit theorem states that n(xn µ) d σz, (5.1) where µ = E X 1 and Z is a standard normal random variable. In this chapter, we wish to consider the asymptotic distribution of, say, some function of X n. In the simplest case, the answer depends on results already known: Consider a linear function g(t) = at+b for some known constants a and b. Since E X n = µ, clearly E g(x n ) = aµ + b = g(µ) by the linearity of the expectation operator. Therefore, it is reasonable to ask whether n[g(x n ) g(µ)] tends to some distribution as n. But the linearity of g(t) allows one to write n [ g(xn ) g(µ) ] = a n ( X n µ ). We conclude by Theorem.4 that n [ g(xn ) g(µ) ] d aσz. Of course, the distribution on the right hand side above is N(, a σ ). None of the preceding development is especially deep; one might even say that it is obvious that a linear transformation of the random variable X n alters its asymptotic distribution 85

2 by a constant multiple. Yet what if the function g(t) is nonlinear? It is in this nonlinear case that a strong understanding of the argument above, as simple as it may be, pays real dividends. For if X n is consistent for µ (say), then we know that, roughly speaking, X n will be very close to µ for large n. Therefore, the only meaningful aspect of the behavior of g(t) is its behavior in a small neighborhood of µ. And in a small neighborhood of µ, g(µ) may be considered to be roughly a linear function if we use a first-order Taylor expansion. In particular, we may approximate g(t) g(µ) + g (µ)(t µ) for t in a small neighborhood of µ. We see that g (µ) is the multiple of t, and so the logic of the linear case above suggests n { g(xn ) g(µ) } d g (µ)σ Z. (5.) Indeed, expression (5.) is a special case of the powerful theorem known as the delta method. Theorem 5.1 Delta method: If g (a) exists and n b (X n a) d X for b >, then n b {g(x n ) g(a)} d g (a)x. The proof of the delta method uses Taylor s theorem, Theorem 1.18: Since X n a P, n b {g(x n ) g(a)} = n b (X n a) {g (a) + o P (1)}, and thus Slutsky s theorem together with the fact that n b (X n a) d X proves the result. Expression (5.) may be reexpressed as a corollary of Theorem 5.1: Corollary 5. The often-used special case of Theorem 5.1 in which X is normally distributed states that if g (µ) exists and n(x n µ) d N(, σ ), then { n g(xn ) g(µ) } d N {, σ g (µ) }. Ultimately, we will extend Theorem 5.1 in two directions: Theorem 5.5 deals with the special case in which g (a) =, and Theorem 5.6 is the multivariate version of the delta method. But we first apply the delta method to a couple of simple examples that illustrate a frequently understood but seldom stated principle: When we speak of the asymptotic distribution of a sequence of random variables, we generally refer to a nontrivial (i.e., nonconstant) distribution. Thus, in the case of an independent and identically distributed sequence X 1, X,... of random variables with finite variance, the phrase asymptotic distribution of X n generally refers to the fact that n ( Xn E X 1 ) d N(, Var X 1 ), not the fact that X n P E X 1. 86

3 Example 5.3 Asymptotic distribution of X n Suppose X 1, X,... are iid with mean µ and finite variance σ. Then by the central limit theorem, n(xn µ) d N(, σ ). Therefore, the delta method gives n(x n µ ) d N(, 4µ σ ). (5.3) However, this is not necessarily the end of the story. If µ =, then the normal limit in (5.3) is degenerate that is, expression (5.3) merely states that n(x n) converges in probability to the constant. This is not what we mean by the asymptotic distribution! Thus, we must treat the case µ = separately, noting in that case that nx d n N(, σ ) by the central limit theorem, which implies that nx n d σ χ 1. Example 5.4 Estimating binomial variance: Suppose X n binomial(n, p). Because X n /n is the maximum likelihood estimator for p, the maximum likelihood estimator for p(1 p) is δ n = X n (n X n )/n. The central limit theorem tells us that n(xn /n p) d N{, p(1 p)}, so the delta method gives n {δn p(1 p)} d N {, p(1 p)(1 p) }. Note that in the case p = 1/, this does not give the asymptotic distribution of δ n. Exercise 5.1 gives a hint about how to find the asymptotic distribution of δ n in this case. We have seen in the preceding examples that if g (a) =, then the delta method gives something other than the asymptotic distribution we seek. However, by using more terms in the Taylor expansion, we obtain the following generalization of Theorem 5.1: Theorem 5.5 If g(t) has r derivatives at the point a and g (a) = g (a) = = g (r 1) (a) =, then n b (X n a) d X for b > implies that n rb {g(x n ) g(a)} d 1 r! g(r) (a)x r. It is straightforward using the multivariate notion of differentiability discussed in Definition 1.34 to prove the following theorem: 87

4 Theorem 5.6 Multivariate delta method: If g : R k R l has a derivative g(a) at a R k and n b (X n a) d Y for some k-vector Y and some sequence X 1, X,... of k-vectors, where b >, then n b {g (X n ) g (a)} d [ g(a)] T Y. The proof of Theorem 5.6 involves a simple application of the multivariate Taylor expansion of Equation (1.18). Exercises for Section 5.1 Exercise 5.1 Let δ n be defined as in Example 5.4. Find the asymptotic distribution of δ n in the case p = 1/. That is, find constant sequences a n and b n and a nontrivial random variable X such that a n (δ n b n ) d X. Hint: Let Y n = X n (n/). Apply the central limit theorem to Y n, then transform both sides of the resulting limit statement so that a statement involving δ n results. Exercise 5. Prove Theorem Variance stabilizing transformations Often, if E (X i ) = µ is the parameter of interest, the central limit theorem gives n(xn µ) d N{, σ (µ)}. In other words, the variance of the limiting distribution is a function of µ. This is a problem if we wish to do inference for µ, because ideally the limiting distribution should not depend on the unknown µ. The delta method gives a possible solution: Since n { g(xn ) g(µ) } d N {, σ (µ)g (µ) }, we may search for a transformation g(x) such that g (µ)σ(µ) is a constant. Such a transformation is called a variance stabilizing transformation. 88

5 Example 5.7 Suppose that X 1, X,... are independent normal random variables with mean and variance σ. Let us define τ = Var Xi, which for the normal distribution may be seen to be σ 4. (To verify this, try showing that E Xi 4 = 3σ 4 by differentiating the normal characteristic function four times and evaluating at zero.) Thus, Example 4.11 shows that ( ) 1 n Xi σ d N(, σ 4 ). n To do inference for σ when we believe that our data are truly independent and identically normally distributed, it would be helpful if the limiting distribution did not depend on the unknown σ. Therefore, it is sensible in light of Corollary 5. to search for a function g(t) such that [g (σ )] σ 4 is not a function of σ. In other words, we want g (t) to be proportional to t = t 1. Clearly g(t) = log t is such a function. Therefore, we call the logarithm function a variance-stabilizing function in this example, and Corollary 5. shows that { ( ) 1 n log Xi log ( σ )} d N(, ). n Exercises for Section 5. Exercise 5.3 Suppose X n binomial(n, p), where < p < 1. (a) Find the asymptotic distribution of g(x n /n) g(p), where g(x) = min{x, 1 x}. (b) Show that h(x) = sin 1 ( x) is a variance-stabilizing transformation. Hint: (d/du) sin 1 (u) = 1/ 1 u. Exercise 5.4 Let X 1, X,... be independent from N(µ, σ ) where µ. Let S n = 1 n (X i X n ). Find the asymptotic distribution of the coefficient of variation S n /X n. Exercise 5.5 Let X n binomial(n, p), where p (, 1) is unknown. Obtain confidence intervals for p in two different ways: 89

6 (a) Since n(x n /n p) d N[, p(1 p)], the variance of the limiting distribution depends only on p. Use the fact that X n /n P p to find a consistent estimator of the variance and use it to derive a 95% confidence interval for p. (b) Use the result of problem 5.3(b) to derive a 95% confidence interval for p. (c) Evaluate the two confidence intervals in parts (a) and (b) numerically for all combinations of n {1, 1, 1} and p {.1,.3,.5} as follows: For 1 realizations of X bin(n, p), construct both 95% confidence intervals and keep track of how many times (out of 1) that the confidence intervals contain p. Report the observed proportion of successes for each (n, p) combination. Does your study reveal any differences in the performance of these two competing methods? 5.3 Sample Moments The weak law of large numbers tells us that If X 1, X,... are independent and identically distributed with E X 1 k <, then 1 n X k i P E X k 1. That is, sample moments are (weakly) consistent. For example, the sample variance, which we define as s n = 1 n (X i X n ) = 1 n is consistent for Var X i = E X i (E X i ). Xi (X n ), (5.4) However, consistency is not the end of the story. The central limit theorem and the delta method will prove very useful in deriving asymptotic distribution results about sample moments. We consider two very important examples involving the sample variance of Equation (5.4). Example 5.8 Distribution of sample T statistic: Suppose X 1, X,... are iid with E (X i ) = µ and Var (X i ) = σ <. Define s n as in Equation (5.4), and let T n = n(xn µ) s n. 9

7 Letting A n = n(xn µ) σ and B n = σ/s n, we obtain T n = A n B n. Therefore, since A d n N(, 1) by the central limit theorem and B P n 1 by the weak law of large numbers, Slutsky s theorem implies that T d n N(, 1). In other words, T statistics are asymptotically normal under the null hypothesis. Example 5.9 Let X 1, X,... be independent and identically distributed with mean µ, variance σ, third central moment E (X i µ) 3 = γ, and Var (X i µ) = τ <. Define Sn as in Equation (4.6). We have shown earlier that n(sn σ ) d N(, τ ). The same fact may be proven using Theorem 4.9 as follows. First, let Y i = X i µ and Z i = Yi. We may use the multivariate central limit theorem to find the joint asymptotic distribution of Y n and Z n, namely {( ) Y n n Z n ( σ )} ( d σ γ N {, γ τ )}. Note that the above result uses the fact that Cov (Y 1, Z 1 ) = γ. We may write Sn = Z n (Y n ). Therefore, define the function g(a, b) = b a and note that this gives ġ(a, b) = ( a, 1). To use the delta method, we should evaluate ġ(, σ ) ( σ γ γ τ We conclude that { ( ) Y n n g ) ( σ ġ(, σ ) T γ = (, 1) γ τ Z n ) ( ) 1 ( )} g = n(s σ n σ ) d N(, τ ) as we found earlier (using a different argument) in Example = ġ(, σ ) T = τ Exercises for Section 5.3 Exercise 5.6 Suppose that X 1, X,... are iid Normal (, σ ) random variables. (a) Based on the result of Example 5.7, Give an approximate test at α =.5 for H : σ = σ vs. H a : σ σ. 91

8 (b) For n = 5, estimate the true level of the test in part (a) for σ = 1 by simulating 5 samples of size n = 5 from the null distribution. Report the proportion of cases in which you reject the null hypothesis according to your test (ideally, this proportion will be about.5). 5.4 Sample Correlation Suppose that (X 1, Y 1 ), (X, Y ),... are iid vectors with E X 4 i < and E Y 4 i <. For the sake of simplicity, we will assume without loss of generality that E X i = E Y i = (alternatively, we could base all of the following derivations on the centered versions of the random variables). We wish to find the asymptotic distribution of the sample correlation coefficient, r. If we let m n x m y m xx m yy = 1 X i n Y i n n X i n Y (5.5) i m n xy X iy i and s x = m xx m x, s y = m yy m y, and s xy = m xy m x m y, (5.6) then r = s xy /(s x s y ). According to the central limit theorem, m x Cov (X m y n m xx m yy σx d 1, X 1 ) Cov (X 1, X 1 Y 1 ) N 5 σy, Cov (Y 1, X 1 ) Cov (Y 1, X 1 Y 1 )......(5.7) m xy σ xy Cov (X 1 Y 1, X 1 ) Cov (X 1 Y 1, X 1 Y 1 ) Let Σ denote the covariance matrix in expression (5.7). Define a function g : R 5 R 3 such that g applied to the vector of moments in Equation (5.5) yields the vector (s x, s y, s xy ) as defined in expression (5.6). Then g a b c d e = a b b a

9 Therefore, if we let Σ = = then by the delta method, T g σx Σ σy g σx σy σ xy σ xy Cov (X 1, X1) Cov (X1, Y1 ) Cov (X1, X 1 Y 1 ) Cov (Y1, X1) Cov (Y1, Y1 ) Cov (Y1, X 1 Y 1 ), Cov (X 1 Y 1, X1) Cov (X 1 Y 1, Y1 ) Cov (X 1 Y 1, X 1 Y 1 ) n s x s y s xy σ x σ y σ xy d N 3 (, Σ ). (5.8) As an aside, note that expression (5.8) gives the same marginal asymptotic distribution for n(s x σ x) as was derived using a different approach in Example 4.11, since Cov (X 1, X 1) is the same as τ in that example. Next, define the function h(a, b, c) = c/ ab, so that we have h(s x, s y, s xy ) = r. Then [ h(a, b, c)] T = 1 ( ) c a3 b, c,, ab 3 ab so that [ h(σ x, σ y, σ xy )] T = ( σxy, σ xy, σxσ 3 y σ x σy 3 1 σ x σ y ) = ( ρ σ x, ρ, σy 1 σ x σ y ). (5.9) Therefore, if A denotes the 1 3 matrix in Equation (5.9), using the delta method once again yields n(r ρ) d N(, AΣ A T ). To recap, we have used the basic tools of the multivariate central limit theorem and the multivariate delta method to obtain a univariate result. This derivation of univariate facts via multivariate techniques is common practice in statistical large-sample theory. Example 5.1 Consider the special case of bivariate normal (X i, Y i ). In this case, we may derive σ Σ x 4 ρ σxσ y ρσxσ 3 y = ρ σxσ y σy 4 ρσ x σy 3. (5.1) ρσxσ 3 y ρσ x σy 3 (1 + ρ )σxσ y 93

10 In this case, AΣ A T = (1 ρ ), which implies that n(r ρ) d N{, (1 ρ ) }. (5.11) In the normal case, we may derive a variance-stabilizing transformation. According to Equation (5.11), we should find a function f(x) satisfying f (x) = (1 x ) 1. Since we integrate to obtain 1 1 x = 1 (1 x) + 1 (1 + x), f(x) = 1 log 1 + x 1 x. This is called Fisher s transformation; we conclude that ( 1 n log 1 + r 1 r 1 log 1 + ρ ) d N(, 1). 1 ρ Exercises for Section 5.4 Exercise 5.7 Verify expressions (5.1) and (5.11). Exercise 5.8 Assume (X 1, Y 1 ),..., (X n, Y n ) are iid from some bivariate normal distribution. Let ρ denote the population correlation coefficient and r the sample correlation coefficient. (a) Describe a test of H : ρ = against H 1 : ρ based on the fact that n[f(r) f(ρ)] d N(, 1), where f(x) is Fisher s transformation f(x) = (1/) log[(1 + x)/(1 x)]. α =.5. Use (b) Based on 5 repetitions each, estimate the actual level for this test in the case when E (X i ) = E (Y i ) =, Var (X i ) = Var (Y i ) = 1, and n {3, 5, 1, }. Exercise 5.9 Suppose that X and Y are jointly distributed such that X and Y are Bernoulli (1/) random variables with P (XY = 1) = θ for θ (, 1/). Let (X 1, Y 1 ), (X, Y ),... be iid with (X i, Y i ) distributed as (X, Y ). 94

11 (a) Find the asymptotic distribution of n [ (X n, Y n ) (1/, 1/) ]. (b) If r n is the sample correlation coefficient for a sample of size n, find the asymptotic distribution of n(r n ρ). (c) Find a variance stabilizing transformation for r n. (d) Based on your answer to part (c), construct a 95% confidence interval for θ. (e) For each combination of n {5, } and θ {.5,.5,.45}, estimate the true coverage probability of the confidence interval in part (d) by simulating 5 samples and the corresponding confidence intervals. One problem you will face is that in some samples, the sample correlation coefficient is undefined because with positive probability each of the X i or Y i will be the same. In such cases, consider the confidence interval to be undefined and the true parameter therefore not contained therein. Hint: To generate a sample of (X, Y ), first simulate the X s from their marginal distribution, then simulate the Y s according to the conditional distribution of Y given X. To obtain this conditional distribution, find P (Y = 1 X = 1) and P (Y = 1 X = ). 95

### The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Chapter 7 Pearson s chi-square test 7. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

### Fisher Information and Cramér-Rao Bound. 1 Fisher Information. Math 541: Statistical Theory II. Instructor: Songfeng Zheng

Math 54: Statistical Theory II Fisher Information Cramér-Rao Bound Instructor: Songfeng Zheng In the parameter estimation problems, we obtain information about the parameter from a sample of data coming

### POL 571: Expectation and Functions of Random Variables

POL 571: Expectation and Functions of Random Variables Kosuke Imai Department of Politics, Princeton University March 10, 2006 1 Expectation and Independence To gain further insights about the behavior

### , where f(x,y) is the joint pdf of X and Y. Therefore

INDEPENDENCE, COVARIANCE AND CORRELATION M384G/374G Independence: For random variables and, the intuitive idea behind " is independent of " is that the distribution of shouldn't depend on what is. This

### Test Code MS (Short answer type) 2004 Syllabus for Mathematics. Syllabus for Statistics

Test Code MS (Short answer type) 24 Syllabus for Mathematics Permutations and Combinations. Binomials and multinomial theorem. Theory of equations. Inequalities. Determinants, matrices, solution of linear

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Homework Zero. Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis. Complete before the first class, do not turn in

Homework Zero Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis Complete before the first class, do not turn in This homework is intended as a self test of your knowledge of the basic statistical

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

### Continuous Random Variables

Continuous Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Continuous Random Variables 2 11 Introduction 2 12 Probability Density Functions 3 13 Transformations 5 2 Mean, Variance and Quantiles

### Lecture 3: Solving Equations Using Fixed Point Iterations

cs41: introduction to numerical analysis 09/14/10 Lecture 3: Solving Equations Using Fixed Point Iterations Instructor: Professor Amos Ron Scribes: Yunpeng Li, Mark Cowlishaw, Nathanael Fillmore Our problem,

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Asymptotic Joint Distribution of Sample Mean and a Sample Quantile. Thomas S. Ferguson UCLA

Asymptotic Joint Distribution of Sample Mean a Sample Quantile Thomas S. Ferguson UCA. Introduction. The joint asymptotic distribution of the sample mean the sample median was found by aplace almost 2

### DIFFERENTIABILITY OF COMPLEX FUNCTIONS. Contents

DIFFERENTIABILITY OF COMPLEX FUNCTIONS Contents 1. Limit definition of a derivative 1 2. Holomorphic functions, the Cauchy-Riemann equations 3 3. Differentiability of real functions 5 4. A sufficient condition

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### Estimation with Minimum Mean Square Error

C H A P T E R 8 Estimation with Minimum Mean Square Error INTRODUCTION A recurring theme in this text and in much of communication, control and signal processing is that of making systematic estimates,

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### Inference in Regression Analysis

Yang Feng Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of

### Interpolating Polynomials Handout March 7, 2012

Interpolating Polynomials Handout March 7, 212 Again we work over our favorite field F (such as R, Q, C or F p ) We wish to find a polynomial y = f(x) passing through n specified data points (x 1,y 1 ),

### 3.7 Non-autonomous linear systems of ODE. General theory

3.7 Non-autonomous linear systems of ODE. General theory Now I will study the ODE in the form ẋ = A(t)x + g(t), x(t) R k, A, g C(I), (3.1) where now the matrix A is time dependent and continuous on some

### Review of Likelihood Theory

Appendix A Review of Likelihood Theory This is a brief summary of some of the key results we need from likelihood theory. A.1 Maximum Likelihood Estimation Let Y 1,..., Y n be n independent random variables

### Generalized Linear Model Theory

Appendix B Generalized Linear Model Theory We describe the generalized linear model as formulated by Nelder and Wedderburn (1972), and discuss estimation of the parameters and tests of hypotheses. B.1

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### 4.5 Linear Dependence and Linear Independence

4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then

### Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

### 3.6: General Hypothesis Tests

3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.

### Advanced Spatial Statistics Fall 2012 NCSU. Fuentes Lecture notes

Advanced Spatial Statistics Fall 2012 NCSU Fuentes Lecture notes Areal unit data 2 Areal Modelling Areal unit data Key Issues Is there spatial pattern? Spatial pattern implies that observations from units

### Applications of the Central Limit Theorem

Applications of the Central Limit Theorem October 23, 2008 Take home message. I expect you to know all the material in this note. We will get to the Maximum Liklihood Estimate material very soon! 1 Introduction

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### Lesson 5 Chapter 4: Jointly Distributed Random Variables

Lesson 5 Chapter 4: Jointly Distributed Random Variables Department of Statistics The Pennsylvania State University 1 Marginal and Conditional Probability Mass Functions The Regression Function Independence

### Linear and Piecewise Linear Regressions

Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

### 5. Conditional Expected Value

Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 5. Conditional Expected Value As usual, our starting point is a random experiment with probability measure P on a sample space Ω. Suppose that X is

### 1 Another method of estimation: least squares

1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Lecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. Log-Normal Distribution

Lecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. Log-ormal Distribution October 4, 200 Limiting Distribution of the Scaled Random Walk Recall that we defined a scaled simple random walk last

### 3. Continuous Random Variables

3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

### Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

### Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Example: (Spector, 987) describes a study of two drugs on human heart rate There are 24 subjects enrolled in the study which are assigned at random to one of

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### 17.Convergence of Random Variables

17.Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then n f n = f if n f n (x) = f(x) for all x in R. This is

### CF4103 Financial Time Series. 1 Financial Time Series and Their Characteristics

CF4103 Financial Time Series 1 Financial Time Series and Their Characteristics Financial time series analysis is concerned with theory and practice of asset valuation over time. For example, there are

### Lecture Notes 1. Brief Review of Basic Probability

Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

### Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

### Statistiek (WISB361)

Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

### Examination 110 Probability and Statistics Examination

Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiple-choice test questions. The test is a three-hour examination

### The t Test. April 3-8, exp (2πσ 2 ) 2σ 2. µ ln L(µ, σ2 x) = 1 σ 2. i=1. 2σ (σ 2 ) 2. ˆσ 2 0 = 1 n. (x i µ 0 ) 2 = i=1

The t Test April 3-8, 008 Again, we begin with independent normal observations X, X n with unknown mean µ and unknown variance σ The likelihood function Lµ, σ x) exp πσ ) n/ σ x i µ) Thus, ˆµ x ln Lµ,

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

### Hypothesis tests, confidence intervals, and bootstrapping

Hypothesis tests, confidence intervals, and bootstrapping Business Statistics 41000 Fall 2015 1 Topics 1. Hypothesis tests Testing a mean: H0 : µ = µ 0 Testing a proportion: H0 : p = p 0 Testing a difference

### Fourier Series. Chapter Some Properties of Functions Goal Preliminary Remarks

Chapter 3 Fourier Series 3.1 Some Properties of Functions 3.1.1 Goal We review some results about functions which play an important role in the development of the theory of Fourier series. These results

### Lecture 18 Chapter 6: Empirical Statistics

Lecture 18 Chapter 6: Empirical Statistics M. George Akritas Definition (Sample Covariance and Pearson s Correlation) Let (X 1, Y 1 ),..., (X n, Y n ) be a sample from a bivariate population. The sample

### Lecture Quantitative Finance

Lecture Quantitative Finance Spring 2011 Prof. Dr. Erich Walter Farkas Lecture 12: May 19, 2011 Chapter 8: Estimating volatility and correlations Prof. Dr. Erich Walter Farkas Quantitative Finance 11:

### Exercises with solutions (1)

Exercises with solutions (). Investigate the relationship between independence and correlation. (a) Two random variables X and Y are said to be correlated if and only if their covariance C XY is not equal

### Numerical Analysis and Computing

Numerical Analysis and Computing Lecture Notes #3 Solutions of Equations in One Variable: ; Root Finding; for Iterative Methods Joe Mahaffy, mahaffy@math.sdsu.edu Department of Mathematics Dynamical Systems

### Limits and Continuity

Math 20C Multivariable Calculus Lecture Limits and Continuity Slide Review of Limit. Side limits and squeeze theorem. Continuous functions of 2,3 variables. Review: Limits Slide 2 Definition Given a function

### Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall

### Inference in Regression Analysis. Dr. Frank Wood

Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters

### In this chapter, we aim to answer the following questions: 1. What is the nature of heteroskedasticity?

Lecture 9 Heteroskedasticity In this chapter, we aim to answer the following questions: 1. What is the nature of heteroskedasticity? 2. What are its consequences? 3. how does one detect it? 4. What are

### 5. Convergence of sequences of random variables

5. Convergence of sequences of random variables Throughout this chapter we assume that {X, X 2,...} is a sequence of r.v. and X is a r.v., and all of them are defined on the same probability space (Ω,

### Problem Set. Problem Set #2. Math 5322, Fall December 3, 2001 ANSWERS

Problem Set Problem Set #2 Math 5322, Fall 2001 December 3, 2001 ANSWERS i Problem 1. [Problem 18, page 32] Let A P(X) be an algebra, A σ the collection of countable unions of sets in A, and A σδ the collection

### t-statistics for Weighted Means with Application to Risk Factor Models

t-statistics for Weighted Means with Application to Risk Factor Models Lisa R. Goldberg Barra, Inc. 00 Milvia St. Berkeley, CA 94704 Alec N. Kercheval Dept. of Mathematics Florida State University Tallahassee,

### Bivariate Distributions

Chapter 4 Bivariate Distributions 4.1 Distributions of Two Random Variables In many practical cases it is desirable to take more than one measurement of a random observation: (brief examples) 1. What is

### Statistics - Written Examination MEC Students - BOVISA

Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent.

### Simultaneous Equations

Simultaneous Equations 1 Motivation and Examples Now we relax the assumption that EX u 0 his wil require new techniques: 1 Instrumental variables 2 2- and 3-stage least squares 3 Limited LIML and full

### Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. b 0, b 1 Q = n (Y i (b 0 + b 1 X i )) 2 i=1 Minimize this by maximizing

### Probability Generating Functions

page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

### GROUPS ACTING ON A SET

GROUPS ACTING ON A SET MATH 435 SPRING 2012 NOTES FROM FEBRUARY 27TH, 2012 1. Left group actions Definition 1.1. Suppose that G is a group and S is a set. A left (group) action of G on S is a rule for

### Taylor and Maclaurin Series

Taylor and Maclaurin Series In the preceding section we were able to find power series representations for a certain restricted class of functions. Here we investigate more general problems: Which functions

### CHAPTER III - MARKOV CHAINS

CHAPTER III - MARKOV CHAINS JOSEPH G. CONLON 1. General Theory of Markov Chains We have already discussed the standard random walk on the integers Z. A Markov Chain can be viewed as a generalization of

### TESTING THE ONE-PART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWO-PART MODEL

TESTING THE ONE-PART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWO-PART MODEL HARALD OBERHOFER AND MICHAEL PFAFFERMAYR WORKING PAPER NO. 2011-01 Testing the One-Part Fractional Response Model against

### Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

### Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

### Lecture 1: Univariate Time Series B41910: Autumn Quarter, 2008, by Mr. Ruey S. Tsay

Lecture 1: Univariate Time Series B41910: Autumn Quarter, 2008, by Mr. Ruey S. Tsay 1 Some Basic Concepts 1. Time Series: A sequence of random variables measuring certain quantity of interest over time.

### Math 576: Quantitative Risk Management

Math 576: Quantitative Risk Management Haijun Li lih@math.wsu.edu Department of Mathematics Washington State University Week 4 Haijun Li Math 576: Quantitative Risk Management Week 4 1 / 22 Outline 1 Basics

### 0 ( x) 2 = ( x)( x) = (( 1)x)(( 1)x) = ((( 1)x))( 1))x = ((( 1)(x( 1)))x = ((( 1)( 1))x)x = (1x)x = xx = x 2.

SOLUTION SET FOR THE HOMEWORK PROBLEMS Page 5. Problem 8. Prove that if x and y are real numbers, then xy x + y. Proof. First we prove that if x is a real number, then x 0. The product of two positive

### EC327: Advanced Econometrics, Spring 2007

EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Appendix D: Summary of matrix algebra Basic definitions A matrix is a rectangular array of numbers, with m

### Descriptive Data Summarization

Descriptive Data Summarization (Understanding Data) First: Some data preprocessing problems... 1 Missing Values The approach of the problem of missing values adopted in SQL is based on nulls and three-valued

### E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

### Chapter 7. Induction and Recursion. Part 1. Mathematical Induction

Chapter 7. Induction and Recursion Part 1. Mathematical Induction The principle of mathematical induction is this: to establish an infinite sequence of propositions P 1, P 2, P 3,..., P n,... (or, simply

### MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### 2 Complex Functions and the Cauchy-Riemann Equations

2 Complex Functions and the Cauchy-Riemann Equations 2.1 Complex functions In one-variable calculus, we study functions f(x) of a real variable x. Likewise, in complex analysis, we study functions f(z)

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### since by using a computer we are limited to the use of elementary arithmetic operations

> 4. Interpolation and Approximation Most functions cannot be evaluated exactly: x, e x, ln x, trigonometric functions since by using a computer we are limited to the use of elementary arithmetic operations

### On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models

On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models Hisashi Tanizaki Graduate School of Economics Kobe University (tanizaki@kobe-u.ac.p) ABSTRACT In this paper we

### Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### Lecture 12 Basic Lyapunov theory

EE363 Winter 2008-09 Lecture 12 Basic Lyapunov theory stability positive definite functions global Lyapunov stability theorems Lasalle s theorem converse Lyapunov theorems finding Lyapunov functions 12

### 11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE We assume here that the population variance σ 2 is known. This is an unrealistic assumption, but it allows us to give a simplified presentation which

### CHAPTER 2 FOURIER SERIES

CHAPTER 2 FOURIER SERIES PERIODIC FUNCTIONS A function is said to have a period T if for all x,, where T is a positive constant. The least value of T>0 is called the period of. EXAMPLES We know that =

### Cointegration. Basic Ideas and Key results. Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board

Cointegration Basic Ideas and Key results Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board Summer School in Financial Mathematics Faculty of Mathematics & Physics University of Ljubljana

### 2.8 Expected values and variance

y 1 b a 0 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 a b (a) Probability density function x 0 a b (b) Cumulative distribution function x Figure 2.3: The probability density function and cumulative distribution function

### The Method of Lagrange Multipliers

The Method of Lagrange Multipliers S. Sawyer October 25, 2002 1. Lagrange s Theorem. Suppose that we want to maximize (or imize a function of n variables f(x = f(x 1, x 2,..., x n for x = (x 1, x 2,...,

### Review Solutions, Exam 3, Math 338

Review Solutions, Eam 3, Math 338. Define a random sample: A random sample is a collection of random variables, typically indeed as X,..., X n.. Why is the t distribution used in association with sampling?

### Market Risk: FROM VALUE AT RISK TO STRESS TESTING. Agenda. Agenda (Cont.) Traditional Measures of Market Risk

Market Risk: FROM VALUE AT RISK TO STRESS TESTING Agenda The Notional Amount Approach Price Sensitivity Measure for Derivatives Weakness of the Greek Measure Define Value at Risk 1 Day to VaR to 10 Day