Differentiating under an integral sign



Similar documents
Microeconomic Theory: Basic Math Concepts

MA4001 Engineering Mathematics 1 Lecture 10 Limits and Continuity

Lectures 5-6: Taylor Series

Rolle s Theorem. q( x) = 1

1 if 1 x 0 1 if 0 x 1

f(x) = g(x), if x A h(x), if x B.

Mathematics 31 Pre-calculus and Limits

A power series about x = a is the series of the form

Estimating the Average Value of a Function

FIRST YEAR CALCULUS. Chapter 7 CONTINUITY. It is a parabola, and we can draw this parabola without lifting our pencil from the paper.

MATHEMATICS BONUS FILES for faculty and students

RAJALAKSHMI ENGINEERING COLLEGE MA 2161 UNIT I - ORDINARY DIFFERENTIAL EQUATIONS PART A

Adaptive Online Gradient Descent

WARM UP EXERCSE. 2-1 Polynomials and Rational Functions

Reference: Introduction to Partial Differential Equations by G. Folland, 1995, Chap. 3.

ISU Department of Mathematics. Graduate Examination Policies and Procedures

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Multivariate Normal Distribution

General Theory of Differential Equations Sections 2.8, , 4.1

Random graphs with a given degree sequence

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

A RIGOROUS AND COMPLETED STATEMENT ON HELMHOLTZ THEOREM

Exercises in Mathematical Analysis I

MATHEMATICAL METHODS OF STATISTICS

9 More on differentiation

Student Performance Q&A:

Wald s Identity. by Jeffery Hein. Dartmouth College, Math 100

Gambling Systems and Multiplication-Invariant Measures

Fuzzy Probability Distributions in Bayesian Analysis

Week 1: Introduction to Online Learning

Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma

On using numerical algebraic geometry to find Lyapunov functions of polynomial dynamical systems

Application. Outline. 3-1 Polynomial Functions 3-2 Finding Rational Zeros of. Polynomial. 3-3 Approximating Real Zeros of.

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

Limits and Continuity

INTERPOLATION. Interpolation is a process of finding a formula (often a polynomial) whose graph will pass through a given set of points (x, y).

correct-choice plot f(x) and draw an approximate tangent line at x = a and use geometry to estimate its slope comment The choices were:

Advanced Microeconomics

The Mean Value Theorem

0 0 such that f x L whenever x a

THE CENTRAL LIMIT THEOREM TORONTO

Inverse Functions and Logarithms

Fixed Point Theorems

Sequences and Series

15 Limit sets. Lyapunov functions

Course Notes for Math 162: Mathematical Statistics Approximation Methods in Statistics

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Taylor and Maclaurin Series

MATH 132: CALCULUS II SYLLABUS

The Relation between Two Present Value Formulae

Game Theory: Supermodular Games 1

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction

How To Understand The Theory Of Algebraic Functions

A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS. In memory of Rou-Huai Wang

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

AP CALCULUS AB 2008 SCORING GUIDELINES

Overview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series

Metric Spaces. Chapter 1

Nonparametric adaptive age replacement with a one-cycle criterion

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

Understanding Basic Calculus

Math 4310 Handout - Quotient Vector Spaces

Convex analysis and profit/cost/support functions

Math 3000 Section 003 Intro to Abstract Math Homework 2

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Graphic Designing with Transformed Functions

Section 3.7. Rolle s Theorem and the Mean Value Theorem. Difference Equations to Differential Equations

CONSTANT-SIGN SOLUTIONS FOR A NONLINEAR NEUMANN PROBLEM INVOLVING THE DISCRETE p-laplacian. Pasquale Candito and Giuseppina D Aguí

Short Programs for functions on Curves

To discuss this topic fully, let us define some terms used in this and the following sets of supplemental notes.

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

6. Define log(z) so that π < I log(z) π. Discuss the identities e log(z) = z and log(e w ) = w.

Continuity of the Perron Root

Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA

Machine Learning and Pattern Recognition Logistic Regression

Adaptive Linear Programming Decoding

BANACH AND HILBERT SPACE REVIEW

Stochastic Inventory Control

Representation of functions as power series

The Fourth International DERIVE-TI92/89 Conference Liverpool, U.K., July Derive 5: The Easiest... Just Got Better!

Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

Math 241, Exam 1 Information.

The Heat Equation. Lectures INF2320 p. 1/88

Calculus. Contents. Paul Sutcliffe. Office: CM212a.

2.3 Convex Constrained Optimization Problems

Scalar Valued Functions of Several Variables; the Gradient Vector

3 Introduction to Assessing Risk

Indiana State Core Curriculum Standards updated 2009 Algebra I

On the Interaction and Competition among Internet Service Providers

Continued Fractions and the Euclidean Algorithm

RANDOM INTERVAL HOMEOMORPHISMS. MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis

ARBITRAGE-FREE OPTION PRICING MODELS. Denis Bell. University of North Florida

Transcription:

CALIFORNIA INSTITUTE OF TECHNOLOGY Ma 2b KC Border Introduction to Probability and Statistics February 213 Differentiating under an integral sign In the derivation of Maximum Likelihood Estimators, or the Cramér Rao Lower Bound, we differentiated under an integral sign, and I never told you explicitly when that is allowed. Here is a theorem that gives conditions under which it is permissible. We start with a function g of a vector variable x and a single parameter θ. Let the domain of g be Θ, where is a rectangle in R n and Θ is an interval in R. So Define the function G: Θ R by g : Θ R. G(θ) = In order for this to make sense we assume: g(θ; x) dx. 1 Assumption Let be a rectangle in R n, let Θ be an open interval in R. Assume that for every θ Θ the function x g(θ; x) is continuous and has a finite integral. That is, We would like to show that g(θ; x) dx < G (θ) = ( θ Θ ). D 1 g(θ; x) dx, where D 1 g indicates the partial derivative of g with respect to its first argument θ. In order for this to be true, the partial derivative has to exist and be integrable with respect to x. 2 Assumption Assume that every x, the function θ g(θ; x) is continuous, and for every θ Θ, the partial derivative D 1 g(θ; x) exists, and is integrable with respect to x. That is, D 1 g(θ; x) dx <. 1

Ma 2b February 213 KC Border Differentiating an integral 2 But this is not enough. As you may remember, a partial derivative is a limit of the form ( ) lim h g(θ + h, x) g(θ, x) /h. You may have heard that passing limits through an integral sign requires a boundedness condition (the Lebesgue Dominated Convergence Theorem). It is beyond the scope of this course and this note to go through the details, but let me say that what is required is called a uniform local integrability condition. To get a feeling for what it means, realize that Assumption 2 requires that at each θ the function D 1 g(θ; x) is bounded in absolute value by an integrable function of x, namely D 1 g(θ; x) itself. Uniform local integrability requires that for each θ, there is little strip (θ ε, θ + ε) on which D 1 g(θ; x) is bounded uniformly in θ for each x by an integrable function of x. 3 Assumption Assume that for every θ Θ there is an ε > and a nonnegative function h: R (depending on θ ) such that for all x, and These assumptions are sufficient. 1 sup D 1 g(θ; x) h(x). θ (θ ε,θ +ε) h(x) dx <. 4 Theorem Let g : Θ R satisfy Assumptions 1, 2, and 3. Then the function G: Θ R defined by G(θ) = g(θ; x) dx is differentiable on Θ and G (θ) = D 1 g(θ; x) dx. Proof : For a proof, see Theorem 24.5 of Aliprantis and Burkinshaw [1, pp. 193 194] and the remarks following it. Assumption 3 is awkward to use, so a stronger condition that implies it is useful. The next theorem may be easier to apply. It requires to be closed and bounded, and g to have a jointly continuous partial derivative. It is similar to Theorem 8.11.2 in Dieudonné [2, p. 177]. 5 Theorem Let be a closed and bounded rectangle in R n, let Θ be an open interval in R, and let g : Θ R be jointly continuous. Assume that the partial derivative D 1 g(θ; x) is jointly continuous on Θ. Then the function G: Θ R defined by G(θ) = g(θ; x) dx is differentiable on Θ and G (θ) = D 1 g(θ; x) dx. Proof : Since g and D 1 g are jointly continuous, they are bounded on every set of the form [θ ε, θ + ε], and so integrable. Therefore the Hypotheses of Theorem 4 are satisfied. 1 We can make weaker assumptions, see, for instance, Aliprantis and Burkinshaw [1, pp. 193 194]. The problem is that the weaker assumptions involve measure-theoretic concepts that are beyond the scope of this course.

Ma 2b February 213 KC Border Differentiating an integral 3 1 An illustrative (counter)example The following example shows what can go wrong when the uniform local integrability Assumption 3 violated. You can find it in Gelbaum and Olmsted [3, Example 9.15, p. 123], but similar examples are well-known. 6 Example Let Θ = R and let = [, 1]. Define g : Θ R via θ 3 g(θ, x) = x 2 e θ2 /x x >, x =. The function g is plotted in figure 1. (Notice that for fixed x, the function θ g(θ, x) is continuous at each θ; and for each fixed θ, the function x g(θ, x) is continuous at each x, including x =. (This is because the exponential term goes to zero much faster than than polynomial term goes to zero as x.) The function g is not jointly continuous though: Along the curve x = θ 2 we have g(θ, x) = e 1 /θ, which diverges to as θ and diverges to as θ.) Define G(θ) = g(θ; x) dx = θ 3 1 x 2 e θ2 /x dx. Consulting a table of integrals if necessary, we find the indefinite integral x 2 e a/x dx = e a/x /a. Thus, letting a = θ 2 we have G(θ) = θe θ2 (θ R). The function G is plotted in Figure 2. Differentiation yields G (θ) = (1 2θ 2 )e θ2 (θ R). Note that G () = (1 2 2 )e 2 = 1. Now let s compute D 1 g(θ; x): For x =, g(θ; x) = for all θ, so D 1 g(θ; ) =. For x >, we have D 1 g(θ; x) = 3θ2 /x x 2 e θ2 + θ3 /x x 2 e θ2 ( 2θ/x) ( 3θ 2 = x 2 2θ4 ) x 3 e θ2 /x.

Ma 2b February 213 KC Border Differentiating an integral 4 Surface of graph. Contours. Figure 1. Plots of g(θ; x) = θ3 x 2 e θ2 /x, (x > ).

Ma 2b February 213 KC Border Differentiating an integral 5 Figure 2. G(θ). So ( 3θ 2 )e 2θ4 x D 1 g(θ; x) = 2 x 3 θ2 /x x > x =. (Note that for fixed θ, the limit of D 1 g(θ; x) as x is zero, so for each fixed θ, D 1 g(θ; x) is continuous in x. ALso, for each fixed x, D 1 g(θ; x) is continuous in θ. But again, along the curve x = θ 2, we have D 1 g(θ; x) = ( 3θ 2 2θ 2) e 1 = e 1 /θ 2 which diverges to as θ. Thus D 1 g(θ; x) is not jointly continuous at (, ). See Figures 3 and 4.) Define the integral I(θ) = D 1 g(θ; x) dx = ( 3θ 2 x 2 2θ4 x 3 ) e θ2 /x dx. It satisfies I() =. At θ =, we have G () = (1 2 2 )e 2 = 1 = I() = So the conclusion of Theorem 4 fails. D 1 g(; x) dx,

Ma 2b February 213 KC Border Differentiating an integral 6 Surface of the log graph. Contours. Figure 3. Plots of the log of D 1 g(θ; x) = ( 3θ 2 x 2 2θ4 x 3 )e θ2 /x.

Ma 2b February 213 KC Border Differentiating an integral 7 The failure of the theorem is confined to te case θ =. For θ >, I(θ) can be computed as I(θ) = ( 3θ 2 D 1 g(θ; x) dx = x 2 2θ4 ) x 3 e θ2 /x dx = 3θ 2 1 /x x 2 e θ2 dx 2θ 4 [ e = 3θ 2 θ 2 /x x=1 θ 2 x= = (1 2θ 2 )e θ2. ] [ 2θ 4 e θ2 /x 1 /x x 3 e θ2 dx ( 1 θ 4 + 1 θ 2 x ) ] x=1 x= For θ >, we have So the only problem is right at θ =. G (θ) = (1 2θ 2 )e θ2 = I(θ) = D 1 g(θ; x) dx. Let s check Assumption 3 for θ =. We need to find an ε > so that θ < ε implies D 1 g(θ; x) h(x) where h(x) dx <. Now for x >, ( 3θ 2 D 1 g(θ; x) = x 2 2θ4 ) x 3 e θ2 /x. Looking at points of the form x = θ 2, we see that h(x) must satisfy h(x) D 1 g( ( 3 x; x) = x x) 2 e 1 = e 1 /x. See Figure 4. So for any ε > for all θ < ε, we have x θ implies h(x) e 1 /x. Thus h(x) dx ε h(x) dx e 1 ε 1 dx =. x Thus Assumption 3 is violated by this example, and the conclusion of the theorem fails. References [1] C. D. Aliprantis and O. Burkinshaw. 1998. Principles of real analysis, 3d. ed. San Diego: Academic Press. [2] J. Dieudonné. 1969. Foundations of modern analysis. Number 1-I in Pure and Applied Mathematics. New York: Academic Press. Volume 1 of Treatise on Analysis. [3] B. R. Gelbaum and J. M. Olmsted. 23. Counterexamples in analysis. Minneola, NY: Dover. Slightly corrected reprint of the 1965 printing of the work originally published in 1964 by Holden Day in San Francisco.

Ma 2b February 213 KC Border Differentiating an integral 8 Figure 4. Cross-sections of of D 1 g(θ; x) = ( 3θ 2 x 2 2θ4 x 3 )e θ2 /x.