Scalar Valued Functions of Several Variables; the Gradient Vector



Similar documents
DERIVATIVES AS MATRICES; CHAIN RULE

Section 12.6: Directional Derivatives and the Gradient Vector

TOPIC 4: DERIVATIVES

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

Limits and Continuity

Class Meeting # 1: Introduction to PDEs

88 CHAPTER 2. VECTOR FUNCTIONS. . First, we need to compute T (s). a By definition, r (s) T (s) = 1 a sin s a. sin s a, cos s a

4.5 Linear Dependence and Linear Independence

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

Differentiation of vectors

3 Contour integrals and Cauchy s Theorem

Math 241, Exam 1 Information.

Multi-variable Calculus and Optimization

CURVE FITTING LEAST SQUARES APPROXIMATION

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Vector and Matrix Norms

1 Error in Euler s Method

Microeconomic Theory: Basic Math Concepts

correct-choice plot f(x) and draw an approximate tangent line at x = a and use geometry to estimate its slope comment The choices were:

Continued Fractions and the Euclidean Algorithm

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

Solutions to Homework 10

1 Inner Products and Norms on Real Vector Spaces

Solutions for Review Problems

Lecture 5 Principal Minors and the Hessian

10.2 Series and Convergence

1 Norms and Vector Spaces

NOTES ON LINEAR TRANSFORMATIONS

Chapter 2. Parameterized Curves in R 3

F Matrix Calculus F 1

BANACH AND HILBERT SPACE REVIEW

Matrix Differentiation

2.2 Derivative as a Function

Systems of Linear Equations

Solutions to Practice Problems for Test 4

Stirling s formula, n-spheres and the Gamma Function

Math 4310 Handout - Quotient Vector Spaces

5 Numerical Differentiation

About the Gamma Function

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Vector has a magnitude and a direction. Scalar has a magnitude

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

A Primer on Index Notation

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

1.7 Graphs of Functions

= δx x + δy y. df ds = dx. ds y + xdy ds. Now multiply by ds to get the form of the equation in terms of differentials: df = y dx + x dy.

x(x + 5) x 2 25 (x + 5)(x 5) = x 6(x 4) x ( x 4) + 3

Name: Section Registered In:

Math 120 Final Exam Practice Problems, Form: A

Chapter 9. Systems of Linear Equations

Fundamental Theorems of Vector Calculus

Unified Lecture # 4 Vectors

Solutions - Homework sections

Student Performance Q&A:

5 Homogeneous systems

Review of Fundamental Mathematics

Exercises and Problems in Calculus. John M. Erdman Portland State University. Version August 1, 2013

Linear Equations and Inequalities

Figure 1.1 Vector A and Vector F

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

26 Integers: Multiplication, Division, and Order

Metric Spaces. Chapter Metrics

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Linear and quadratic Taylor polynomials for functions of several variables.

Solutions to Math 51 First Exam January 29, 2015

Linearly Independent Sets and Linearly Dependent Sets

Math 115 Extra Problems for 5.5

Practice with Proofs

ON CERTAIN DOUBLY INFINITE SYSTEMS OF CURVES ON A SURFACE

Section 5.1 Continuous Random Variables: Introduction

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

SOLUTIONS TO HOMEWORK ASSIGNMENT #4, MATH 253

Notes on Determinant

Lecture Notes on Elasticity of Substitution

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Calculus with Parametric Curves

PRACTICE FINAL. Problem 1. Find the dimensions of the isosceles triangle with largest area that can be inscribed in a circle of radius 10cm.

Linear Algebra: Vectors

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Method of Green s Functions

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

PYTHAGOREAN TRIPLES KEITH CONRAD

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Consumer Theory. The consumer s problem

0 <β 1 let u(x) u(y) kuk u := sup u(x) and [u] β := sup

Linear Algebra Notes

Solving Linear Systems, Continued and The Inverse of a Matrix

Constrained optimization.

Section 2.7 One-to-One Functions and Their Inverses

Similarity and Diagonalization. Similar Matrices

4. Continuous Random Variables, the Pareto and Normal Distributions

Inverse Functions and Logarithms

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Collinear Points in Permutations

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization

Homogeneous systems of algebraic equations. A homogeneous (ho-mo-geen -ius) system of linear algebraic equations is one in which

Transcription:

Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions vector valued function of n variables: Let us consider a scalar (i.e., numerical, rather than y = φ(x = φ(x 1, x 2,..., x n, X D, where the domain D is a region in R n. By this we mean that D is an open, connected set. The term open means that each point X 0 D is the center of a ball N(X 0, r = { X X X0 < r }, for some r > 0, which still lies in D. The term connected, as we will use it here, means that any two points in D can be connected by a curve lying in D. The partial derivatives of the function φ(x at a point X 0 D are the numbers, which we denote by (X 0, k = 1, 2,..., n, obtained by differentiating the function φ(x with respect to the variable x k at the point X 0, treating the other variables, x 1,..., x k 1, x k+1,..., x n, as constant quantities. Taking the components of X 0 to be x 0,k, k = 1, 2,..., n, these partial derivatives exist just in case the corresponding limits of difference quotients φ(x 0,1,..., x 0,k 1, x k, x 0,k+1,..., x 0,n φ(x 0,1,..., x 0,k 1, x 0,k, x 0,k+1,..., x 0,n lim x k x 0,k x k x 0,k exist; the limit values are the partial derivatives we have indicated. As we allow the point X 0 to range throughout the region D the relationship thus engendered between X 0 and the partial derivative (X 0 defines a function; replacing X 0 by the symbol X, we denote this function by (X. It is also common to see this function written as φ xk (X = φ xk (x 1, x 2,..., x n. If each of these functions is continuous as a function of X D then we say that φ(x is continuously differentiable in the region D. In the work to follow we will assume continuous differentiability unless we specifically indicate otherwise. If we combine the partial derivatives (X 0, k = 1, 2,..., n into an n-dimensional vector, in the obvious order, we obtain the gradient (vector of the function φ(x at 1

the point X 0. A commonly used notation for the gradient at X 0 is φ(x 0. Again letting X 0 range throughout D, and replacing X 0 by X we obtain the vector function ( φ(x = φ(x 1, x 2,..., x n = (X, (X,..., (X. x 1 x 2 x n In R 3 we can write φ(x = = x ( x (x, y, z, y (x, y, zi + y (x, y, z, (x, y, z z (x, y, zj + (x, y, zk. z In this way we obtain a vector field in the region D, since the dimension of φ(x is n, the same as the dimension of the independent vector variable X. Example 1 If, in R 2, we define φ(x, y = x 2 y + y3 3, then the corresponding gradient field is ( φ(x, y = x (x2 y + y3 3, y (x2 y + y3 3 = ( 2xy, x 2 + y 2. The gradient vector at the point (1, 2, for example, is (4, 5. A Notational Convention Vectors can be written in row form: (x 1, x 2,..., x n, or in column form: x 1 x 2.. x n These vectors are said to be transposes of each other; if the row vector is designated as X, then the corresponding column vector is designated as X, and vice versa. If X and Y are n-dimensional column and row vectors, respectively, it is common to write x 1 X Y = Y (X = (y 1 y 2 x 2... y n. = n y k x k. x n 2

If W is an n-dimensional row vector then W X n w k x k. Gradients are usually written as row vectors; thus the preceding implies that φ(x 0 X n (X 0 x k. We will use the notation repeatedly in the sequel. First Order Linear Approximation to a Scalar Function If φ(x is a continuously differentiable function of X in the region D which contains the point X 0 the gradient φ(x 0 can be used to approximate values of the function φ(x at points X D lying close to X 0. The formula for this approximation relationship is φ(x φ(x 0 + φ(x 0 (X X 0 = n (X 0 (x k x 0,k. The right hand side defines a linear function of the vector variable X, which we call the linear approximation to φ(x at the point X 0. We will denote this linear function by L φ,x0 (X. Figure 1 3

This relationship can be illustrated for n = 2, φ = φ(x, y by noting that in this case the graph of z = L φ,x0,y 0 (x, y is a plane in R 3 which is tangent to the graph of z = φ(x, y at the point (x 0, y 0, φ(x 0, y 0. Example 2 Let w = x 2 + y 3 + z 4. Taking the base point X 0 = (x 0, y 0, z 0 = (1, 1, 1 we have w 0 φ(1, 1, 1 = 3. Now φ(x, y, z = (2x 3y 4z ; φ(1, 1, 1 = (2 3 4. Accordingly, the linear approximation to φ(x, y, z as given by the above formula is x 1 3 + 2(x 1 + 3(y 1 + 4(z 1 = w = 3 + ( 2 3 4 y 1 =. z 1 6 + 2x + 3y + 4z The linear approximation can be used to estimate the value of φ at points X near the base point X 0 (with some degree of error corresponding to the term o ( X X 0. Thus, in the example just studied, if we take (x, y, z = (1.1,.9, 1.05, the linear approximation gives the value L φ,(1,1,1 (1.1,.9, 1.05 = 6 + 2(1.1 + 3(.9 + 4(1.05 = 3.1, whereas the actual value is φ(1.1,.9, 1.05 = (1.1 2 + (.9 3 + (1.05 4 = 3.1545. Basis for the Approximation; Error We want to see why the linear approximation, involving the gradient vector, works as it does, and to obtain a bound on the error incurred in use of this approximation. For clarity and brevity we carry out the calculations for the case n = 2; the argument in a larger number of variables is essentially identical. Let φ(x, y be continuously differentiable in a region D, let X 0 = (x 0, y 0 be a point in D and let N(X 0, r be a disc of positive radius, r, centered at X 0, which still lies in D. Let X = (x, y be a point in N(X 0, r. We consider the difference φ(x φ(x 0 = φ(x, y φ(x 0, y 0 = φ(x, y φ(x, y 0 + φ(x, y 0 φ(x 0, y 0. 4

Applying the Mean Value Theorem twice, we have φ(x, y φ(x, y 0 + φ(x, y 0 φ(x 0, y 0 = y (x, η(y y 0 + x (ξ, y 0(x x 0, where ξ lies between x 0 and x and η lies between y 0 and y. Then, clearly, + φ(x, y φ(x 0, y 0 = y (x 0, y 0 (y y 0 + x (x 0, y 0 (x x 0 ( ( (x, η y y (x 0, y 0 (y y 0 + x (ξ, y 0 x (x 0, y 0 (x x 0 = φ(x 0, y 0 (X X 0 + γ(x 0, X(X X 0, where, in the error term γ(x 0, X(X X 0, γ(x 0, X = ( (x, η (x y y 0, y 0 (ξ, y x 0 (x x 0, y 0. Applying the Schwarz inequality to the error term we have γ(x 0, X(X X 0 γ(x 0, X X X 0. Because φ(x is continuously differentiable, lim X X0 γ(x 0, X = lim ( X X y (x, η y (x 0, y 0 0 x (ξ, y 0 x (x 0, y 0 = 0. This means that γ(x 0, X(X X 0 γ(x 0, X X X 0 has the property γ(x 0, X(X X 0 lim X X 0 X X 0 a relationship which we express by saying that = 0, φ(x = φ(x 0 + φ(x 0 (X X 0 + o( X X 0 ; i.e., the error in the approximation tends to zero, as X X 0 0, more rapidly than any multiple of X X 0. It should be noted that the error estimate just obtained depends critically on the continuity of the partial derivatives of the function φ(x. Without that property one can show, with appropriate examples, that such an estimate need not hold. The approximation relationship holds in exactly the same way in R n, for a general positive integer n, as we have shown it to hold for n = 2. 5

Proposition (Chain Rule Suppose X(t is a continuously differentiable curve in R n and φ(x is a continuously differentiable function of X D R n. Then d dt φ(x(t = φ(x(tx (t. Fix a value of t, say t 0. Then X(t = X(t 0 + X (t 0 (t t 0 + o( t t 0 as t t 0. Using the formula for the first order approximation to φ(x at X = X(t 0 we find that φ(x(t = φ(x(t 0 + φ(x(t 0 (X(t X(t 0 + o( X(t X(t 0 ( = φ(x(t 0 + φ(x(t 0 X(t 0 + X (t 0 (t t 0 + o( t t 0 X(t 0 +o ( φ(x(t 0 (X (t 0 (t t 0 + o( t t 0 = φ(x(t 0 + φ(x(t 0 Dividing by t t 0 we then have ( X (t 0 (t t 0 + o( t t 0. φ(x(t φ(x(t 0 t t 0 = φ(x(t 0 X (t 0 + o( t t 0 t t 0. Letting t t 0 the difference quotient on the left approaches d dt φ(x(t t=t0 while the expression at the right approaches φ(x(t 0 X (t 0. Thus dφ dt (X(t = φ(x(t 0 X (t 0. t=t0 Since t 0 could be any value of t, the result follows. An important special case occurs when we take X(t to be the straight line We then obtain, since dx dt U, X(t = X 0 + t U, U = 1. dφ dt (X 0 = dφ dt (X(0 = φ(x 0U dx du (X 0. 6

This is called the directional derivative of φ(x in the direction U at the point X 0. It represents the slope of the graph of φ(x in the direction corresponding to the unit vector U. Using the Schwarz inequality again we obtain dφ dt (X 0 = φ(x 0 U φ(x 0 U with the inequality holding strictly unless U is collinear with φ(x 0. The two unit vectors collinear with φ(x 0 are U ± = ± φ(x 0 φ(x 0, provided φ(x 0 is not the zero (column vector. For those the corresponding directional derivatives are φ(x 0 U ± = ± φ(x 0 φ(x 0 φ(x 0 = ± φ(x 0. The gradient vectors U +, U thus correspond to the extreme values of the directional derivative U (X 0; the uphill direction U + yields its maximum and the downhill direction U yields its minimum, those maximum and minimum values being φ(x 0 and φ(x 0, respectively. Suggested Exercises 1. Find the gradients, as vector functions of X, for φ(x = φ(x, y = sin(x 2 +y 2 +2x 3y; ψ(x = ψ(x, y, z = x 2 + 2xy 2 +y 4 z 2. 2. Find the linear approximation to ψ(x, y, z, as defined in 2.,at the point X = 1 1. Use the linear approximation to estimate the values of φ at the points 2 1.2 1.2 2.3 and 1.4 1.4. Where is the approximation most effective? Why? 2.5 3. Let φ(x, y be as defined in 1. Find all unit vectors U such that ( 1, 2 = 0. U 4. Let X(t be the curve described by X(t = t + 1 t 3 t 2 t and let ψ(x be as defined in 1. Compute d ψ(x(t as a function of t and give its value when t = 1. dt 7