Newton s Method for Unconstrained Optimization
|
|
- Job Stevens
- 7 years ago
- Views:
Transcription
1 Newton s Method for Unconstrained Optimization Robert M. Freund February, Massachusetts Institute of Technology.
2 Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion of f (x) at x = x. Here f (x) is the gradient of f (x) and H (x) is the Hessian of f (x). Notice that h(x) is a quadratic function, which is minimized by solving h(x) =. Since the gradient of h(x) is: we therefore are motivated to solve: which yields h(x) = f ( x)+ H( x)(x x), f ( x)+ H ( x)(x x) =, x x = H( x) f ( x). The direction H ( x) f ( x) is called the Newton direction, or the Newton step at x = x. This leads to the following algorithm for solving (P): Step Given x,set k Newton s Method: Step d k = H(x k ) f (x k ). If d k =, then stop. Step 2 Choose step-size α k =. 2
3 Step 3 Set x k+ x k + α k d k, k k +. Go to Step. Note the following: The method assumes H(x k ) is nonsingular at each iteration. There is no guarantee that f(x k+ ) f(x k ). Step 2 could be augmented by a line-search of f(x k + αd k )tofind an optimal value of the step-size parameter α. Recall that we call a matrix SPD if it is symmetric and positive definite. Proposition. If H(x) is SPD and d := H(x) f(x), then d is a descent direction, i.e., f(x + αd) < f(x) for all sufficiently small values of α. Proof: It is sufficient to show that f(x) t d = f(x) t H(x) f(x) <. This will clearly be the case if H(x) is SPD. Since H(x) is SPD, if v =, < (H(x) v) t H(x)(H(x) v) = v t H(x) v, thereby showing that H(x) is SPD.. Rates of convergence A sequence of numbers {s i } exhibits linear convergence if lim i s i = s and s i+ s lim = δ <. i s i s If δ = in the above expression, the sequence exhibits superlinear convergence. A sequence of numbers {s i } exhibits quadratic convergence if lim i s i = s and s i+ s lim = δ <. i s i s 2 3
4 .. Examples of Rates of Convergence ( ) i Linear convergence: s i = :.,.,., etc. s =. s i+ s =.. s i s Superlinear convergence: s i = i! :, 2, 6, 24, 25,etc. s =. s i+ s i! = = as i. s i s (i +)! i + ( ) (2 i ) Quadratic convergence: s i = :.,.,.,., etc. s =. s i+ s ( 2i ) 2 = =. s i s 2 2i+.2 Quadratic Convergence of Newton s Method We have the following quadratic convergence theorem. In the theorem, we use the operator norm of a matrix M: M := max { Mx x =}. x Theorem. (Quadratic Convergence Theorem) Suppose f(x) is twice continuously differentiable and x is a point for which f(x )=. Suppose that H(x) satisfies the following conditions: there exists a scalar h > for which [H(x )] h there exists scalars β> and L> for which H(x) H(x ) L x x for all x satisfying x x β { } Let x satisfy x x <γ:= min β, 2h 3L,and let x N := x H(x) f(x). Then: ( ) L (i) x N x x x 2 2(h L x x ) (ii) x N x < x x <γ 4
5 ( ) (iii) x 2 3L N x x x 2h Example : Let f (x) =7x ln(x). Then f (x) = f (x) =7 x and H(x) = f (x) = 2. It is not hard to check that x = 7 = is x the unique global minimum. The Newton direction at x is ( ) 2 2 d = H (x) f ( x) f (x) = = x 7 = x 7x. f ( x) x k Newton s method will generate the sequence of iterates {x } satisfying: k+ k ) 2 x = x k +(x k 7(x k ) 2 )=2x k 7(x. Below are some examples of the sequences generated by this method for different starting points. k x k x k x k x k , By the way, the range of quadratic convergence for Newton s method for this function happens to be x (., ). 5
6 Example 2: f(x) = ln( x x 2 ) ln x ln x 2. f(x) = x x 2 x x x 2 x 2 ) 2 ( x, ( ) 2 ( ) 2 x x 2 + x x 2 H(x) =. ( ) 2 ( ) 2 ( ) 2 x x 2 x x 2 + x 2 ( ) 3, f(x )= x = 3, k x k x k 2 x k x e e e 6 Comments: The convergence rate is quadratic: x N x x x 2 3L 2h We typically never know β, h, or L. However, there are some amazing n exceptions, for example f(x) = ln(x j ), as we will soon see. j= The constants β, h, and L depend on the choice of norm used, but the method does not. This is a drawback in the concept. But we do not know β, h, or L anyway. x x x 2 In the limit we obtain x N 6 L 2h
7 We did not assume convexity, only that H(x ) is nonsingular and not badly behaved near x. One can view Newton s method as trying successively to solve f(x) = by successive linear approximations. Note from the statement of the convergence theorem that the iterates of Newton s method are equally attracted to local minima and local maxima. Indeed, the method is just trying to solve f(x) =. What if H(x k ) becomes increasingly singular (or not positive definite)? In this case, one way to fix this is to use H(x k )+ ɛi. () Comparison with steepest-descent. One can think of steepest-descent as ɛ + in () above. The work per iteration of Newton s method is O(n 3 ) So-called quasi-newton methods use approximations of H(x k )at each iteration in an attempt to do less work per iteration. 2 Proof of Theorem. The proof relies on the following two elementary facts. For the first fact, let v denote the usual Euclidian norm of a vector, namely v := v T v. The operator norm of a matrix M is defined as follows: M := max { Mx x =}. x Proposition 2. Suppose that M is a symmetric matrix. Then the following are equivalent:. h> satisfies M h 2. h> satisfies Mv h v for any vector v 7
8 You are asked to prove this as part of your homework for the class. Proposition 2.2 Suppose that f (x) is twice differentiable. Then f (z) f (x) = [H(x + t(z x))] (z x)dt. Proof: Let φ(t) := f (x+t(z x)). Then φ() = f (x) and φ() = f (z), and φ (t) =[H(x + t(z x))] (z x). From the fundamental theorem of calculus, we have: f (z) f (x) = φ() φ() = φ (t)dt ProofofTheorem.: We have: = [H(x + t(z x))] (z x)dt. x N x = x H(x) f (x) x = x x + H (x) ( f (x ) f (x)) = x x + H (x) [H (x + t(x x))] (x x)dt (from Proposition 2.2) = H(x) [H(x + t(x x)) H(x)] (x x)dt 8
9 Therefore x N x H(x) [H(x + t(x x)) H(x)] (x x) dt x x H (x) L t (x x) dt = x x 2 H(x) L tdt = x x 2 H(x) L 2 We now bound H(x).Let v be any vector. Then H(x)v = H (x )v +(H (x) H(x ))v H (x )v (H(x) H(x ))v h v H(x) H (x ) v (from Proposition 2.) h v L x x v = (h L x x ) v. Invoking Proposition 2. again, we see that this implies that Combining this with the above yields H(x). h L x x L x N x x x 2 2(h L x x ), which is (i) of the theorem. Because L x x < 3 we have: x N x x x 2h 3 2h L x x < ( 2(h L x x ) 2 h 2h 3 9 ) x x = x x,
10 which establishes (ii) of the theorem. Finally, we have L L 3L x N x x x 2 2(h L x x ) x x 2 ( ) 2 h = x 2h x 2 2h, which establishes (iii) of the theorem. 3 3 Newton s Method Exercises. (Newton s Method) Suppose we want to minimize the following function: f(x) =9x 4ln(x 7) over the domain X = {x x>7} using Newton s method. (a) Give an exact formula for the Newton iterate for a given value of x. (b) Using a calculator (or a computer, if you wish), compute five iterations of Newton s method starting at each of the following points, and record your answers: x =7.4 x =7.2 x =7. x =7.8 x =7.88 (c) Verify empirically that Newton s method will converge to the optimal solution for all starting values of x in the range (7, ). What behavior does Newton s method exhibit outside of this range? 2. (Newton s Method) Suppose we want to minimize the following function: f(x) =6x 4ln(x 2) 3 ln(25 x) over the domain X = {x 2 < x<25} using Newton s method. (a) Using a calculator (or a computer, if you wish), compute five iterations of Newton s method starting at each of the following points, and record your answers:
11 x = 2.6 x = 2.7 x = 2.4 x = 2.8 x = 3. (b) Verify empirically that Newton s method will converge to the optimal solution for all starting values of x in the range (2, 3.5). What behavior does Newton s method exhibit outside of this range? 3. (Newton s Method) Suppose that we seek to minimize the following function: f(x,x 2 )= 9x x 2 +θ( ln( x x 2 ) ln(x ) ln(x 2 ) ln(5 x +x 2 )), where θ is a given parameter, on the domain X = {(x,x 2 ) x >,x 2 >,x + x 2 <,x x 2 < 5}. This exercise asks you to implement Newton s method on this problem, first without a line- search, and then with a line-search. Run your algorithm for θ = and for θ =, using the following starting points. x =(8, 9) T. x =(, 4) T. x =(5, 68.69) T. x =(, 2) T. (a) When you run Newton s method without a line-search for this problem and with these starting points, what behavior do you observe? (b) When you run Newton s method with a line-search for this problem, what behavior do you observe? 4. (Projected Newton s Method) Prove Proposition 6. of the notes on Projected Steepest Descent. 5. (Newton s Method) In class we described Newton s method as a method for finding a point x for which f(x ) =. Now consider the following setting, where we have n nonlinear equations in n unknowns x = (x,...,x n ):
12 which we conveniently write as g (x) =... g n (x) =, g(x) =. Let J (x) denote the Jacobian matrix (of partial derivatives) of g(x). Then at x = x we have g( x + d) g( x)+ J ( x)d, this being the linear approximation of g(x) at x = x. Construct a version of Newton s method for solving the equation system g(x) =. 6. (Newton s Method) Suppose that f (x) is a strictly convex twice-continuously differentiable function, and consider Newton s method with a linex, we compute the Newton direction d = [H( x)] f ( x) search. Given and the next iterate x is chosen to satisfy: x := arg min f ( x + αd). α Prove that the iterates of this method converge to the unique global minimum of f (x). 7. Prove Proposition 2. of the notes on Newton s method. 8. Bertsekas, Exercise.4., page 99. 2
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
More informationNumerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
More information(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationt := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
More informationHOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!
Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following
More informationNonlinear Algebraic Equations. Lectures INF2320 p. 1/88
Nonlinear Algebraic Equations Lectures INF2320 p. 1/88 Lectures INF2320 p. 2/88 Nonlinear algebraic equations When solving the system u (t) = g(u), u(0) = u 0, (1) with an implicit Euler scheme we have
More information2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More informationNonlinear Algebraic Equations Example
Nonlinear Algebraic Equations Example Continuous Stirred Tank Reactor (CSTR). Look for steady state concentrations & temperature. s r (in) p,i (in) i In: N spieces with concentrations c, heat capacities
More informationPrinciples of Scientific Computing Nonlinear Equations and Optimization
Principles of Scientific Computing Nonlinear Equations and Optimization David Bindel and Jonathan Goodman last revised March 6, 2006, printed March 6, 2009 1 1 Introduction This chapter discusses two related
More informationLimits and Continuity
Math 20C Multivariable Calculus Lecture Limits and Continuity Slide Review of Limit. Side limits and squeeze theorem. Continuous functions of 2,3 variables. Review: Limits Slide 2 Definition Given a function
More informationDate: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationSolving Linear Systems, Continued and The Inverse of a Matrix
, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing
More informationBig Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More information1 Lecture: Integration of rational functions by decomposition
Lecture: Integration of rational functions by decomposition into partial fractions Recognize and integrate basic rational functions, except when the denominator is a power of an irreducible quadratic.
More informationLecture 5 Principal Minors and the Hessian
Lecture 5 Principal Minors and the Hessian Eivind Eriksen BI Norwegian School of Management Department of Economics October 01, 2010 Eivind Eriksen (BI Dept of Economics) Lecture 5 Principal Minors and
More informationScalar Valued Functions of Several Variables; the Gradient Vector
Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions vector valued function of n variables: Let us consider a scalar (i.e., numerical, rather than y = φ(x = φ(x 1,
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationH/wk 13, Solutions to selected problems
H/wk 13, Solutions to selected problems Ch. 4.1, Problem 5 (a) Find the number of roots of x x in Z 4, Z Z, any integral domain, Z 6. (b) Find a commutative ring in which x x has infinitely many roots.
More informationGenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter MWetter@lbl.gov February 20, 2009 Notice: This work was supported by the U.S.
More informationLinear and quadratic Taylor polynomials for functions of several variables.
ams/econ 11b supplementary notes ucsc Linear quadratic Taylor polynomials for functions of several variables. c 010, Yonatan Katznelson Finding the extreme (minimum or maximum) values of a function, is
More informationLecture 13 Linear quadratic Lyapunov theory
EE363 Winter 28-9 Lecture 13 Linear quadratic Lyapunov theory the Lyapunov equation Lyapunov stability conditions the Lyapunov operator and integral evaluating quadratic integrals analysis of ARE discrete-time
More informationSolutions of Equations in One Variable. Fixed-Point Iteration II
Solutions of Equations in One Variable Fixed-Point Iteration II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationIntroduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
More informationMulti-variable Calculus and Optimization
Multi-variable Calculus and Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Multi-variable Calculus and Optimization 1 / 51 EC2040 Topic 3 - Multi-variable Calculus
More informationMATH 425, PRACTICE FINAL EXAM SOLUTIONS.
MATH 45, PRACTICE FINAL EXAM SOLUTIONS. Exercise. a Is the operator L defined on smooth functions of x, y by L u := u xx + cosu linear? b Does the answer change if we replace the operator L by the operator
More informationNumerical Methods for Solving Systems of Nonlinear Equations
Numerical Methods for Solving Systems of Nonlinear Equations by Courtney Remani A project submitted to the Department of Mathematical Sciences in conformity with the requirements for Math 4301 Honour s
More informationNumerical Solution of Differential
Chapter 13 Numerical Solution of Differential Equations We have considered numerical solution procedures for two kinds of equations: In chapter 10 the unknown was a real number; in chapter 6 the unknown
More informationTaylor and Maclaurin Series
Taylor and Maclaurin Series In the preceding section we were able to find power series representations for a certain restricted class of functions. Here we investigate more general problems: Which functions
More informationconstraint. Let us penalize ourselves for making the constraint too big. We end up with a
Chapter 4 Constrained Optimization 4.1 Equality Constraints (Lagrangians) Suppose we have a problem: Maximize 5, (x 1, 2) 2, 2(x 2, 1) 2 subject to x 1 +4x 2 =3 If we ignore the constraint, we get the
More informationPractice with Proofs
Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using
More informationGeneral Framework for an Iterative Solution of Ax b. Jacobi s Method
2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,
More informationCost Minimization and the Cost Function
Cost Minimization and the Cost Function Juan Manuel Puerta October 5, 2009 So far we focused on profit maximization, we could look at a different problem, that is the cost minimization problem. This is
More informationIncreasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.
1. Differentiation The first derivative of a function measures by how much changes in reaction to an infinitesimal shift in its argument. The largest the derivative (in absolute value), the faster is evolving.
More information1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
More informationSummer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.
Summer course on Convex Optimization Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.Minnesota Interior-Point Methods: the rebirth of an old idea Suppose that f is
More informationNumerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
More informationCS 295 (UVM) The LMS Algorithm Fall 2013 1 / 23
The LMS Algorithm The LMS Objective Function. Global solution. Pseudoinverse of a matrix. Optimization (learning) by gradient descent. LMS or Widrow-Hoff Algorithms. Convergence of the Batch LMS Rule.
More informationCross product and determinants (Sect. 12.4) Two main ways to introduce the cross product
Cross product and determinants (Sect. 12.4) Two main ways to introduce the cross product Geometrical definition Properties Expression in components. Definition in components Properties Geometrical expression.
More informationNonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,
More informationSECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the
More information14.451 Lecture Notes 10
14.451 Lecture Notes 1 Guido Lorenzoni Fall 29 1 Continuous time: nite horizon Time goes from to T. Instantaneous payo : f (t; x (t) ; y (t)) ; (the time dependence includes discounting), where x (t) 2
More informationDifferentiation and Integration
This material is a supplement to Appendix G of Stewart. You should read the appendix, except the last section on complex exponentials, before this material. Differentiation and Integration Suppose we have
More informationSIXTY STUDY QUESTIONS TO THE COURSE NUMERISK BEHANDLING AV DIFFERENTIALEKVATIONER I
Lennart Edsberg, Nada, KTH Autumn 2008 SIXTY STUDY QUESTIONS TO THE COURSE NUMERISK BEHANDLING AV DIFFERENTIALEKVATIONER I Parameter values and functions occurring in the questions belowwill be exchanged
More informationNonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
More informationWalrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.
Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian
More informationClass Meeting # 1: Introduction to PDEs
MATH 18.152 COURSE NOTES - CLASS MEETING # 1 18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck Class Meeting # 1: Introduction to PDEs 1. What is a PDE? We will be studying functions u = u(x
More information6.4 Logarithmic Equations and Inequalities
6.4 Logarithmic Equations and Inequalities 459 6.4 Logarithmic Equations and Inequalities In Section 6.3 we solved equations and inequalities involving exponential functions using one of two basic strategies.
More informationTOPIC 4: DERIVATIVES
TOPIC 4: DERIVATIVES 1. The derivative of a function. Differentiation rules 1.1. The slope of a curve. The slope of a curve at a point P is a measure of the steepness of the curve. If Q is a point on the
More informationElasticity Theory Basics
G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold
More informationMicroeconomic Theory: Basic Math Concepts
Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts
More informationDERIVATIVES AS MATRICES; CHAIN RULE
DERIVATIVES AS MATRICES; CHAIN RULE 1. Derivatives of Real-valued Functions Let s first consider functions f : R 2 R. Recall that if the partial derivatives of f exist at the point (x 0, y 0 ), then we
More informationCITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
More information1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
More information17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
More informationA NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
More information19 LINEAR QUADRATIC REGULATOR
19 LINEAR QUADRATIC REGULATOR 19.1 Introduction The simple form of loopshaping in scalar systems does not extend directly to multivariable (MIMO) plants, which are characterized by transfer matrices instead
More informationFurther Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
More informationLS.6 Solution Matrices
LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions
More informationDuality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
More informationInterior Point Methods and Linear Programming
Interior Point Methods and Linear Programming Robert Robere University of Toronto December 13, 2012 Abstract The linear programming problem is usually solved through the use of one of two algorithms: either
More informationSolutions to Homework 5
Solutions to Homework 5 1. Let z = f(x, y) be a twice continously differentiable function of x and y. Let x = r cos θ and y = r sin θ be the equations which transform polar coordinates into rectangular
More informationFactoring Cubic Polynomials
Factoring Cubic Polynomials Robert G. Underwood 1. Introduction There are at least two ways in which using the famous Cardano formulas (1545) to factor cubic polynomials present more difficulties than
More informationSOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve
SOLUTIONS Problem. Find the critical points of the function f(x, y = 2x 3 3x 2 y 2x 2 3y 2 and determine their type i.e. local min/local max/saddle point. Are there any global min/max? Partial derivatives
More information10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
More informationThe Method of Partial Fractions Math 121 Calculus II Spring 2015
Rational functions. as The Method of Partial Fractions Math 11 Calculus II Spring 015 Recall that a rational function is a quotient of two polynomials such f(x) g(x) = 3x5 + x 3 + 16x x 60. The method
More informationIterative Methods for Optimization
Iterative Methods for Optimization C.T. Kelley North Carolina State University Raleigh, North Carolina Society for Industrial and Applied Mathematics Philadelphia Contents Preface xiii How to Get the Software
More informationSection 3.7. Rolle s Theorem and the Mean Value Theorem. Difference Equations to Differential Equations
Difference Equations to Differential Equations Section.7 Rolle s Theorem and the Mean Value Theorem The two theorems which are at the heart of this section draw connections between the instantaneous rate
More informationPolynomials. Dr. philippe B. laval Kennesaw State University. April 3, 2005
Polynomials Dr. philippe B. laval Kennesaw State University April 3, 2005 Abstract Handout on polynomials. The following topics are covered: Polynomial Functions End behavior Extrema Polynomial Division
More informationMath 120 Final Exam Practice Problems, Form: A
Math 120 Final Exam Practice Problems, Form: A Name: While every attempt was made to be complete in the types of problems given below, we make no guarantees about the completeness of the problems. Specifically,
More informationPUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.
PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationLecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10
Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 10 Boundary Value Problems for Ordinary Differential Equations Copyright c 2001. Reproduction
More informationBIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION
BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationOnline Convex Optimization
E0 370 Statistical Learning heory Lecture 19 Oct 22, 2013 Online Convex Optimization Lecturer: Shivani Agarwal Scribe: Aadirupa 1 Introduction In this lecture we shall look at a fairly general setting
More informationBANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
More informationLecture 14: Section 3.3
Lecture 14: Section 3.3 Shuanglin Shao October 23, 2013 Definition. Two nonzero vectors u and v in R n are said to be orthogonal (or perpendicular) if u v = 0. We will also agree that the zero vector in
More informationcorrect-choice plot f(x) and draw an approximate tangent line at x = a and use geometry to estimate its slope comment The choices were:
Topic 1 2.1 mode MultipleSelection text How can we approximate the slope of the tangent line to f(x) at a point x = a? This is a Multiple selection question, so you need to check all of the answers that
More informationLecture 7: Finding Lyapunov Functions 1
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1
More informationFixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
More informationSection 12.6: Directional Derivatives and the Gradient Vector
Section 26: Directional Derivatives and the Gradient Vector Recall that if f is a differentiable function of x and y and z = f(x, y), then the partial derivatives f x (x, y) and f y (x, y) give the rate
More informationBindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8
Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e
More informationChapter 6. Cuboids. and. vol(conv(p ))
Chapter 6 Cuboids We have already seen that we can efficiently find the bounding box Q(P ) and an arbitrarily good approximation to the smallest enclosing ball B(P ) of a set P R d. Unfortunately, both
More informationAN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationProbabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
More informationFACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z
FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z DANIEL BIRMAJER, JUAN B GIL, AND MICHAEL WEINER Abstract We consider polynomials with integer coefficients and discuss their factorization
More informationMean value theorem, Taylors Theorem, Maxima and Minima.
MA 001 Preparatory Mathematics I. Complex numbers as ordered pairs. Argand s diagram. Triangle inequality. De Moivre s Theorem. Algebra: Quadratic equations and express-ions. Permutations and Combinations.
More information1 Inner Products and Norms on Real Vector Spaces
Math 373: Principles Techniques of Applied Mathematics Spring 29 The 2 Inner Product 1 Inner Products Norms on Real Vector Spaces Recall that an inner product on a real vector space V is a function from
More informationTo give it a definition, an implicit function of x and y is simply any relationship that takes the form:
2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to
More informationNotes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand
Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such
More informationIdeal Class Group and Units
Chapter 4 Ideal Class Group and Units We are now interested in understanding two aspects of ring of integers of number fields: how principal they are (that is, what is the proportion of principal ideals
More informationNotes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
More informationHøgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver
Høgskolen i Narvik Sivilingeniørutdanningen STE637 ELEMENTMETODER Oppgaver Klasse: 4.ID, 4.IT Ekstern Professor: Gregory A. Chechkin e-mail: chechkin@mech.math.msu.su Narvik 6 PART I Task. Consider two-point
More informationExample 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum. asin. k, a, and b. We study stability of the origin x
Lecture 4. LaSalle s Invariance Principle We begin with a motivating eample. Eample 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum Dynamics of a pendulum with friction can be written
More informationThesis Title. A. U. Thor. A.A.S., University of Southern Swampland, 1988 M.S., Art Therapy, University of New Mexico, 1991 THESIS
Thesis Title by A. U. Thor A.A.S., University of Southern Swampland, 1988 M.S., Art Therapy, University of New Mexico, 1991 THESIS Submitted in Partial Fulllment of the Requirements for the Degree of Master
More informationSeparable First Order Differential Equations
Separable First Order Differential Equations Form of Separable Equations which take the form = gx hy or These are differential equations = gxĥy, where gx is a continuous function of x and hy is a continuously
More information