3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions


 Rodney Hardy
 1 years ago
 Views:
Transcription
1 3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions logconcave and logconvex functions convexity with respect to generalized inequalities 3 1
2 Definition f : R n R is convex if domf is a convex set and f(θx+(1 θ)y) θf(x)+(1 θ)f(y) for all x,y domf, 0 θ 1 (x,f(x)) (y,f(y)) f is concave if f is convex f is strictly convex if domf is convex and f(θx+(1 θ)y) < θf(x)+(1 θ)f(y) for x,y domf, x y, 0 < θ < 1 Convex functions 3 2
3 Examples on R convex: affine: ax+b on R, for any a,b R exponential: e ax, for any a R powers: x α on R ++, for α 1 or α 0 powers of absolute value: x p on R, for p 1 negative entropy: xlogx on R ++ concave: affine: ax+b on R, for any a,b R powers: x α on R ++, for 0 α 1 logarithm: logx on R ++ Convex functions 3 3
4 Examples on R n and R m n affine functions are convex and concave; all norms are convex examples on R n affine function f(x) = a T x+b norms: x p = ( n i=1 x i p ) 1/p for p 1; x = max k x k examples on R m n (m n matrices) affine function f(x) = tr(a T X)+b = m i=1 n A ij X ij +b j=1 spectral (maximum singular value) norm f(x) = X 2 = σ max (X) = (λ max (X T X)) 1/2 Convex functions 3 4
5 Restriction of a convex function to a line f : R n R is convex if and only if the function g : R R, g(t) = f(x+tv), domg = {t x+tv domf} is convex (in t) for any x domf, v R n can check convexity of f by checking convexity of functions of one variable example. f : S n R with f(x) = logdetx, domf = S n ++ g(t) = logdet(x +tv) = logdetx +logdet(i +tx 1/2 VX 1/2 ) n = logdetx + log(1+tλ i ) where λ i are the eigenvalues of X 1/2 VX 1/2 g is concave in t (for any choice of X 0, V); hence f is concave i=1 Convex functions 3 5
6 Extendedvalue extension extendedvalue extension f of f is f(x) = f(x), x domf, f(x) =, x domf often simplifies notation; for example, the condition 0 θ 1 = f(θx+(1 θ)y) θ f(x)+(1 θ) f(y) (as an inequality in R { }), means the same as the two conditions domf is convex for x,y domf, 0 θ 1 = f(θx+(1 θ)y) θf(x)+(1 θ)f(y) Convex functions 3 6
7 Firstorder condition f is differentiable if domf is open and the gradient f(x) = exists at each x domf ( f(x), f(x),..., f(x) ) x 1 x 2 x n 1storder condition: differentiable f with convex domain is convex iff f(y) f(x)+ f(x) T (y x) for all x,y domf f(y) f(x) + f(x) T (y x) (x,f(x)) firstorder approximation of f is global underestimator Convex functions 3 7
8 Secondorder conditions f is twice differentiable if domf is open and the Hessian 2 f(x) S n, exists at each x domf 2 f(x) ij = 2 f(x) x i x j, i,j = 1,...,n, 2ndorder conditions: for twice differentiable f with convex domain f is convex if and only if 2 f(x) 0 for all x domf if 2 f(x) 0 for all x domf, then f is strictly convex Convex functions 3 8
9 Examples quadratic function: f(x) = (1/2)x T Px+q T x+r (with P S n ) f(x) = Px+q, 2 f(x) = P convex if P 0 leastsquares objective: f(x) = Ax b 2 2 f(x) = 2A T (Ax b), 2 f(x) = 2A T A convex (for any A) quadraticoverlinear: f(x,y) = x 2 /y 2 2 f(x,y) = 2 y 3 [ y x ][ y x ] T 0 f(x,y) convex for y > 0 1 y 0 2 x 0 Convex functions 3 9
10 logsumexp: f(x) = log n k=1 expx k is convex 2 f(x) = 1 1 T z diag(z) 1 (1 T z) 2zzT (z k = expx k ) to show 2 f(x) 0, we must verify that v T 2 f(x)v 0 for all v: v T 2 f(x)v = ( k z kv 2 k )( k z k) ( k v kz k ) 2 ( k z k) 2 0 since ( k v kz k ) 2 ( k z kv 2 k )( k z k) (from CauchySchwarz inequality) geometric mean: f(x) = ( n k=1 x k) 1/n on R n ++ is concave (similar proof as for logsumexp) Convex functions 3 10
11 Epigraph and sublevel set αsublevel set of f : R n R: C α = {x domf f(x) α} sublevel sets of convex functions are convex (converse is false) epigraph of f : R n R: epif = {(x,t) R n+1 x domf, f(x) t} epif f f is convex if and only if epif is a convex set Convex functions 3 11
12 Jensen s inequality basic inequality: if f is convex, then for 0 θ 1, f(θx+(1 θ)y) θf(x)+(1 θ)f(y) extension: if f is convex, then f(ez) Ef(z) for any random variable z basic inequality is special case with discrete distribution prob(z = x) = θ, prob(z = y) = 1 θ Convex functions 3 12
13 Operations that preserve convexity practical methods for establishing convexity of a function 1. verify definition (often simplified by restricting to a line) 2. for twice differentiable functions, show 2 f(x) 0 3. show that f is obtained from simple convex functions by operations that preserve convexity nonnegative weighted sum composition with affine function pointwise maximum and supremum composition minimization perspective Convex functions 3 13
14 Positive weighted sum & composition with affine function nonnegative multiple: αf is convex if f is convex, α 0 sum: f 1 +f 2 convex if f 1,f 2 convex (extends to infinite sums, integrals) composition with affine function: f(ax+b) is convex if f is convex examples log barrier for linear inequalities f(x) = m log(b i a T i x), i=1 domf = {x a T i x < b i,i = 1,...,m} (any) norm of affine function: f(x) = Ax+b Convex functions 3 14
15 Pointwise maximum if f 1,..., f m are convex, then f(x) = max{f 1 (x),...,f m (x)} is convex examples piecewiselinear function: f(x) = max i=1,...,m (a T i x+b i) is convex sum of r largest components of x R n : f(x) = x [1] +x [2] + +x [r] is convex (x [i] is ith largest component of x) proof: f(x) = max{x i1 +x i2 + +x ir 1 i 1 < i 2 < < i r n} Convex functions 3 15
16 Pointwise supremum if f(x,y) is convex in x for each y A, then is convex examples g(x) = supf(x,y) y A support function of a set C: S C (x) = sup y C y T x is convex distance to farthest point in a set C: f(x) = sup x y y C maximum eigenvalue of symmetric matrix: for X S n, λ max (X) = sup y T Xy y 2 =1 Convex functions 3 16
17 Composition with scalar functions composition of g : R n R and h : R R: f(x) = h(g(x)) f is convex if g convex, h convex, h nondecreasing g concave, h convex, h nonincreasing proof (for n = 1, differentiable g,h) f (x) = h (g(x))g (x) 2 +h (g(x))g (x) note: monotonicity must hold for extendedvalue extension h examples expg(x) is convex if g is convex 1/g(x) is convex if g is concave and positive Convex functions 3 17
18 Vector composition composition of g : R n R k and h : R k R: f(x) = h(g(x)) = h(g 1 (x),g 2 (x),...,g k (x)) f is convex if g i convex, h convex, h nondecreasing in each argument g i concave, h convex, h nonincreasing in each argument proof (for n = 1, differentiable g,h) f (x) = g (x) T 2 h(g(x))g (x)+ h(g(x)) T g (x) examples m i=1 logg i(x) is concave if g i are concave and positive log m i=1 expg i(x) is convex if g i are convex Convex functions 3 18
19 Minimization if f(x,y) is convex in (x,y) and C is a convex set, then is convex examples g(x) = inf y C f(x,y) f(x,y) = x T Ax+2x T By +y T Cy with [ A B B T C ] 0, C 0 minimizing over y gives g(x) = inf y f(x,y) = x T (A BC 1 B T )x g is convex, hence Schur complement A BC 1 B T 0 distance to a set: dist(x,s) = inf y S x y is convex if S is convex Convex functions 3 19
20 Perspective the perspective of a function f : R n R is the function g : R n R R, g(x,t) = tf(x/t), domg = {(x,t) x/t domf, t > 0} g is convex if f is convex examples f(x) = x T x is convex; hence g(x,t) = x T x/t is convex for t > 0 negative logarithm f(x) = logx is convex; hence relative entropy g(x,t) = tlogt tlogx is convex on R 2 ++ if f is convex, then g(x) = (c T x+d)f ( (Ax+b)/(c T x+d) ) is convex on {x c T x+d > 0, (Ax+b)/(c T x+d) domf} Convex functions 3 20
21 the conjugate of a function f is The conjugate function f (y) = sup (y T x f(x)) x dom f f(x) xy x (0, f (y)) f is convex (even if f is not) will be useful in chapter 5 Convex functions 3 21
22 examples negative logarithm f(x) = logx f (y) = sup(xy +logx) = x>0 { 1 log( y) y < 0 otherwise strictly convex quadratic f(x) = (1/2)x T Qx with Q S n ++ f (y) = sup(y T x (1/2)x T Qx) x = 1 2 yt Q 1 y Convex functions 3 22
23 Quasiconvex functions f : R n R is quasiconvex if domf is convex and the sublevel sets are convex for all α S α = {x domf f(x) α} β α a b c f is quasiconcave if f is quasiconvex f is quasilinear if it is quasiconvex and quasiconcave Convex functions 3 23
24 x is quasiconvex on R Examples ceil(x) = inf{z Z z x} is quasilinear logx is quasilinear on R ++ f(x 1,x 2 ) = x 1 x 2 is quasiconcave on R 2 ++ linearfractional function is quasilinear distance ratio f(x) = at x+b c T x+d, domf = {x ct x+d > 0} f(x) = x a 2 x b 2, domf = {x x a 2 x b 2 } is quasiconvex Convex functions 3 24
25 internal rate of return cash flow x = (x 0,...,x n ); x i is payment in period i (to us if x i > 0) we assume x 0 < 0 and x 0 +x 1 + +x n > 0 present value of cash flow x, for interest rate r: PV(x,r) = n (1+r) i x i i=0 internal rate of return is smallest interest rate for which PV(x,r) = 0: IRR(x) = inf{r 0 PV(x,r) = 0} IRR is quasiconcave: superlevel set is intersection of open halfspaces IRR(x) R n (1+r) i x i > 0 for 0 r < R i=0 Convex functions 3 25
26 Properties modified Jensen inequality: for quasiconvex f 0 θ 1 = f(θx+(1 θ)y) max{f(x),f(y)} firstorder condition: differentiable f with cvx domain is quasiconvex iff f(y) f(x) = f(x) T (y x) 0 x f(x) sums of quasiconvex functions are not necessarily quasiconvex Convex functions 3 26
27 Logconcave and logconvex functions a positive function f is logconcave if logf is concave: f(θx+(1 θ)y) f(x) θ f(y) 1 θ for 0 θ 1 f is logconvex if logf is convex powers: x a on R ++ is logconvex for a 0, logconcave for a 0 many common probability densities are logconcave, e.g., normal: f(x) = 1 (2π)n detσ e 1 2 (x x)t Σ 1 (x x) cumulative Gaussian distribution function Φ is logconcave Φ(x) = 1 2π x e u2 /2 du Convex functions 3 27
28 Properties of logconcave functions twice differentiable f with convex domain is logconcave if and only if f(x) 2 f(x) f(x) f(x) T for all x domf product of logconcave functions is logconcave sum of logconcave functions is not always logconcave integration: if f : R n R m R is logconcave, then g(x) = f(x,y) dy is logconcave (not easy to show) Convex functions 3 28
29 consequences of integration property convolution f g of logconcave functions f, g is logconcave (f g)(x) = f(x y)g(y)dy if C R n convex and y is a random variable with logconcave pdf then is logconcave f(x) = prob(x+y C) proof: write f(x) as integral of product of logconcave functions p is pdf of y f(x) = g(x+y)p(y)dy, g(u) = { 1 u C 0 u C, Convex functions 3 29
30 example: yield function Y(x) = prob(x+w S) x R n : nominal parameter values for product w R n : random variations of parameters in manufactured product S: set of acceptable values if S is convex and w has a logconcave pdf, then Y is logconcave yield regions {x Y(x) α} are convex Convex functions 3 30
31 Convexity with respect to generalized inequalities f : R n R m is Kconvex if domf is convex and for x, y domf, 0 θ 1 f(θx+(1 θ)y) K θf(x)+(1 θ)f(y) example f : S m S m, f(x) = X 2 is S m +convex proof: for fixed z R m, z T X 2 z = Xz 2 2 is convex in X, i.e., z T (θx +(1 θ)y) 2 z θz T X 2 z +(1 θ)z T Y 2 z for X,Y S m, 0 θ 1 therefore (θx +(1 θ)y) 2 θx 2 +(1 θ)y 2 Convex functions 3 31
Chapter 5. Banach Spaces
9 Chapter 5 Banach Spaces Many linear equations may be formulated in terms of a suitable linear operator acting on a Banach space. In this chapter, we study Banach spaces and linear operators acting on
More informationClustering with Bregman Divergences
Journal of Machine Learning Research 6 (2005) 1705 1749 Submitted 10/03; Revised 2/05; Published 10/05 Clustering with Bregman Divergences Arindam Banerjee Srujana Merugu Department of Electrical and Computer
More informationINTRODUCTORY SET THEORY
M.Sc. program in mathematics INTRODUCTORY SET THEORY Katalin Károlyi Department of Applied Analysis, Eötvös Loránd University H1088 Budapest, Múzeum krt. 68. CONTENTS 1. SETS Set, equal sets, subset,
More informationProbability: Theory and Examples. Rick Durrett. Edition 4.1, April 21, 2013
i Probability: Theory and Examples Rick Durrett Edition 4.1, April 21, 213 Typos corrected, three new sections in Chapter 8. Copyright 213, All rights reserved. 4th edition published by Cambridge University
More information10. Proximal point method
L. Vandenberghe EE236C Spring 201314) 10. Proximal point method proximal point method augmented Lagrangian method MoreauYosida smoothing 101 Proximal point method a conceptual algorithm for minimizing
More informationRobust MeanCovariance Solutions for Stochastic Optimization
OPERATIONS RESEARCH Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0030364X eissn 15265463 00 0000 0001 INFORMS doi 10.1287/xxxx.0000.0000 c 0000 INFORMS Robust MeanCovariance Solutions for Stochastic
More informationFixed Point Theorems For SetValued Maps
Fixed Point Theorems For SetValued Maps Bachelor s thesis in functional analysis Institute for Analysis and Scientific Computing Vienna University of Technology Andreas Widder, July 2009. Preface In this
More informationCritical points of once continuously differentiable functions are important because they are the only points that can be local maxima or minima.
Lecture 0: Convexity and Optimization We say that if f is a once continuously differentiable function on an interval I, and x is a point in the interior of I that x is a critical point of f if f (x) =
More informationFixed Point Theorems and Applications
Fixed Point Theorems and Applications VITTORINO PATA Dipartimento di Matematica F. Brioschi Politecnico di Milano vittorino.pata@polimi.it Contents Preface 2 Notation 3 1. FIXED POINT THEOREMS 5 The Banach
More informationInner Product Spaces and Orthogonality
Inner Product Spaces and Orthogonality week 34 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,
More informationINTERIOR POINT POLYNOMIAL TIME METHODS IN CONVEX PROGRAMMING
GEORGIA INSTITUTE OF TECHNOLOGY SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES INTERIOR POINT POLYNOMIAL TIME METHODS IN CONVEX PROGRAMMING ISYE 8813 Arkadi Nemirovski On sabbatical leave from
More information3. INNER PRODUCT SPACES
. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.
More informationMUSTHAVE MATH TOOLS FOR GRADUATE STUDY IN ECONOMICS
MUSTHAVE MATH TOOLS FOR GRADUATE STUDY IN ECONOMICS William Neilson Department of Economics University of Tennessee Knoxville September 29 289 by William Neilson web.utk.edu/~wneilson/mathbook.pdf Acknowledgments
More informationMichael Ulbrich. Nonsmooth Newtonlike Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces
Michael Ulbrich Nonsmooth Newtonlike Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces Technische Universität München Fakultät für Mathematik June 21, revised
More informationNOTES ON GROUP THEORY
NOTES ON GROUP THEORY Abstract. These are the notes prepared for the course MTH 751 to be offered to the PhD students at IIT Kanpur. Contents 1. Binary Structure 2 2. Group Structure 5 3. Group Actions
More informationCalderón problem. Mikko Salo
Calderón problem Lecture notes, Spring 2008 Mikko Salo Department of Mathematics and Statistics University of Helsinki Contents Chapter 1. Introduction 1 Chapter 2. Multiple Fourier series 5 2.1. Fourier
More informationAn existence result for a nonconvex variational problem via regularity
An existence result for a nonconvex variational problem via regularity Irene Fonseca, Nicola Fusco, Paolo Marcellini Abstract Local Lipschitz continuity of minimizers of certain integrals of the Calculus
More informationSOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve
SOLUTIONS Problem. Find the critical points of the function f(x, y = 2x 3 3x 2 y 2x 2 3y 2 and determine their type i.e. local min/local max/saddle point. Are there any global min/max? Partial derivatives
More informationLectures on Geometric Group Theory. Cornelia Drutu and Michael Kapovich
Lectures on Geometric Group Theory Cornelia Drutu and Michael Kapovich Preface The main goal of this book is to describe several tools of the quasiisometric rigidity and to illustrate them by presenting
More informationAlgebraic Number Theory
Algebraic Number Theory 1. Algebraic prerequisites 1.1. General 1.1.1. Definition. For a field F define the ring homomorphism Z F by n n 1 F. Its kernel I is an ideal of Z such that Z/I is isomorphic to
More informationIEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2013. ACCEPTED FOR PUBLICATION 1
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2013. ACCEPTED FOR PUBLICATION 1 ActiveSet Newton Algorithm for Overcomplete NonNegative Representations of Audio Tuomas Virtanen, Member,
More informationA Modern Course on Curves and Surfaces. Richard S. Palais
A Modern Course on Curves and Surfaces Richard S. Palais Contents Lecture 1. Introduction 1 Lecture 2. What is Geometry 4 Lecture 3. Geometry of InnerProduct Spaces 7 Lecture 4. Linear Maps and the Euclidean
More informationRegression. Chapter 2. 2.1 Weightspace View
Chapter Regression Supervised learning can be divided into regression and classification problems. Whereas the outputs for classification are discrete class labels, regression is concerned with the prediction
More informationSome Research Problems in Uncertainty Theory
Journal of Uncertain Systems Vol.3, No.1, pp.310, 2009 Online at: www.jus.org.uk Some Research Problems in Uncertainty Theory aoding Liu Uncertainty Theory Laboratory, Department of Mathematical Sciences
More information8430 HANDOUT 3: ELEMENTARY THEORY OF QUADRATIC FORMS
8430 HANDOUT 3: ELEMENTARY THEORY OF QUADRATIC FORMS PETE L. CLARK 1. Basic definitions An integral binary quadratic form is just a polynomial f = ax 2 + bxy + cy 2 with a, b, c Z. We define the discriminant
More informationTWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS
TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS XIAOLONG LONG, KNUT SOLNA, AND JACK XIN Abstract. We study the problem of constructing sparse and fast mean reverting portfolios.
More informationU.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra
U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory
More informationLinear Control Systems
Chapter 3 Linear Control Systems Topics : 1. Controllability 2. Observability 3. Linear Feedback 4. Realization Theory Copyright c Claudiu C. Remsing, 26. All rights reserved. 7 C.C. Remsing 71 Intuitively,
More informationThe Matrix Cookbook. [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen. Version: November 15, 2012
The Matrix Cookbook http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen Version: November 15, 01 1 Introduction What is this? These pages are a collection of facts (identities, approximations,
More informationPower Control in Wireless Cellular Networks
Power Control in Wireless Cellular Networks Power Control in Wireless Cellular Networks Mung Chiang Electrical Engineering Department, Princeton University Prashanth Hande Electrical Engineering Department,
More information