3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Transcription

1 3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions convexity with respect to generalized inequalities 3 1

2 Definition f : R n R is convex if domf is a convex set and f(θx+(1 θ)y) θf(x)+(1 θ)f(y) for all x,y domf, 0 θ 1 (x,f(x)) (y,f(y)) f is concave if f is convex f is strictly convex if domf is convex and f(θx+(1 θ)y) < θf(x)+(1 θ)f(y) for x,y domf, x y, 0 < θ < 1 Convex functions 3 2

3 Examples on R convex: affine: ax+b on R, for any a,b R exponential: e ax, for any a R powers: x α on R ++, for α 1 or α 0 powers of absolute value: x p on R, for p 1 negative entropy: xlogx on R ++ concave: affine: ax+b on R, for any a,b R powers: x α on R ++, for 0 α 1 logarithm: logx on R ++ Convex functions 3 3

4 Examples on R n and R m n affine functions are convex and concave; all norms are convex examples on R n affine function f(x) = a T x+b norms: x p = ( n i=1 x i p ) 1/p for p 1; x = max k x k examples on R m n (m n matrices) affine function f(x) = tr(a T X)+b = m i=1 n A ij X ij +b j=1 spectral (maximum singular value) norm f(x) = X 2 = σ max (X) = (λ max (X T X)) 1/2 Convex functions 3 4

5 Restriction of a convex function to a line f : R n R is convex if and only if the function g : R R, g(t) = f(x+tv), domg = {t x+tv domf} is convex (in t) for any x domf, v R n can check convexity of f by checking convexity of functions of one variable example. f : S n R with f(x) = logdetx, domf = S n ++ g(t) = logdet(x +tv) = logdetx +logdet(i +tx 1/2 VX 1/2 ) n = logdetx + log(1+tλ i ) where λ i are the eigenvalues of X 1/2 VX 1/2 g is concave in t (for any choice of X 0, V); hence f is concave i=1 Convex functions 3 5

6 Extended-value extension extended-value extension f of f is f(x) = f(x), x domf, f(x) =, x domf often simplifies notation; for example, the condition 0 θ 1 = f(θx+(1 θ)y) θ f(x)+(1 θ) f(y) (as an inequality in R { }), means the same as the two conditions domf is convex for x,y domf, 0 θ 1 = f(θx+(1 θ)y) θf(x)+(1 θ)f(y) Convex functions 3 6

7 First-order condition f is differentiable if domf is open and the gradient f(x) = exists at each x domf ( f(x), f(x),..., f(x) ) x 1 x 2 x n 1st-order condition: differentiable f with convex domain is convex iff f(y) f(x)+ f(x) T (y x) for all x,y domf f(y) f(x) + f(x) T (y x) (x,f(x)) first-order approximation of f is global underestimator Convex functions 3 7

8 Second-order conditions f is twice differentiable if domf is open and the Hessian 2 f(x) S n, exists at each x domf 2 f(x) ij = 2 f(x) x i x j, i,j = 1,...,n, 2nd-order conditions: for twice differentiable f with convex domain f is convex if and only if 2 f(x) 0 for all x domf if 2 f(x) 0 for all x domf, then f is strictly convex Convex functions 3 8

9 Examples quadratic function: f(x) = (1/2)x T Px+q T x+r (with P S n ) f(x) = Px+q, 2 f(x) = P convex if P 0 least-squares objective: f(x) = Ax b 2 2 f(x) = 2A T (Ax b), 2 f(x) = 2A T A convex (for any A) quadratic-over-linear: f(x,y) = x 2 /y 2 2 f(x,y) = 2 y 3 [ y x ][ y x ] T 0 f(x,y) convex for y > 0 1 y 0 2 x 0 Convex functions 3 9

10 log-sum-exp: f(x) = log n k=1 expx k is convex 2 f(x) = 1 1 T z diag(z) 1 (1 T z) 2zzT (z k = expx k ) to show 2 f(x) 0, we must verify that v T 2 f(x)v 0 for all v: v T 2 f(x)v = ( k z kv 2 k )( k z k) ( k v kz k ) 2 ( k z k) 2 0 since ( k v kz k ) 2 ( k z kv 2 k )( k z k) (from Cauchy-Schwarz inequality) geometric mean: f(x) = ( n k=1 x k) 1/n on R n ++ is concave (similar proof as for log-sum-exp) Convex functions 3 10

11 Epigraph and sublevel set α-sublevel set of f : R n R: C α = {x domf f(x) α} sublevel sets of convex functions are convex (converse is false) epigraph of f : R n R: epif = {(x,t) R n+1 x domf, f(x) t} epif f f is convex if and only if epif is a convex set Convex functions 3 11

12 Jensen s inequality basic inequality: if f is convex, then for 0 θ 1, f(θx+(1 θ)y) θf(x)+(1 θ)f(y) extension: if f is convex, then f(ez) Ef(z) for any random variable z basic inequality is special case with discrete distribution prob(z = x) = θ, prob(z = y) = 1 θ Convex functions 3 12

13 Operations that preserve convexity practical methods for establishing convexity of a function 1. verify definition (often simplified by restricting to a line) 2. for twice differentiable functions, show 2 f(x) 0 3. show that f is obtained from simple convex functions by operations that preserve convexity nonnegative weighted sum composition with affine function pointwise maximum and supremum composition minimization perspective Convex functions 3 13

14 Positive weighted sum & composition with affine function nonnegative multiple: αf is convex if f is convex, α 0 sum: f 1 +f 2 convex if f 1,f 2 convex (extends to infinite sums, integrals) composition with affine function: f(ax+b) is convex if f is convex examples log barrier for linear inequalities f(x) = m log(b i a T i x), i=1 domf = {x a T i x < b i,i = 1,...,m} (any) norm of affine function: f(x) = Ax+b Convex functions 3 14

15 Pointwise maximum if f 1,..., f m are convex, then f(x) = max{f 1 (x),...,f m (x)} is convex examples piecewise-linear function: f(x) = max i=1,...,m (a T i x+b i) is convex sum of r largest components of x R n : f(x) = x [1] +x [2] + +x [r] is convex (x [i] is ith largest component of x) proof: f(x) = max{x i1 +x i2 + +x ir 1 i 1 < i 2 < < i r n} Convex functions 3 15

16 Pointwise supremum if f(x,y) is convex in x for each y A, then is convex examples g(x) = supf(x,y) y A support function of a set C: S C (x) = sup y C y T x is convex distance to farthest point in a set C: f(x) = sup x y y C maximum eigenvalue of symmetric matrix: for X S n, λ max (X) = sup y T Xy y 2 =1 Convex functions 3 16

17 Composition with scalar functions composition of g : R n R and h : R R: f(x) = h(g(x)) f is convex if g convex, h convex, h nondecreasing g concave, h convex, h nonincreasing proof (for n = 1, differentiable g,h) f (x) = h (g(x))g (x) 2 +h (g(x))g (x) note: monotonicity must hold for extended-value extension h examples expg(x) is convex if g is convex 1/g(x) is convex if g is concave and positive Convex functions 3 17

18 Vector composition composition of g : R n R k and h : R k R: f(x) = h(g(x)) = h(g 1 (x),g 2 (x),...,g k (x)) f is convex if g i convex, h convex, h nondecreasing in each argument g i concave, h convex, h nonincreasing in each argument proof (for n = 1, differentiable g,h) f (x) = g (x) T 2 h(g(x))g (x)+ h(g(x)) T g (x) examples m i=1 logg i(x) is concave if g i are concave and positive log m i=1 expg i(x) is convex if g i are convex Convex functions 3 18

19 Minimization if f(x,y) is convex in (x,y) and C is a convex set, then is convex examples g(x) = inf y C f(x,y) f(x,y) = x T Ax+2x T By +y T Cy with [ A B B T C ] 0, C 0 minimizing over y gives g(x) = inf y f(x,y) = x T (A BC 1 B T )x g is convex, hence Schur complement A BC 1 B T 0 distance to a set: dist(x,s) = inf y S x y is convex if S is convex Convex functions 3 19

20 Perspective the perspective of a function f : R n R is the function g : R n R R, g(x,t) = tf(x/t), domg = {(x,t) x/t domf, t > 0} g is convex if f is convex examples f(x) = x T x is convex; hence g(x,t) = x T x/t is convex for t > 0 negative logarithm f(x) = logx is convex; hence relative entropy g(x,t) = tlogt tlogx is convex on R 2 ++ if f is convex, then g(x) = (c T x+d)f ( (Ax+b)/(c T x+d) ) is convex on {x c T x+d > 0, (Ax+b)/(c T x+d) domf} Convex functions 3 20

21 the conjugate of a function f is The conjugate function f (y) = sup (y T x f(x)) x dom f f(x) xy x (0, f (y)) f is convex (even if f is not) will be useful in chapter 5 Convex functions 3 21

22 examples negative logarithm f(x) = logx f (y) = sup(xy +logx) = x>0 { 1 log( y) y < 0 otherwise strictly convex quadratic f(x) = (1/2)x T Qx with Q S n ++ f (y) = sup(y T x (1/2)x T Qx) x = 1 2 yt Q 1 y Convex functions 3 22

23 Quasiconvex functions f : R n R is quasiconvex if domf is convex and the sublevel sets are convex for all α S α = {x domf f(x) α} β α a b c f is quasiconcave if f is quasiconvex f is quasilinear if it is quasiconvex and quasiconcave Convex functions 3 23

24 x is quasiconvex on R Examples ceil(x) = inf{z Z z x} is quasilinear logx is quasilinear on R ++ f(x 1,x 2 ) = x 1 x 2 is quasiconcave on R 2 ++ linear-fractional function is quasilinear distance ratio f(x) = at x+b c T x+d, domf = {x ct x+d > 0} f(x) = x a 2 x b 2, domf = {x x a 2 x b 2 } is quasiconvex Convex functions 3 24

25 internal rate of return cash flow x = (x 0,...,x n ); x i is payment in period i (to us if x i > 0) we assume x 0 < 0 and x 0 +x 1 + +x n > 0 present value of cash flow x, for interest rate r: PV(x,r) = n (1+r) i x i i=0 internal rate of return is smallest interest rate for which PV(x,r) = 0: IRR(x) = inf{r 0 PV(x,r) = 0} IRR is quasiconcave: superlevel set is intersection of open halfspaces IRR(x) R n (1+r) i x i > 0 for 0 r < R i=0 Convex functions 3 25

26 Properties modified Jensen inequality: for quasiconvex f 0 θ 1 = f(θx+(1 θ)y) max{f(x),f(y)} first-order condition: differentiable f with cvx domain is quasiconvex iff f(y) f(x) = f(x) T (y x) 0 x f(x) sums of quasiconvex functions are not necessarily quasiconvex Convex functions 3 26

27 Log-concave and log-convex functions a positive function f is log-concave if logf is concave: f(θx+(1 θ)y) f(x) θ f(y) 1 θ for 0 θ 1 f is log-convex if logf is convex powers: x a on R ++ is log-convex for a 0, log-concave for a 0 many common probability densities are log-concave, e.g., normal: f(x) = 1 (2π)n detσ e 1 2 (x x)t Σ 1 (x x) cumulative Gaussian distribution function Φ is log-concave Φ(x) = 1 2π x e u2 /2 du Convex functions 3 27

28 Properties of log-concave functions twice differentiable f with convex domain is log-concave if and only if f(x) 2 f(x) f(x) f(x) T for all x domf product of log-concave functions is log-concave sum of log-concave functions is not always log-concave integration: if f : R n R m R is log-concave, then g(x) = f(x,y) dy is log-concave (not easy to show) Convex functions 3 28

29 consequences of integration property convolution f g of log-concave functions f, g is log-concave (f g)(x) = f(x y)g(y)dy if C R n convex and y is a random variable with log-concave pdf then is log-concave f(x) = prob(x+y C) proof: write f(x) as integral of product of log-concave functions p is pdf of y f(x) = g(x+y)p(y)dy, g(u) = { 1 u C 0 u C, Convex functions 3 29

30 example: yield function Y(x) = prob(x+w S) x R n : nominal parameter values for product w R n : random variations of parameters in manufactured product S: set of acceptable values if S is convex and w has a log-concave pdf, then Y is log-concave yield regions {x Y(x) α} are convex Convex functions 3 30

31 Convexity with respect to generalized inequalities f : R n R m is K-convex if domf is convex and for x, y domf, 0 θ 1 f(θx+(1 θ)y) K θf(x)+(1 θ)f(y) example f : S m S m, f(x) = X 2 is S m +-convex proof: for fixed z R m, z T X 2 z = Xz 2 2 is convex in X, i.e., z T (θx +(1 θ)y) 2 z θz T X 2 z +(1 θ)z T Y 2 z for X,Y S m, 0 θ 1 therefore (θx +(1 θ)y) 2 θx 2 +(1 θ)y 2 Convex functions 3 31