A note on the convex infimum convolution inequality



Similar documents
No: Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

Separation Properties for Locally Convex Cones

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

2.3 Convex Constrained Optimization Problems

Tail inequalities for order statistics of log-concave vectors and applications

Properties of BMO functions whose reciprocals are also BMO

MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying

Metric Spaces. Chapter Metrics

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Notes on metric spaces

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space

Lecture Notes on Measure Theory and Functional Analysis

FINITE DIMENSIONAL ORDERED VECTOR SPACES WITH RIESZ INTERPOLATION AND EFFROS-SHEN S UNIMODULARITY CONJECTURE AARON TIKUISIS

Practice with Proofs

1 Sufficient statistics

FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z

THE BANACH CONTRACTION PRINCIPLE. Contents

1 if 1 x 0 1 if 0 x 1

1. Prove that the empty set is a subset of every set.

Pacific Journal of Mathematics

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12

Continued Fractions and the Euclidean Algorithm

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

4. Expanding dynamical systems

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Maximum Likelihood Estimation

Inner Product Spaces

9 More on differentiation

Principle of Data Reduction

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sum of digits of polynomial values in arithmetic progressions

Duality of linear conic problems

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

TOPIC 4: DERIVATIVES

F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein)

Mathematical Methods of Engineering Analysis

Let H and J be as in the above lemma. The result of the lemma shows that the integral

Convex analysis and profit/cost/support functions

Collinear Points in Permutations

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

Order statistics and concentration of l r norms for log-concave vectors

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

EXISTENCE AND NON-EXISTENCE RESULTS FOR A NONLINEAR HEAT EQUATION

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

ALMOST COMMON PRIORS 1. INTRODUCTION

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich

The Mean Value Theorem

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Metric Spaces. Chapter 1

Lecture 6: Discrete & Continuous Probability and Random Variables

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

Definition and Properties of the Production Function: Lecture

Stochastic Inventory Control

A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS. In memory of Rou-Huai Wang

Date: April 12, Contents

An example of a computable

Section Inner Products and Norms

Sign changes of Hecke eigenvalues of Siegel cusp forms of degree 2

Metric Spaces Joseph Muscat 2003 (Last revised May 2009)

Geometrical Characterization of RN-operators between Locally Convex Vector Spaces

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

CHAPTER IV - BROWNIAN MOTION

Vector and Matrix Norms

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

1 Prior Probability and Posterior Probability

Metric Spaces. Lecture Notes and Exercises, Fall M.van den Berg

Adaptive Online Gradient Descent

1 Introduction, Notation and Statement of the main results

Class Meeting # 1: Introduction to PDEs

Solutions to Homework 10

Probability Generating Functions

1 Norms and Vector Spaces

Prime Numbers and Irreducible Polynomials

ON NONNEGATIVE SOLUTIONS OF NONLINEAR TWO-POINT BOUNDARY VALUE PROBLEMS FOR TWO-DIMENSIONAL DIFFERENTIAL SYSTEMS WITH ADVANCED ARGUMENTS

BANACH AND HILBERT SPACE REVIEW

Math 4310 Handout - Quotient Vector Spaces

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).

Continuity of the Perron Root

PRIME FACTORS OF CONSECUTIVE INTEGERS

THREE DIMENSIONAL GEOMETRY

Homework # 3 Solutions

FIRST YEAR CALCULUS. Chapter 7 CONTINUITY. It is a parabola, and we can draw this parabola without lifting our pencil from the paper.

Mathematics for Econometrics, Fourth Edition

Bipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES. 1. Introduction

5. Continuous Random Variables

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

INCIDENCE-BETWEENNESS GEOMETRY

Basics of Statistical Machine Learning

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n

Math 120 Final Exam Practice Problems, Form: A

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Linear and quadratic Taylor polynomials for functions of several variables.

Polynomial Invariants


Transcription:

A note on the convex infimum convolution inequality Naomi Feldheim, Arnaud Marsiglietti, Piotr Nayar, Jing Wang arxiv:505.00240v [math.pr] May 205 Abstract We characterize the symmetric measures which satisfy the one dimensional convex infimum convolution inequality of Maurey. For these measures the tensorization argument yields the two level Talagrand s concentration inequalities for their products and convex sets in R n. 200 Mathematics Subject Classification. Primary 26D0; Secondary 28A35, 52A20, 60E5. Key words and phrases. Infimum convolution, Concentration of measure, Convex sets, Product measures, Poincaré inequality. Introduction In the past few decades a lot of attention has been devoted to study the concentration of measure phenomenon, especially the concentration properties of product measures on the Euclidean space R n. Through this note by x p we denote the l p norm on R n, namely x p = ( n i= x i p /p, and let us take Bp n = {x Rn : x p }. We say that a Borel probability measure µ on R n satisfies concentration with a profile α µ (t if for any set A R n with µ(a /2 we have µ(a+tb2 n α µ(t, t 0, where B2 n is the Euclidean ball of radius, centred at the origin. Equivalently, for any -Lipschitz function f on R n we have µ({x : f Med µ f > t} α µ (t, t > 0, where Med µ f is a median of f. Moreover, the median can be replaced with the mean, see [L, Proposition.8]. The usual way to reach concentration is via certain functional inequalities. For example, we say that µ satisfies Poincaré inequality (sometimes also called the spectral gap inequality with constant C, if for any f : R n R which is, say, C smooth, we have Var µ (f C f 2 dµ. ( We assume that C is the best constant possible. This inequality implies the exponential concentration, namely, concentration with a profile α µ (t = 2exp( t/2 C. The product exponential probability measure ν n, i.e., measure with density 2 n exp( x, satisfies ( with the constant 4 (the one dimensional case goes back to [K]; for a one-line proof see Lemma 2. in [BL]. From this fact (case n = one can deduce, using the transportation argument, that any one-dimensional log-concave measure, i.e. themeasure with the density of the forme V, where V is convex, satisfies the Poincaré inequality with a universal constant C. In fact, there is a so-called Muckenhoupt condition that fully characterize measures satisfying the one dimensional Poincaré inequality, see [Mu] and [K]. Namely, let us assume, for simplicity, that our measure is symmetric. If the density of the absolutely continuous part of µ is equal to p then the optimal constant in ( satisfies (+ C 4B, where B = sup 2 2B x>0 { µ[x, x 0 p(y dy }.

In particular, the constant C is finite if B <. Suppose that µ,...µ satisfy Poincaré inequality with constant C. Then the same can be said about the product measure µ = µ... µ n, due to tensorization property of ( (see Corollary 5.7 in [L], which follows immediately from the subadditivity of the variance. Thus, one can say that the Poincaré inequality is fully understood in the case of product measures. A much stronger functional inequality that gives concentration is the so-called log-sobolev inequality, Ent µ (f 2 2C f 2 dµ, (2 where for a function g 0 we set Ent µ (g = ( glng dµ ( g dµ ln g dµ. It implies the Gaussian concentration phenomenon, i.e., concentration with a profile α µ (t = exp( t 2 /6C. As an example, the standard Gaussian measure γ n, i.e., the measure with density (2π n/2 e x 2 /2, satisfies (2 with the constant (see [G]. As in (, the log-sobolev inequality possess similar terrorization properties and there is a full description of measures µ satisfying (2 on the real line, see [BG2]. Namely, the optimal constant C in (2 satisfies 75 B C 936B, where B = sup x>0 { µ[x, ln ( µ[x, x 0 } p(y dy. Another way to concentration leads through the so-called property (τ (see [M] and [LW]. A measure µ on R n is said to satisfy property (τ with a nonnegative cost function ϕ if the inequality ( ( e f ϕ dµ e f dµ (3 holds for every bounded measurable function f on R n. Here f ϕ = inf y {f(y+ϕ(x y} is the so-called infimum convolution. Property (τ implies concentration, see Lemma 4 in [M]. Namely, for every measurable set A we have µ({x / A+{ϕ < t}} µ(a e t. In the seminal paper [M] Maurey showed that the exponential measure ν satisfies the infimum convolution inequality with a the cost function ϕ(t = min{ 36 t2, 2 ( t 2}. As a consequence 9 of tensorization (see Lemma in [M] we get that ν n satisfies (3 with the cost function ϕ(x = n i= ϕ(x i. This leads to the so-called Talagrand s two level concentration, ν n (A+6 tb n 2 +9tBn ν n (A e t, (see Corollary in[m]. As we can see, this gives a stronger conclusion than the Poincaré inequality. It is a well known fact that the property (τ with a cost function which is quadratic around 0, say ϕ(x = C x 2 for x, implies the Poincaré inequality with the constant /(4C, see Corrolary 3 in [M]. We shall sketch the argument in Section 2. The goal of this article is to investigate concentration properties of a wider class of measures, i.e., measures that may not even satisfy the Poincaré inequality. For example, in the case of purely atomic measures one can easily construct a non-constant function f with f 2 dµ = 0. However, one can still hope to get concentration if one restricts himself to the class of convex sets. It turns out that to reach exponential concentration for convex sets, it suffices to prove that the measure µ satisfies the convex Poincaré inequality. Below we state the formal definition. 2

Definition. We say that a Borel probability measure µ on R satisfies the convex Poincaré inequality with a constant C p if for every convex function f : R R with f bounded we have Var µ f C p (f 2 dµ. (4 Here we adopt the standard convention that for a locally Lipschitz function f: R R the gradient f is defined by f f(x+h f(x (x = limsup. (5 h 0 h This definition applies in particular to convex f. If f is differentiable, (5 agrees with the usual derivative. We also need the following definition. Definition 2. Let h > 0 and λ [0,. Let M(h,λ be the class of symmetric measures on R, satisfying the condition µ[x+h, λµ[x, for x 0. Moreover, let M + (h,λ be the class of measures supported on R +, satisfying the same condition. The inequality (4 has been investigated by Bobkov and Götze, see [BG, Theorem 4.2]. In particular, the authors proved (4 in the class M(h,λ with a constant C p depending only on h and λ. This leads to the exponential concentration for -Lipschitz convex functions f (as well as the exponential concentration for convex sets via the standard Herbst-like argument which actually goes back to [AS] (see also, e.g., Theorem 3.3 in [L], by deriving the bound on the Laplace transform of f. It is worth mentioning that a much stronger condition µ[x + C/x, λµ[x,, x m, where m is a fixed positive number, has been considered. It implies log-sobolev inequalities for log-convex functions and the Gaussian concentration of measure for convex sets. We refer to the nice study [A] for the details. In [M] the author showed that every measure supported on a subset of R with diameter satisfies the convex property (τ, i.e. the inequality (3 for convex functions f, with the cost function 4 x 2 (see Theorem 3 therein. The aim of this note is to extend both this result and the convex Poincaré inequality of Bobkov and Götze. We introduce the following definition. Definition 3. Define ϕ 0 (x = R { 2 x2 x x 2 x >. We say that a Borel probability measure µ on R satisfies convex exponential property (τ with constant C τ if for any convex function f : R R with inff > we have ( ( e f ϕ dµ e f dµ (6 with ϕ(x = ϕ 0 (x/c τ. We are ready to state our main result. Theorem. The following conditions are equivalent (a µ M(h,λ, 3

(b There exists C p > 0 such that µ satisfies the convex Poincaré inequality with constant C p, (c There exists C τ > 0 such that µ satisfies the convex exponential property (τ with constant C τ. Moreover, (a implies (c with the constant C τ = 7h/( λ 2, (c implies (b with the constant C p = 2 C2 τ and (b implies (a with h = 8C p and λ = /2. This generalizes Maurey s theorem due to the fact that any symmetric measure supported on the interval [,] belongs to M(,0. It is well known that the convex property (τ tensorizes, namely, if µ,...,µ n have convex property (τ with cost functions ϕ,...ϕ n then the product measure µ = µ... µ n has convex property (τ with ϕ(x = n i= ϕ i(x i, see [M, Lemma 5]. Therefore Theorem implies the following Corollary. Corollary. Let µ,...,µ n M(h,λ and let us take µ = µ... µ n. Define the cost function ϕ(x = n i= ϕ 0(x i /C τ, where C τ = 7h/( λ 2. Then for any coordinate-wise convex function f (in particular, for any convex function we have ( ( e f ϕ dµ e f dµ. (7 As a consequence, one can deduce the two-level concentration for convex sets and convex functions in R n. Corollary 2. Let µ,...,µ n M(h,λ. Let C τ = 7h/( λ 2. Take µ = µ... µ n. Then for any convex set A with µ(a > 0 we have ( µ A+ 2tC τ B2 n +2tC τ B n e t µ(a. Corollary 3. Let µ,...,µ n M(h,λ. Let C τ = 7h/( λ 2. Take µ = µ... µ n. Then for any convex function f : R n R with we have and f(x f(y 2 a x y 2, f(x f(y b x y, x,y R n, µ({f > Med µ f +C τ t} 2exp ( 8 { } tb min, t2, t 0, (8 a 2 µ({f < Med µ f C τ t} 2exp ( 8 { } tb min, t2, t 0. (9 a 2 Let us mention that very recently Adamczak and Strzelecki established related results in the context of modified log-sobolev inequalities, see [ASt]. For simplicity we state their result in the case of symmetric measures. For λ [0,, β [0,] and h,m > 0 the authors defined the class of measures M β AS (h,λ,m satisfying the condition µ([x+h/xβ, λµ([x, for x m. Note that M 0 AS (h,λ,0 = M(h,λ. They proved that products µ = µ... µ n of such measures satisfy the inequality Ent µ (e f C AS e f ( f 2 f 4 β+ β β+ β dµ. (0

for any smooth convex function f : R R. As a consequence, for any convex set A in R n with µ(a /2 we have µ (A+t 2 B n2 +t +β Cτ B+β n e C AS t, t 0. Here the constants C AS,C AS depend only on the parameters β,m,h and λ. They also established inequality similar to (8, namely for a convex function f with f(x f(y 2 a x y 2, f(x f(y +β b x y +β, x,y R n, one gets µ({f > Med µ f +2t} 2exp ( 36 min { t +β t 2, b +β C β a 2 C AS AS }, t 0. However, the authors were not able to get (9. In fact one can show that for β = 0 our Theorem is stronger than (0. In particular, the inequality (0 is equivalent to e f ϕ dµ e f dµ, see [AS], which easily follows from (7. The rest of this article is organized as follows. In the next section we prove Theorem. In Section 3 we deduce Corollaries 2 and 3. 2 Proof of Theorem We need the following lemma, which is essentially included in Theorem 4.2 of [BG]. For reader s convenience we provide a straightforward proof of this fact. Lemma. Let µ M + (h,λ and let g : R [0, be non-decreasing with g(0 = 0. Then g 2 dµ ( 2 2 λ (g(x g(x h 2 dµ(x. Proof. We first prove that λ g(x dµ(x g(x h dµ(x for any non-decreasing g : R [0, such that g(0 = 0. Both sides of this inequality are linear in g. Therefore, it is enough to consider only functions of the form g(x = [a, (x for a 0, since g can be expressed as a mixture of these functions. For g(x = [a, (x the above inequality reduces to λµ[a, µ[a+h,, which is clearly true due to our assumption on µ. The above is equivalent to ( λ g(x dµ(x (g(x g(x h dµ(x. ( 5

Now, let us use ( with g 2 instead of g. Then, g 2 dµ (g(x 2 g(x h 2 dµ(x λ = (g(x g(x h(g(x+g(x h dµ(x λ λ 2 λ ( ( /2 ( (g(x g(x h 2 dµ(x /2 ( (g(x g(x h 2 dµ(x /2 (g(x+g(x h 2 dµ(x /2 g(x 2 dµ(x We arrive at ( Our assertion follows. /2 g 2 dµ 2 ( λ (g(x g(x h 2 dµ(x /2. In the rest of this note, we take f : R R to be convex. Let x 0 be a point where f attains its minimal value. Note that this point may not be unique. However, one can check that what follows does not depend on the choice of x 0. Moreover, if f is increasing (decreasing we adopt the notation x 0 = (x 0 =. Let us define a discrete version of gradient of f, f(x f(x h x > x 0 +h (Df(x = f(x f(x 0 x [x 0 h,x 0 +h]. f(x f(x+h x < x 0 h Lemma 2. Let f : R R be a convex function with f(0 = 0 and let µ M(h,λ. Then (e f/2 e f/2 2 8 dµ e f(x ((Df(x 2 dµ(x. ( λ 2 Proof. Step. We first assume that f is non-negative and non-decreasing. It follows that f(x = 0 for x 0. Correspondingly, the function g = e f/2 e f/2 is non-negative, non-decreasing and g(0 = 0. Moreover, let µ + be the normalized restriction of µ to R +. Note that µ + M + (h,λ. From Lemma we get (e f/2 e f/2 2 dµ = 2 Observe that = g 2 dµ + 2 (g(x g(x h 2 dµ + (x ( λ 2 (g(x g(x h 2 dµ(x. 4 ( λ 2 g(x g(x h = e f(x 2 e f(x 2 e f(x h 2 +e f(x h 2 = 2 (e f(x 2 e f(x h 2 (e f(x 2 e f(x h 2 (+e f(x 2 f(x h 2 e f(x 2 (f(x f(x h, 6

where the last inequality follows from the mean value theorem. We arrive at (e f/2 e f/2 ( 2 2 2 dµ e f(x ((Df(x 2 dµ(x. λ Step 2. Now let f be non-decreasing but not necessarily non-negative. From convexity of f and the fact that f(0 = 0 we get f( x f(x for x 0. This implies the inequality e f( x e f( x e f(x e f(x, x 0. From the symmetry of µ one gets (e f/2 e f/2 2 dµ (e f/2 e f/2 2 dµ +. Let f = f [0,. From Step one gets (e f/2 e f/2 2 ( 2 dµ + = 2 e f/2 e f/2 dµ 8 ( λ 2 8 ( λ 2 e f(x ((D f(x 2 dµ(x e f(x ((Df(x 2 dµ(x. Step 3. The conclusion of Step 2 is also true in the case of non-increasing functions with f(0 = 0. This is due to the invariance of our assertion under the symmetry x x, which is an easy consequence of the symmetry of µ and the fact that for F(x = f( x we have (DF(x = (Df( x. Step 4. Let us now eliminate of the assumption of monotonicity of f. Suppose that f is not monotone. Then f has a (not necessarily unique minimum attained at some point x 0 R. Due to the remark of Step 3 we can assume that x 0 0. Since f(0 = 0, we have f(x 0 < 0. Take y 0 = inf{y R : f(y = 0}. Clearly y 0 x 0. We define f (x = { f(x x x0 f(x 0 x < x 0, f 2 (x = { 0 x y0 f(x x < y 0. Note that f is non-decreasing and f 2 is non-increasing. Moreover, f (0 = f 2 (0 = 0. Therefore, from the previous steps applied for f and f 2 we get + + ( (e f(x0/2 e f(x0/2 2 µ((,x 0 ]+ e f/2 e f/2 2 8 dµ e f(x ((Df(x 2 dµ(x x 0 ( λ 2 x 0 (2 and y0 ( e f/2 e f/2 2 8 y0 dµ e f(x ((Df(x 2 dµ(x. (3 ( λ 2 Moreover, since f(x f(x 0 on [y 0,x 0 ], we have x0 ( e f/2 e f/2 2 dµ (e f(x 0 /2 e f(x0/2 2 µ([y 0,x 0 ] (e f(x0/2 e f(x0/2 2 µ((,x 0 ]. (4 y 0 Combining (2, (3 and (4, we arrive at (e f/2 e f/2 2 dµ 8 ( λ 2 e f(x ((Df(x 2 dµ(x. 7

The following lemma provides an estimate on the infimum convolution. Lemma 3. Let C,h > 0. Define ϕ (x = C ϕ 0 (x/h. Assume that a convex function f satisfies f /(C h. Then (f ϕ (x f(x C 2 (Df(x 2. Proof. Letusconsider thecasewhenx x 0 +h. Wetakeθ [0,]andwritey = θ(x h+( θx. Note that x y = hθ. By the convexity of f we have (f ϕ (x f(y+ϕ (x y θf(x h+( θf(x+ϕ (hθ = θf(x h+( θf(x+ 2C θ 2. Let us now take θ = C (f(x f(x h. Note that 0 f /C h yields θ [0,]. We get (f ϕ (x f(x θ(f(x f(x h+ 2C θ 2 = f(x C 2 (f(x f(x h2. The case x x 0 h follows by similar computation (one has to take y = θ(x+h+( θx. Also, in the case x [x 0 h,x 0 +h] it is enough to take y = θx 0 +( θx and use the fact that x y = θ(x x 0 hθ. Proof of Theorem. We begin by showing that (a implies (c. We do this in three steps. Step. We first show that it is enough to consider only the case when f satisfies f /C τ. Since inff >, then either there exists a point y 0 such that f (y 0 /C τ, or f attains its minimum at some point y 0 (for example if f(x = 2 x /C τ, only the second case is valid. Let us fix this point y 0. Moreover, let us define y = sup{y y 0 : f (y /C τ } and y + = inf{y y 0 : f (y /C τ } (if f < /C τ then we adopt the notation y + = and, similarly, y = if f > /C τ. Let us consider a function g with g = f on [y,y + ] and with the derivative satisfying g = sgn(f min{ f,/c τ } when f 0 and g = max{f, /C τ } when f < 0. Clearly g f and g(x = f(x for y x y +. We now observe that f ϕ = g ϕ. To see this recall that (f ϕ(x = inf y (f(y+ϕ(x y. The function ψ x (y = f(y+ϕ(x y satisfies ψ x (y = f (y ϕ (x y f (y /C τ. Thus, for y y + we have ψ x (y 0 and therefore ψ x is non-decreasing for y y +. Similarly, ψ x is non-increasing for y y. This means that inf y (f(y + ϕ(x y is attained at some point y = y(x [y,y + ]. Thus, (f ϕ(x = f(y(x+ϕ(x y(x = g(y(x+ϕ(x y(x = (g ϕ(x. The last equality follows from the fact that the minimum of y g(y+ϕ(x y is achieved at the same point y = y(x. Therefore, since f ϕ = g ϕ and g f, we have ( ( ( ( e f ϕ dµ e f dµ e g ϕ dµ e g dµ and g /C τ. Step 2. The inequality (6 stays invariant when we add a constant to the function f. Thus, we may and will assume that f(0 = 0. Note that from the elementary inequality 4ab (a+b 2 we have ( 4 ( e f ϕ dµ ( (e e f dµ f ϕ +e dµ f 2. 8

Thus, it is enough to show that (e f ϕ +e f dµ 2. Step 3. Assume f /C τ. Take C = 7/( λ 2, C τ = C h and ϕ(x = ϕ 0 (x/c τ. By the convexity of ϕ 0 we get ϕ(x C ϕ 0 (x/h, since C >. Thus, by Lemma 3 we get f ϕ f(x C 2 (Df(x 2. By the mean value theorem (Df(x /h /C τ. Therefore, C 2 (Df(x 2 C 2 ( h C τ 2 = /2C. Let α(c = 2C ( exp( 2C. The convexity of the exponential function yields e s α(c s, s [0,/2C ]. Therefore, (e f ϕ +e f ( dµ e f(x 2 C (Df(x 2 +e f(x dµ(x (e ( f(x 2 C α(c (Df(x +e 2 f(x dµ(x. Therefore, since e f +e f 2 = (e f/2 e f/2 2, we are to prove that (e f/2 e f/2 2 C dµ 2 α(c e f(x (Df(x 2 dµ(x. From Lemma 2 this inequality is true whenever 2 C α(c 8 ( λ 2. It suffices to observe that ( 2 C α(c = C ( e 2 2C C 2 + = C 2C 2+ C C 2+ 8 = 8 ( λ 2. We now sketch the proof of the fact that (c implies (b. Due to the standard approximation argument one can assume that f is a convex C 2 smooth function with bounded first and second derivative (note that in the definition of the convex Poincaré inequality we assumed that f is bounded. Consider the function f ε = εf. The infimum of ψ x (y = ϕ(y+εf(x y is attained at thepointysatisfying theequationψ x (y = ϕ (y εf (x y = 0. Notethatψ x isstrictlyincreasing on the interval [ C τ,c τ ]. If ε is sufficiently small, it follows that the above equation has a unique solution y x and that y x [ C τ,c τ ]. Thus, y x = C 2 τ εf (x y x. This implies y x = εc 2 τ f (x+o(ε. We get f ϕ(x = ϕ(y x +εf(x y x = y 2 2Cτ 2 x +εf(x εc2 τ f (x+o(ε 2 = 2 ε2 C 2 τ f (x 2 +εf(x ε 2 C 2 τ f (x 2 +o(ε 2 = εf(x 2 ε2 C 2 τ f (x 2 +o(ε 2. Therefore, from the infimum convolution inequality we get ( ( e εf(x 2 ε2 Cτ 2f (x 2 +o(ε 2 dµ(x e εf dµ. Testing (6 with f(x = x /C τ one gets that e ϕ dµ < and therefore e x /Cτ dµ <. Also, there exists a constant c > 0 such that f(x c(+ x, x R. As a consequence, after some 9

additional technical steps, one can consider the Taylor expansion of the above quantities in ε = 0. This gives ( ( +εf 2 ε2 Cτ 2 f 2 + ( ( 2 ε2 f 2 +o(ε 2 dµ εf + 2 ε2 f 2 +o(ε 2 dµ. Comparing the terms in front of ε 2 leads to ( f 2 dµ 2 f dµ 2 C2 τ f 2 dµ. This is exactly the Poincaré inequality with constant 2 C2 τ. We show that (b implies (a. Suppose that a symmetric Borel probability measure µ on R satisfies the convex Poincaré inequality with a constant C p. Consider the function f u (x = max{x u,0}, u 0. We have f u(x 2 dµ(x = µ([u,. R Since f u (y = 0 for y 0 and µ((,0] /2, one gets Var µ f u = (f u (x f u (y 2 dµ(xdµ(y 0 (f u (x f u (y 2 dµ(xdµ(y 2 R R 2 R (f u (x 2 dµ(x ([ 4 R 4 (f u (x 2 dµ(x 2C p µ u+ 8C p,. u+ 8C p These two observations, together with Poincaré inequality, yield that µ M( 8C p,/2. 3 Concentration properties We show that the convex property (τ implies concentration for convex sets. Proposition. Suppose that a Borel probability measure µ on R n satisfies the convex property (τ with a non-negative cost function ϕ, restricted to the family of convex functions. Let B ϕ (t = {x R n : ϕ(x t}. Then for any convex set A we have µ(a+b ϕ (t e t µ(a. The proof of this proposition is similar to the proof of Proposition 2.4 in [LW]. We recall the argument. Proof. Let f = 0 on A and f = outside of A. Note that f is convex (to avoid working with functions having values + one can consider a family of convex functions f n = ndist(a,x and take n. Suppose that (f ϕ(x t. Then there exists y R n such that f(y+ϕ(x y t. Thus, y A and x y B ϕ (t. Therefore x A+B ϕ (t. It follows that x / A+B ϕ (t implies (f ϕ(x > t. Applying the infimum convolution inequality we get ( ( e t ( µ(a+b ϕ (t µ(a e f ϕ dµ e f dµ. Our assertion follows. 0

We are ready to derive the two-level concentration for convex sets. Proof of Corollary 2. The argument is similar to [M, Corollary ]. Due to Corollary, µ = µ... µ n satisfies property (τ with the cost function ϕ(x = n i= ϕ 0(x i /C τ. Suppose that ϕ(x t. Define y,z R n in the following way. Take y i = x i if x i C τ and y i = 0 otherwise. Take z i = x i if x i > C τ and z i = 0 otherwise. Then x = y +z. Moreover, n ϕ(y i /C τ + i= n ϕ(z i /C τ = i= n ϕ(x i /C τ t. In particular y 2 2 2C2 τ t and t n i= ϕ(z i/c τ z 2 /C τ, since z i z 2 2 i for z i. This gives x 2tC τ B2 n +2tC τb n. Our assertion follows from Proposition. Finally, we prove concentration for convex Lipschitz functions. Proof of Corollary 3. The proof of (8 is similar to the proof of Propositon 4.8 in [L]. Let us define a convex set A = {f Med µ f} and observe that µ(a /2. Moreover, A+C τ ( 2tB n 2 +2tB n {f Med µ f +C τ (a 2t+2bt}. Applying Corollary 2 we get ({ µ f > Med µ f +C τ (a } 2t+2bt 2e t, t 0. Take s = C τ (a 2t+2bt and r = s/c τ. Suppose that r r2. Then b a 2 a 2t+2tb = r a2 r/b+ 2 2 r = a 2 r 8b +2b r 2 4b a r 8b +2b r 8b. By the monotonicity of x a 2x+2xb, x 0 it follows that 8 min{r, r2 } = r b a 2 8b if r r2, then b a 2 a 2t+2tb = r 2 r + br2 2a 2 = a Therefore, 8 min{r b, r2 a 2 } = r2 8a 2 t. Thus, i= 2 r2 r2 +2b 8a2 4a a 2 µ({f > Med µ f +rc τ } 2e t 2exp ( 8 { rb min, r2 a 2 2 r2 r2 +2b 8a2 8a 2. }, t 0. t. Moreover, For the proof of (9 we follow [T]. Define a convex set B = {f < Med µ f C τ (a 2t+2bt} with t 0. It follows that and thus Corollary 2 yields A+C τ ( 2tB n 2 +2tB n {f < Med µ f} ( 2 µ A+C τ ( 2tB2 n +2tBn e t µ(b. Therefore µ(b 2e t. To finish the proof we proceed as above.

Acknowledgements We would like to thank Rados law Adamczak for his useful comments regarding concentration estimates for Lipschitz functions. This research was supported in part by the Institute for Mathematics and its Applications with funds provided by the National Science Foundation. References [A] R. Adamczak, Logarithmic Sobolev inequalities and concentration of measure for convex functions and polynomial chaoses. Bull. Pol. Acad. Sci. Math. 53 (2005, no. 2, 22 238. [AS] S. Aida, D. Stroock, Moment estimates derived from Poincaré and logarithmic Sobolev inequalities. Math. Res. Lett. (994, no., 75 86. [ASt] R. Adamczak, M. Strzelecki, Modified logarithmic Sobolev inequalities for convex functions, preprint (to be submitted on arxiv [BG] S. Bobkov, F. Götze, Discrete isoperimetric and Poincaré-type inequalities. Probab. Theory Related Fields 4 (999, no. 2, 245 277. [BG2] S. G. Bobkov, F. Götze, Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. J. Funct. Anal. 63 (999, no., 28. [BL] S. Bobkov, M. Ledoux, Poincaré s inequalities and Talagrand s concentration phenomenon for the exponential distribution. Probab. Theory Related Fields 07 (997, no. 3, 383 400. [GRST] N. Gozlan, C. Roberto, P.M. Samson, and P. Tetali, Kantorovich duality for general transport costs and applications, preprint (204, arxiv:42.7480. [G] L. Gross, Leonard Logarithmic Sobolev inequalities. Amer. J. Math. 97 (975, no. 4, 06 083. [K] Ch. Klaassen, On an inequality of Chernoff. Ann. Probab. 3 (985, no. 3, 966 974. [LW] R. Lata la, J. O. Wojtaszczyk, On the infimum convolution inequality, Studia Math. 89 (2008, 47 87. [L] M. Ledoux, The concentration of measure phenomenon. Mathematical Surveys and Monographs, 89. American Mathematical Society, Providence, RI, 200. [M] B. Maurey, Some deviation inequalities, Geom. Funct. Anal. (99, 88 97. [Mu] B. Muckenhoupt, Hardy s inequality with weights. Studia Math. 44 (972, 3 38. [T] M. Talagrand, An isoperimetric theorem on the cube and the Kintchine-Kahane inequalities. Proc. Amer. Math. Soc. 04 (988, no. 3, 905 909. 2