Concentration inequalities for order statistics Using the entropy method and Rényi s representation Maud Thomas 1 in collaboration with Stéphane Boucheron 1 1 LPMA Université Paris-Diderot High Dimensional Probability VII Cargèse, May 26-30, 2014 1 / 19
Background: order statistics Sample: X 1,..., X n i.i.d. F. Order statistics X p1q ě... ě X pnq non-increasing rearrangement of X 1,..., X n. Goal X p1q : sample maximum. X pn{2q : sample median. @k, PtX pkq ď tu ř n `n i k i F i ptqp1 F ptqq n i. Classical statistic theory and Extreme Value Theory provide: Asymptotic distributions. Convergence of moments. derive simple, non-asymptotic variance/tail bounds for order statistics. 2 / 19
Background: concentration Concentration of measure phenomenon Any function of many independent random variables that does not depend too much on any of them is concentrated around its mean value. Example: Gaussian concentration X a standard Gaussian vector and Z f px q. Poincaré s inequality: VarrZs ď E} f } 2. Gross logarithmic Sobolev inequality: EntrZ 2 s ď 2E} f } 2. Cirelson s inequality: PtZ ě EZ ` tu ď expp t 2 {p2l 2 qq if } f } ď L. 3 / 19
Gaussian case and the Poincaré s inequality f px 1,..., X n q X pkq the rank k order statistic of a sample is a simple function of n independent random variables. f 1. X i are standard Gaussian. Poincaré s inequality ñ VarrX pkq s ď 1. Extreme Value Theory ñ VarrX p1q s Op1{ log nq. Classical statistic theory ñ VarrX pn{2q s Op1{nq. We do not understand (clearly) in which way order statistics are a smooth function of the sample. 4 / 19
Order statistics and spacings Proposition (Boucheron, T. (2012)) For all 0 ă k ď n{2 VarrX pkq s ď ke `Xpkq X pk`1q 2ı ker 2 k s. For all λ P R, Ent e λx pkq : λerx pkq e λx pkq s Ere λx pkqs log Ere λx pkqs ď ke e λx pk`1q ψpλpxpkq X pk`1q qq ke e λx pk`1q ψpλ k q with ψpxq 1 ` px 1qe x. 5 / 19
Remarks V k : k 2 k is called the Efron-Stein estimate of the variance of X pkq. Without any assumption such as: F belongs to the max-domain of attraction of an extreme value distribution G, i.e lim F n pa nñ`8 nx ` b nq Gpxq for every continuity point x of G. px pkq q is a sequence of extreme order statistics, if k fixed, n Ñ 8; central order statistics, if k{n Ñ p P p0, 1q while, n Ñ 8; intermediate order statistics, if k{n Ñ 0, k Ñ 8. 6 / 19
Proof Efron-Stein inequality (Efron, Stein (1981)) Let f : R n Ñ R be measurable, and let Z f px 1,..., X n q. Let Z i f i px 1,..., X i 1, X i`1,..., X n q where f i : R n 1 Ñ R is an arbitrary measurable function. Suppose Z is square-integrable, then: «ff nÿ VarrZs ď E pz Z i q 2. i 1 Modified logarithmic Sobolev inequality (Wu(2000); Massart (2000)) Let τpxq e x x 1. With the same notations, for any λ P R, Ent e «ff ÿ n λz ď E e λz τ p λpz Z i qq. i 1 7 / 19
Graphical assessment 1 2 3 4 5 + + + + + + + + + + Ratio between the Efron-Stein estimate and the variance of the maximum of n independent Gaussian random variables. n 2 p for p 1,..., 10. The asymptote is the line y 12{π 2 «1.22. 2 5 10 20 50 100 200 500 1000 8 / 19
Rényi s representation The order statistics of an exponential sample are distributed as partial sums of independent exponentially distributed random variables. Rényi s representation (Rényi (1953)) Let Y p1q ě Y p2q ě... ě Y pnq be the order statistics of an independent sample of the standard exponential distribution, then `Ypnq,..., Y piq,..., Y p1q ` En n,..., nÿ k i E k k,..., nÿ k 1 E k k where E 1,..., E n are i.i.d standard exponential random variables. 9 / 19
Quantile transformation Definition (Quantile function) F Ð ppq inf tx : F pxq ě pu, p P p0, 1q. Notation Uptq F Ð p1 1{tq, t P p1, 8q. Representation for order statistics If Y p1q ě... ě Y pnq are the order statistics of an exponential sample, then pu expqpy p1q q ě... ě pu expqpy pnq q are distributed as the order statistics of a sample drawn according to F. 10 / 19
Hazard rate, spacings and order statistics Definition (Hazard rate) The hazard rate h of a differentiable distribution function F is defined as: h F 1 {F F 1 {p1 F q. Lemma The distribution function F has non-decreasing hazard rate h, iff U exp is concave. Indeed, pu expq 1 1 h pu expq. If the distribution is log-concave, then the associated hazard rate is non-decreasing. 11 / 19
Variance bound for order statistics when the hazard rate is non-decreasing Recall: V k k 2 k. Proposition (Boucheron, T. (2012)) If F has non-decreasing hazard rate h, then for 1 ď k ď n{2, Var X pkq ď EVk ď 2 k E 1 hpx pk`1q q 2j. 12 / 19
Towards an exponential Efron-Stein inequality Definition (Exponential Efron-Stein inequality) Let Z f px 1,..., X n q where X 1,..., X n are independent random variables and V its Efron-Stein estimate of the variance of Z. Z satisfies an exponential Efron-Stein inequality if for all θ, λ ą 0 such that λθ ă 1 and E e λv {θ ă 8: Problem ı log E e λpz EZq ď λθ ı 1 λθ log E e λv {θ For an exponential sample, E e λv {θ 8. ë Find another decoupling inequality. ë Negative Association.. 13 / 19
Decoupling inequality: negative association Negative association X and Y are negatively associated if for any non-decreasing functions f, g E rf px qgpy qs ď E rf px qs E rgpy qs. Lemma If the distribution function F has non-decreasing hazard rate, then X pk`1q and k X pkq X pk`1q are negatively associated. 14 / 19
Exponential Efron-Stein inequality for order statistics Proposition (Boucheron, T. (2012)) If F has non-decreasing hazard rate h, then for λ ě 0, and 1 ď k ď n{2, log Ee λpx pkq EX pkq q ď λ k 2 E k `eλ k 1 λ k 2 E «c Vk k ff e λ? V k {k 1. 15 / 19
Gaussian hazard rate Uptq Φ Ð p1 1{tq for t ą 1. 4 4 3 3 U(exp(x)) 2 1 U(2 exp(x)) 2 0 Gaussian distribution 1 Absolute value of Gaussian random variable 1 0 0 2 4 6 8 10 0 2 4 6 8 10 16 / 19
Variance of absolute values of Gaussian random variables Proposition (Boucheron, T. (2012)) Let n ě 3, let X pkq be the rank k order statistic of absolute values of n standard independent Gaussian random variables, VarrX pkq s ď 1 k log 2 log ` 2n k 8 logp1 ` 4 k log log ` 2n k q. For the maximum (k 1), the bound becomes: 1 8 log 2 log 2n logp1 ` 4 log log 2nq. M n : maximum of n standard Gaussian r.v Chatterjee (Talagrand L1-L2 inequality): VarrM ns ď 1 1`log n. Nourdin (Ornstein-Uhlenbeck process): VarrM ns ď 2 log n. 17 / 19
Bernstein inequality for the maximum of absolute values of Gaussian random variables Theorem (Boucheron, T. (2012)) For n such that the solution v n of equation is smaller than 1, for all 0 ď λ ă 1? vn, 16{x ` logp1 ` 2{x ` 4 logp4{xqq logp2nq log Ee λpx p1q EX p1q q ď v n λ 2 2p1?v n λq. 18 / 19
Thank you for your attention! 19 / 19