# SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1

Save this PDF as:

Size: px
Start display at page:

## Transcription

4 4 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN separately, it might be beeficial to first estimate the set of variables which are relevat to ay of the multivariate resposes ad to estimate oly subsequetly the idividual supports withi that set. Thus we focus o the problem of recoverig the uio of the supports, amely the set S := K k=1 S k, correspodig to the subset of idices i {1,...,p} that are ivolved i at least oe regressio. We cosider a rage of problems i which variables ca be relevat to all, some, oly oe or oe of the regressios, ad we ivestigate if ad how the overlap of the idividual supports ad the relatedess of idividual regressios beefit or hider estimatio of the support uio. The support uio problem ca be uderstood as the geeralizatio of the problem of variable selectio to the group settig. Rather tha selectig specific compoets of a coefficiet vector, we aim to select specific rows of a coefficiet matrix. We thus also refer to the support uio problem as the row selectio problem. We ote that recoverig S, although ot equivalet to recoverig each of the distict idividual supports S k, addresses the essetial difficulty i recoverig those supports. Ideed, as we show i Sectio 2.2, give a method that returs the row support S with S p (with high probability), it is straightforward to recover the idividual supports S k by ordiary least-squares ad thresholdig. If computatioal complexity were ot a cocer, the atural way to perform row selectio for B would be by solvig the optimizatio problem (6) arg mi B R p K { 1 2 Y XB 2 F + λ B l0 /l q where B = (β ik ) 1 i p,1 k K is a p K matrix, the quatity F deotes the Frobeius orm, 1 ad the orm B l0 /l q couts the umber of rows i B that have ozero l q orm. As before, the l 0 compoet of this regularizer yields a ocovex ad computatioally itractable problem, so that it is atural to cosider the relaxatio { } 1 (7) arg mi B R p K 2 Y XB 2 F + λ B l1 /l q, where B l1 /l q is the block l 1 /l q orm ( p K 1/q p (8) B l1 /l q := β q) ij = β i q. i=1 j=1 The relaxatio (7) is a atural geeralizatio of the Lasso; ideed, it specializes to the Lasso i the case K = 1. For later referece, we also ote that settig q = 1 leads to the use of the l 1 /l 1 block orm i the relaxatio (7). Sice this orm decouples across both the rows ad colums, this particular choice is equivalet to solvig K separate Lasso problems, oe for each colum of the p K i=1 }, 1 The Frobeius orm of a matrix A is give by A F := i,j A 2 ij.

9 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 9 ecessary for variable selectio cosistecy of the Lasso. Ideed, i the absece of such coditios, it is always possible to make the Lasso fail, eve with a arbitrarily large sample size. [See, however, Meishause ad Yu (2009) for methods that weake the irrepresetable coditio.] Note that these assumptios are trivially satisfied by the stadard Gaussia esemble = I p p, with C mi = C max = 1, D max = 1adγ = 1. More geerally, it ca be show that various matrix classes (e.g., Toeplitz matrices, tree-structured covariace matrices, bouded off-diagoal matrices) satisfy these coditios [Meishause ad Bühlma (2006); Zhao ad Yu (2006); Waiwright (2009b)]. We require a few pieces of otatio before statig the mai results. For a arbitrary matrix B S R s K with ith row β i R 1 K, we defie the matrix ζ(b S ) R s K with ith row ζ(β i ) := β i (15), β i 2 whe β i 0, ad we set ζ(β i ) = 0 otherwise. With this otatio, the sparsityoverlap fuctio is give by (16) ψ(b):= ζ(b S ) T ( SS ) 1 ζ(b S ) 2, where 2 deotes the spectral orm. We use this sparsity-overlap fuctio to defie the sample complexity parameter, which captures the effective sample size (17) θ l1 /l 2 (, p; B ) := 2ψ(B ) log(p s). I the followig two theorems, we cosider a radom desig matrix X draw with i.i.d. N(0, )row vectors, where satisfies assumptios (A1) (A3), ad a observatio matrix Y specified by model (4). I order to capture depedece iduced by the desig covariace matrix, for ay positive semidefiite matrix Q 0, we defie the quatities (18a) ρ l (Q) := 1 2 mi[q ii + Q jj 2Q ij ] i j ad (18b) ρ u (Q) := max Q ii. i We ote that by defiitio, we have ρ l (Q) ρ u (Q) wheever Q 0. Our bouds are stated i terms of these quatities as applied to the coditioal covariace matrix S c S c S := S c S c S c S( SS ) 1 SS c. Our first result is a achievability result, showig that the multivariate group Lasso succeeds i recoverig the row support ad yields cosistecy i l /l 2 orm. We state this result for sequeces of regularizatio parameters λ =

10 10 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN f(p)log p,wheref(p) + is ay fuctio such that λ 0. We also p + assume that is sufficietly large such that s/ < 1/2. THEOREM 1. Suppose that we solve the multivariate group Lasso with specified regularizatio parameter sequece λ for a sequece of problems idexed by (,p,b, )that satisfy assumptios (A1) (A3), ad such that, for some ν>0, θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) >(1 + ν)ρ u( S c S c S) (19) γ 2. The for uiversal costats c i > 0(i.e., idepedet of, p, s, B, ), with probability greater tha 1 c 2 exp( c 3 K log s) c 0 exp( c 1 log(p s)), the followig statemets hold: (a) The multivariate group Lasso has a uique solutio B with row support S(B) that is cotaied withi the true row support S(B ), ad moreover satisfies the (20) boud 8K log s B B l /l 2 C mi + λ D max + 6λ s 2. C mi } {{ } ρ(,s,λ ) ρ (b) If bmi = o(1), the estimate of the row support, S( B) := {i {1,...,p} β i 0}, specified by this uique solutio is equal to the row support set S(B ) of thetruemodel. Note that the theorem is aturally separated ito two distict but related claims. Part (a) guaratees that the method produces o false iclusios ad, moreover, bouds the maximum l 2 -error across the rows. Part (b) requires some additioal ρ b mi assumptios amely, the restrictio = o(1) esurig that the error ρ is of lower order tha the miimum l 2 -orm bmi across rows but also guaratees the stroger result of o false exclusios as well, so that the method recovers the row support exactly. Note that the probability of these evets coverges to oe oly if both (p s) ad s ted to ifiity, which might seem couter-ituitive iitially (sice problems with larger support sets s might seem harder). However, as we discuss at the ed of Sectio 3.3, this depedece ca be removed at the expese of a slightly slower covergece rate for B B l /l 2. Our secod mai theorem is a egative result, showig that the multivariate group Lasso fails with high probability if the rescaled sample size θ l1 /l 2 is below a critical threshold. I order to clarify the phrasig of this result, ote that Theorem 1 ca be summarized succictly as guarateeig that there is a uique solutio B with the correct row support that satisfies B B l /l 2 = o(bmi ). The followig result shows that such a guaratee caot hold if the sample size scales too slowly relative to p, s ad the other problem parameters.

12 12 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN row support has bee recovered, it is straightforward to recover the idividual supports of each colum of the regressio matrix via the additioal steps of performig ordiary least squares ad thresholdig. Efficiet multi-stage estimatio of idividual supports: (1) estimate the support uio with Ŝ, the support uio of the solutio B of the multivariate group Lasso; (2) compute the restricted ordiary least squares (ROLS) estimate, (21) BŜ := arg mi Y XŜBŜ F BŜ for the restricted multivariate problem; (3) compute the matrix T( BŜ) obtaied by thresholdig BŜ at the level 2log(K Ŝ ) 2 C mi, ad estimate the idividual supports by the ozero etries of T( BŜ). The followig result, which is proved i Appedix A, shows that uder the assumptios of Theorem 1, the additioal post-processig applied to the support uio estimate will recover the idividual supports with high probability: COROLLARY 2. Uder assumptios (A1) (A3) ad the additioal assumptios of Theorem 1, if for all idividual ozero coefficiets βik,i S,1 k K, we have βik 2 4log(Ks) C mi, the with probability greater tha 1 (exp( c 0 K log s)) the above two-step estimatio procedure recovers the idividual supports of B Some cosequeces of Theorems 1 ad 2. We begi by makig some simple observatios about the sparsity-overlap fuctio. LEMMA 1. (a) For ay desig satisfyig assumptio (A1), the sparsityoverlap ψ(b ) obeys the bouds s (22) C max K ψ(b ) s ; C mi (b) if SS = I s s, ad if the colums (Z (k) ) of the matrix Z = ζ(b ) are orthogoal, the the sparsity-overlap fuctio is ψ(b ) = max k=1,...,k Z (k) 2 2. The proof of this claim is provided i Appedix C. Based o this lemma, we ow study some special cases of Theorems 1 ad 2. The simplest special case is the uivariate regressio problem (K = 1), i which case the quatity ζ(β ) [as defied i equatio (15)] simply yields a s-dimesioal sig vector with elemets zi = sig(β i ). [Recall that the sig fuctio is defied as sig(0) = 0, sig(x) = 1

14 14 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN the ordiary Lasso strategy requires, with high probability, at least as may samples as the l 1 /l 2 multivariate group Lasso. I particular, the relative efficiecy of multivariate group Lasso versus ordiary Lasso is give by the ratio (24) max k=1,...,k ψ(β (k) S ) log(p s k ) ψ(b S ) log(p s) 1. We cosider the special case of idetical regressios, for which a result ca be stated for ay covariace desig ad the case of orthoormal regressios which illustrates Corollary 3. EXAMPLE 1 (Idetical regressios). Suppose that B := β 1 T K that is, B cosists of K copies of the same coefficiet vector β R p, with support of cardiality S =s.wethehave[ζ(b )] ij = sig(βi )/ K, from which we see that ψ(b ) = z T ( SS ) 1 z, with z beig a s-dimesioal sig vector with elemets zi = sig(β i ). Cosequetly, we have the equality ψ(b ) = ψ(β ),sothat uder our aalysis there is o beefit i usig the multivariate group Lasso relative to the strategy of solvig separate Lasso problems ad costructig the uio of idividually estimated supports. This behavior may seem couter-ituitive, sice uder the model (4) we essetially have K observatios of the coefficiet vector β with the same desig matrix but K idepedet oise realizatios, which could help to reduce the effective oise variace from σ 2 to σ 2 /K if the fact that the regressios are idetical is kow. It must be bore i mid, however, that i our high-dimesioal aalysis the oise variace does ot grow as the dimesioality grows, ad thus asymptotically the oise is domiated by the iterferece betwee the covariates, which grows as (p s). It is thus this high-dimesioal iterferece that domiates the rates give i Theorems 1 ad 2. I cotrast to this pessimistic example, we ow tur to the most optimistic extreme: EXAMPLE 2 ( Orthoormal regressios). Suppose that SS = I s s ad (for s>k) suppose that B is costructed such that the colums of the s K matrix ζ(b ) are orthogoal ad with equal orm (which implies their orm equals sk ). Uder these coditios, we claim that the sample complexity of multivariate group Lasso is smaller tha that of the ordiary Lasso by a factor of 1/K. Ideed, usig Lemma 1(b), we observe that Kψ(B ) = K Z (1) 2 = K Z (k) 2 = tr(z T Z ) = tr(z Z T ) = s, k=1 because Z Z T R s s is the Gram matrix of s uit vectors i R k ad its diagoal elemets are therefore all equal to 1. Cosequetly, the multivariate group Lasso

15 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 15 recovers the row support with high probability for sequeces such that 2(s/K) log(p s) > 1, which allows for sample sizes K times smaller tha the ordiary Lasso approach. Corollary 3 ad the subsequet examples address the case of ucorrelated desig ( SS = I s s ) o the row support S, for which the multivariate group Lasso is ever worse tha the ordiary Lasso i performig row selectio. The followig example shows that if the supports are disjoit, the ordiary Lasso has the same sample complexity as the multivariate group Lasso for ucorrelated desig SS = I s s, but ca be better tha the multivariate group Lasso for desigs SS with suitable correlatios: COROLLARY 4 (Disjoit supports). Suppose that the support sets S k of idividual regressio problems are disjoit. The for ay desig covariace SS, we have max ψ( β (k) ) (a) S ψ(bs ) (b) K ψ ( β (k) ) (25) S. 1 k K PROOF. so that Z (k) S First ote that, sice all supports are disjoit, Z (k) i = sig(βik ), = ζ(β (k) S ). Iequality (b) is the immediate because we have ). To establish iequality (a), we ote that Z S T 1 SS Z S 2 tr(z S T 1 SS Z S ψ(b ) = max x R K : x 1 k=1 x T Z S T 1 SS Z S x max 1 k K et k Z S T 1 SS Z S e k max 1 k K Z(k) T S 1 SS Z(k) S. A caveat i iterpretig Corollary 4, ad more geerally i comparig the performace of the ordiary Lasso ad the multivariate group Lasso, is that for a geeral covariace matrix SS, assumptios (A2) ad (A3) required by Theorem 1 do ot iduce the same costraits o the covariace matrix whe applied to the multivariate problem as whe applied to the idividual regressios. Ideed, i the latter case, (A2) would require max k S c k S k 1 S k S k 1 γ ad (A3) would require max k 1 S k S k D max. Thus (A3) is a stroger assumptio i the multivariate case but (A2) is ot. We illustrate Corollary 4 with a example. EXAMPLE 3 (Disjoit support with ucorrelated desig). Suppose that SS = I s s, ad the supports are disjoit. I this case, we claim that the sample complexity of the l 1 /l 2 multivariate group Lasso is the same as the ordiary Lasso.

17 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 17 FIG. 1. Compariso of sparsity-overlap fuctios for l 1 /l 2 ad the Lasso. For the pair 1 2π (ϑ 1,ϑ 2 ), we represet i each row of plots, correspodig respectively to μ = 0(top), 0.9 (middle) ad 0.9 (bottom), from left to right, the quatities: ψ(b ) (left), max(ψ(β (1) ), ψ(β (2) )) (ceter) ad max(0,ψ(b ) max(ψ(β (1) ), ψ(β (2) ))) (right). The latter idicates whe the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) does ot hold ad by how much it is violated. create eedles, slits or flaps i the graph of the fuctio ψ. Deote by R + ad R the sets R + ={(ϑ 1,ϑ 2 ) mi[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] > 0}, R ={(ϑ 1,ϑ 2 ) max[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] < 0} o which ψ(b ) reaches its miimum value whe μ 0.5 adμ 0.5, respectively (see middle ad bottom ceter plots i Figure 4). For μ = 0, the top ceter graph illustrates that ψ(b ) is equal to 2 except for the cases of matrices BS with disjoit support, correspodig to the discrete set D ={(k π 2,(k± 1) π 2 ), k Z} for which it equals 1. The top rightmost graph illustrates that, as show i Corollary 3, the iequality always holds for a ucorrelated desig. For μ>0, the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) is violated oly o a subset of S R ;ad for μ<0, the iequality is symmetrically violated o a subset of S R + (see Figure 4).

19 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 19 FIG. 2. Plots of support uio recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet types of sparsity: liear sparsity i the left colum (s = p/8) ad logarithmic sparsity i the right colum (s = 2log 2 (p)). The first three rows are based o usig the multivariate group Lasso to estimate the support for the three cases of idetical regressio, itermediate agles ad orthoormal regressios. The fourth row presets results for the Lasso i the case of idetical regressios.

20 20 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN FIG. 3. Plots of support recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet type of sparsity: logarithmic sparsity o top (s = O(log(p))) ad liear sparsity o bottom (s = αp), ad for icreasig values of p from left to right. The oise level is set at σ = 0.1. Each graph shows four curves (black, red, gree, blue) correspodig to the case of idepedet l 1 regularizatio, ad, for l 1 /l 2 regularizatio, the cases of idetical regressio, itermediate agles ad orthoormal regressios. Note how curves correspodig to the same case across differet problem sizes p all coicide, as predicted by Theorems 1 ad 2. Moreover, cosistet with the theory, the curves for the idetical regressio group reach P[Ŝ = S] 0.50 at θ 1, whereas the orthoormal regressio group reaches 50% success substatially earlier. (top three rows), as well as the referece Lasso case for the case of idetical regressios (bottom row). Each pael plots the success probability, P[Ŝ = S],versus the rescaled sample size θ = /[2s log(p s)]. Uder this rescalig, Theorems 1 ad 2 predict that the curves should alig, ad that the success probability should trasitio to 1 oce θ exceeds a critical threshold (depedet o the type of esemble). Note that for suitably large problem sizes (p 128), the curves do alig i the predicted way, showig step-fuctio behavior. Figure 3 plots data from the same simulatios i a differet format. Here the top row correspods to logarithmic sparsity, ad the bottom row to liear sparsity; each pael shows the four differet choices for B, with the problem size p icreasig from left to right. Note how i each pael the locatio of the trasitio of P[Ŝ = S] to oe shifts from right to

### High-dimensional support union recovery in multivariate regression

High-dimesioal support uio recovery i multivariate regressio Guillaume Oboziski Departmet of Statistics UC Berkeley gobo@stat.berkeley.edu Marti J. Waiwright Departmet of Statistics Dept. of Electrical

### Department of Computer Science, University of Otago

Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

### I. Chi-squared Distributions

1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

### Asymptotic Growth of Functions

CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

### Chapter 7 Methods of Finding Estimators

Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

### Convexity, Inequalities, and Norms

Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

### Properties of MLE: consistency, asymptotic normality. Fisher information.

Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

### 3. Covariance and Correlation

Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

### NPTEL STRUCTURAL RELIABILITY

NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics

### In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

### Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

### Module 4: Mathematical Induction

Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate

### ORDERS OF GROWTH KEITH CONRAD

ORDERS OF GROWTH KEITH CONRAD Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really wat to uderstad their behavior It also helps you better grasp topics i calculus

### AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99

VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS Jia Huag 1, Joel L. Horowitz 2 ad Fegrog Wei 3 1 Uiversity of Iowa, 2 Northwester Uiversity ad 3 Uiversity of West Georgia Abstract We cosider a oparametric

### 7. Sample Covariance and Correlation

1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y

### THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

### Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

### 1 The Binomial Theorem: Another Approach

The Biomial Theorem: Aother Approach Pascal s Triagle I class (ad i our text we saw that, for iteger, the biomial theorem ca be stated (a + b = c a + c a b + c a b + + c ab + c b, where the coefficiets

### SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

### Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

### Output Analysis (2, Chapters 10 &11 Law)

B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

### Modified Line Search Method for Global Optimization

Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

### HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY

The Aals of Statistics 2012, Vol. 40, No. 3, 1637 1664 DOI: 10.1214/12-AOS1018 Istitute of Mathematical Statistics, 2012 HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH

### Confidence Intervals for One Mean with Tolerance Probability

Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with

### Sequences and Series

CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

### CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

### 4.1 Sigma Notation and Riemann Sums

0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

### Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

### The Field of Complex Numbers

The Field of Complex Numbers S. F. Ellermeyer The costructio of the system of complex umbers begis by appedig to the system of real umbers a umber which we call i with the property that i = 1. (Note that

### Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

### Soving Recurrence Relations

Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

### 5 Boolean Decision Trees (February 11)

5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

### Section IV.5: Recurrence Relations from Algorithms

Sectio IV.5: Recurrece Relatios from Algorithms Give a recursive algorithm with iput size, we wish to fid a Θ (best big O) estimate for its ru time T() either by obtaiig a explicit formula for T() or by

### Definition. Definition. 7-2 Estimating a Population Proportion. Definition. Definition

7- stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio

### B1. Fourier Analysis of Discrete Time Signals

B. Fourier Aalysis of Discrete Time Sigals Objectives Itroduce discrete time periodic sigals Defie the Discrete Fourier Series (DFS) expasio of periodic sigals Defie the Discrete Fourier Trasform (DFT)

### Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

### Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

### Math Discrete Math Combinatorics MULTIPLICATION PRINCIPLE:

Math 355 - Discrete Math 4.1-4.4 Combiatorics Notes MULTIPLICATION PRINCIPLE: If there m ways to do somethig ad ways to do aother thig the there are m ways to do both. I the laguage of set theory: Let

### 1 Computing the Standard Deviation of Sample Means

Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

### Alternatives To Pearson s and Spearman s Correlation Coefficients

Alteratives To Pearso s ad Spearma s Correlatio Coefficiets Floreti Smaradache Chair of Math & Scieces Departmet Uiversity of New Mexico Gallup, NM 8730, USA Abstract. This article presets several alteratives

### Incremental calculation of weighted mean and variance

Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

### Chapter Gaussian Elimination

Chapter 04.06 Gaussia Elimiatio After readig this chapter, you should be able to:. solve a set of simultaeous liear equatios usig Naïve Gauss elimiatio,. lear the pitfalls of the Naïve Gauss elimiatio

### MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

### {{1}, {2, 4}, {3}} {{1, 3, 4}, {2}} {{1}, {2}, {3, 4}} 5.4 Stirling Numbers

. Stirlig Numbers Whe coutig various types of fuctios from., we quicly discovered that eumeratig the umber of oto fuctios was a difficult problem. For a domai of five elemets ad a rage of four elemets,

### Class Meeting # 16: The Fourier Transform on R n

MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

### LECTURE 13: Cross-validation

LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

### Sequences II. Chapter 3. 3.1 Convergent Sequences

Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,

### Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

### Lecture Notes CMSC 251

We have this messy summatio to solve though First observe that the value remais costat throughout the sum, ad so we ca pull it out frot Also ote that we ca write 3 i / i ad (3/) i T () = log 3 (log ) 1

### A probabilistic proof of a binomial identity

A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

### Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

### Review for College Algebra Final Exam

Review for College Algebra Fial Exam (Please remember that half of the fial exam will cover chapters 1-4. This review sheet covers oly the ew material, from chapters 5 ad 7.) 5.1 Systems of equatios i

### Confidence Intervals for One Mean

Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

### Chapter 5: Inner Product Spaces

Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

### Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

### Normal Distribution.

Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

### 8.5 Alternating infinite series

65 8.5 Alteratig ifiite series I the previous two sectios we cosidered oly series with positive terms. I this sectio we cosider series with both positive ad egative terms which alterate: positive, egative,

### Linear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant

MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is

### if A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S,

Lecture 5: Borel Sets Topologically, the Borel sets i a topological space are the σ-algebra geerated by the ope sets. Oe ca build up the Borel sets from the ope sets by iteratig the operatios of complemetatio

### 1 Introduction to reducing variance in Monte Carlo simulations

Copyright c 007 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a uow mea µ = E(X) of a distributio by

### A Gentle Introduction to Algorithms: Part II

A Getle Itroductio to Algorithms: Part II Cotets of Part I:. Merge: (to merge two sorted lists ito a sigle sorted list.) 2. Bubble Sort 3. Merge Sort: 4. The Big-O, Big-Θ, Big-Ω otatios: asymptotic bouds

### Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

### Hypothesis Tests Applied to Means

The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with

### represented by 4! different arrangements of boxes, divide by 4! to get ways

Problem Set #6 solutios A juggler colors idetical jugglig balls red, white, ad blue (a I how may ways ca this be doe if each color is used at least oce? Let us preemptively color oe ball i each color,

### Lecture 7: Borel Sets and Lebesgue Measure

EE50: Probability Foudatios for Electrical Egieers July-November 205 Lecture 7: Borel Sets ad Lebesgue Measure Lecturer: Dr. Krisha Jagaatha Scribes: Ravi Kolla, Aseem Sharma, Vishakh Hegde I this lecture,

### Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

### Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

### π d i (b i z) (n 1)π )... sin(θ + )

SOME TRIGONOMETRIC IDENTITIES RELATED TO EXACT COVERS Joh Beebee Uiversity of Alaska, Achorage Jauary 18, 1990 Sherma K Stei proves that if si π = k si π b where i the b i are itegers, the are positive

### where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

### Maximum Likelihood Estimators.

Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

### CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

### Gregory Carey, 1998 Linear Transformations & Composites - 1. Linear Transformations and Linear Composites

Gregory Carey, 1998 Liear Trasformatios & Composites - 1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio

### x(x 1)(x 2)... (x k + 1) = [x] k n+m 1

1 Coutig mappigs For every real x ad positive iteger k, let [x] k deote the fallig factorial ad x(x 1)(x 2)... (x k + 1) ( ) x = [x] k k k!, ( ) k = 1. 0 I the sequel, X = {x 1,..., x m }, Y = {y 1,...,

### 8.3 POLAR FORM AND DEMOIVRE S THEOREM

SECTION 8. POLAR FORM AND DEMOIVRE S THEOREM 48 8. POLAR FORM AND DEMOIVRE S THEOREM Figure 8.6 (a, b) b r a 0 θ Complex Number: a + bi Rectagular Form: (a, b) Polar Form: (r, θ) At this poit you ca add,

### The second difference is the sequence of differences of the first difference sequence, 2

Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for

### MATH 361 Homework 9. Royden Royden Royden

MATH 61 Homework 9 Royde..9 First, we show that for ay subset E of the real umbers, E c + y = E + y) c traslatig the complemet is equivalet to the complemet of the traslated set). Without loss of geerality,

### when n = 1, 2, 3, 4, 5, 6, This list represents the amount of dollars you have after n days. Note: The use of is read as and so on.

Geometric eries Before we defie what is meat by a series, we eed to itroduce a related topic, that of sequeces. Formally, a sequece is a fuctio that computes a ordered list. uppose that o day 1, you have

### FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. Powers of a matrix We begi with a propositio which illustrates the usefuless of the diagoalizatio. Recall that a square matrix A is diogaalizable if

### 13 Fast Fourier Transform (FFT)

13 Fast Fourier Trasform FFT) The fast Fourier trasform FFT) is a algorithm for the efficiet implemetatio of the discrete Fourier trasform. We begi our discussio oce more with the cotiuous Fourier trasform.

### AQA STATISTICS 1 REVISION NOTES

AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if

### CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

### ME 101 Measurement Demonstration (MD 1) DEFINITIONS Precision - A measure of agreement between repeated measurements (repeatability).

INTRODUCTION This laboratory ivestigatio ivolves makig both legth ad mass measuremets of a populatio, ad the assessig statistical parameters to describe that populatio. For example, oe may wat to determie

### Numerical Solution of Equations

School of Mechaical Aerospace ad Civil Egieerig Numerical Solutio of Equatios T J Craft George Begg Buildig, C4 TPFE MSc CFD- Readig: J Ferziger, M Peric, Computatioal Methods for Fluid Dyamics HK Versteeg,

### Grade 7. Strand: Number Specific Learning Outcomes It is expected that students will:

Strad: Number Specific Learig Outcomes It is expected that studets will: 7.N.1. Determie ad explai why a umber is divisible by 2, 3, 4, 5, 6, 8, 9, or 10, ad why a umber caot be divided by 0. [C, R] [C]

### SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

### Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

### 9.8: THE POWER OF A TEST

9.8: The Power of a Test CD9-1 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based

Chapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity

### 1.3 Binomial Coefficients

18 CHAPTER 1. COUNTING 1. Biomial Coefficiets I this sectio, we will explore various properties of biomial coefficiets. Pascal s Triagle Table 1 cotais the values of the biomial coefficiets ( ) for 0to

### 428 CHAPTER 12 MULTIPLE LINEAR REGRESSION

48 CHAPTER 1 MULTIPLE LINEAR REGRESSION Table 1-8 Team Wis Pts GF GA PPG PPcT SHG PPGA PKPcT SHGA Chicago 47 104 338 68 86 7. 4 71 76.6 6 Miesota 40 96 31 90 91 6.4 17 67 80.7 0 Toroto 8 68 3 330 79.3

### 0,1 is an accumulation

Sectio 5.4 1 Accumulatio Poits Sectio 5.4 Bolzao-Weierstrass ad Heie-Borel Theorems Purpose of Sectio: To itroduce the cocept of a accumulatio poit of a set, ad state ad prove two major theorems of real

### Stat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals - the general concept

Statistics 104 Lecture 16 (IPS 6.1) Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for

### Engineering 323 Beautiful Homework Set 3 1 of 7 Kuszmar Problem 2.51

Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

### Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

### Hypothesis testing. Null and alternative hypotheses

Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

### 5: Introduction to Estimation

5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

### Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values