SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1

Size: px
Start display at page:

Download "SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1"

Transcription

1 The Aals of Statistics 2011, Vol. 39, No. 1, 1 47 DOI: /09-AOS776 Istitute of Mathematical Statistics, 2011 SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1 BY GUILLAUME OBOZINSKI, MARTIN J. WAINWRIGHT AND MICHAEL I. JORDAN INRIA Willow Project-Team Laboratoire d Iformatique de l Ecole Normale Supérieure, Uiversity of Califoria, Berkeley ad Uiversity of Califoria, Berkeley I multivariate regressio, a K-dimesioal respose vector is regressed upo a commo set of p covariates, with a matrix B R p K of regressio coefficiets. We study the behavior of the multivariate group Lasso,iwhich block regularizatio based o the l 1 /l 2 orm is used for support uio recovery, or recovery of the set of s rows for which B is ozero. Uder highdimesioal scalig, we show that the multivariate group Lasso exhibits a threshold for the recovery of the exact row patter with high probability over the radom desig ad oise that is specified by the sample complexity parameter θ(,p,s):= /[2ψ(B ) log(p s)]. Here is the sample size, ad ψ(b ) is a sparsity-overlap fuctio measurig a combiatio of the sparsities ad overlaps of the K-regressio coefficiet vectors that costitute the model. We prove that the multivariate group Lasso succeeds for problem sequeces (,p,s)such that θ(,p,s) exceeds a critical level θ u, ad fails for sequeces such that θ(,p,s) lies below a critical level θ l. For the special case of the stadard Gaussia esemble, we show that θ l = θ u so that the characterizatio is sharp. The sparsity-overlap fuctio ψ(b ) reveals that, if the desig is ucorrelated o the active rows, l 1 /l 2 regularizatio for multivariate regressio ever harms performace relative to a ordiary Lasso approach ad ca yield substatial improvemets i sample complexity (up to a factor of K) whe the coefficiet vectors are suitably orthogoal. For more geeral desigs, it is possible for the ordiary Lasso to outperform the multivariate group Lasso. We complemet our aalysis with simulatios that demostrate the sharpess of our theoretical results, eve for relatively small problems. 1. Itroductio. The developmet of efficiet algorithms for estimatio of large-scale models has bee a major goal of statistical learig research i the last decade. There is ow a substatial body of work based o l 1 -regularizatio datig back to the semial work of Tibshirai (1996) ad Dooho ad collaborators [Che, Dooho ad Sauders (1998); Dooho ad Huo (2001)]. The bulk of Received Jue 2009; revised November Supported i part by NSF Grats DMS ad CCF to MJW, ad by NSF Grat ad DARPA IPTO Cotract FA to MIJ. AMS 2000 subject classificatios. Primary 62J07; secodary 62F07. Key words ad phrases. LASSO, block-orm, secod-order coe program, sparsity, variable selectio, multivariate regressio, high-dimesioal scalig, simultaeous Lasso, group Lasso. 1

2 2 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN this work has focused o the stadard problem of liear regressio, i which oe makes observatios of the form (1) y = Xβ + w, where y R is a real-valued vector of observatios, w R is a additive zeromea oise vector ad X R p is the desig matrix. A subset of the compoets of the ukow parameter vector β R p are assumed ozero; the goal is to idetify these coefficiets ad (possibly) estimate their values. This goal ca be formulated i terms of the solutio of the pealized optimizatio problem (2) arg mi β R p { 1 y Xβ 22 + λ β 0 }, where β 0 couts the umber of ozero compoets i β ad where λ > 0is a regularizatio parameter. Ufortuately, this optimizatio problem is computatioally itractable, a fact which has led various authors to cosider the covex relaxatio [Tibshirai (1996); Che, Dooho ad Sauders (1998)] (3) arg mi β R p { 1 y Xβ 22 + λ β 1 }, i which β 0 is replaced with the l 1 orm β 1. This relaxatio, ofte referred to as the Lasso [Tibshirai (1996)], is a quadratic program, ad ca be solved efficietly by various methods [e.g., Boyd ad Vadeberghe (2004); Osbore, Presell ad Turlach (2000); Efro et al. (2004)]. A variety of theoretical results are ow i place for the Lasso, both i the traditioal settig where the sample size teds to ifiity with the problem size p fixed [Kight ad Fu (2000)], as well as uder high-dimesioal scalig, i which p ad ted to ifiity simultaeously, thereby allowig p to be comparable to or eve larger tha [e.g., Meishause ad Bühlma (2006); Waiwright (2009b); Meishause ad Yu (2009); Bickel, Ritov ad Tsybakov (2009)]. I may applicatios, it is atural to impose sparsity costraits o the regressio vector β, ad a variety of such costraits have bee cosidered. For example, oe ca cosider a hard sparsity model i which β is assumed to cotai at most s ozero etries or a soft sparsity model i which β is assumed to belog to a l q ball with q<1. Aalyses also differ i terms of the loss fuctios that are cosidered. For the model or variable selectio problem, it is atural to cosider the zero oe loss associated with the problem of recoverig the ukow support set of β. Alteratively, oe ca view the Lasso as a shrikage estimator to be compared to traditioal least squares or ridge regressio; i this case, it is atural to study the l 2 -loss β β 2 betwee the estimate β ad the groud truth. I other settigs, the predictio error E[(Y X T β) 2 ] may be of primary iterest, ad oe tries to show risk cosistecy (amely, that the estimated model predicts as well as the best sparse model, whether or ot the true model is sparse).

3 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 3 A umber of alteratives to the Lasso have bee explored i recet years, ad i some cases stroger theoretical results have bee obtaied [Fa ad Li (2001); Frak ad Friedma (1993); Huag, Horowitz ad Ma (2008)]. However, the resultig optimizatio problems are geerally ocovex ad thus difficult to solve i practice. The Lasso remais a focus of attetio due to its combiatio of favorable statistical ad computatioal properties Block-structured regularizatio. While the assumptio of sparsity at the level of idividual coefficiets is oe way to give meaig to high-dimesioal (p ) regressio, there are other structural assumptios that are atural i regressio, ad which may provide additioal leverage. For istace, i a hierarchical regressio model, groups of regressio coefficiets may be required to be zero or ozero i a blockwise maer; for example, oe might wish to iclude a particular covariate ad all powers of that covariate as a group [Yua ad Li (2006); Zhao, Rocha ad Yu (2009)]. Aother example arises whe we cosider variable selectio i the settig of multivariate regressio: multiple regressios ca be related by a (partially) shared sparsity patter, such as whe there are a uderlyig set of covariates that are relevat across regressios [Oboziski, Taskar ad Jorda (2010); Argyriou, Evgeiou ad Potil (2006); Turlach, Veables ad Wright (2005); Zhag et al. (2008)]. Based o such motivatios, a recet lie of research [Bach, Lackriet ad Jorda (2004); Tropp (2006); Yua ad Li (2006); Zhao, Rocha ad Yu (2009); Oboziski, Taskar ad Jorda (2010); Ravikumar et al. (2009)] has studied the use of block-regularizatio schemes, iwhichthel 1 orm is composed with some other l q orm (q >1), thereby obtaiig the l 1 /l q orm defied as a sum of l q orms over groups of regressio coefficiets. The best kow examples of such block orms are the l 1 /l orm [Turlach, Veables ad Wright (2005); Zhag et al. (2008)] ad the l 1 /l 2 orm [Oboziski, Taskar ad Jorda (2010)]. I this paper, we ivestigate the use of l 1 /l 2 block-regularizatio i the cotext of high-dimesioal multivariate liear regressio, i which a collectio of K scalar outputs are regressed o the same desig matrix X R p.represetig the regressio coefficiets as a p K matrix B, the multivariate regressio model takes the form (4) Y = XB + W, where Y R K ad W R K are matrices of observatios ad zero-mea oise, respectively. I additio, we assume a hard-sparsity model for the regressio coefficiets i which colum k of the coefficiet matrix B has ozero etries o a subset (5) S k := { i {1,...,p} βik 0} of size s k := S k. I may applicatios it is atural to expect that the supports S k should overlap. I that case, istead of estimatig the support of each regressio

4 4 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN separately, it might be beeficial to first estimate the set of variables which are relevat to ay of the multivariate resposes ad to estimate oly subsequetly the idividual supports withi that set. Thus we focus o the problem of recoverig the uio of the supports, amely the set S := K k=1 S k, correspodig to the subset of idices i {1,...,p} that are ivolved i at least oe regressio. We cosider a rage of problems i which variables ca be relevat to all, some, oly oe or oe of the regressios, ad we ivestigate if ad how the overlap of the idividual supports ad the relatedess of idividual regressios beefit or hider estimatio of the support uio. The support uio problem ca be uderstood as the geeralizatio of the problem of variable selectio to the group settig. Rather tha selectig specific compoets of a coefficiet vector, we aim to select specific rows of a coefficiet matrix. We thus also refer to the support uio problem as the row selectio problem. We ote that recoverig S, although ot equivalet to recoverig each of the distict idividual supports S k, addresses the essetial difficulty i recoverig those supports. Ideed, as we show i Sectio 2.2, give a method that returs the row support S with S p (with high probability), it is straightforward to recover the idividual supports S k by ordiary least-squares ad thresholdig. If computatioal complexity were ot a cocer, the atural way to perform row selectio for B would be by solvig the optimizatio problem (6) arg mi B R p K { 1 2 Y XB 2 F + λ B l0 /l q where B = (β ik ) 1 i p,1 k K is a p K matrix, the quatity F deotes the Frobeius orm, 1 ad the orm B l0 /l q couts the umber of rows i B that have ozero l q orm. As before, the l 0 compoet of this regularizer yields a ocovex ad computatioally itractable problem, so that it is atural to cosider the relaxatio { } 1 (7) arg mi B R p K 2 Y XB 2 F + λ B l1 /l q, where B l1 /l q is the block l 1 /l q orm ( p K 1/q p (8) B l1 /l q := β q) ij = β i q. i=1 j=1 The relaxatio (7) is a atural geeralizatio of the Lasso; ideed, it specializes to the Lasso i the case K = 1. For later referece, we also ote that settig q = 1 leads to the use of the l 1 /l 1 block orm i the relaxatio (7). Sice this orm decouples across both the rows ad colums, this particular choice is equivalet to solvig K separate Lasso problems, oe for each colum of the p K i=1 }, 1 The Frobeius orm of a matrix A is give by A F := i,j A 2 ij.

5 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 5 regressio matrix B. A more iterestig choice is q = 2, which yields a block l 1 /l 2 orm that couples together the colums of B. Regularizatio with the l 1 /l 2 orm is commoly referred to as the group Lasso i the settig of uivariate regressio [Yua ad Li (2006)]. We thus refer to l 1 /l 2 regularizatio i the multivariate settig as the multivariate group Lasso. Note that the multivariate group Lasso ca be viewed as a special case of the group Lasso, i that it ivolves a specific groupig of regressio coefficiets, but the multivariate settig brigs ew statistical issues to the fore. As we discuss i Appedix B, the multivariate group Lasso ca be cast as a secod-order coe program (SOCP). This is a family of covex optimizatio problems that ca be solved efficietly with iterior poit methods [Boyd ad Vadeberghe (2004)] ad icludes quadratic programs as a particular case. Some recet work has addressed certai statistical aspects of block-regularizatio schemes. Meier, va de Geer ad Bühlma (2008) perform a aalysis of risk cosistecy with block-orm regularizatio. Bach (2008) provides a aalysis of block-wise support recovery for the kerelized group Lasso i the classical, fixed p settig. I the high-dimesioal settig, Ravikumar et al. (2009) studies the cosistecy of block-wise support recovery for the group Lasso for fixed desig matrices, ad their result is geeralized by Liu ad Zhag (2008) to block-wise support recovery i the settig of geeral l 1 /l q regularizatio, agai for fixed desig matrices. However, these aalyses do ot discrimiate betwee various values of q, yieldig the same qualitative results ad the same covergece rates for q = 1 as for q > 1. Our focus, which is motivated by the empirical observatio that the group Lasso ad the multivariate group Lasso ca outperform the ordiary Lasso [Bach (2008); Yua ad Li (2006); Zhao, Rocha ad Yu (2009); Oboziski, Taskar ad Jorda (2010)], is precisely the distictio betwee q = 1 ad q > 1 (specifically q = 2). We ote that i cocurret work Negahba ad Waiwright (2008) have studied a related problem of support recovery for the l 1 /l relaxatio. The distictio betwee q = 1adq = 2 is also sigificat from a optimizatio-theoretic poit of view. I particular, the SOCP relaxatios uderlyig the multivariate group Lasso (q = 2) are geerally tighter tha the quadratic programmig relaxatio uderlyig the Lasso (q = 1); however, the improved accuracy is geerally obtaied at a higher computatioal cost [Boyd ad Vadeberghe (2004)]. Thus we ca view our problem as a istace of the geeral questio of the relatioship of statistical efficiecy to computatioal efficiecy: does the qualitatively greater amout of computatioal effort ivolved i solvig the multivariate group Lasso always yield greater statistical efficiecy? More specifically, ca we give theoretical coditios uder which solvig the geeralized Lasso problem (7) has greater statistical efficiecy tha aive strategies based o the ordiary Lasso? Coversely, ca the multivariate group Lasso ever be worse tha the ordiary Lasso?

6 6 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN With this motivatio, this paper provides a detailed aalysis of model selectio cosistecy of the multivariate group Lasso (7) with l 1 /l 2 -regularizatio. Statistical efficiecy is defied i terms of the scalig of the sample size, as a fuctio of the problem size p ad of the sparsity structure of the regressio matrix B, required for cosistet row selectio. Our aalysis is high-dimesioal i ature, allowig both ad p to diverge, ad yieldig explicit error bouds as a fuctio of p. As detailed below, our aalysis provides affirmative aswers to both of the questios above. First, we demostrate that uder certai structural assumptios o the desig ad regressio matrix B, the multivariate group Lasso is always guarateed to outperform the ordiary Lasso, i that it correctly performs row selectio for sample sizes for which the Lasso fails with high probability. Secod, we also exhibit some problems (though arguably ot geeric) for which the multivariate group Lasso will be outperformed by the aive strategy of applyig the Lasso separately to each of the K colums, ad takig the uio of supports Our results. The mai cotributio of this paper is to show that uder certai techical coditios o the desig ad oise matrices, the model selectio performace of block-regularized l 1 /l 2 regressio (7) is govered by the sample complexity fuctio (9) θ l1 /l 2 (, p; B ) := 2ψ(B ) log(p s), where is the sample size, p is the ambiet dimesio, s = S is the umber of rows that are ozero ad ψ( ) is a sparsity-overlap fuctio. Our use of the term sample complexity for θ l1 /l 2 reflects the role it plays i our aalysis as the rate at which the sample size must grow i order to obtai cosistet row selectio as a fuctio of the problem parameters. More precisely, for scaligs (,p,s,b ) such that θ l1 /l 2 (, p; B ) exceeds a fixed critical threshold θ u (0, + ),weshow that the probability of correct row selectio by the l 1 /l 2 multivariate group Lasso coverges to oe, ad coversely, for scaligs such that θ l1 /l 2 (, p; B ) is below aother threshold θ l, we show that the multivariate group Lasso fails with high probability. Whereas the ratio (log p)/ is stadard for the high-dimesioal theory of l 1 - regularizatio, the fuctio ψ(b ) is a ovel ad iterestig quatity, oe which measures both the sparsity of the matrix B as well as the overlap betwee the differet regressios, represeted by the colums of B [see equatio (16) forthe precise defiitio of ψ(b )]. As a particular illustratio, cosider the special case of a uivariate regressio with K = 1, i which the covex program (7) reduces to the ordiary Lasso (3). I this case, if the desig matrix is draw from the stadard Gaussia esemble [i.e., X ij N(0, 1), i.i.d.], we show that the sparsityoverlap fuctio reduces to ψ(b ) = s, correspodig to the support size of the sigle coefficiet vector. We thus recover as a corollary a previously kow result [Waiwright (2009b)]: amely, the Lasso succeeds i performig exact support recovery oce the ratio /[s log(p s)] exceeds a certai critical threshold. At

7 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 7 the other extreme, for a geuiely multivariate problem with K>1ads ozero rows, agai for a stadard Gaussia desig, whe the regressio matrix is suitably orthoormal relative to the desig (see Sectio 2 for a precise defiitio), the sparsity-overlap fuctio is give by ψ(b ) = s/k. I this case, l 1 /l 2 blockregularizatio has sample complexity lower by a factor of K relative to the aive approach of solvig K separate Lasso problems. Of course, there is also a rage of behavior betwee these two extremes, i which the gai i sample complexity varies smoothly as a fuctio of the sparsity-overlap ψ(b ) i the iterval [ s K,s]. O the other had, we also show that for suitably correlated desigs, it is possible that the sample complexity ψ(b ) associated with l 1 /l 2 block-regularizatio is larger tha that of the ordiary Lasso (l 1 /l 1 ) approach. The remaider of the paper is orgaized as follows. I Sectio 2, we provide a precise statemet of our mai results (Theorems 1 ad 2), discuss some of their cosequeces ad illustrate the close agreemet betwee our theoretical results ad simulatios. Sectios 3 ad 4 are devoted to the proofs of Theorems 1 ad 2, respectively, with the argumets broke dow ito a series of steps. More techical results are deferred to the appedices. We coclude with a brief discussio i Sectio Notatio. We collect here some otatio used throughout the paper. For a (possibly radom) matrix M R p K, we defie the Frobeius orm M F := ( i,j m 2 ij )1/2, ad for parameters 1 a b,thel a /l b block orm is defied as follows: (10) M la /l b := { p ( K i=1 ) a/b } 1/a m ik b. k=1 These vector orms o matrices should be distiguished from the (a, b)-operator orms (11) M a,b := sup Mx a x b =1 (although some orms belog to both families; see Lemma 7 i Appedix E). Importat special cases of the latter iclude the spectral orm M 2,2 (also deoted M 2 ), ad the l -operator orm M, = max Kj=1 i=1,...,p M ij, deoted M for short. I additio to the usual Ladau otatio O ad o, we write a = (b ) for sequeces such that b a = o(1). We also use the otatio a = (b ) if both a = O(b ) ad b = O(a ) hold. 2. Mai results ad some cosequeces. The aalysis of this paper cosiders the multivariate group Lasso estimator, obtaied as a solutio to the SOCP { } 1 (12) arg mi 2 Y XB 2 F + λ B l1 /l 2 B R p K

8 8 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN for radom esembles of multivariate liear regressio problems, each of the form (4), where the oise matrix W R K is assumed to cosist of i.i.d. elemets W ij N(0,σ 2 ). We cosider radom desig matrices X with each row draw i a i.i.d. maer from a zero-mea Gaussia N(0, ),where 0isap p covariace matrix. Although the block-regularized problem (12) eed ot have a uique solutio i geeral, a cosequece of our aalysis is that i the regime of iterest, the solutio is uique, so that we may talk uambiguously about the estimated support Ŝ. The mai object of study i this paper is the probability P[Ŝ = S], where the probability is take both over the radom choice of oise matrix W ad radom desig matrix X. We study the behavior of this probability as elemets of the triplet (,p,s)ted to ifiity Notatio ad assumptios. More precisely, our mai result applies to sequeces of models idexed by (, p(), s()), a associated sequece of p p covariace matrices ad a sequece {B } of coefficiet matrices with row support (13) S := {i βi 0} of size S =s = s(). We use S c to deote its complemet (i.e., S c := {1,...,p}\S). We let (14) bmi := mi i S β i 2 correspod to the miimal l 2 row-orm of the coefficiet matrix B over its ozero rows. Give a observed pair (Y, X) from the model (4), the goal is to estimate the row support S of the matrix B. We impose the followig coditios o the covariace of the desig matrix: (A1) Bouded eigespectrum: There exist fixed costats C mi > 0 ad C max < + such that all eigevalues of the s s matrix SS are cotaied i the iterval [C mi,c max ]. (A2) Irrepresetable coditio: There exists a fixed parameter γ (0, 1] such that S c S( SS ) 1 1 γ. (A3) Self-icoherece: ( SS ) 1 D max for some D max < +. The lower boud ivolvig C mi i assumptio (A1) prevets excess depedece amog elemets of the desig matrix associated with the support S; coditios of this form are required for model selectio cosistecy or l 2 cosistecy of the Lasso. The upper boud ivolvig C max i assumptio (A1) is ot eeded for provig success but oly failure of the multivariate group Lasso. The irrepresetable coditio ad self-icoherece assumptios are also well kow from previous work o variable selectio cosistecy of the Lasso [Meishause ad Bühlma (2006); Tropp (2006); Zhao ad Yu (2006)]. Although such assumptios are ot eeded i aalyzig l 2 or risk cosistecy, they are kow to be

9 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 9 ecessary for variable selectio cosistecy of the Lasso. Ideed, i the absece of such coditios, it is always possible to make the Lasso fail, eve with a arbitrarily large sample size. [See, however, Meishause ad Yu (2009) for methods that weake the irrepresetable coditio.] Note that these assumptios are trivially satisfied by the stadard Gaussia esemble = I p p, with C mi = C max = 1, D max = 1adγ = 1. More geerally, it ca be show that various matrix classes (e.g., Toeplitz matrices, tree-structured covariace matrices, bouded off-diagoal matrices) satisfy these coditios [Meishause ad Bühlma (2006); Zhao ad Yu (2006); Waiwright (2009b)]. We require a few pieces of otatio before statig the mai results. For a arbitrary matrix B S R s K with ith row β i R 1 K, we defie the matrix ζ(b S ) R s K with ith row ζ(β i ) := β i (15), β i 2 whe β i 0, ad we set ζ(β i ) = 0 otherwise. With this otatio, the sparsityoverlap fuctio is give by (16) ψ(b):= ζ(b S ) T ( SS ) 1 ζ(b S ) 2, where 2 deotes the spectral orm. We use this sparsity-overlap fuctio to defie the sample complexity parameter, which captures the effective sample size (17) θ l1 /l 2 (, p; B ) := 2ψ(B ) log(p s). I the followig two theorems, we cosider a radom desig matrix X draw with i.i.d. N(0, )row vectors, where satisfies assumptios (A1) (A3), ad a observatio matrix Y specified by model (4). I order to capture depedece iduced by the desig covariace matrix, for ay positive semidefiite matrix Q 0, we defie the quatities (18a) ρ l (Q) := 1 2 mi[q ii + Q jj 2Q ij ] i j ad (18b) ρ u (Q) := max Q ii. i We ote that by defiitio, we have ρ l (Q) ρ u (Q) wheever Q 0. Our bouds are stated i terms of these quatities as applied to the coditioal covariace matrix S c S c S := S c S c S c S( SS ) 1 SS c. Our first result is a achievability result, showig that the multivariate group Lasso succeeds i recoverig the row support ad yields cosistecy i l /l 2 orm. We state this result for sequeces of regularizatio parameters λ =

10 10 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN f(p)log p,wheref(p) + is ay fuctio such that λ 0. We also p + assume that is sufficietly large such that s/ < 1/2. THEOREM 1. Suppose that we solve the multivariate group Lasso with specified regularizatio parameter sequece λ for a sequece of problems idexed by (,p,b, )that satisfy assumptios (A1) (A3), ad such that, for some ν>0, θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) >(1 + ν)ρ u( S c S c S) (19) γ 2. The for uiversal costats c i > 0(i.e., idepedet of, p, s, B, ), with probability greater tha 1 c 2 exp( c 3 K log s) c 0 exp( c 1 log(p s)), the followig statemets hold: (a) The multivariate group Lasso has a uique solutio B with row support S(B) that is cotaied withi the true row support S(B ), ad moreover satisfies the (20) boud 8K log s B B l /l 2 C mi + λ D max + 6λ s 2. C mi } {{ } ρ(,s,λ ) ρ (b) If bmi = o(1), the estimate of the row support, S( B) := {i {1,...,p} β i 0}, specified by this uique solutio is equal to the row support set S(B ) of thetruemodel. Note that the theorem is aturally separated ito two distict but related claims. Part (a) guaratees that the method produces o false iclusios ad, moreover, bouds the maximum l 2 -error across the rows. Part (b) requires some additioal ρ b mi assumptios amely, the restrictio = o(1) esurig that the error ρ is of lower order tha the miimum l 2 -orm bmi across rows but also guaratees the stroger result of o false exclusios as well, so that the method recovers the row support exactly. Note that the probability of these evets coverges to oe oly if both (p s) ad s ted to ifiity, which might seem couter-ituitive iitially (sice problems with larger support sets s might seem harder). However, as we discuss at the ed of Sectio 3.3, this depedece ca be removed at the expese of a slightly slower covergece rate for B B l /l 2. Our secod mai theorem is a egative result, showig that the multivariate group Lasso fails with high probability if the rescaled sample size θ l1 /l 2 is below a critical threshold. I order to clarify the phrasig of this result, ote that Theorem 1 ca be summarized succictly as guarateeig that there is a uique solutio B with the correct row support that satisfies B B l /l 2 = o(bmi ). The followig result shows that such a guaratee caot hold if the sample size scales too slowly relative to p, s ad the other problem parameters.

11 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 11 THEOREM 2. Cosider problem sequeces idexed by (,p,b, )that satisfy assumptios (A1) (A2), ad with miimum value bmi such that b mi 2 = ( log p ), ad suppose that we solve the multivariate group Lasso with ay positive regularizatio sequece {λ }. The there exist ν>0 ad uiversal costats c i > 0 such that if the sample size is lower bouded as θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) <(1 ν)ρ l( S c S c S) (2 γ) 2, the with probability greater tha 1 c 0 exp{ c 1 mi( K s, θ l 2 log(p s))}, there is o solutio B of the multivariate group Lasso that has the correct row support ad satisfies the boud B B l /l 2 = o(bmi ). The proof of this claim is provided i Sectio 4. We ote that iformatiotheoretic methods [Waiwright (2009a)] imply that o method (icludig the multivariate group Lasso) ca perform exact support recovery uless /s +, so that the probability give i Theorem 2 coverges to oe uder the give coditios. Note that Theorems 1 ad 2 i cojuctio imply that the rescaled sample size θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) captures the behavior of the multivariate group Lasso for support recovery ad estimatio i block l /l 2 orm. For the special case of radom desig matrices draw from the stadard Gaussia esemble (i.e., = I p p ), the give scaligs are sharp: COROLLARY 1. For the stadard Gaussia esemble, the multivariate group Lasso udergoes a sharp threshold at the level θ l1 /l 2 (,p,b ) = 1. More specifically, for ay δ>0: (a) For problem sequeces (,p,b ) such that θ l1 /l 2 (,p,b )>1 + δ, the multivariate group Lasso succeeds with high probability. (b) Coversely, for sequeces such that θ l1 /l 2 (,p,b )<1 δ, the multivariate group Lasso fails with high probability. PROOF. I the special case = I p p, it is straightforward to verify that all the assumptios are satisfied: i particular, we have C mi = C max = 1, D max = 1 ad γ = 1. Moreover, a short calculatio shows that ρ u (I) = ρ l (I) = 1. Cosequetly, the thresholds give i the sufficiet coditio (19) ad the ecessary coditio (21) are both equal to oe Efficiet estimatio of idividual supports. The precedig results address exact recovery of the support uio of the regressio matrix B. As demostrated by the followig procedure ad the associated corollary of Theorem 1, oce the

12 12 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN row support has bee recovered, it is straightforward to recover the idividual supports of each colum of the regressio matrix via the additioal steps of performig ordiary least squares ad thresholdig. Efficiet multi-stage estimatio of idividual supports: (1) estimate the support uio with Ŝ, the support uio of the solutio B of the multivariate group Lasso; (2) compute the restricted ordiary least squares (ROLS) estimate, (21) BŜ := arg mi Y XŜBŜ F BŜ for the restricted multivariate problem; (3) compute the matrix T( BŜ) obtaied by thresholdig BŜ at the level 2log(K Ŝ ) 2 C mi, ad estimate the idividual supports by the ozero etries of T( BŜ). The followig result, which is proved i Appedix A, shows that uder the assumptios of Theorem 1, the additioal post-processig applied to the support uio estimate will recover the idividual supports with high probability: COROLLARY 2. Uder assumptios (A1) (A3) ad the additioal assumptios of Theorem 1, if for all idividual ozero coefficiets βik,i S,1 k K, we have βik 2 4log(Ks) C mi, the with probability greater tha 1 (exp( c 0 K log s)) the above two-step estimatio procedure recovers the idividual supports of B Some cosequeces of Theorems 1 ad 2. We begi by makig some simple observatios about the sparsity-overlap fuctio. LEMMA 1. (a) For ay desig satisfyig assumptio (A1), the sparsityoverlap ψ(b ) obeys the bouds s (22) C max K ψ(b ) s ; C mi (b) if SS = I s s, ad if the colums (Z (k) ) of the matrix Z = ζ(b ) are orthogoal, the the sparsity-overlap fuctio is ψ(b ) = max k=1,...,k Z (k) 2 2. The proof of this claim is provided i Appedix C. Based o this lemma, we ow study some special cases of Theorems 1 ad 2. The simplest special case is the uivariate regressio problem (K = 1), i which case the quatity ζ(β ) [as defied i equatio (15)] simply yields a s-dimesioal sig vector with elemets zi = sig(β i ). [Recall that the sig fuctio is defied as sig(0) = 0, sig(x) = 1

13 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 13 if x>0 ad sig(x) = 1 if x<0.] I this case, the sparsity-overlap fuctio is give by ψ(β ) = z T ( SS ) 1 z, ad as a cosequece of Lemma 1(a), we have ψ(β ) = (s). Cosequetly, a simple corollary of Theorems 1 ad 2 is that the Lasso succeeds oce the ratio /(2s log(p s)) exceeds a certai critical threshold, determied by the eigespectrum ad icoherece properties of, ad it fails below a certai threshold. This result matches the ecessary ad sufficiet coditios established i previous work o the Lasso [Waiwright (2009b)]. We ca also use Lemma 1 ad Theorems 1 ad 2 to compare the performace of the multivariate group Lasso to the followig (arguably aive) strategy for row selectio usig the ordiary Lasso. Row selectio usig ordiary Lasso: (1) Apply the ordiary Lasso separately to each of the K uivariate regressio problems specified by the colums of B, thereby obtaiig estimates β (k) for k = 1,...,K. (2) For k = 1,...,K, estimate the support of idividual colums via Ŝk := {i β (k) i 0}. (3) Estimate the row support by takig the uio: Ŝ = K Ŝk. k=1 To uderstad the coditios goverig the success/failure of this procedure, ote that it succeeds if ad oly if for each ozero row i S = K k=1 S k,thevariable β (k) i is ozero for at least oe k, adforallj S c ={1,...,p}\S, thevariable β (k) j = 0forallk = 1,...,K. From our uderstadig of the uivariate case, we kow that for θ u = ρ u( S c S c S ), the coditio γ 2 2θ u max ψ( β (k) ) S log(p sk ) 2θ u max ψ( β (k) ) (23) S log(p s) k=1,...,k k=1,...,k is sufficiet to esure that the ordiary Lasso succeeds i row selectio. Coversely, for θ l = ρ l( S c S c S ), if the sample size is upper bouded as (2 γ) 2 <2θ l max ψ( β (k) ) S log(p s), k=1,...,k the there will exist some j S c such for at least oe k {1,...,K}, there holds β (k) j 0 with high probability, implyig failure of the ordiary Lasso. A atural questio is whether the multivariate group Lasso, by takig ito accout the coupligs across colums, always outperforms (or at least matches) the aive strategy. The followig result, prove i Appedix D, shows that if the desig is ucorrelated o the support, the ideed this is the case. COROLLARY 3 (Multivariate group Lasso versus ordiary Lasso). Assume SS = I s s. The for ay multivariate regressio problem, row selectio usig

14 14 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN the ordiary Lasso strategy requires, with high probability, at least as may samples as the l 1 /l 2 multivariate group Lasso. I particular, the relative efficiecy of multivariate group Lasso versus ordiary Lasso is give by the ratio (24) max k=1,...,k ψ(β (k) S ) log(p s k ) ψ(b S ) log(p s) 1. We cosider the special case of idetical regressios, for which a result ca be stated for ay covariace desig ad the case of orthoormal regressios which illustrates Corollary 3. EXAMPLE 1 (Idetical regressios). Suppose that B := β 1 T K that is, B cosists of K copies of the same coefficiet vector β R p, with support of cardiality S =s.wethehave[ζ(b )] ij = sig(βi )/ K, from which we see that ψ(b ) = z T ( SS ) 1 z, with z beig a s-dimesioal sig vector with elemets zi = sig(β i ). Cosequetly, we have the equality ψ(b ) = ψ(β ),sothat uder our aalysis there is o beefit i usig the multivariate group Lasso relative to the strategy of solvig separate Lasso problems ad costructig the uio of idividually estimated supports. This behavior may seem couter-ituitive, sice uder the model (4) we essetially have K observatios of the coefficiet vector β with the same desig matrix but K idepedet oise realizatios, which could help to reduce the effective oise variace from σ 2 to σ 2 /K if the fact that the regressios are idetical is kow. It must be bore i mid, however, that i our high-dimesioal aalysis the oise variace does ot grow as the dimesioality grows, ad thus asymptotically the oise is domiated by the iterferece betwee the covariates, which grows as (p s). It is thus this high-dimesioal iterferece that domiates the rates give i Theorems 1 ad 2. I cotrast to this pessimistic example, we ow tur to the most optimistic extreme: EXAMPLE 2 ( Orthoormal regressios). Suppose that SS = I s s ad (for s>k) suppose that B is costructed such that the colums of the s K matrix ζ(b ) are orthogoal ad with equal orm (which implies their orm equals sk ). Uder these coditios, we claim that the sample complexity of multivariate group Lasso is smaller tha that of the ordiary Lasso by a factor of 1/K. Ideed, usig Lemma 1(b), we observe that Kψ(B ) = K Z (1) 2 = K Z (k) 2 = tr(z T Z ) = tr(z Z T ) = s, k=1 because Z Z T R s s is the Gram matrix of s uit vectors i R k ad its diagoal elemets are therefore all equal to 1. Cosequetly, the multivariate group Lasso

15 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 15 recovers the row support with high probability for sequeces such that 2(s/K) log(p s) > 1, which allows for sample sizes K times smaller tha the ordiary Lasso approach. Corollary 3 ad the subsequet examples address the case of ucorrelated desig ( SS = I s s ) o the row support S, for which the multivariate group Lasso is ever worse tha the ordiary Lasso i performig row selectio. The followig example shows that if the supports are disjoit, the ordiary Lasso has the same sample complexity as the multivariate group Lasso for ucorrelated desig SS = I s s, but ca be better tha the multivariate group Lasso for desigs SS with suitable correlatios: COROLLARY 4 (Disjoit supports). Suppose that the support sets S k of idividual regressio problems are disjoit. The for ay desig covariace SS, we have max ψ( β (k) ) (a) S ψ(bs ) (b) K ψ ( β (k) ) (25) S. 1 k K PROOF. so that Z (k) S First ote that, sice all supports are disjoit, Z (k) i = sig(βik ), = ζ(β (k) S ). Iequality (b) is the immediate because we have ). To establish iequality (a), we ote that Z S T 1 SS Z S 2 tr(z S T 1 SS Z S ψ(b ) = max x R K : x 1 k=1 x T Z S T 1 SS Z S x max 1 k K et k Z S T 1 SS Z S e k max 1 k K Z(k) T S 1 SS Z(k) S. A caveat i iterpretig Corollary 4, ad more geerally i comparig the performace of the ordiary Lasso ad the multivariate group Lasso, is that for a geeral covariace matrix SS, assumptios (A2) ad (A3) required by Theorem 1 do ot iduce the same costraits o the covariace matrix whe applied to the multivariate problem as whe applied to the idividual regressios. Ideed, i the latter case, (A2) would require max k S c k S k 1 S k S k 1 γ ad (A3) would require max k 1 S k S k D max. Thus (A3) is a stroger assumptio i the multivariate case but (A2) is ot. We illustrate Corollary 4 with a example. EXAMPLE 3 (Disjoit support with ucorrelated desig). Suppose that SS = I s s, ad the supports are disjoit. I this case, we claim that the sample complexity of the l 1 /l 2 multivariate group Lasso is the same as the ordiary Lasso.

16 16 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN If the idividual regressios have disjoit support, the ZS = ζ(b S ) has oly a sigle ozero etry per row ad therefore the colums of Z are orthogoal. Moreover, Zik = sig(β(k) i ). By Lemma 1(b), the sparsity-overlap fuctio ψ(b ) is equal to the largest squared colum orm. But Z (k) 2 = s i=1 sig(β (k) i ) 2 = s k. Thus, the sample complexity of the multivariate group Lasso is the same as the ordiary Lasso i this case. 2 Fially, we cosider a example that illustrates the effect of correlated desigs: EXAMPLE 4 (Effects of correlated desigs). To illustrate the behavior of the sparsity-overlap fuctio i the presece of correlatios i the desig, we cosider the simple case of two regressios with support of size 2. For parameters ϑ 1 ad ϑ 2 [0,π] ad μ ( 1, +1), cosider regressio matrices B such that B = ζ(bs ) ad [ ] [ ] ζ(bs ) = cos(ϑ1 ) si(ϑ 1 ) ad 1 1 μ (26) SS cos(ϑ 2 ) si(ϑ 2 ) =. μ 1 Settig M = ζ(bs )T SS 1 ζ(b S ), a simple calculatio shows that tr(m ) = 2 ( 1 + μ cos(ϑ 1 ϑ 2 ) ) ad det(m ) = (1 μ 2 ) si(ϑ 1 ϑ 2 ) 2, so that the eigevalues of M are μ + = (1 + μ) ( 1 + cos(ϑ 1 ϑ 2 ) ) ad μ = (1 μ) ( 1 cos(ϑ 1 ϑ 2 ) ), ad ψ(b ) = max(μ +,μ ). O the other had, with z 1 = ζ ( β (1) ) ( ) sig(cos(ϑ1 )) = sig(cos(ϑ 2 )) ad z 2 = ζ ( β (2) ) = we have ( ) sig(si(ϑ1 )) sig(si(ϑ 2 )) ψ ( β (1) ) = z 1 T 1 SS z 1 = 1 {cos(ϑ1 ) 0} + 1 {cos(ϑ2 ) 0} + 2μ sig(cos(ϑ 1 ) cos(ϑ 2 )), ψ ( β (2) ) = z 2 T 1 SS z 2 = 1 {si(ϑ1 ) 0} + 1 {si(ϑ2 ) 0} + 2μ sig(si(ϑ 1 ) si(ϑ 2 )). Figure 1 provides a graphical compariso of these sample complexity fuctios. The fuctio ψ(b ) = max(ψ(β (1) ), ψ(β (2) )) is discotiuous o S = π 2 Z R R π 2 Z, ad, as a cosequece, so is its differece with ψ(b ). Note that, for fixed ϑ 1 or fixed ϑ 2, some of these discotiuities are removable discotiuities of the iduced fuctio o the other variable, ad these discotiuities therefore 2 I makig this assertio, we are igorig ay differece betwee log(p s k ) ad log(p s), which is valid, for istace, i the regime of subliear sparsity, whe s k /p 0ads/p 0.

17 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 17 FIG. 1. Compariso of sparsity-overlap fuctios for l 1 /l 2 ad the Lasso. For the pair 1 2π (ϑ 1,ϑ 2 ), we represet i each row of plots, correspodig respectively to μ = 0(top), 0.9 (middle) ad 0.9 (bottom), from left to right, the quatities: ψ(b ) (left), max(ψ(β (1) ), ψ(β (2) )) (ceter) ad max(0,ψ(b ) max(ψ(β (1) ), ψ(β (2) ))) (right). The latter idicates whe the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) does ot hold ad by how much it is violated. create eedles, slits or flaps i the graph of the fuctio ψ. Deote by R + ad R the sets R + ={(ϑ 1,ϑ 2 ) mi[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] > 0}, R ={(ϑ 1,ϑ 2 ) max[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] < 0} o which ψ(b ) reaches its miimum value whe μ 0.5 adμ 0.5, respectively (see middle ad bottom ceter plots i Figure 4). For μ = 0, the top ceter graph illustrates that ψ(b ) is equal to 2 except for the cases of matrices BS with disjoit support, correspodig to the discrete set D ={(k π 2,(k± 1) π 2 ), k Z} for which it equals 1. The top rightmost graph illustrates that, as show i Corollary 3, the iequality always holds for a ucorrelated desig. For μ>0, the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) is violated oly o a subset of S R ;ad for μ<0, the iequality is symmetrically violated o a subset of S R + (see Figure 4).

18 18 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN 2.4. Illustrative simulatios. I this sectio, we preset the results of simulatios that illustrate the sharpess of Theorems 1 ad 2, ad furthermore demostrate how quickly the predicted behavior is observed as elemets of the triple (, p, s) grow i differet regimes. We explore the case of two regressios (i.e., K = 2) which share a idetical support set S with cardiality S =s i Sectio ad cosider a slightly more geeral case i Sectio Threshold effect i the stadard Gaussia case. The first set of experimets was desiged to reveal the threshold effect predicted by Theorems 1 ad 2. The desig matrix X is sampled from the stadard Gaussia esemble, with i.i.d. etries X ij N(0, 1). We cosider two types of sparsity: logarithmic sparsity, where s = α log(p), for α = 2/ log(2), ad liear sparsity, where s = αp,forα = 1/8 for various ambiet model dimesios p {16, 32, 64, 256, 512, 1024}. Fora give triplet (,p,s), we solve the block-regularized problem (12) with the regularizatio parameter λ = log(p s)(log s)/. For each fixed (p, s) pair, we measure the sample complexity i terms of a parameter θ, i particular lettig = θslog(p s) for θ [0.25, 1.5].WeletthematrixB R p 2 of regressio coefficiets have etries βij i { 1/ 2, 1/ 2}, choosig the parameters to vary the agle betwee the two colums, thereby obtaiig various desired values of ψ(b ). Sice = I p p for the stadard Gaussia esemble, the sparsity-overlap fuctio ψ(b ) is simply the maximal eigevalue of the Gram matrix ζ(bs )T ζ(bs ). Sice βij =1/ 2 by costructio, we are guarateed that BS = ζ(b S ), that the miimum value bmi = 1, ad, moreover, that the colums of ζ(b S ) have the same Euclidea orm. To costruct parameter matrices B that satisfy βij =1/ 2, we choose both p ad the sparsity scaligs so that the obtaied values for s are multiples of four. We the costruct the colums Z (1) ad Z (2) of the matrix B = ζ(b ) from copies of vectors of legth four. Deotig by the usual matrix tesor product, we cosider the followig 4-vectors: Idetical regressios: We set Z (1) = Z (2) = s,sothatthesparsity-overlap fuctio is ψ(b ) = s. Orthoormal regressios: Here B is costructed with Z (1) Z (2),sothat ψ(b ) = 2 s, the most favorable situatio. I order to achieve this orthoormality, we set Z (1) = s ad Z (2) = s/2 (1, 1) T. Itermediate agles: I this itermediate case, the colums Z (1) ad Z (2) are at a 60 agle, which leads to ψ(b ) = 3 4 s. Specifically, we set Z(1) = s ad Z (2) = s/4 (1, 1, 1, 1) T. Figure 2 shows plots for liear sparsity (left colum) ad logarithmic sparsity (right colum) i all three cases ad where the multivariate group Lasso was used

19 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 19 FIG. 2. Plots of support uio recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet types of sparsity: liear sparsity i the left colum (s = p/8) ad logarithmic sparsity i the right colum (s = 2log 2 (p)). The first three rows are based o usig the multivariate group Lasso to estimate the support for the three cases of idetical regressio, itermediate agles ad orthoormal regressios. The fourth row presets results for the Lasso i the case of idetical regressios.

20 20 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN FIG. 3. Plots of support recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet type of sparsity: logarithmic sparsity o top (s = O(log(p))) ad liear sparsity o bottom (s = αp), ad for icreasig values of p from left to right. The oise level is set at σ = 0.1. Each graph shows four curves (black, red, gree, blue) correspodig to the case of idepedet l 1 regularizatio, ad, for l 1 /l 2 regularizatio, the cases of idetical regressio, itermediate agles ad orthoormal regressios. Note how curves correspodig to the same case across differet problem sizes p all coicide, as predicted by Theorems 1 ad 2. Moreover, cosistet with the theory, the curves for the idetical regressio group reach P[Ŝ = S] 0.50 at θ 1, whereas the orthoormal regressio group reaches 50% success substatially earlier. (top three rows), as well as the referece Lasso case for the case of idetical regressios (bottom row). Each pael plots the success probability, P[Ŝ = S],versus the rescaled sample size θ = /[2s log(p s)]. Uder this rescalig, Theorems 1 ad 2 predict that the curves should alig, ad that the success probability should trasitio to 1 oce θ exceeds a critical threshold (depedet o the type of esemble). Note that for suitably large problem sizes (p 128), the curves do alig i the predicted way, showig step-fuctio behavior. Figure 3 plots data from the same simulatios i a differet format. Here the top row correspods to logarithmic sparsity, ad the bottom row to liear sparsity; each pael shows the four differet choices for B, with the problem size p icreasig from left to right. Note how i each pael the locatio of the trasitio of P[Ŝ = S] to oe shifts from right to

High-dimensional support union recovery in multivariate regression

High-dimensional support union recovery in multivariate regression High-dimesioal support uio recovery i multivariate regressio Guillaume Oboziski Departmet of Statistics UC Berkeley gobo@stat.berkeley.edu Marti J. Waiwright Departmet of Statistics Dept. of Electrical

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

3. Covariance and Correlation

3. Covariance and Correlation Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99

AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99 VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS Jia Huag 1, Joel L. Horowitz 2 ad Fegrog Wei 3 1 Uiversity of Iowa, 2 Northwester Uiversity ad 3 Uiversity of West Georgia Abstract We cosider a oparametric

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

7. Sample Covariance and Correlation

7. Sample Covariance and Correlation 1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY

HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY The Aals of Statistics 2012, Vol. 40, No. 3, 1637 1664 DOI: 10.1214/12-AOS1018 Istitute of Mathematical Statistics, 2012 HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

4.1 Sigma Notation and Riemann Sums

4.1 Sigma Notation and Riemann Sums 0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Sequences II. Chapter 3. 3.1 Convergent Sequences

Sequences II. Chapter 3. 3.1 Convergent Sequences Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Linear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant

Linear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

Gregory Carey, 1998 Linear Transformations & Composites - 1. Linear Transformations and Linear Composites

Gregory Carey, 1998 Linear Transformations & Composites - 1. Linear Transformations and Linear Composites Gregory Carey, 1998 Liear Trasformatios & Composites - 1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. Powers of a matrix We begi with a propositio which illustrates the usefuless of the diagoalizatio. Recall that a square matrix A is diogaalizable if

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

The second difference is the sequence of differences of the first difference sequence, 2

The second difference is the sequence of differences of the first difference sequence, 2 Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Chapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

9.8: THE POWER OF A TEST

9.8: THE POWER OF A TEST 9.8: The Power of a Test CD9-1 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based

More information

1.3 Binomial Coefficients

1.3 Binomial Coefficients 18 CHAPTER 1. COUNTING 1. Biomial Coefficiets I this sectio, we will explore various properties of biomial coefficiets. Pascal s Triagle Table 1 cotais the values of the biomial coefficiets ( ) for 0to

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Engineering 323 Beautiful Homework Set 3 1 of 7 Kuszmar Problem 2.51

Engineering 323 Beautiful Homework Set 3 1 of 7 Kuszmar Problem 2.51 Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

Institute of Actuaries of India Subject CT1 Financial Mathematics

Institute of Actuaries of India Subject CT1 Financial Mathematics Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i

More information

Divide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015

Divide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015 CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a

More information

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

More information

Unit 20 Hypotheses Testing

Unit 20 Hypotheses Testing Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Lecture 4: Cheeger s Inequality

Lecture 4: Cheeger s Inequality Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a d-regular

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE

SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE By Guillaume Lecué CNRS, LAMA, Mare-la-vallée, 77454 Frace ad By Shahar Medelso Departmet of Mathematics,

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Irreducible polynomials with consecutive zero coefficients

Irreducible polynomials with consecutive zero coefficients Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem

More information

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 haupt@ieee.org Abstract:

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

1. MATHEMATICAL INDUCTION

1. MATHEMATICAL INDUCTION 1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL. Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Notes on exponential generating functions and structures.

Notes on exponential generating functions and structures. Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the

More information

An example of non-quenched convergence in the conditional central limit theorem for partial sums of a linear process

An example of non-quenched convergence in the conditional central limit theorem for partial sums of a linear process A example of o-queched covergece i the coditioal cetral limit theorem for partial sums of a liear process Dalibor Volý ad Michael Woodroofe Abstract A causal liear processes X,X 0,X is costructed for which

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

THE ABRACADABRA PROBLEM

THE ABRACADABRA PROBLEM THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected

More information

1 The Gaussian channel

1 The Gaussian channel ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.

More information

Recursion and Recurrences

Recursion and Recurrences Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,

More information

NATIONAL SENIOR CERTIFICATE GRADE 12

NATIONAL SENIOR CERTIFICATE GRADE 12 NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2 Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,

More information