SUPPORT UNION RECOVERY IN HIGHDIMENSIONAL MULTIVARIATE REGRESSION 1


 Ethelbert Osborne
 2 years ago
 Views:
Transcription
1 The Aals of Statistics 2011, Vol. 39, No. 1, 1 47 DOI: /09AOS776 Istitute of Mathematical Statistics, 2011 SUPPORT UNION RECOVERY IN HIGHDIMENSIONAL MULTIVARIATE REGRESSION 1 BY GUILLAUME OBOZINSKI, MARTIN J. WAINWRIGHT AND MICHAEL I. JORDAN INRIA Willow ProjectTeam Laboratoire d Iformatique de l Ecole Normale Supérieure, Uiversity of Califoria, Berkeley ad Uiversity of Califoria, Berkeley I multivariate regressio, a Kdimesioal respose vector is regressed upo a commo set of p covariates, with a matrix B R p K of regressio coefficiets. We study the behavior of the multivariate group Lasso,iwhich block regularizatio based o the l 1 /l 2 orm is used for support uio recovery, or recovery of the set of s rows for which B is ozero. Uder highdimesioal scalig, we show that the multivariate group Lasso exhibits a threshold for the recovery of the exact row patter with high probability over the radom desig ad oise that is specified by the sample complexity parameter θ(,p,s):= /[2ψ(B ) log(p s)]. Here is the sample size, ad ψ(b ) is a sparsityoverlap fuctio measurig a combiatio of the sparsities ad overlaps of the Kregressio coefficiet vectors that costitute the model. We prove that the multivariate group Lasso succeeds for problem sequeces (,p,s)such that θ(,p,s) exceeds a critical level θ u, ad fails for sequeces such that θ(,p,s) lies below a critical level θ l. For the special case of the stadard Gaussia esemble, we show that θ l = θ u so that the characterizatio is sharp. The sparsityoverlap fuctio ψ(b ) reveals that, if the desig is ucorrelated o the active rows, l 1 /l 2 regularizatio for multivariate regressio ever harms performace relative to a ordiary Lasso approach ad ca yield substatial improvemets i sample complexity (up to a factor of K) whe the coefficiet vectors are suitably orthogoal. For more geeral desigs, it is possible for the ordiary Lasso to outperform the multivariate group Lasso. We complemet our aalysis with simulatios that demostrate the sharpess of our theoretical results, eve for relatively small problems. 1. Itroductio. The developmet of efficiet algorithms for estimatio of largescale models has bee a major goal of statistical learig research i the last decade. There is ow a substatial body of work based o l 1 regularizatio datig back to the semial work of Tibshirai (1996) ad Dooho ad collaborators [Che, Dooho ad Sauders (1998); Dooho ad Huo (2001)]. The bulk of Received Jue 2009; revised November Supported i part by NSF Grats DMS ad CCF to MJW, ad by NSF Grat ad DARPA IPTO Cotract FA to MIJ. AMS 2000 subject classificatios. Primary 62J07; secodary 62F07. Key words ad phrases. LASSO, blockorm, secodorder coe program, sparsity, variable selectio, multivariate regressio, highdimesioal scalig, simultaeous Lasso, group Lasso. 1
2 2 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN this work has focused o the stadard problem of liear regressio, i which oe makes observatios of the form (1) y = Xβ + w, where y R is a realvalued vector of observatios, w R is a additive zeromea oise vector ad X R p is the desig matrix. A subset of the compoets of the ukow parameter vector β R p are assumed ozero; the goal is to idetify these coefficiets ad (possibly) estimate their values. This goal ca be formulated i terms of the solutio of the pealized optimizatio problem (2) arg mi β R p { 1 y Xβ 22 + λ β 0 }, where β 0 couts the umber of ozero compoets i β ad where λ > 0is a regularizatio parameter. Ufortuately, this optimizatio problem is computatioally itractable, a fact which has led various authors to cosider the covex relaxatio [Tibshirai (1996); Che, Dooho ad Sauders (1998)] (3) arg mi β R p { 1 y Xβ 22 + λ β 1 }, i which β 0 is replaced with the l 1 orm β 1. This relaxatio, ofte referred to as the Lasso [Tibshirai (1996)], is a quadratic program, ad ca be solved efficietly by various methods [e.g., Boyd ad Vadeberghe (2004); Osbore, Presell ad Turlach (2000); Efro et al. (2004)]. A variety of theoretical results are ow i place for the Lasso, both i the traditioal settig where the sample size teds to ifiity with the problem size p fixed [Kight ad Fu (2000)], as well as uder highdimesioal scalig, i which p ad ted to ifiity simultaeously, thereby allowig p to be comparable to or eve larger tha [e.g., Meishause ad Bühlma (2006); Waiwright (2009b); Meishause ad Yu (2009); Bickel, Ritov ad Tsybakov (2009)]. I may applicatios, it is atural to impose sparsity costraits o the regressio vector β, ad a variety of such costraits have bee cosidered. For example, oe ca cosider a hard sparsity model i which β is assumed to cotai at most s ozero etries or a soft sparsity model i which β is assumed to belog to a l q ball with q<1. Aalyses also differ i terms of the loss fuctios that are cosidered. For the model or variable selectio problem, it is atural to cosider the zero oe loss associated with the problem of recoverig the ukow support set of β. Alteratively, oe ca view the Lasso as a shrikage estimator to be compared to traditioal least squares or ridge regressio; i this case, it is atural to study the l 2 loss β β 2 betwee the estimate β ad the groud truth. I other settigs, the predictio error E[(Y X T β) 2 ] may be of primary iterest, ad oe tries to show risk cosistecy (amely, that the estimated model predicts as well as the best sparse model, whether or ot the true model is sparse).
3 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 3 A umber of alteratives to the Lasso have bee explored i recet years, ad i some cases stroger theoretical results have bee obtaied [Fa ad Li (2001); Frak ad Friedma (1993); Huag, Horowitz ad Ma (2008)]. However, the resultig optimizatio problems are geerally ocovex ad thus difficult to solve i practice. The Lasso remais a focus of attetio due to its combiatio of favorable statistical ad computatioal properties Blockstructured regularizatio. While the assumptio of sparsity at the level of idividual coefficiets is oe way to give meaig to highdimesioal (p ) regressio, there are other structural assumptios that are atural i regressio, ad which may provide additioal leverage. For istace, i a hierarchical regressio model, groups of regressio coefficiets may be required to be zero or ozero i a blockwise maer; for example, oe might wish to iclude a particular covariate ad all powers of that covariate as a group [Yua ad Li (2006); Zhao, Rocha ad Yu (2009)]. Aother example arises whe we cosider variable selectio i the settig of multivariate regressio: multiple regressios ca be related by a (partially) shared sparsity patter, such as whe there are a uderlyig set of covariates that are relevat across regressios [Oboziski, Taskar ad Jorda (2010); Argyriou, Evgeiou ad Potil (2006); Turlach, Veables ad Wright (2005); Zhag et al. (2008)]. Based o such motivatios, a recet lie of research [Bach, Lackriet ad Jorda (2004); Tropp (2006); Yua ad Li (2006); Zhao, Rocha ad Yu (2009); Oboziski, Taskar ad Jorda (2010); Ravikumar et al. (2009)] has studied the use of blockregularizatio schemes, iwhichthel 1 orm is composed with some other l q orm (q >1), thereby obtaiig the l 1 /l q orm defied as a sum of l q orms over groups of regressio coefficiets. The best kow examples of such block orms are the l 1 /l orm [Turlach, Veables ad Wright (2005); Zhag et al. (2008)] ad the l 1 /l 2 orm [Oboziski, Taskar ad Jorda (2010)]. I this paper, we ivestigate the use of l 1 /l 2 blockregularizatio i the cotext of highdimesioal multivariate liear regressio, i which a collectio of K scalar outputs are regressed o the same desig matrix X R p.represetig the regressio coefficiets as a p K matrix B, the multivariate regressio model takes the form (4) Y = XB + W, where Y R K ad W R K are matrices of observatios ad zeromea oise, respectively. I additio, we assume a hardsparsity model for the regressio coefficiets i which colum k of the coefficiet matrix B has ozero etries o a subset (5) S k := { i {1,...,p} βik 0} of size s k := S k. I may applicatios it is atural to expect that the supports S k should overlap. I that case, istead of estimatig the support of each regressio
4 4 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN separately, it might be beeficial to first estimate the set of variables which are relevat to ay of the multivariate resposes ad to estimate oly subsequetly the idividual supports withi that set. Thus we focus o the problem of recoverig the uio of the supports, amely the set S := K k=1 S k, correspodig to the subset of idices i {1,...,p} that are ivolved i at least oe regressio. We cosider a rage of problems i which variables ca be relevat to all, some, oly oe or oe of the regressios, ad we ivestigate if ad how the overlap of the idividual supports ad the relatedess of idividual regressios beefit or hider estimatio of the support uio. The support uio problem ca be uderstood as the geeralizatio of the problem of variable selectio to the group settig. Rather tha selectig specific compoets of a coefficiet vector, we aim to select specific rows of a coefficiet matrix. We thus also refer to the support uio problem as the row selectio problem. We ote that recoverig S, although ot equivalet to recoverig each of the distict idividual supports S k, addresses the essetial difficulty i recoverig those supports. Ideed, as we show i Sectio 2.2, give a method that returs the row support S with S p (with high probability), it is straightforward to recover the idividual supports S k by ordiary leastsquares ad thresholdig. If computatioal complexity were ot a cocer, the atural way to perform row selectio for B would be by solvig the optimizatio problem (6) arg mi B R p K { 1 2 Y XB 2 F + λ B l0 /l q where B = (β ik ) 1 i p,1 k K is a p K matrix, the quatity F deotes the Frobeius orm, 1 ad the orm B l0 /l q couts the umber of rows i B that have ozero l q orm. As before, the l 0 compoet of this regularizer yields a ocovex ad computatioally itractable problem, so that it is atural to cosider the relaxatio { } 1 (7) arg mi B R p K 2 Y XB 2 F + λ B l1 /l q, where B l1 /l q is the block l 1 /l q orm ( p K 1/q p (8) B l1 /l q := β q) ij = β i q. i=1 j=1 The relaxatio (7) is a atural geeralizatio of the Lasso; ideed, it specializes to the Lasso i the case K = 1. For later referece, we also ote that settig q = 1 leads to the use of the l 1 /l 1 block orm i the relaxatio (7). Sice this orm decouples across both the rows ad colums, this particular choice is equivalet to solvig K separate Lasso problems, oe for each colum of the p K i=1 }, 1 The Frobeius orm of a matrix A is give by A F := i,j A 2 ij.
5 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 5 regressio matrix B. A more iterestig choice is q = 2, which yields a block l 1 /l 2 orm that couples together the colums of B. Regularizatio with the l 1 /l 2 orm is commoly referred to as the group Lasso i the settig of uivariate regressio [Yua ad Li (2006)]. We thus refer to l 1 /l 2 regularizatio i the multivariate settig as the multivariate group Lasso. Note that the multivariate group Lasso ca be viewed as a special case of the group Lasso, i that it ivolves a specific groupig of regressio coefficiets, but the multivariate settig brigs ew statistical issues to the fore. As we discuss i Appedix B, the multivariate group Lasso ca be cast as a secodorder coe program (SOCP). This is a family of covex optimizatio problems that ca be solved efficietly with iterior poit methods [Boyd ad Vadeberghe (2004)] ad icludes quadratic programs as a particular case. Some recet work has addressed certai statistical aspects of blockregularizatio schemes. Meier, va de Geer ad Bühlma (2008) perform a aalysis of risk cosistecy with blockorm regularizatio. Bach (2008) provides a aalysis of blockwise support recovery for the kerelized group Lasso i the classical, fixed p settig. I the highdimesioal settig, Ravikumar et al. (2009) studies the cosistecy of blockwise support recovery for the group Lasso for fixed desig matrices, ad their result is geeralized by Liu ad Zhag (2008) to blockwise support recovery i the settig of geeral l 1 /l q regularizatio, agai for fixed desig matrices. However, these aalyses do ot discrimiate betwee various values of q, yieldig the same qualitative results ad the same covergece rates for q = 1 as for q > 1. Our focus, which is motivated by the empirical observatio that the group Lasso ad the multivariate group Lasso ca outperform the ordiary Lasso [Bach (2008); Yua ad Li (2006); Zhao, Rocha ad Yu (2009); Oboziski, Taskar ad Jorda (2010)], is precisely the distictio betwee q = 1 ad q > 1 (specifically q = 2). We ote that i cocurret work Negahba ad Waiwright (2008) have studied a related problem of support recovery for the l 1 /l relaxatio. The distictio betwee q = 1adq = 2 is also sigificat from a optimizatiotheoretic poit of view. I particular, the SOCP relaxatios uderlyig the multivariate group Lasso (q = 2) are geerally tighter tha the quadratic programmig relaxatio uderlyig the Lasso (q = 1); however, the improved accuracy is geerally obtaied at a higher computatioal cost [Boyd ad Vadeberghe (2004)]. Thus we ca view our problem as a istace of the geeral questio of the relatioship of statistical efficiecy to computatioal efficiecy: does the qualitatively greater amout of computatioal effort ivolved i solvig the multivariate group Lasso always yield greater statistical efficiecy? More specifically, ca we give theoretical coditios uder which solvig the geeralized Lasso problem (7) has greater statistical efficiecy tha aive strategies based o the ordiary Lasso? Coversely, ca the multivariate group Lasso ever be worse tha the ordiary Lasso?
6 6 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN With this motivatio, this paper provides a detailed aalysis of model selectio cosistecy of the multivariate group Lasso (7) with l 1 /l 2 regularizatio. Statistical efficiecy is defied i terms of the scalig of the sample size, as a fuctio of the problem size p ad of the sparsity structure of the regressio matrix B, required for cosistet row selectio. Our aalysis is highdimesioal i ature, allowig both ad p to diverge, ad yieldig explicit error bouds as a fuctio of p. As detailed below, our aalysis provides affirmative aswers to both of the questios above. First, we demostrate that uder certai structural assumptios o the desig ad regressio matrix B, the multivariate group Lasso is always guarateed to outperform the ordiary Lasso, i that it correctly performs row selectio for sample sizes for which the Lasso fails with high probability. Secod, we also exhibit some problems (though arguably ot geeric) for which the multivariate group Lasso will be outperformed by the aive strategy of applyig the Lasso separately to each of the K colums, ad takig the uio of supports Our results. The mai cotributio of this paper is to show that uder certai techical coditios o the desig ad oise matrices, the model selectio performace of blockregularized l 1 /l 2 regressio (7) is govered by the sample complexity fuctio (9) θ l1 /l 2 (, p; B ) := 2ψ(B ) log(p s), where is the sample size, p is the ambiet dimesio, s = S is the umber of rows that are ozero ad ψ( ) is a sparsityoverlap fuctio. Our use of the term sample complexity for θ l1 /l 2 reflects the role it plays i our aalysis as the rate at which the sample size must grow i order to obtai cosistet row selectio as a fuctio of the problem parameters. More precisely, for scaligs (,p,s,b ) such that θ l1 /l 2 (, p; B ) exceeds a fixed critical threshold θ u (0, + ),weshow that the probability of correct row selectio by the l 1 /l 2 multivariate group Lasso coverges to oe, ad coversely, for scaligs such that θ l1 /l 2 (, p; B ) is below aother threshold θ l, we show that the multivariate group Lasso fails with high probability. Whereas the ratio (log p)/ is stadard for the highdimesioal theory of l 1  regularizatio, the fuctio ψ(b ) is a ovel ad iterestig quatity, oe which measures both the sparsity of the matrix B as well as the overlap betwee the differet regressios, represeted by the colums of B [see equatio (16) forthe precise defiitio of ψ(b )]. As a particular illustratio, cosider the special case of a uivariate regressio with K = 1, i which the covex program (7) reduces to the ordiary Lasso (3). I this case, if the desig matrix is draw from the stadard Gaussia esemble [i.e., X ij N(0, 1), i.i.d.], we show that the sparsityoverlap fuctio reduces to ψ(b ) = s, correspodig to the support size of the sigle coefficiet vector. We thus recover as a corollary a previously kow result [Waiwright (2009b)]: amely, the Lasso succeeds i performig exact support recovery oce the ratio /[s log(p s)] exceeds a certai critical threshold. At
7 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 7 the other extreme, for a geuiely multivariate problem with K>1ads ozero rows, agai for a stadard Gaussia desig, whe the regressio matrix is suitably orthoormal relative to the desig (see Sectio 2 for a precise defiitio), the sparsityoverlap fuctio is give by ψ(b ) = s/k. I this case, l 1 /l 2 blockregularizatio has sample complexity lower by a factor of K relative to the aive approach of solvig K separate Lasso problems. Of course, there is also a rage of behavior betwee these two extremes, i which the gai i sample complexity varies smoothly as a fuctio of the sparsityoverlap ψ(b ) i the iterval [ s K,s]. O the other had, we also show that for suitably correlated desigs, it is possible that the sample complexity ψ(b ) associated with l 1 /l 2 blockregularizatio is larger tha that of the ordiary Lasso (l 1 /l 1 ) approach. The remaider of the paper is orgaized as follows. I Sectio 2, we provide a precise statemet of our mai results (Theorems 1 ad 2), discuss some of their cosequeces ad illustrate the close agreemet betwee our theoretical results ad simulatios. Sectios 3 ad 4 are devoted to the proofs of Theorems 1 ad 2, respectively, with the argumets broke dow ito a series of steps. More techical results are deferred to the appedices. We coclude with a brief discussio i Sectio Notatio. We collect here some otatio used throughout the paper. For a (possibly radom) matrix M R p K, we defie the Frobeius orm M F := ( i,j m 2 ij )1/2, ad for parameters 1 a b,thel a /l b block orm is defied as follows: (10) M la /l b := { p ( K i=1 ) a/b } 1/a m ik b. k=1 These vector orms o matrices should be distiguished from the (a, b)operator orms (11) M a,b := sup Mx a x b =1 (although some orms belog to both families; see Lemma 7 i Appedix E). Importat special cases of the latter iclude the spectral orm M 2,2 (also deoted M 2 ), ad the l operator orm M, = max Kj=1 i=1,...,p M ij, deoted M for short. I additio to the usual Ladau otatio O ad o, we write a = (b ) for sequeces such that b a = o(1). We also use the otatio a = (b ) if both a = O(b ) ad b = O(a ) hold. 2. Mai results ad some cosequeces. The aalysis of this paper cosiders the multivariate group Lasso estimator, obtaied as a solutio to the SOCP { } 1 (12) arg mi 2 Y XB 2 F + λ B l1 /l 2 B R p K
8 8 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN for radom esembles of multivariate liear regressio problems, each of the form (4), where the oise matrix W R K is assumed to cosist of i.i.d. elemets W ij N(0,σ 2 ). We cosider radom desig matrices X with each row draw i a i.i.d. maer from a zeromea Gaussia N(0, ),where 0isap p covariace matrix. Although the blockregularized problem (12) eed ot have a uique solutio i geeral, a cosequece of our aalysis is that i the regime of iterest, the solutio is uique, so that we may talk uambiguously about the estimated support Ŝ. The mai object of study i this paper is the probability P[Ŝ = S], where the probability is take both over the radom choice of oise matrix W ad radom desig matrix X. We study the behavior of this probability as elemets of the triplet (,p,s)ted to ifiity Notatio ad assumptios. More precisely, our mai result applies to sequeces of models idexed by (, p(), s()), a associated sequece of p p covariace matrices ad a sequece {B } of coefficiet matrices with row support (13) S := {i βi 0} of size S =s = s(). We use S c to deote its complemet (i.e., S c := {1,...,p}\S). We let (14) bmi := mi i S β i 2 correspod to the miimal l 2 roworm of the coefficiet matrix B over its ozero rows. Give a observed pair (Y, X) from the model (4), the goal is to estimate the row support S of the matrix B. We impose the followig coditios o the covariace of the desig matrix: (A1) Bouded eigespectrum: There exist fixed costats C mi > 0 ad C max < + such that all eigevalues of the s s matrix SS are cotaied i the iterval [C mi,c max ]. (A2) Irrepresetable coditio: There exists a fixed parameter γ (0, 1] such that S c S( SS ) 1 1 γ. (A3) Selficoherece: ( SS ) 1 D max for some D max < +. The lower boud ivolvig C mi i assumptio (A1) prevets excess depedece amog elemets of the desig matrix associated with the support S; coditios of this form are required for model selectio cosistecy or l 2 cosistecy of the Lasso. The upper boud ivolvig C max i assumptio (A1) is ot eeded for provig success but oly failure of the multivariate group Lasso. The irrepresetable coditio ad selficoherece assumptios are also well kow from previous work o variable selectio cosistecy of the Lasso [Meishause ad Bühlma (2006); Tropp (2006); Zhao ad Yu (2006)]. Although such assumptios are ot eeded i aalyzig l 2 or risk cosistecy, they are kow to be
9 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 9 ecessary for variable selectio cosistecy of the Lasso. Ideed, i the absece of such coditios, it is always possible to make the Lasso fail, eve with a arbitrarily large sample size. [See, however, Meishause ad Yu (2009) for methods that weake the irrepresetable coditio.] Note that these assumptios are trivially satisfied by the stadard Gaussia esemble = I p p, with C mi = C max = 1, D max = 1adγ = 1. More geerally, it ca be show that various matrix classes (e.g., Toeplitz matrices, treestructured covariace matrices, bouded offdiagoal matrices) satisfy these coditios [Meishause ad Bühlma (2006); Zhao ad Yu (2006); Waiwright (2009b)]. We require a few pieces of otatio before statig the mai results. For a arbitrary matrix B S R s K with ith row β i R 1 K, we defie the matrix ζ(b S ) R s K with ith row ζ(β i ) := β i (15), β i 2 whe β i 0, ad we set ζ(β i ) = 0 otherwise. With this otatio, the sparsityoverlap fuctio is give by (16) ψ(b):= ζ(b S ) T ( SS ) 1 ζ(b S ) 2, where 2 deotes the spectral orm. We use this sparsityoverlap fuctio to defie the sample complexity parameter, which captures the effective sample size (17) θ l1 /l 2 (, p; B ) := 2ψ(B ) log(p s). I the followig two theorems, we cosider a radom desig matrix X draw with i.i.d. N(0, )row vectors, where satisfies assumptios (A1) (A3), ad a observatio matrix Y specified by model (4). I order to capture depedece iduced by the desig covariace matrix, for ay positive semidefiite matrix Q 0, we defie the quatities (18a) ρ l (Q) := 1 2 mi[q ii + Q jj 2Q ij ] i j ad (18b) ρ u (Q) := max Q ii. i We ote that by defiitio, we have ρ l (Q) ρ u (Q) wheever Q 0. Our bouds are stated i terms of these quatities as applied to the coditioal covariace matrix S c S c S := S c S c S c S( SS ) 1 SS c. Our first result is a achievability result, showig that the multivariate group Lasso succeeds i recoverig the row support ad yields cosistecy i l /l 2 orm. We state this result for sequeces of regularizatio parameters λ =
10 10 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN f(p)log p,wheref(p) + is ay fuctio such that λ 0. We also p + assume that is sufficietly large such that s/ < 1/2. THEOREM 1. Suppose that we solve the multivariate group Lasso with specified regularizatio parameter sequece λ for a sequece of problems idexed by (,p,b, )that satisfy assumptios (A1) (A3), ad such that, for some ν>0, θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) >(1 + ν)ρ u( S c S c S) (19) γ 2. The for uiversal costats c i > 0(i.e., idepedet of, p, s, B, ), with probability greater tha 1 c 2 exp( c 3 K log s) c 0 exp( c 1 log(p s)), the followig statemets hold: (a) The multivariate group Lasso has a uique solutio B with row support S(B) that is cotaied withi the true row support S(B ), ad moreover satisfies the (20) boud 8K log s B B l /l 2 C mi + λ D max + 6λ s 2. C mi } {{ } ρ(,s,λ ) ρ (b) If bmi = o(1), the estimate of the row support, S( B) := {i {1,...,p} β i 0}, specified by this uique solutio is equal to the row support set S(B ) of thetruemodel. Note that the theorem is aturally separated ito two distict but related claims. Part (a) guaratees that the method produces o false iclusios ad, moreover, bouds the maximum l 2 error across the rows. Part (b) requires some additioal ρ b mi assumptios amely, the restrictio = o(1) esurig that the error ρ is of lower order tha the miimum l 2 orm bmi across rows but also guaratees the stroger result of o false exclusios as well, so that the method recovers the row support exactly. Note that the probability of these evets coverges to oe oly if both (p s) ad s ted to ifiity, which might seem couterituitive iitially (sice problems with larger support sets s might seem harder). However, as we discuss at the ed of Sectio 3.3, this depedece ca be removed at the expese of a slightly slower covergece rate for B B l /l 2. Our secod mai theorem is a egative result, showig that the multivariate group Lasso fails with high probability if the rescaled sample size θ l1 /l 2 is below a critical threshold. I order to clarify the phrasig of this result, ote that Theorem 1 ca be summarized succictly as guarateeig that there is a uique solutio B with the correct row support that satisfies B B l /l 2 = o(bmi ). The followig result shows that such a guaratee caot hold if the sample size scales too slowly relative to p, s ad the other problem parameters.
11 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 11 THEOREM 2. Cosider problem sequeces idexed by (,p,b, )that satisfy assumptios (A1) (A2), ad with miimum value bmi such that b mi 2 = ( log p ), ad suppose that we solve the multivariate group Lasso with ay positive regularizatio sequece {λ }. The there exist ν>0 ad uiversal costats c i > 0 such that if the sample size is lower bouded as θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) <(1 ν)ρ l( S c S c S) (2 γ) 2, the with probability greater tha 1 c 0 exp{ c 1 mi( K s, θ l 2 log(p s))}, there is o solutio B of the multivariate group Lasso that has the correct row support ad satisfies the boud B B l /l 2 = o(bmi ). The proof of this claim is provided i Sectio 4. We ote that iformatiotheoretic methods [Waiwright (2009a)] imply that o method (icludig the multivariate group Lasso) ca perform exact support recovery uless /s +, so that the probability give i Theorem 2 coverges to oe uder the give coditios. Note that Theorems 1 ad 2 i cojuctio imply that the rescaled sample size θ l1 /l 2 (, p; B ) = 2ψ(B ) log(p s) captures the behavior of the multivariate group Lasso for support recovery ad estimatio i block l /l 2 orm. For the special case of radom desig matrices draw from the stadard Gaussia esemble (i.e., = I p p ), the give scaligs are sharp: COROLLARY 1. For the stadard Gaussia esemble, the multivariate group Lasso udergoes a sharp threshold at the level θ l1 /l 2 (,p,b ) = 1. More specifically, for ay δ>0: (a) For problem sequeces (,p,b ) such that θ l1 /l 2 (,p,b )>1 + δ, the multivariate group Lasso succeeds with high probability. (b) Coversely, for sequeces such that θ l1 /l 2 (,p,b )<1 δ, the multivariate group Lasso fails with high probability. PROOF. I the special case = I p p, it is straightforward to verify that all the assumptios are satisfied: i particular, we have C mi = C max = 1, D max = 1 ad γ = 1. Moreover, a short calculatio shows that ρ u (I) = ρ l (I) = 1. Cosequetly, the thresholds give i the sufficiet coditio (19) ad the ecessary coditio (21) are both equal to oe Efficiet estimatio of idividual supports. The precedig results address exact recovery of the support uio of the regressio matrix B. As demostrated by the followig procedure ad the associated corollary of Theorem 1, oce the
12 12 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN row support has bee recovered, it is straightforward to recover the idividual supports of each colum of the regressio matrix via the additioal steps of performig ordiary least squares ad thresholdig. Efficiet multistage estimatio of idividual supports: (1) estimate the support uio with Ŝ, the support uio of the solutio B of the multivariate group Lasso; (2) compute the restricted ordiary least squares (ROLS) estimate, (21) BŜ := arg mi Y XŜBŜ F BŜ for the restricted multivariate problem; (3) compute the matrix T( BŜ) obtaied by thresholdig BŜ at the level 2log(K Ŝ ) 2 C mi, ad estimate the idividual supports by the ozero etries of T( BŜ). The followig result, which is proved i Appedix A, shows that uder the assumptios of Theorem 1, the additioal postprocessig applied to the support uio estimate will recover the idividual supports with high probability: COROLLARY 2. Uder assumptios (A1) (A3) ad the additioal assumptios of Theorem 1, if for all idividual ozero coefficiets βik,i S,1 k K, we have βik 2 4log(Ks) C mi, the with probability greater tha 1 (exp( c 0 K log s)) the above twostep estimatio procedure recovers the idividual supports of B Some cosequeces of Theorems 1 ad 2. We begi by makig some simple observatios about the sparsityoverlap fuctio. LEMMA 1. (a) For ay desig satisfyig assumptio (A1), the sparsityoverlap ψ(b ) obeys the bouds s (22) C max K ψ(b ) s ; C mi (b) if SS = I s s, ad if the colums (Z (k) ) of the matrix Z = ζ(b ) are orthogoal, the the sparsityoverlap fuctio is ψ(b ) = max k=1,...,k Z (k) 2 2. The proof of this claim is provided i Appedix C. Based o this lemma, we ow study some special cases of Theorems 1 ad 2. The simplest special case is the uivariate regressio problem (K = 1), i which case the quatity ζ(β ) [as defied i equatio (15)] simply yields a sdimesioal sig vector with elemets zi = sig(β i ). [Recall that the sig fuctio is defied as sig(0) = 0, sig(x) = 1
13 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 13 if x>0 ad sig(x) = 1 if x<0.] I this case, the sparsityoverlap fuctio is give by ψ(β ) = z T ( SS ) 1 z, ad as a cosequece of Lemma 1(a), we have ψ(β ) = (s). Cosequetly, a simple corollary of Theorems 1 ad 2 is that the Lasso succeeds oce the ratio /(2s log(p s)) exceeds a certai critical threshold, determied by the eigespectrum ad icoherece properties of, ad it fails below a certai threshold. This result matches the ecessary ad sufficiet coditios established i previous work o the Lasso [Waiwright (2009b)]. We ca also use Lemma 1 ad Theorems 1 ad 2 to compare the performace of the multivariate group Lasso to the followig (arguably aive) strategy for row selectio usig the ordiary Lasso. Row selectio usig ordiary Lasso: (1) Apply the ordiary Lasso separately to each of the K uivariate regressio problems specified by the colums of B, thereby obtaiig estimates β (k) for k = 1,...,K. (2) For k = 1,...,K, estimate the support of idividual colums via Ŝk := {i β (k) i 0}. (3) Estimate the row support by takig the uio: Ŝ = K Ŝk. k=1 To uderstad the coditios goverig the success/failure of this procedure, ote that it succeeds if ad oly if for each ozero row i S = K k=1 S k,thevariable β (k) i is ozero for at least oe k, adforallj S c ={1,...,p}\S, thevariable β (k) j = 0forallk = 1,...,K. From our uderstadig of the uivariate case, we kow that for θ u = ρ u( S c S c S ), the coditio γ 2 2θ u max ψ( β (k) ) S log(p sk ) 2θ u max ψ( β (k) ) (23) S log(p s) k=1,...,k k=1,...,k is sufficiet to esure that the ordiary Lasso succeeds i row selectio. Coversely, for θ l = ρ l( S c S c S ), if the sample size is upper bouded as (2 γ) 2 <2θ l max ψ( β (k) ) S log(p s), k=1,...,k the there will exist some j S c such for at least oe k {1,...,K}, there holds β (k) j 0 with high probability, implyig failure of the ordiary Lasso. A atural questio is whether the multivariate group Lasso, by takig ito accout the coupligs across colums, always outperforms (or at least matches) the aive strategy. The followig result, prove i Appedix D, shows that if the desig is ucorrelated o the support, the ideed this is the case. COROLLARY 3 (Multivariate group Lasso versus ordiary Lasso). Assume SS = I s s. The for ay multivariate regressio problem, row selectio usig
14 14 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN the ordiary Lasso strategy requires, with high probability, at least as may samples as the l 1 /l 2 multivariate group Lasso. I particular, the relative efficiecy of multivariate group Lasso versus ordiary Lasso is give by the ratio (24) max k=1,...,k ψ(β (k) S ) log(p s k ) ψ(b S ) log(p s) 1. We cosider the special case of idetical regressios, for which a result ca be stated for ay covariace desig ad the case of orthoormal regressios which illustrates Corollary 3. EXAMPLE 1 (Idetical regressios). Suppose that B := β 1 T K that is, B cosists of K copies of the same coefficiet vector β R p, with support of cardiality S =s.wethehave[ζ(b )] ij = sig(βi )/ K, from which we see that ψ(b ) = z T ( SS ) 1 z, with z beig a sdimesioal sig vector with elemets zi = sig(β i ). Cosequetly, we have the equality ψ(b ) = ψ(β ),sothat uder our aalysis there is o beefit i usig the multivariate group Lasso relative to the strategy of solvig separate Lasso problems ad costructig the uio of idividually estimated supports. This behavior may seem couterituitive, sice uder the model (4) we essetially have K observatios of the coefficiet vector β with the same desig matrix but K idepedet oise realizatios, which could help to reduce the effective oise variace from σ 2 to σ 2 /K if the fact that the regressios are idetical is kow. It must be bore i mid, however, that i our highdimesioal aalysis the oise variace does ot grow as the dimesioality grows, ad thus asymptotically the oise is domiated by the iterferece betwee the covariates, which grows as (p s). It is thus this highdimesioal iterferece that domiates the rates give i Theorems 1 ad 2. I cotrast to this pessimistic example, we ow tur to the most optimistic extreme: EXAMPLE 2 ( Orthoormal regressios). Suppose that SS = I s s ad (for s>k) suppose that B is costructed such that the colums of the s K matrix ζ(b ) are orthogoal ad with equal orm (which implies their orm equals sk ). Uder these coditios, we claim that the sample complexity of multivariate group Lasso is smaller tha that of the ordiary Lasso by a factor of 1/K. Ideed, usig Lemma 1(b), we observe that Kψ(B ) = K Z (1) 2 = K Z (k) 2 = tr(z T Z ) = tr(z Z T ) = s, k=1 because Z Z T R s s is the Gram matrix of s uit vectors i R k ad its diagoal elemets are therefore all equal to 1. Cosequetly, the multivariate group Lasso
15 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 15 recovers the row support with high probability for sequeces such that 2(s/K) log(p s) > 1, which allows for sample sizes K times smaller tha the ordiary Lasso approach. Corollary 3 ad the subsequet examples address the case of ucorrelated desig ( SS = I s s ) o the row support S, for which the multivariate group Lasso is ever worse tha the ordiary Lasso i performig row selectio. The followig example shows that if the supports are disjoit, the ordiary Lasso has the same sample complexity as the multivariate group Lasso for ucorrelated desig SS = I s s, but ca be better tha the multivariate group Lasso for desigs SS with suitable correlatios: COROLLARY 4 (Disjoit supports). Suppose that the support sets S k of idividual regressio problems are disjoit. The for ay desig covariace SS, we have max ψ( β (k) ) (a) S ψ(bs ) (b) K ψ ( β (k) ) (25) S. 1 k K PROOF. so that Z (k) S First ote that, sice all supports are disjoit, Z (k) i = sig(βik ), = ζ(β (k) S ). Iequality (b) is the immediate because we have ). To establish iequality (a), we ote that Z S T 1 SS Z S 2 tr(z S T 1 SS Z S ψ(b ) = max x R K : x 1 k=1 x T Z S T 1 SS Z S x max 1 k K et k Z S T 1 SS Z S e k max 1 k K Z(k) T S 1 SS Z(k) S. A caveat i iterpretig Corollary 4, ad more geerally i comparig the performace of the ordiary Lasso ad the multivariate group Lasso, is that for a geeral covariace matrix SS, assumptios (A2) ad (A3) required by Theorem 1 do ot iduce the same costraits o the covariace matrix whe applied to the multivariate problem as whe applied to the idividual regressios. Ideed, i the latter case, (A2) would require max k S c k S k 1 S k S k 1 γ ad (A3) would require max k 1 S k S k D max. Thus (A3) is a stroger assumptio i the multivariate case but (A2) is ot. We illustrate Corollary 4 with a example. EXAMPLE 3 (Disjoit support with ucorrelated desig). Suppose that SS = I s s, ad the supports are disjoit. I this case, we claim that the sample complexity of the l 1 /l 2 multivariate group Lasso is the same as the ordiary Lasso.
16 16 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN If the idividual regressios have disjoit support, the ZS = ζ(b S ) has oly a sigle ozero etry per row ad therefore the colums of Z are orthogoal. Moreover, Zik = sig(β(k) i ). By Lemma 1(b), the sparsityoverlap fuctio ψ(b ) is equal to the largest squared colum orm. But Z (k) 2 = s i=1 sig(β (k) i ) 2 = s k. Thus, the sample complexity of the multivariate group Lasso is the same as the ordiary Lasso i this case. 2 Fially, we cosider a example that illustrates the effect of correlated desigs: EXAMPLE 4 (Effects of correlated desigs). To illustrate the behavior of the sparsityoverlap fuctio i the presece of correlatios i the desig, we cosider the simple case of two regressios with support of size 2. For parameters ϑ 1 ad ϑ 2 [0,π] ad μ ( 1, +1), cosider regressio matrices B such that B = ζ(bs ) ad [ ] [ ] ζ(bs ) = cos(ϑ1 ) si(ϑ 1 ) ad 1 1 μ (26) SS cos(ϑ 2 ) si(ϑ 2 ) =. μ 1 Settig M = ζ(bs )T SS 1 ζ(b S ), a simple calculatio shows that tr(m ) = 2 ( 1 + μ cos(ϑ 1 ϑ 2 ) ) ad det(m ) = (1 μ 2 ) si(ϑ 1 ϑ 2 ) 2, so that the eigevalues of M are μ + = (1 + μ) ( 1 + cos(ϑ 1 ϑ 2 ) ) ad μ = (1 μ) ( 1 cos(ϑ 1 ϑ 2 ) ), ad ψ(b ) = max(μ +,μ ). O the other had, with z 1 = ζ ( β (1) ) ( ) sig(cos(ϑ1 )) = sig(cos(ϑ 2 )) ad z 2 = ζ ( β (2) ) = we have ( ) sig(si(ϑ1 )) sig(si(ϑ 2 )) ψ ( β (1) ) = z 1 T 1 SS z 1 = 1 {cos(ϑ1 ) 0} + 1 {cos(ϑ2 ) 0} + 2μ sig(cos(ϑ 1 ) cos(ϑ 2 )), ψ ( β (2) ) = z 2 T 1 SS z 2 = 1 {si(ϑ1 ) 0} + 1 {si(ϑ2 ) 0} + 2μ sig(si(ϑ 1 ) si(ϑ 2 )). Figure 1 provides a graphical compariso of these sample complexity fuctios. The fuctio ψ(b ) = max(ψ(β (1) ), ψ(β (2) )) is discotiuous o S = π 2 Z R R π 2 Z, ad, as a cosequece, so is its differece with ψ(b ). Note that, for fixed ϑ 1 or fixed ϑ 2, some of these discotiuities are removable discotiuities of the iduced fuctio o the other variable, ad these discotiuities therefore 2 I makig this assertio, we are igorig ay differece betwee log(p s k ) ad log(p s), which is valid, for istace, i the regime of subliear sparsity, whe s k /p 0ads/p 0.
17 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 17 FIG. 1. Compariso of sparsityoverlap fuctios for l 1 /l 2 ad the Lasso. For the pair 1 2π (ϑ 1,ϑ 2 ), we represet i each row of plots, correspodig respectively to μ = 0(top), 0.9 (middle) ad 0.9 (bottom), from left to right, the quatities: ψ(b ) (left), max(ψ(β (1) ), ψ(β (2) )) (ceter) ad max(0,ψ(b ) max(ψ(β (1) ), ψ(β (2) ))) (right). The latter idicates whe the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) does ot hold ad by how much it is violated. create eedles, slits or flaps i the graph of the fuctio ψ. Deote by R + ad R the sets R + ={(ϑ 1,ϑ 2 ) mi[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] > 0}, R ={(ϑ 1,ϑ 2 ) max[cos(ϑ 1 ) cos(ϑ 2 ), si(ϑ 1 ) si(ϑ 2 )] < 0} o which ψ(b ) reaches its miimum value whe μ 0.5 adμ 0.5, respectively (see middle ad bottom ceter plots i Figure 4). For μ = 0, the top ceter graph illustrates that ψ(b ) is equal to 2 except for the cases of matrices BS with disjoit support, correspodig to the discrete set D ={(k π 2,(k± 1) π 2 ), k Z} for which it equals 1. The top rightmost graph illustrates that, as show i Corollary 3, the iequality always holds for a ucorrelated desig. For μ>0, the iequality ψ(b ) max(ψ(β (1) ), ψ(β (2) )) is violated oly o a subset of S R ;ad for μ<0, the iequality is symmetrically violated o a subset of S R + (see Figure 4).
18 18 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN 2.4. Illustrative simulatios. I this sectio, we preset the results of simulatios that illustrate the sharpess of Theorems 1 ad 2, ad furthermore demostrate how quickly the predicted behavior is observed as elemets of the triple (, p, s) grow i differet regimes. We explore the case of two regressios (i.e., K = 2) which share a idetical support set S with cardiality S =s i Sectio ad cosider a slightly more geeral case i Sectio Threshold effect i the stadard Gaussia case. The first set of experimets was desiged to reveal the threshold effect predicted by Theorems 1 ad 2. The desig matrix X is sampled from the stadard Gaussia esemble, with i.i.d. etries X ij N(0, 1). We cosider two types of sparsity: logarithmic sparsity, where s = α log(p), for α = 2/ log(2), ad liear sparsity, where s = αp,forα = 1/8 for various ambiet model dimesios p {16, 32, 64, 256, 512, 1024}. Fora give triplet (,p,s), we solve the blockregularized problem (12) with the regularizatio parameter λ = log(p s)(log s)/. For each fixed (p, s) pair, we measure the sample complexity i terms of a parameter θ, i particular lettig = θslog(p s) for θ [0.25, 1.5].WeletthematrixB R p 2 of regressio coefficiets have etries βij i { 1/ 2, 1/ 2}, choosig the parameters to vary the agle betwee the two colums, thereby obtaiig various desired values of ψ(b ). Sice = I p p for the stadard Gaussia esemble, the sparsityoverlap fuctio ψ(b ) is simply the maximal eigevalue of the Gram matrix ζ(bs )T ζ(bs ). Sice βij =1/ 2 by costructio, we are guarateed that BS = ζ(b S ), that the miimum value bmi = 1, ad, moreover, that the colums of ζ(b S ) have the same Euclidea orm. To costruct parameter matrices B that satisfy βij =1/ 2, we choose both p ad the sparsity scaligs so that the obtaied values for s are multiples of four. We the costruct the colums Z (1) ad Z (2) of the matrix B = ζ(b ) from copies of vectors of legth four. Deotig by the usual matrix tesor product, we cosider the followig 4vectors: Idetical regressios: We set Z (1) = Z (2) = s,sothatthesparsityoverlap fuctio is ψ(b ) = s. Orthoormal regressios: Here B is costructed with Z (1) Z (2),sothat ψ(b ) = 2 s, the most favorable situatio. I order to achieve this orthoormality, we set Z (1) = s ad Z (2) = s/2 (1, 1) T. Itermediate agles: I this itermediate case, the colums Z (1) ad Z (2) are at a 60 agle, which leads to ψ(b ) = 3 4 s. Specifically, we set Z(1) = s ad Z (2) = s/4 (1, 1, 1, 1) T. Figure 2 shows plots for liear sparsity (left colum) ad logarithmic sparsity (right colum) i all three cases ad where the multivariate group Lasso was used
19 SUPPORT UNION RECOVERY IN MULTIVARIATE REGRESSION 19 FIG. 2. Plots of support uio recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet types of sparsity: liear sparsity i the left colum (s = p/8) ad logarithmic sparsity i the right colum (s = 2log 2 (p)). The first three rows are based o usig the multivariate group Lasso to estimate the support for the three cases of idetical regressio, itermediate agles ad orthoormal regressios. The fourth row presets results for the Lasso i the case of idetical regressios.
20 20 G. OBOZINSKI, M. J. WAINWRIGHT AND M. I. JORDAN FIG. 3. Plots of support recovery probability, P[Ŝ = S], versus the cotrol parameter θ = /[2s log(p s)] for two differet type of sparsity: logarithmic sparsity o top (s = O(log(p))) ad liear sparsity o bottom (s = αp), ad for icreasig values of p from left to right. The oise level is set at σ = 0.1. Each graph shows four curves (black, red, gree, blue) correspodig to the case of idepedet l 1 regularizatio, ad, for l 1 /l 2 regularizatio, the cases of idetical regressio, itermediate agles ad orthoormal regressios. Note how curves correspodig to the same case across differet problem sizes p all coicide, as predicted by Theorems 1 ad 2. Moreover, cosistet with the theory, the curves for the idetical regressio group reach P[Ŝ = S] 0.50 at θ 1, whereas the orthoormal regressio group reaches 50% success substatially earlier. (top three rows), as well as the referece Lasso case for the case of idetical regressios (bottom row). Each pael plots the success probability, P[Ŝ = S],versus the rescaled sample size θ = /[2s log(p s)]. Uder this rescalig, Theorems 1 ad 2 predict that the curves should alig, ad that the success probability should trasitio to 1 oce θ exceeds a critical threshold (depedet o the type of esemble). Note that for suitably large problem sizes (p 128), the curves do alig i the predicted way, showig stepfuctio behavior. Figure 3 plots data from the same simulatios i a differet format. Here the top row correspods to logarithmic sparsity, ad the bottom row to liear sparsity; each pael shows the four differet choices for B, with the problem size p icreasig from left to right. Note how i each pael the locatio of the trasitio of P[Ŝ = S] to oe shifts from right to
Highdimensional support union recovery in multivariate regression
Highdimesioal support uio recovery i multivariate regressio Guillaume Oboziski Departmet of Statistics UC Berkeley gobo@stat.berkeley.edu Marti J. Waiwright Departmet of Statistics Dept. of Electrical
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationConvexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationNPTEL STRUCTURAL RELIABILITY
NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationModule 4: Mathematical Induction
Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate
More informationORDERS OF GROWTH KEITH CONRAD
ORDERS OF GROWTH KEITH CONRAD Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really wat to uderstad their behavior It also helps you better grasp topics i calculus
More informationAMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS Jia Huag 1, Joel L. Horowitz 2 ad Fegrog Wei 3 1 Uiversity of Iowa, 2 Northwester Uiversity ad 3 Uiversity of West Georgia Abstract We cosider a oparametric
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the BruMikowski iequality for boxes. Today we ll go over the
More information1 The Binomial Theorem: Another Approach
The Biomial Theorem: Aother Approach Pascal s Triagle I class (ad i our text we saw that, for iteger, the biomial theorem ca be stated (a + b = c a + c a b + c a b + + c ab + c b, where the coefficiets
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationHIGHDIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY
The Aals of Statistics 2012, Vol. 40, No. 3, 1637 1664 DOI: 10.1214/12AOS1018 Istitute of Mathematical Statistics, 2012 HIGHDIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH
More informationConfidence Intervals for One Mean with Tolerance Probability
Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationCME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8
CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationLecture 4: Cauchy sequences, BolzanoWeierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, BolzaoWeierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
More informationThe Field of Complex Numbers
The Field of Complex Numbers S. F. Ellermeyer The costructio of the system of complex umbers begis by appedig to the system of real umbers a umber which we call i with the property that i = 1. (Note that
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationSection IV.5: Recurrence Relations from Algorithms
Sectio IV.5: Recurrece Relatios from Algorithms Give a recursive algorithm with iput size, we wish to fid a Θ (best big O) estimate for its ru time T() either by obtaiig a explicit formula for T() or by
More informationDefinition. Definition. 72 Estimating a Population Proportion. Definition. Definition
7 stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio
More informationB1. Fourier Analysis of Discrete Time Signals
B. Fourier Aalysis of Discrete Time Sigals Objectives Itroduce discrete time periodic sigals Defie the Discrete Fourier Series (DFS) expasio of periodic sigals Defie the Discrete Fourier Trasform (DFT)
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationMath Discrete Math Combinatorics MULTIPLICATION PRINCIPLE:
Math 355  Discrete Math 4.14.4 Combiatorics Notes MULTIPLICATION PRINCIPLE: If there m ways to do somethig ad ways to do aother thig the there are m ways to do both. I the laguage of set theory: Let
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationAlternatives To Pearson s and Spearman s Correlation Coefficients
Alteratives To Pearso s ad Spearma s Correlatio Coefficiets Floreti Smaradache Chair of Math & Scieces Departmet Uiversity of New Mexico Gallup, NM 8730, USA Abstract. This article presets several alteratives
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationChapter Gaussian Elimination
Chapter 04.06 Gaussia Elimiatio After readig this chapter, you should be able to:. solve a set of simultaeous liear equatios usig Naïve Gauss elimiatio,. lear the pitfalls of the Naïve Gauss elimiatio
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More information{{1}, {2, 4}, {3}} {{1, 3, 4}, {2}} {{1}, {2}, {3, 4}} 5.4 Stirling Numbers
. Stirlig Numbers Whe coutig various types of fuctios from., we quicly discovered that eumeratig the umber of oto fuctios was a difficult problem. For a domai of five elemets ad a rage of four elemets,
More informationClass Meeting # 16: The Fourier Transform on R n
MATH 18.152 COUSE NOTES  CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationSequences II. Chapter 3. 3.1 Convergent Sequences
Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationLecture Notes CMSC 251
We have this messy summatio to solve though First observe that the value remais costat throughout the sum, ad so we ca pull it out frot Also ote that we ca write 3 i / i ad (3/) i T () = log 3 (log ) 1
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationReview for College Algebra Final Exam
Review for College Algebra Fial Exam (Please remember that half of the fial exam will cover chapters 14. This review sheet covers oly the ew material, from chapters 5 ad 7.) 5.1 Systems of equatios i
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationChapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More information8.5 Alternating infinite series
65 8.5 Alteratig ifiite series I the previous two sectios we cosidered oly series with positive terms. I this sectio we cosider series with both positive ad egative terms which alterate: positive, egative,
More informationLinear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant
MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is
More informationif A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S,
Lecture 5: Borel Sets Topologically, the Borel sets i a topological space are the σalgebra geerated by the ope sets. Oe ca build up the Borel sets from the ope sets by iteratig the operatios of complemetatio
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 007 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a uow mea µ = E(X) of a distributio by
More informationA Gentle Introduction to Algorithms: Part II
A Getle Itroductio to Algorithms: Part II Cotets of Part I:. Merge: (to merge two sorted lists ito a sigle sorted list.) 2. Bubble Sort 3. Merge Sort: 4. The BigO, BigΘ, BigΩ otatios: asymptotic bouds
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationHypothesis Tests Applied to Means
The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with
More informationrepresented by 4! different arrangements of boxes, divide by 4! to get ways
Problem Set #6 solutios A juggler colors idetical jugglig balls red, white, ad blue (a I how may ways ca this be doe if each color is used at least oce? Let us preemptively color oe ball i each color,
More informationLecture 7: Borel Sets and Lebesgue Measure
EE50: Probability Foudatios for Electrical Egieers JulyNovember 205 Lecture 7: Borel Sets ad Lebesgue Measure Lecturer: Dr. Krisha Jagaatha Scribes: Ravi Kolla, Aseem Sharma, Vishakh Hegde I this lecture,
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More informationπ d i (b i z) (n 1)π )... sin(θ + )
SOME TRIGONOMETRIC IDENTITIES RELATED TO EXACT COVERS Joh Beebee Uiversity of Alaska, Achorage Jauary 18, 1990 Sherma K Stei proves that if si π = k si π b where i the b i are itegers, the are positive
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationGregory Carey, 1998 Linear Transformations & Composites  1. Linear Transformations and Linear Composites
Gregory Carey, 1998 Liear Trasformatios & Composites  1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio
More informationx(x 1)(x 2)... (x k + 1) = [x] k n+m 1
1 Coutig mappigs For every real x ad positive iteger k, let [x] k deote the fallig factorial ad x(x 1)(x 2)... (x k + 1) ( ) x = [x] k k k!, ( ) k = 1. 0 I the sequel, X = {x 1,..., x m }, Y = {y 1,...,
More information8.3 POLAR FORM AND DEMOIVRE S THEOREM
SECTION 8. POLAR FORM AND DEMOIVRE S THEOREM 48 8. POLAR FORM AND DEMOIVRE S THEOREM Figure 8.6 (a, b) b r a 0 θ Complex Number: a + bi Rectagular Form: (a, b) Polar Form: (r, θ) At this poit you ca add,
More informationThe second difference is the sequence of differences of the first difference sequence, 2
Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for
More informationMATH 361 Homework 9. Royden Royden Royden
MATH 61 Homework 9 Royde..9 First, we show that for ay subset E of the real umbers, E c + y = E + y) c traslatig the complemet is equivalet to the complemet of the traslated set). Without loss of geerality,
More informationwhen n = 1, 2, 3, 4, 5, 6, This list represents the amount of dollars you have after n days. Note: The use of is read as and so on.
Geometric eries Before we defie what is meat by a series, we eed to itroduce a related topic, that of sequeces. Formally, a sequece is a fuctio that computes a ordered list. uppose that o day 1, you have
More informationFIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix
FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. Powers of a matrix We begi with a propositio which illustrates the usefuless of the diagoalizatio. Recall that a square matrix A is diogaalizable if
More information13 Fast Fourier Transform (FFT)
13 Fast Fourier Trasform FFT) The fast Fourier trasform FFT) is a algorithm for the efficiet implemetatio of the discrete Fourier trasform. We begi our discussio oce more with the cotiuous Fourier trasform.
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationTheorems About Power Series
Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real oegative umber R, called the radius
More informationME 101 Measurement Demonstration (MD 1) DEFINITIONS Precision  A measure of agreement between repeated measurements (repeatability).
INTRODUCTION This laboratory ivestigatio ivolves makig both legth ad mass measuremets of a populatio, ad the assessig statistical parameters to describe that populatio. For example, oe may wat to determie
More informationNumerical Solution of Equations
School of Mechaical Aerospace ad Civil Egieerig Numerical Solutio of Equatios T J Craft George Begg Buildig, C4 TPFE MSc CFD Readig: J Ferziger, M Peric, Computatioal Methods for Fluid Dyamics HK Versteeg,
More informationGrade 7. Strand: Number Specific Learning Outcomes It is expected that students will:
Strad: Number Specific Learig Outcomes It is expected that studets will: 7.N.1. Determie ad explai why a umber is divisible by 2, 3, 4, 5, 6, 8, 9, or 10, ad why a umber caot be divided by 0. [C, R] [C]
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationKey Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationChapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity
More information1.3 Binomial Coefficients
18 CHAPTER 1. COUNTING 1. Biomial Coefficiets I this sectio, we will explore various properties of biomial coefficiets. Pascal s Triagle Table 1 cotais the values of the biomial coefficiets ( ) for 0to
More information428 CHAPTER 12 MULTIPLE LINEAR REGRESSION
48 CHAPTER 1 MULTIPLE LINEAR REGRESSION Table 18 Team Wis Pts GF GA PPG PPcT SHG PPGA PKPcT SHGA Chicago 47 104 338 68 86 7. 4 71 76.6 6 Miesota 40 96 31 90 91 6.4 17 67 80.7 0 Toroto 8 68 3 330 79.3
More information0,1 is an accumulation
Sectio 5.4 1 Accumulatio Poits Sectio 5.4 BolzaoWeierstrass ad HeieBorel Theorems Purpose of Sectio: To itroduce the cocept of a accumulatio poit of a set, ad state ad prove two major theorems of real
More informationStat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals  the general concept
Statistics 104 Lecture 16 (IPS 6.1) Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for
More informationEngineering 323 Beautiful Homework Set 3 1 of 7 Kuszmar Problem 2.51
Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationTrigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is
0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values
More informationDivide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015
CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a
More information6 Algorithm analysis
6 Algorithm aalysis Geerally, a algorithm has three cases Best case Average case Worse case. To demostrate, let us cosider the a really simple search algorithm which searches for k i the set A{a 1 a...
More information