The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs


 Mary Evans
 1 years ago
 Views:
Transcription
1 Joural of Machie Learig Research Submitted 3/09; Revised 5/09; ublished 0/09 The Noparaormal: Semiparametric Estimatio of High Dimesioal Udirected Graphs Ha Liu Joh Lafferty Larry Wasserma School of Computer Sciece Caregie Mello Uiversity 5000 Forbes Aveue ittsburgh, A 523, USA Editor: Marti J. Waiwright Abstract Recet methods for estimatig sparse udirected graphs for realvalued data i high dimesioal problems rely heavily o the assumptio of ormality. We show how to use a semiparametric Gaussia copula or oparaormal for high dimesioal iferece. Just as additive models exted liear models by replacig liear fuctios with a set of oedimesioal smooth fuctios, the oparaormal exteds the ormal by trasformig the variables by smooth fuctios. We derive a method for estimatig the oparaormal, study the method s theoretical properties, ad show that it works well i may examples. Keywords: graphical models, Gaussia copula, high dimesioal iferece, sparsity, l regularizatio, graphical lasso, paraormal, occult. Itroductio The liear model is a maistay of statistical iferece that has bee exteded i several importat ways. A extesio to high dimesios was achieved by addig a sparsity costrait, leadig to the lasso Tibshirai, 996. A extesio to oparametric models was achieved by replacig liear fuctios with smooth fuctios, leadig to additive models Hastie ad Tibshirai, 999. These two ideas were recetly combied, leadig to a extesio called sparse additive models SpAM Ravikumar et al., 2008, 2009a. I this paper we cosider a similar oparametric extesio of udirected graphical models based o multivariate Gaussia distributios i the high dimesioal settig. Specifically, we use a high dimesioal Gaussia copula with oparametric margials, which we refer to as a oparaormal distributio. If X is a pdimesioal radom vector distributed accordig to a multivariate Gaussia distributio with covariace matrix Σ, the coditioal idepedece relatios betwee the radom variables X,X 2,...,X p are ecoded i a graph formed from the precisio matrix Ω = Σ. Specifically, missig edges i the graph correspod to zeroes of Ω. To estimate the graph from a sample of size, it is oly ecessary to estimate Σ, which is easy if is much larger tha p. However, whe p is larger tha, the problem is more challegig. Recet work has focused o the problem of estimatig the graph i this high dimesioal settig, which becomes feasible if G is sparse. Yua ad Li 2007 c 2009 Ha Liu, Joh Lafferty ad Larry Wasserma.
2 LIU, LAFFERTY, AND WASSERMAN Assumptios Dimesio Regressio Graphical Models parametric oparametric low liear model multivariate ormal high lasso graphical lasso low additive model oparaormal high sparse additive model l regularized oparaormal Figure : Compariso of regressio ad graphical models. The oparaormal exteds additive models to the graphical model settig. Regularizig the iverse covariace leads to a extesio to high dimesios, which parallels sparse additive models for regressio. ad Baerjee et al propose a estimator based o regularized maximum likelihood usig a l costrait o the etries of Ω, ad Friedma et al develop a efficiet algorithm for computig the estimator usig a graphical versio of the lasso. The resultig estimatio procedure has excellet theoretical properties, as show recetly by Rothma et al ad Ravikumar et al. 2009b. While Gaussia graphical models ca be useful, a reliace o exact ormality is limitig. Our goal i this paper is to weake this assumptio. Our approach parallels the ideas behid sparse additive models for regressio Ravikumar et al., 2008, 2009a. Specifically, we replace the Gaussia with a semiparametric Gaussia copula. This meas that we replace the radom variable X = X,...,X p by the trasformed radom variable fx = f X,..., f p X p, ad assume that fx is multivariate Gaussia. This semiparametric copula results i a oparametric extesio of the ormal that we call the oparaormal distributio. The oparaormal depeds o the fuctios { f j }, ad a mea µ ad covariace matrix Σ, all of which are to be estimated from data. While the resultig family of distributios is much richer tha the stadard parametric ormal the paraormal, the idepedece relatios amog the variables are still ecoded i the precisio matrix Ω = Σ. We propose a oparametric estimator for the fuctios { f j }, ad show how the graphical lasso ca be used to estimate the graph i the high dimesioal settig. The relatioship betwee liear regressio models, Gaussia graphical models, ad their extesios to oparametric ad high dimesioal models is summarized i Figure. Most theoretical results o semiparametric copulas focus o low or at least fiite dimesioal models Klaasse ad Weller, 997; Tsukahara, Models with icreasig dimesio require a more delicate aalysis; i particular, simply pluggig i the usual empirical distributio of the margials does ot lead to accurate iferece. Istead we use a trucated empirical distributio. We give a theoretical aalysis of this estimator, provig cosistecy results with respect to risk, model selectio, ad estimatio of Ω i the Frobeius orm. I the followig sectio we review the basic otio of the graph correspodig to a multivariate Gaussia, ad formulate differet criteria for evaluatig estimators of the covariace or iverse covariace. I Sectio 3 we preset the oparaormal, ad i Sectio 4 we discuss estimatio of the model. We preset a theoretical aalysis of the estimatio method i Sectio 5, with the detailed proofs collected i a appedix. I Sectio 6 we preset experimets with both simulated data ad gee microarray data, where the problem is to costruct the isopreoid biosythetic pathway. 2296
3 THE NONARANORMAL 2. Estimatig Udirected Graphs Let X = X,...,X p deote a radom vector with distributio = Nµ,Σ. The udirected graph G = V,E correspodig to cosists of a vertex set V ad a edge set E. The set V has p elemets, oe for each compoet of X. The edge set E cosists of ordered pairs i, j where i, j E if there is a edge betwee X i ad X j. The edge betwee i, j is excluded from E if ad oly if X i is idepedet of X j give the other variables X \{i, j} X s : s p, s i, j, writte X i X j X\{i, j}. It is well kow that, for multivariate Gaussia distributios, holds if ad oly if Ω i j = 0 where Ω = Σ. Let X,X 2,...,X be a radom sample from, where X i R p. If is much larger tha p, the we ca estimate Σ usig maximum likelihood, leadig to the estimate Ω = S, where S = i= T X i X X i X is the sample covariace, with X the sample mea. The zeroes of Ω ca the be estimated by applyig hypothesis testig to Ω Drto ad erlma, 2007, Whe p >, maximum likelihood is o loger useful; i particular, the estimate Σ is ot positive defiite, havig rak o greater tha. Ispired by the success of the lasso for liear models, several authors have suggested estimatig Σ by miimizig lω+λ Ω jk j k where lω = log Ω trωs plog2π 2 is the loglikelihood with S the sample covariace matrix. The estimator Ω ca be computed efficietly usig the glasso algorithm Friedma et al., 2007, which is a block coordiate descet algorithm that uses the stadard lasso to estimate a sigle row ad colum of Ω i each iteratio. Uder appropriate sparsity coditios, the resultig estimator Ω has bee show to have good theoretical properties Rothma et al., 2008; Ravikumar et al., 2009b. There are several differet ways to judge the quality of a estimator Σ of the covariace or Ω of the iverse covariace. We discuss three i this paper, persistecy, orm cosistecy, ad sparsistecy. ersistecy meas cosistecy i risk, whe the model is ot ecessarily assumed to be correct. Suppose the true distributio has mea µ 0, ad that we use a multivariate ormal px;µ 0,Σ for predictio; we do ot assume that is ormal. We observe a ew vector X ad defie the predictio risk to be Z RΣ = Elog px;µ 0,Σ = log px;µ 0,Σdx. It follows that RΣ = 2 trσ Σ 0 +log Σ plog2π 2297
4 LIU, LAFFERTY, AND WASSERMAN where Σ 0 is the covariace of X uder. IfS is a set of covariace matrices, the oracle is defied to be the covariace matrix Σ that miimizes RΣ overs: Σ = arg mi Σ S RΣ. Thus px;µ 0,Σ is the best predictor of a ew observatio amog all distributios i {px;µ 0,Σ : Σ S}. I particular, ifs cosists of covariace matrices with sparse graphs, the px;µ 0,Σ is, i some sese, the best sparse predictor. A estimator Σ is persistet if R Σ RΣ 0 as the sample size icreases to ifiity. Thus, a persistet estimator approximates the best estimator over the classs, but we do ot assume that the true distributio has a covariace matrix is, or eve that it is Gaussia. Moreover, we allow the dimesio p = p to icrease with. O the other had, orm cosistecy ad sparsistecy require that the true distributio is Gaussia. I this case, let Σ 0 deote the true covariace matrix. A estimator is orm cosistet if Σ Σ 0 where is a orm. If EΩ deotes the edge set correspodig to Ω, a estimator is sparsistet if EΩ E Ω 0. Thus, a sparsistet estimator idetifies the correct graph cosistetly. We preset our theoretical aalysis o these properties of the oparaormal i Sectio The Noparaormal We say that a radom vector X = X,...,X p T has a oparaormal distributio if there exist fuctios { f j } p j= such that Z fx Nµ,Σ, where fx = f X,..., f p X p. We the write X NN µ,σ, f. Whe the f j s are mootoe ad differetiable, the joit probability desity fuctio of X is give by { p X x = 2π p/2 exp p Σ /2 2 fx µt Σ fx µ} f jx j. 2 j= Lemma The oparaormal distributio NN µ,σ, f is a Gaussia copula whe the f j s are mootoe ad differetiable. roof By Sklar s theorem Sklar, 959, ay joit distributio ca be writte as Fx,...,x p = C{F x,...,f p x p } where the fuctio C is called a copula. For the oparaormal we have Fx,...,x p = Φ µ,σ Φ F x,...,φ F p x p 2298
5 THE NONARANORMAL where Φ µ,σ is the multivariate Gaussia cdf ad Φ is the uivariate stadard Gaussia cdf. Thus, the correspodig copula is Cu,...,u p = Φ µ,σ Φ u,...,φ u p. This is exactly a Gaussia copula with parameters µ ad Σ. If each f j is differetiable the the desity of X has the same form as 2. Note that the desity i 2 is ot idetifiable; to make the family idetifiable we demad that f j preserve meas ad variaces: µ j =EZ j =EX j ad σ 2 j Σ j j = VarZ j = VarX j. 3 Note that these coditios oly deped o diagσ but ot the full covariace matrix. Let F j x deote the margial distributio fuctio of X j. The f j x µ j F j x =X j x =Z j f j x = Φ which implies that f j x = µ j + σ j Φ F j x. 4 The followig basic fact says that the idepedece graph of the oparaormal is ecoded i Ω = Σ, as for the parametric ormal. σ j Lemma 2 If X NN µ,σ, f is oparaormal ad each f j is differetiable, the X i X j X \{i, j} if ad oly if Ω i j = 0, where Ω = Σ. roof From the form of the desity 2, it follows that the desity factors with respect to the graph of Ω, ad therefore obeys the global Markov property of the graph. Next we show that the above is true for ay choice of idetificatio restrictios. Lemma 3 Defie h j x = Φ F j x 5 ad let Λ be the covariace matrix of hx. The X j X k X \{ j,k} if ad oly if Λ jk = 0. roof We ca rewrite the covariace matrix as Hece Σ = DΛD ad Σ jk = CovZ j,z k = σ j σ k Covh j X j,h k X k. Σ = D Λ D, where D is the diagoal matrix with diagd = σ. The zero patter of Λ is therefore idetical to the zero patter of Σ. 2299
6 LIU, LAFFERTY, AND WASSERMAN Figure 2: Desities of three 2dimesioal oparaormals. The compoet fuctios have the form f j x = sigx x α j. Left: α = 0.9, α 2 = 0.8; ceter: α =.2, α 2 = 0.8; right α = 2, α 2 = 3. I each case µ= 0,0 ad Σ =.5.5. Thus, it is ot ecessary to estimate µ or σ to estimate the graph. Figure 2 shows three examples of 2dimesioal oparaormal desities. I each case, the compoet fuctios f j x take the form f j x = a j sigx x α j + b j where the costats a j ad b j are set to eforce the idetifiability costraits 3. The covariace i each case is Σ =.5.5 ad the mea is µ= 0,0. The expoet α j determies the oliearity. It ca be see how the cocavity of the desity chages with the expoet α, ad that α > ca result i multiple modes. The assumptio that fx = f X,..., f p X p is ormal leads to a semiparametric model where oly oe dimesioal fuctios eed to be estimated. But the mootoicity of the fuctios f j, which map otor, eables computatioal tractability of the oparaormal. For more geeral fuctios f, the ormalizig costat for the desity { p X x exp } 2 fx µt Σ fx µ caot be computed i closed form. 2300
7 THE NONARANORMAL 4. Estimatio Method Let X,...,X be a sample of size where X i = X i,...,xi p T R p. I light of 5 we defie ĥ j x = Φ F j x where F j is a estimator of F j. A atural cadidate for F j is the margial empirical distributio fuctio F j t { }. X i j t i= i= Now, let θ deote the parameters of the copula. Tsukahara 2005 suggests takig θ to be the solutio of φ F X i,..., F p X p i,θ = 0 where φ is a estimatig equatio ad F j t = F j t/ +. I our case, θ correspods to the covariace matrix. The resultig estimator θ, called a rak approximate Zestimator, has excellet theoretical properties. However, we are iterested i the high dimesioal sceario where the dimesio p is allowed to icrease with ; the variace of F j t is too large i this case. Istead, we use the followig trucated or Wisorized estimator: δ if F j x < δ F j x = F j x if δ F j x δ 6 δ if F j x > δ, where δ is a trucatio parameter. Clearly, there is a biasvariace tradeoff i choosig δ. Essetially the same estimator with δ = / is studied by Klaasse ad Weller 997 i the case of bivariate Gaussia copula. I what follows we use δ 4 /4 πlog. This provides the right balace so that we ca achieve the desired rate of covergece i our estimate of Ω ad the associated udirected graph G i the high dimesioal settig. Give this estimate of the distributio of variable X j, we the estimate the trasformatio fuctio f j by f j x µ j + σ j h j x 7 where h j x = Φ F j x ad µ j ad σ j are the sample mea ad the stadard deviatio: µ j X i 2. j ad σ j = X i j µ j i=. After Charles. Wisor, whom Joh Tukey credited with covertig him from topology to statistics Mallows 990. i= 230
8 LIU, LAFFERTY, AND WASSERMAN Now, let S f be the sample covariace matrix of fx,..., fx ; that is, S f T fx i µ f fx i µ f 8 µ f i= i= fx i. We the estimate Ω usig S f. For istace, the maximum likelihood estimator is S f. The l regularized estimator is { } Ω = arg mi tr ΩS f log Ω +λ Ω Ω Ω MLE = where λ is a regularizatio parameter, ad Ω = j k Ω jk. The estimated graph is the Ê = { j,k : Ω jk 0}. The oparaormal is aalogous to a sparse additive regressio model Ravikumar et al., 2009a, i the sese that both methods trasform the variables by uivariate fuctios. However, while sparse additive models use a regularized risk criterio to fit uivariate trasformatios, our oparaormal estimator uses a twostep procedure:. Replace the observatios, for each variable, by their respective ormal scores, subject to a Wisorized trucatio. 2. Apply the graphical lasso to the trasformed data to estimate the udirected graph. The first step is oiterative ad computatioally efficiet, with o tuig parameters; it also makes the oparaormal ameable to theoretical aalysis. Startig with the model i 2, aother possibility would be to parametrize each f j accordig to some parametric class of mootoe fuctios such as the BoxCox family, ad the fid the maximum likelihood estimates of Ω, f,... f p i that class. This might lead to estimates of f j that deped o Ω, ad vice versa, ad the estimatio problem would ot i geeral be covex. Alteratively, due to 4, the margial iformatio could be used to estimate the parameters. Our oparametric approach to estimatig the trasformatios has the advatages of makig few assumptios ad beig easy to compute. I the followig sectio we aalyze the theoretical properties of this estimator. 5. Theoretical Results I this sectio we preset our theoretical results o risk cosistecy, model selectio cosistecy, ad orm cosistecy of the covariace Σ ad iverse covariace Ω. From Lemma 3, the estimate of the graph does ot deped o σ j, j {,..., p} ad µ, so we assume that σ j = ad µ= 0. Our key techical result is a aalysis of the covariace of the Wisorized estimator defied i 6, 7, ad 8. I particular, we show that uder appropriate coditios, max j,k S f jk S f jk = o where S f jk deotes the j,k etry of the matrix. This result allows us to leverage the recet aalysis of Rothma et al ad Ravikumar et al. 2009b i the Gaussia case to obtai cosistecy results for the oparaormal. More precisely, our mai theorem is the followig
9 THE NONARANORMAL Theorem 4 Suppose that p = ξ ad let f be the Wisorized estimator defied i 7 with δ = 4 /4 πlog. Defie For some M 2ξ+. The for ay ε C M log plog 2 C M 48 π 2M M /2 ad sufficietly large, we have max S f jk S f jk > 2ε jk 2 πlogp + 2exp /2 ε 2 2log p 232π 2 log 2 + 2exp 2log p /2 + o. 8πlog The proof of the above theorem is give i Sectio 7. The followig corollary is immediate, ad specifies the scalig of the dimesio i terms of sample size. Corollary 5 Let M max{5π,2ξ+}. The log plog max 2 S f jk S f jk > 2CM jk /2 = o. Hece, max j,k S f jk S f jk = O log plog 2. /2 The followig corollary yields estimatio cosistecy i both the Frobeius orm ad the l 2  operator orm. The proof follows the same argumets as the proof of Theorem ad Theorem 2 from Rothma et al. 2008, replacig their Lemma with our Theorem 4. For a matrix A = a i j, the Frobeius orm F is defied as A F i, j a 2 i j. The l 2 operator orm 2 is defied as the magitude of the largest eigevalue of the matrix, A 2 max x 2 = Ax 2. I the followig, we write a b if there are positive costats c ad C idepedet of such that c a /b C. Corollary 6 Suppose that the data are geerated as X i NN µ 0,Σ 0, f 0, ad let Ω 0 = Σ 0. If the regularizatio parameter λ is chose as log plog 2 λ 2C M /2 where C M is defied i Theorem 4. The the oparaormal estimator Ω of 9 satisfies Ω Ω 0 F = O s+ plog plog 2 /2 2303
10 LIU, LAFFERTY, AND WASSERMAN ad Ω Ω 0 2 = O slog plog 2, /2 where s Card{i, j {,..., p} {,..., p} Ω 0 i, j 0, i j} is the umber of ozero offdiagoal elemets of the true precisio matrix. To prove the model selectio cosistecy result, we eed further assumptios. We follow Ravikumar 2009 ad let the p 2 p 2 Fisher iformatio matrix of Σ 0 be Γ Σ 0 Σ 0 where is the Kroecker matrix product, ad defie the support set S of Ω 0 = Σ 0 as S {i, j {,..., p} {,..., p} Ω 0 i, j 0}. We use S c to deote the complemet of S i the set {,..., p} {,..., p}, ad for ay two subsets T ad T of {,..., p} {,..., p}, we use Γ T T to deote the submatrix with rows ad colums of Γ idexed by T ad T respectively. Assumptio There exists some α 0,], such that ΓS c SΓ SS α. As i Ravikumar et al. 2009b, we defie two quatities K Σ0 Σ 0 ad K Γ Γ SS. Further, we defie the maximum row degree as d max i=,...,p Card{ j,..., p Ω 0i, j 0}. Assumptio 2 The quatities K Σ 0 ad K Γ are bouded, ad there are positive costats C such that mi Ω log 3 0 j,k C j,k S /2 for large eough. The proof of the followig corollary uses our Theorem 4 i place of Equatio 2 i the aalysis of Ravikumar et al. 2009b. Corollary 7 Suppose the regularizatio parameter is chose as log plog 2 λ 2C M /2 where CM,, p is defied i Theorem 4. The the oparaormal estimator Ω satisfies G Ω,Ω 0 o whereg Ω,Ω 0 is the evet { } sig Ω j,k = sigω 0 j,k, j,k S. 2304
11 THE NONARANORMAL Our persistecy risk cosistecy result parallels the persistecy result for additive models give i Ravikumar et al. 2009a, ad allows model dimesio that grows expoetially with sample size. The defiitio i this theorem uses the fact from Lemma that sup x Φ F j x 2log whe δ = /4 /4 πlog. I the ext theorem, we do ot assume the true model is oparaormal ad defie the populatio ad sample risks as R f,ω = 2 { tr [ ΩE fx fx T ] log Ω plog2π } R f,ω = 2 {tr[ωs f] log Ω plog2π}. Theorem 8 Suppose that p e ξ for some ξ <, ad defie the classes M = { f :R R : f is mootoe with f C } log C = { Ω : Ω L }. Let Ω be give by The R f, Ω { } Ω = argmi tr ΩS f log Ω. Ω C log if R f,ω = O L C ξ. f,ω M p Hece the Wisorized estimator of f,ω with δ = /4 /4 πlog is persistet over C whe L = o ξ/2 / log. The proofs of Theorems 4 ad 8 are give i Sectio Experimetal Results I this sectio, we report experimetal results o sythetic ad real data sets. We maily compare the l regularized oparaormal ad Gaussia paraormal models, computed usig the graphical lasso algorithm glasso of Friedma et al The primary coclusios are: i Whe the data are multivariate Gaussia, the performace of the two methods is comparable; ii whe the model is correct, the oparaormal performs much better tha the graphical lasso i may cases; iii for a particular gee microarray data set, our method behaves differetly from the graphical lasso, ad may support differet biological coclusios. Note that we ca reuse the glasso implemetatio to fit a sparse oparaormal. I particular, after computig the Wisorized sample covariace S f, we pass this matrix to the glasso routie to carry out the optimizatio { } Ω = arg mi tr ΩS f log Ω +λ Ω. Ω 2305
12 LIU, LAFFERTY, AND WASSERMAN 6. Neighborhood Graphs We begi by describig a procedure to geerate graphs as i Meishause ad Bühlma, 2006, with respect to which several distributios ca the be defied. We geerate a pdimesioal sparse graph G V,E as follows: Let V = {,..., p} correspod to variables X = X,...,X p. We associate each idex j with a poit Y [0,] 2 where j,y 2 j Y k,...,y k Uiform[0, ] for k =,2. Each pair of odes i, j is icluded i the edge set E with probability i, j E = exp y i y j 2 2π 2s where y i y i,y 2 i is the observatio of Y i,y 2 i ad represets the Euclidea distace. Here, s = 0.25 is a parameter that cotrols the sparsity level of the geerated graph. We restrict the maximum degree of the graph to be four ad build the iverse covariace matrix Ω 0 accordig to Ω 0 i, j = if i = j if i, j E 0 otherwise, where the value guaratees positive defiiteess of the iverse covariace matrix. Give Ω 0, data poits are sampled from X,...,X NNµ 0,Σ 0, f 0 where µ 0 =.5,...,.5, Σ 0 = Ω 0. For simplicity, the trasformatio fuctios for all dimesios are the same, f =...= f p = f. To sample data from the oparaormal distributio, we also require g f ; two differet trasformatios g are employed. Defiitio 9 Gaussia CDF Trasformatio Let g 0 be a oedimesioal Gaussia cumulative distributio fuctio with mea µ g0 ad the stadard deviatio σ g0, that is, t µg0 g 0 t Φ We defie the trasformatio fuctio g j = f j g j z j σ j where σ j = Σ 0 j, j. σ g0. for the jth dimesio as Z t µj g 0 z j g 0 tφ σ j dt Z Z dt 2 t µj y µj g 0 y g 0 tφ σ j φ σ j dy + µ j 2306
13 THE NONARANORMAL before trasform ower trasform CDF trasform Desity Desity Desity N = 5000 Badwidth = N = 5000 Badwidth = N = 5000 Badwidth = 0.64 idetity fuctio power fuctio, alpha = 3 CDF of N0.05, Figure 3: The power ad cdf trasformatios. The desities are estimated usig a kerel desity estimator with badwidths selected by crossvalidatio. Defiitio 0 Symmetric ower Trasformatio Let g 0 be the symmetric ad odd trasformatio give by g 0 t = sigt t α where α > 0 is a parameter. We defie the power trasformatio for the jth dimesio as g j z j σ j g 0 z j µ j Z g 2 0 t µ jφ t µj σ j dt + µ j. These trasformatio are costructed to preserve the margial mea ad stadard deviatio. I the followig experimets, we refer to them as the cdf trasformatio ad the power trasformatio, respectively. For the cdf trasformatio, we set µ g0 = 0.05 ad σ g0 = 0.4. For the power trasformatio, we set α = 3. To visualize these two trasformatios, we sample 5000 data poits from a oedimesioal ormal distributio N0.5,.0 ad the apply the above two trasformatios; the results are show i Figure 3. It ca be see how the cdf ad power trasformatios map a uivariate ormal distributio ito a highly skewed ad a bimodal distributio, respectively. 2307
14 LIU, LAFFERTY, AND WASSERMAN cdf power liear glasso path glasso path glasso path oparaormal path oparaormal path oparaormal path = 500 cdf power liear glasso path glasso path glasso path oparaormal path oparaormal path oparaormal path = 200 Figure 4: Regularizatio paths for the glasso ad oparaormal with = 500 top ad = 200 bottom. The paths for the relevat variables ozero iverse covariace etries are plotted as solid black lies; the paths for the irrelevat variables are plotted as dashed red lies. For ogaussia distributios, the oparaormal better separates the relevat ad irrelevat dimesios. To geerate sythetic data, we set p = 40, resultig i = 820 parameters to be estimated, ad vary the sample sizes from = 200 to = 000. Three coditios are cosidered, correspodig to usig the cdf trasform, the power trasform, or o trasformatio. I each case, both the glasso ad the oparaormal are applied to estimate the graph. 2308
15 THE NONARANORMAL 6.. COMARISON OF REGULARIZATION ATHS We choose a set of regularizatio parameters Λ; for each λ Λ, we obtai a estimate Ω which is a matrix. The upper triagular matrix has 780 parameters; we vectorize it to get a 780dimesioal parameter vector. A regularizatio path is the trace of these parameters over all the regularizatio parameters withi Λ. The regularizatio paths for both methods are plotted i Figure 4. For the cdf trasformatio ad the power trasformatio, the oparaormal separates the relevat ad the irrelevat dimesios very well. For the glasso, relevat variables are mixed with irrelevat variables. If o trasformatio is applied, the paths for both methods are almost the same ESTIMATED TRANSFORMATIONS For sample size = 000, we plot the estimated trasformatios for three of the variables i Figure 5. It is clear that Wisorizatio plays a sigificat role for the power trasformatio. This is ituitive due to the high skewess of the oparaormal distributio i this case. cdf power liear f estimated true f estimated true g estimated true x x x f estimated true f estimated true g estimated true x2 x2 x2 f estimated true f estimated true g estimated true x3 x3 x3 Figure 5: Estimated trasformatios for the first three variables. Wisorizatio plays a sigificat role for the power trasformatio due to its high skewess. 2309
16 LIU, LAFFERTY, AND WASSERMAN cdf power liear Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Figure 6: Boxplots of the oracle scores for = 000,500,200 top, ceter, bottom QUANTITATIVE COMARISON To evaluate the performace for structure estimatio quatitatively, we use false positive ad false egative rates. Let G = V,E be a pdimesioal graph which has at most p 2 edges i which there are E = r edges, ad let Ĝ λ = V,Ê λ be a estimated graph usig the regularizatio parameter λ. The umber of false positives at λ is Fλ umber of edges i Ê λ ot i E The umber of false egatives at λ is defied as The oracle regularizatio level λ is the FNλ umber of edges i E ot i Ê λ. λ = arg mi{fλ+fnλ}. λ Λ The oracle score is Fλ + FNλ. Figure 6 shows boxplots of the oracle scores for the two methods, calculated usig 00 simulatios. 230
17 THE NONARANORMAL To illustrate the overall performace of these two methods over the full paths, ROC curves are show i Figure 7, usig FNλ, Fλ r p. 2 r The curves clearly show how the performace of both methods improves with sample size, ad that the oparaormal is superior to the Gaussia model i most cases. cdf power liear CDF Trasform ower Trasform No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN CDF Trasform FN ower Trasform FN No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN CDF Trasform FN ower Trasform FN No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN FN FN Figure 7: ROC curves for sample sizes = 000,500,200 top, middle, bottom. Let FE Fλ ad FNE FNλ, Tables, 2, ad 3 provide umerical comparisos of both methods o data sets with differet trasformatios, where we repeat the experimets 00 times ad report the average FE ad FNE values with the correspodig stadard deviatios. It s clear from the tables that the oparaormal achieves sigificatly smaller errors tha the glasso if the true distributio of the data is ot multivariate Gaussia ad achieves performace comparable to the glasso whe the true distributio is exactly multivariate Gaussia. Figure 8 shows typical rus for the cdf ad power trasformatios. It s clear that whe the glasso estimates the graph icorrectly, the mistakes iclude both false positives ad egatives. 23
18 LIU, LAFFERTY, AND WASSERMAN Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table : Quatitative compariso o the data set usig the cdf trasformatio. For both FE ad FNE, the oparaormal performs much better i geeral. Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table 2: Quatitative compariso o the data set usig the power trasformatio. For both FE ad FNE, the oparaormal performs much better i geeral COMARISON IN THE GAUSSIAN CASE The previous experimets idicate that the oparaormal works almost as well as the glasso i the Gaussia case. This iitially appears surprisig, sice a parametric method is expected to be more efficiet tha a oparametric method if the parametric assumptio is correct. To maifest this efficiecy loss, we coducted some experimets with very small ad relatively large p. For multivariate Gaussia models, Figure 9 shows results with, p,s = 50,40,/8,50,00,/5 232
19 THE NONARANORMAL Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table 3: Quatitative compariso o the data set without ay trasformatio. The two methods behave similarly, the glasso is slightly better. ad 30, 00, /5. From the mea ROC curves, we see that oparaormal does ideed behave worse tha the glasso, suggestig some efficiecy loss. However, from the correspodig boxplots, the efficiecy reductio is relatively isigificat THE CASE WHEN p Figure 0 shows results from a simulatio of the oparaormal usig cdf trasformatios with = 200, p = 500 ad sparsity level s = /40. The boxplot shows that the oparaormal outperforms the glasso. A typical ru of the regularizatio paths cofirms this coclusio, showig that the oparaormal path separates the relevat ad irrelevat dimesios very well. I cotrast, with the glasso the relevat variables are buried amog the irrelevat variables. 6.2 Gee Microarray Data I this study, we cosider a data set based o Affymetrix GeeChip microarrays for the plat Arabidopsis thaliaa, Wille et al., The sample size is = 8. The expressio levels for each chip are preprocessed by logtrasformatio ad stadardizatio. A subset of 40 gees from the isopreoid pathway are chose, ad we study the associatios amog them usig both the paraormal ad oparaormal models. Eve though these data are geerally treated as multivariate Gaussia i the previous aalysis Wille et al., 2004, our study shows that the results of the oparaormal ad the glasso are very differet over a wide rage of regularizatio parameters. This suggests the oparaormal could support differet scietific coclusios COMARISON OF THE REGULARIZATION ATHS We first compare the regularizatio paths of the two methods, i Figure. To geerate the paths, we select 50 regularizatio parameters o a evely spaced grid i the iterval [0.6,.2]. Although 233
20 LIU, LAFFERTY, AND WASSERMAN cdf power true graph, p = 40 oparaormal, p = 40 true graph, p = 40 oparaormal, p = 40 z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z true graph, p = 40 oparaormal, p = 40 true graph, p = 40 oparaormal, p = 40 z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z Figure 8: Typical rus for the two methods for = 000 usig the cdf ad power trasformatios. The dashed black lies i the symmetric differece plots idicate edges foud by the glasso but ot the oparaormal, ad viceversa for the solid red lies. z the paths for the two methods look similar, there are some subtle differeces. I particular, variables become ozero i a differet order, especially whe the regularizatio parameter is i the rage λ [0.2, 0.3]. As show below, these subtle differeces i the paths lead to differet model selectio behaviors COMARISON OF THE ESTIMATED GRAHS Figure 2 compares the estimated graphs for the two methods at several values of the regularizatio parameter λ i the rage [0.6,0.37]. For each λ, we show the estimated graph from the oparaormal i the first colum. I the secod colum we show the graph obtaied by scaig the full 234
Chapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationAMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS Jia Huag 1, Joel L. Horowitz 2 ad Fegrog Wei 3 1 Uiversity of Iowa, 2 Northwester Uiversity ad 3 Uiversity of West Georgia Abstract We cosider a oparametric
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the BruMikowski iequality for boxes. Today we ll go over the
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationCME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8
CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationConvexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationHighdimensional support union recovery in multivariate regression
Highdimesioal support uio recovery i multivariate regressio Guillaume Oboziski Departmet of Statistics UC Berkeley gobo@stat.berkeley.edu Marti J. Waiwright Departmet of Statistics Dept. of Electrical
More informationTHE HEIGHT OF qbinary SEARCH TREES
THE HEIGHT OF qbinary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationLIMIT DISTRIBUTION FOR THE WEIGHTED RANK CORRELATION COEFFICIENT, r W
REVSTAT Statistical Joural Volume 4, Number 3, November 2006, 189 200 LIMIT DISTRIBUTION FOR THE WEIGHTED RANK CORRELATION COEFFICIENT, r W Authors: Joaquim F. Pito da Costa Dep. de Matemática Aplicada,
More informationGregory Carey, 1998 Linear Transformations & Composites  1. Linear Transformations and Linear Composites
Gregory Carey, 1998 Liear Trasformatios & Composites  1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationAn example of nonquenched convergence in the conditional central limit theorem for partial sums of a linear process
A example of oqueched covergece i the coditioal cetral limit theorem for partial sums of a liear process Dalibor Volý ad Michael Woodroofe Abstract A causal liear processes X,X 0,X is costructed for which
More informationKey Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationPlugin martingales for testing exchangeability online
Plugi martigales for testig exchageability olie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationLecture 4: Cheeger s Inequality
Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a dregular
More informationNotes on Hypothesis Testing
Probability & Statistics Grishpa Notes o Hypothesis Testig A radom sample X = X 1,..., X is observed, with joit pmf/pdf f θ x 1,..., x. The values x = x 1,..., x of X lie i some sample space X. The parameter
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationClass Meeting # 16: The Fourier Transform on R n
MATH 18.152 COUSE NOTES  CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationIrreducible polynomials with consecutive zero coefficients
Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem
More informationARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorovtype test for monotonicity of regression. Cecile Durot
STAPRO 66 pp:  col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N  SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorovtype test for mootoicity of regressio Cecile Durot Laboratoire
More information, a Wishart distribution with n 1 degrees of freedom and scale matrix.
UMEÅ UNIVERSITET Matematiskstatistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that
More informationData Analysis and Statistical Behaviors of Stock Market Fluctuations
44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:
More informationSUPPORT UNION RECOVERY IN HIGHDIMENSIONAL MULTIVARIATE REGRESSION 1
The Aals of Statistics 2011, Vol. 39, No. 1, 1 47 DOI: 10.1214/09AOS776 Istitute of Mathematical Statistics, 2011 SUPPORT UNION RECOVERY IN HIGHDIMENSIONAL MULTIVARIATE REGRESSION 1 BY GUILLAUME OBOZINSKI,
More informationHIGHDIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY
The Aals of Statistics 2012, Vol. 40, No. 3, 1637 1664 DOI: 10.1214/12AOS1018 Istitute of Mathematical Statistics, 2012 HIGHDIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationDistributions of Order Statistics
Chapter 2 Distributios of Order Statistics We give some importat formulae for distributios of order statistics. For example, where F k: (x)=p{x k, x} = I F(x) (k, k + 1), I x (a,b)= 1 x t a 1 (1 t) b 1
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More informationEntropy of bicapacities
Etropy of bicapacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uivates.fr JeaLuc Marichal Applied Mathematics
More informationCOMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationResearch Method (I) Knowledge on Sampling (Simple Random Sampling)
Research Method (I) Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationLecture 10: Hypothesis testing and confidence intervals
Eco 514: Probability ad Statistics Lecture 10: Hypothesis testig ad cofidece itervals Types of reasoig Deductive reasoig: Start with statemets that are assumed to be true ad use rules of logic to esure
More informationThe analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
More informationThe second difference is the sequence of differences of the first difference sequence, 2
Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for
More information1 Hypothesis testing for a single mean
BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely
More informationarxiv:1506.03481v1 [stat.me] 10 Jun 2015
BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,
More informationJoint Probability Distributions and Random Samples
STAT5 Sprig 204 Lecture Notes Chapter 5 February, 204 Joit Probability Distributios ad Radom Samples 5. Joitly Distributed Radom Variables Chapter Overview Joitly distributed rv Joit mass fuctio, margial
More informationParameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 11 04/01/2008. Sven Zenker
Parameter estimatio for oliear models: Numerical approaches to solvig the iverse problem Lecture 11 04/01/2008 Sve Zeker Review: Trasformatio of radom variables Cosider probability distributio of a radom
More informationTHE ABRACADABRA PROBLEM
THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected
More informationResearch Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationSolutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork
Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the
More informationProbabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ?
Probabilistic Egieerig Mechaics 4 (009) 577 584 Cotets lists available at ScieceDirect Probabilistic Egieerig Mechaics joural homepage: wwwelseviercom/locate/probegmech Do Roseblatt ad Nataf isoprobabilistic
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationSwaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps
Swaps: Costat maturity swaps (CMS) ad costat maturity reasury (CM) swaps A Costat Maturity Swap (CMS) swap is a swap where oe of the legs pays (respectively receives) a swap rate of a fixed maturity, while
More informationRunning Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis
Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a stepbystep procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationLearning outcomes. Algorithms and Data Structures. Time Complexity Analysis. Time Complexity Analysis How fast is the algorithm? Prof. Dr.
Algorithms ad Data Structures Algorithm efficiecy Learig outcomes Able to carry out simple asymptotic aalysisof algorithms Prof. Dr. Qi Xi 2 Time Complexity Aalysis How fast is the algorithm? Code the
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationINFINITE SERIES KEITH CONRAD
INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal
More information1 The Gaussian channel
ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.
More informationSUPPLEMENTARY MATERIAL TO GENERAL NONEXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE
SUPPLEMENTARY MATERIAL TO GENERAL NONEXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE By Guillaume Lecué CNRS, LAMA, Marelavallée, 77454 Frace ad By Shahar Medelso Departmet of Mathematics,
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationLinear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant
MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is
More informationGibbs Distribution in Quantum Statistics
Gibbs Distributio i Quatum Statistics Quatum Mechaics is much more complicated tha the Classical oe. To fully characterize a state of oe particle i Classical Mechaics we just eed to specify its radius
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More informationUniversal coding for classes of sources
Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric
More informationExploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
More informationNEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,
NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical
More information23 The Remainder and Factor Theorems
 The Remaider ad Factor Theorems Factor each polyomial completely usig the give factor ad log divisio 1 x + x x 60; x + So, x + x x 60 = (x + )(x x 15) Factorig the quadratic expressio yields x + x x
More information