The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs

Size: px
Start display at page:

Download "The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs"

Transcription

1 Joural of Machie Learig Research Submitted 3/09; Revised 5/09; ublished 0/09 The Noparaormal: Semiparametric Estimatio of High Dimesioal Udirected Graphs Ha Liu Joh Lafferty Larry Wasserma School of Computer Sciece Caregie Mello Uiversity 5000 Forbes Aveue ittsburgh, A 523, USA HANLIU@CS.CMU.EDU LAFFERTY@CS.CMU.EDU LARRY@STAT.CMU.EDU Editor: Marti J. Waiwright Abstract Recet methods for estimatig sparse udirected graphs for real-valued data i high dimesioal problems rely heavily o the assumptio of ormality. We show how to use a semiparametric Gaussia copula or oparaormal for high dimesioal iferece. Just as additive models exted liear models by replacig liear fuctios with a set of oe-dimesioal smooth fuctios, the oparaormal exteds the ormal by trasformig the variables by smooth fuctios. We derive a method for estimatig the oparaormal, study the method s theoretical properties, ad show that it works well i may examples. Keywords: graphical models, Gaussia copula, high dimesioal iferece, sparsity, l regularizatio, graphical lasso, paraormal, occult. Itroductio The liear model is a maistay of statistical iferece that has bee exteded i several importat ways. A extesio to high dimesios was achieved by addig a sparsity costrait, leadig to the lasso Tibshirai, 996. A extesio to oparametric models was achieved by replacig liear fuctios with smooth fuctios, leadig to additive models Hastie ad Tibshirai, 999. These two ideas were recetly combied, leadig to a extesio called sparse additive models SpAM Ravikumar et al., 2008, 2009a. I this paper we cosider a similar oparametric extesio of udirected graphical models based o multivariate Gaussia distributios i the high dimesioal settig. Specifically, we use a high dimesioal Gaussia copula with oparametric margials, which we refer to as a oparaormal distributio. If X is a p-dimesioal radom vector distributed accordig to a multivariate Gaussia distributio with covariace matrix Σ, the coditioal idepedece relatios betwee the radom variables X,X 2,...,X p are ecoded i a graph formed from the precisio matrix Ω = Σ. Specifically, missig edges i the graph correspod to zeroes of Ω. To estimate the graph from a sample of size, it is oly ecessary to estimate Σ, which is easy if is much larger tha p. However, whe p is larger tha, the problem is more challegig. Recet work has focused o the problem of estimatig the graph i this high dimesioal settig, which becomes feasible if G is sparse. Yua ad Li 2007 c 2009 Ha Liu, Joh Lafferty ad Larry Wasserma.

2 LIU, LAFFERTY, AND WASSERMAN Assumptios Dimesio Regressio Graphical Models parametric oparametric low liear model multivariate ormal high lasso graphical lasso low additive model oparaormal high sparse additive model l -regularized oparaormal Figure : Compariso of regressio ad graphical models. The oparaormal exteds additive models to the graphical model settig. Regularizig the iverse covariace leads to a extesio to high dimesios, which parallels sparse additive models for regressio. ad Baerjee et al propose a estimator based o regularized maximum likelihood usig a l costrait o the etries of Ω, ad Friedma et al develop a efficiet algorithm for computig the estimator usig a graphical versio of the lasso. The resultig estimatio procedure has excellet theoretical properties, as show recetly by Rothma et al ad Ravikumar et al. 2009b. While Gaussia graphical models ca be useful, a reliace o exact ormality is limitig. Our goal i this paper is to weake this assumptio. Our approach parallels the ideas behid sparse additive models for regressio Ravikumar et al., 2008, 2009a. Specifically, we replace the Gaussia with a semiparametric Gaussia copula. This meas that we replace the radom variable X = X,...,X p by the trasformed radom variable fx = f X,..., f p X p, ad assume that fx is multivariate Gaussia. This semiparametric copula results i a oparametric extesio of the ormal that we call the oparaormal distributio. The oparaormal depeds o the fuctios { f j }, ad a mea µ ad covariace matrix Σ, all of which are to be estimated from data. While the resultig family of distributios is much richer tha the stadard parametric ormal the paraormal, the idepedece relatios amog the variables are still ecoded i the precisio matrix Ω = Σ. We propose a oparametric estimator for the fuctios { f j }, ad show how the graphical lasso ca be used to estimate the graph i the high dimesioal settig. The relatioship betwee liear regressio models, Gaussia graphical models, ad their extesios to oparametric ad high dimesioal models is summarized i Figure. Most theoretical results o semiparametric copulas focus o low or at least fiite dimesioal models Klaasse ad Weller, 997; Tsukahara, Models with icreasig dimesio require a more delicate aalysis; i particular, simply pluggig i the usual empirical distributio of the margials does ot lead to accurate iferece. Istead we use a trucated empirical distributio. We give a theoretical aalysis of this estimator, provig cosistecy results with respect to risk, model selectio, ad estimatio of Ω i the Frobeius orm. I the followig sectio we review the basic otio of the graph correspodig to a multivariate Gaussia, ad formulate differet criteria for evaluatig estimators of the covariace or iverse covariace. I Sectio 3 we preset the oparaormal, ad i Sectio 4 we discuss estimatio of the model. We preset a theoretical aalysis of the estimatio method i Sectio 5, with the detailed proofs collected i a appedix. I Sectio 6 we preset experimets with both simulated data ad gee microarray data, where the problem is to costruct the isopreoid biosythetic pathway. 2296

3 THE NONARANORMAL 2. Estimatig Udirected Graphs Let X = X,...,X p deote a radom vector with distributio = Nµ,Σ. The udirected graph G = V,E correspodig to cosists of a vertex set V ad a edge set E. The set V has p elemets, oe for each compoet of X. The edge set E cosists of ordered pairs i, j where i, j E if there is a edge betwee X i ad X j. The edge betwee i, j is excluded from E if ad oly if X i is idepedet of X j give the other variables X \{i, j} X s : s p, s i, j, writte X i X j X\{i, j}. It is well kow that, for multivariate Gaussia distributios, holds if ad oly if Ω i j = 0 where Ω = Σ. Let X,X 2,...,X be a radom sample from, where X i R p. If is much larger tha p, the we ca estimate Σ usig maximum likelihood, leadig to the estimate Ω = S, where S = i= T X i X X i X is the sample covariace, with X the sample mea. The zeroes of Ω ca the be estimated by applyig hypothesis testig to Ω Drto ad erlma, 2007, Whe p >, maximum likelihood is o loger useful; i particular, the estimate Σ is ot positive defiite, havig rak o greater tha. Ispired by the success of the lasso for liear models, several authors have suggested estimatig Σ by miimizig lω+λ Ω jk j k where lω = log Ω trωs plog2π 2 is the log-likelihood with S the sample covariace matrix. The estimator Ω ca be computed efficietly usig the glasso algorithm Friedma et al., 2007, which is a block coordiate descet algorithm that uses the stadard lasso to estimate a sigle row ad colum of Ω i each iteratio. Uder appropriate sparsity coditios, the resultig estimator Ω has bee show to have good theoretical properties Rothma et al., 2008; Ravikumar et al., 2009b. There are several differet ways to judge the quality of a estimator Σ of the covariace or Ω of the iverse covariace. We discuss three i this paper, persistecy, orm cosistecy, ad sparsistecy. ersistecy meas cosistecy i risk, whe the model is ot ecessarily assumed to be correct. Suppose the true distributio has mea µ 0, ad that we use a multivariate ormal px;µ 0,Σ for predictio; we do ot assume that is ormal. We observe a ew vector X ad defie the predictio risk to be Z RΣ = Elog px;µ 0,Σ = log px;µ 0,Σdx. It follows that RΣ = 2 trσ Σ 0 +log Σ plog2π 2297

4 LIU, LAFFERTY, AND WASSERMAN where Σ 0 is the covariace of X uder. IfS is a set of covariace matrices, the oracle is defied to be the covariace matrix Σ that miimizes RΣ overs: Σ = arg mi Σ S RΣ. Thus px;µ 0,Σ is the best predictor of a ew observatio amog all distributios i {px;µ 0,Σ : Σ S}. I particular, ifs cosists of covariace matrices with sparse graphs, the px;µ 0,Σ is, i some sese, the best sparse predictor. A estimator Σ is persistet if R Σ RΣ 0 as the sample size icreases to ifiity. Thus, a persistet estimator approximates the best estimator over the classs, but we do ot assume that the true distributio has a covariace matrix is, or eve that it is Gaussia. Moreover, we allow the dimesio p = p to icrease with. O the other had, orm cosistecy ad sparsistecy require that the true distributio is Gaussia. I this case, let Σ 0 deote the true covariace matrix. A estimator is orm cosistet if Σ Σ 0 where is a orm. If EΩ deotes the edge set correspodig to Ω, a estimator is sparsistet if EΩ E Ω 0. Thus, a sparsistet estimator idetifies the correct graph cosistetly. We preset our theoretical aalysis o these properties of the oparaormal i Sectio The Noparaormal We say that a radom vector X = X,...,X p T has a oparaormal distributio if there exist fuctios { f j } p j= such that Z fx Nµ,Σ, where fx = f X,..., f p X p. We the write X NN µ,σ, f. Whe the f j s are mootoe ad differetiable, the joit probability desity fuctio of X is give by { p X x = 2π p/2 exp p Σ /2 2 fx µt Σ fx µ} f jx j. 2 j= Lemma The oparaormal distributio NN µ,σ, f is a Gaussia copula whe the f j s are mootoe ad differetiable. roof By Sklar s theorem Sklar, 959, ay joit distributio ca be writte as Fx,...,x p = C{F x,...,f p x p } where the fuctio C is called a copula. For the oparaormal we have Fx,...,x p = Φ µ,σ Φ F x,...,φ F p x p 2298

5 THE NONARANORMAL where Φ µ,σ is the multivariate Gaussia cdf ad Φ is the uivariate stadard Gaussia cdf. Thus, the correspodig copula is Cu,...,u p = Φ µ,σ Φ u,...,φ u p. This is exactly a Gaussia copula with parameters µ ad Σ. If each f j is differetiable the the desity of X has the same form as 2. Note that the desity i 2 is ot idetifiable; to make the family idetifiable we demad that f j preserve meas ad variaces: µ j =EZ j =EX j ad σ 2 j Σ j j = VarZ j = VarX j. 3 Note that these coditios oly deped o diagσ but ot the full covariace matrix. Let F j x deote the margial distributio fuctio of X j. The f j x µ j F j x =X j x =Z j f j x = Φ which implies that f j x = µ j + σ j Φ F j x. 4 The followig basic fact says that the idepedece graph of the oparaormal is ecoded i Ω = Σ, as for the parametric ormal. σ j Lemma 2 If X NN µ,σ, f is oparaormal ad each f j is differetiable, the X i X j X \{i, j} if ad oly if Ω i j = 0, where Ω = Σ. roof From the form of the desity 2, it follows that the desity factors with respect to the graph of Ω, ad therefore obeys the global Markov property of the graph. Next we show that the above is true for ay choice of idetificatio restrictios. Lemma 3 Defie h j x = Φ F j x 5 ad let Λ be the covariace matrix of hx. The X j X k X \{ j,k} if ad oly if Λ jk = 0. roof We ca rewrite the covariace matrix as Hece Σ = DΛD ad Σ jk = CovZ j,z k = σ j σ k Covh j X j,h k X k. Σ = D Λ D, where D is the diagoal matrix with diagd = σ. The zero patter of Λ is therefore idetical to the zero patter of Σ. 2299

6 LIU, LAFFERTY, AND WASSERMAN Figure 2: Desities of three 2-dimesioal oparaormals. The compoet fuctios have the form f j x = sigx x α j. Left: α = 0.9, α 2 = 0.8; ceter: α =.2, α 2 = 0.8; right α = 2, α 2 = 3. I each case µ= 0,0 ad Σ =.5.5. Thus, it is ot ecessary to estimate µ or σ to estimate the graph. Figure 2 shows three examples of 2-dimesioal oparaormal desities. I each case, the compoet fuctios f j x take the form f j x = a j sigx x α j + b j where the costats a j ad b j are set to eforce the idetifiability costraits 3. The covariace i each case is Σ =.5.5 ad the mea is µ= 0,0. The expoet α j determies the oliearity. It ca be see how the cocavity of the desity chages with the expoet α, ad that α > ca result i multiple modes. The assumptio that fx = f X,..., f p X p is ormal leads to a semiparametric model where oly oe dimesioal fuctios eed to be estimated. But the mootoicity of the fuctios f j, which map otor, eables computatioal tractability of the oparaormal. For more geeral fuctios f, the ormalizig costat for the desity { p X x exp } 2 fx µt Σ fx µ caot be computed i closed form. 2300

7 THE NONARANORMAL 4. Estimatio Method Let X,...,X be a sample of size where X i = X i,...,xi p T R p. I light of 5 we defie ĥ j x = Φ F j x where F j is a estimator of F j. A atural cadidate for F j is the margial empirical distributio fuctio F j t { }. X i j t i= i= Now, let θ deote the parameters of the copula. Tsukahara 2005 suggests takig θ to be the solutio of φ F X i,..., F p X p i,θ = 0 where φ is a estimatig equatio ad F j t = F j t/ +. I our case, θ correspods to the covariace matrix. The resultig estimator θ, called a rak approximate Z-estimator, has excellet theoretical properties. However, we are iterested i the high dimesioal sceario where the dimesio p is allowed to icrease with ; the variace of F j t is too large i this case. Istead, we use the followig trucated or Wisorized estimator: δ if F j x < δ F j x = F j x if δ F j x δ 6 δ if F j x > δ, where δ is a trucatio parameter. Clearly, there is a bias-variace tradeoff i choosig δ. Essetially the same estimator with δ = / is studied by Klaasse ad Weller 997 i the case of bivariate Gaussia copula. I what follows we use δ 4 /4 πlog. This provides the right balace so that we ca achieve the desired rate of covergece i our estimate of Ω ad the associated udirected graph G i the high dimesioal settig. Give this estimate of the distributio of variable X j, we the estimate the trasformatio fuctio f j by f j x µ j + σ j h j x 7 where h j x = Φ F j x ad µ j ad σ j are the sample mea ad the stadard deviatio: µ j X i 2. j ad σ j = X i j µ j i=. After Charles. Wisor, whom Joh Tukey credited with covertig him from topology to statistics Mallows 990. i= 230

8 LIU, LAFFERTY, AND WASSERMAN Now, let S f be the sample covariace matrix of fx,..., fx ; that is, S f T fx i µ f fx i µ f 8 µ f i= i= fx i. We the estimate Ω usig S f. For istace, the maximum likelihood estimator is S f. The l -regularized estimator is { } Ω = arg mi tr ΩS f log Ω +λ Ω Ω Ω MLE = where λ is a regularizatio parameter, ad Ω = j k Ω jk. The estimated graph is the Ê = { j,k : Ω jk 0}. The oparaormal is aalogous to a sparse additive regressio model Ravikumar et al., 2009a, i the sese that both methods trasform the variables by uivariate fuctios. However, while sparse additive models use a regularized risk criterio to fit uivariate trasformatios, our oparaormal estimator uses a two-step procedure:. Replace the observatios, for each variable, by their respective ormal scores, subject to a Wisorized trucatio. 2. Apply the graphical lasso to the trasformed data to estimate the udirected graph. The first step is o-iterative ad computatioally efficiet, with o tuig parameters; it also makes the oparaormal ameable to theoretical aalysis. Startig with the model i 2, aother possibility would be to parametrize each f j accordig to some parametric class of mootoe fuctios such as the Box-Cox family, ad the fid the maximum likelihood estimates of Ω, f,... f p i that class. This might lead to estimates of f j that deped o Ω, ad vice versa, ad the estimatio problem would ot i geeral be covex. Alteratively, due to 4, the margial iformatio could be used to estimate the parameters. Our oparametric approach to estimatig the trasformatios has the advatages of makig few assumptios ad beig easy to compute. I the followig sectio we aalyze the theoretical properties of this estimator. 5. Theoretical Results I this sectio we preset our theoretical results o risk cosistecy, model selectio cosistecy, ad orm cosistecy of the covariace Σ ad iverse covariace Ω. From Lemma 3, the estimate of the graph does ot deped o σ j, j {,..., p} ad µ, so we assume that σ j = ad µ= 0. Our key techical result is a aalysis of the covariace of the Wisorized estimator defied i 6, 7, ad 8. I particular, we show that uder appropriate coditios, max j,k S f jk S f jk = o where S f jk deotes the j,k etry of the matrix. This result allows us to leverage the recet aalysis of Rothma et al ad Ravikumar et al. 2009b i the Gaussia case to obtai cosistecy results for the oparaormal. More precisely, our mai theorem is the followig

9 THE NONARANORMAL Theorem 4 Suppose that p = ξ ad let f be the Wisorized estimator defied i 7 with δ = 4 /4 πlog. Defie For some M 2ξ+. The for ay ε C M log plog 2 C M 48 π 2M M /2 ad sufficietly large, we have max S f jk S f jk > 2ε jk 2 πlogp + 2exp /2 ε 2 2log p 232π 2 log 2 + 2exp 2log p /2 + o. 8πlog The proof of the above theorem is give i Sectio 7. The followig corollary is immediate, ad specifies the scalig of the dimesio i terms of sample size. Corollary 5 Let M max{5π,2ξ+}. The log plog max 2 S f jk S f jk > 2CM jk /2 = o. Hece, max j,k S f jk S f jk = O log plog 2. /2 The followig corollary yields estimatio cosistecy i both the Frobeius orm ad the l 2 - operator orm. The proof follows the same argumets as the proof of Theorem ad Theorem 2 from Rothma et al. 2008, replacig their Lemma with our Theorem 4. For a matrix A = a i j, the Frobeius orm F is defied as A F i, j a 2 i j. The l 2- operator orm 2 is defied as the magitude of the largest eigevalue of the matrix, A 2 max x 2 = Ax 2. I the followig, we write a b if there are positive costats c ad C idepedet of such that c a /b C. Corollary 6 Suppose that the data are geerated as X i NN µ 0,Σ 0, f 0, ad let Ω 0 = Σ 0. If the regularizatio parameter λ is chose as log plog 2 λ 2C M /2 where C M is defied i Theorem 4. The the oparaormal estimator Ω of 9 satisfies Ω Ω 0 F = O s+ plog plog 2 /2 2303

10 LIU, LAFFERTY, AND WASSERMAN ad Ω Ω 0 2 = O slog plog 2, /2 where s Card{i, j {,..., p} {,..., p} Ω 0 i, j 0, i j} is the umber of ozero off-diagoal elemets of the true precisio matrix. To prove the model selectio cosistecy result, we eed further assumptios. We follow Ravikumar 2009 ad let the p 2 p 2 Fisher iformatio matrix of Σ 0 be Γ Σ 0 Σ 0 where is the Kroecker matrix product, ad defie the support set S of Ω 0 = Σ 0 as S {i, j {,..., p} {,..., p} Ω 0 i, j 0}. We use S c to deote the complemet of S i the set {,..., p} {,..., p}, ad for ay two subsets T ad T of {,..., p} {,..., p}, we use Γ T T to deote the sub-matrix with rows ad colums of Γ idexed by T ad T respectively. Assumptio There exists some α 0,], such that ΓS c SΓ SS α. As i Ravikumar et al. 2009b, we defie two quatities K Σ0 Σ 0 ad K Γ Γ SS. Further, we defie the maximum row degree as d max i=,...,p Card{ j,..., p Ω 0i, j 0}. Assumptio 2 The quatities K Σ 0 ad K Γ are bouded, ad there are positive costats C such that mi Ω log 3 0 j,k C j,k S /2 for large eough. The proof of the followig corollary uses our Theorem 4 i place of Equatio 2 i the aalysis of Ravikumar et al. 2009b. Corollary 7 Suppose the regularizatio parameter is chose as log plog 2 λ 2C M /2 where CM,, p is defied i Theorem 4. The the oparaormal estimator Ω satisfies G Ω,Ω 0 o whereg Ω,Ω 0 is the evet { } sig Ω j,k = sigω 0 j,k, j,k S. 2304

11 THE NONARANORMAL Our persistecy risk cosistecy result parallels the persistecy result for additive models give i Ravikumar et al. 2009a, ad allows model dimesio that grows expoetially with sample size. The defiitio i this theorem uses the fact from Lemma that sup x Φ F j x 2log whe δ = /4 /4 πlog. I the ext theorem, we do ot assume the true model is oparaormal ad defie the populatio ad sample risks as R f,ω = 2 { tr [ ΩE fx fx T ] log Ω plog2π } R f,ω = 2 {tr[ωs f] log Ω plog2π}. Theorem 8 Suppose that p e ξ for some ξ <, ad defie the classes M = { f :R R : f is mootoe with f C } log C = { Ω : Ω L }. Let Ω be give by The R f, Ω { } Ω = argmi tr ΩS f log Ω. Ω C log if R f,ω = O L C ξ. f,ω M p Hece the Wisorized estimator of f,ω with δ = /4 /4 πlog is persistet over C whe L = o ξ/2 / log. The proofs of Theorems 4 ad 8 are give i Sectio Experimetal Results I this sectio, we report experimetal results o sythetic ad real data sets. We maily compare the l -regularized oparaormal ad Gaussia paraormal models, computed usig the graphical lasso algorithm glasso of Friedma et al The primary coclusios are: i Whe the data are multivariate Gaussia, the performace of the two methods is comparable; ii whe the model is correct, the oparaormal performs much better tha the graphical lasso i may cases; iii for a particular gee microarray data set, our method behaves differetly from the graphical lasso, ad may support differet biological coclusios. Note that we ca reuse the glasso implemetatio to fit a sparse oparaormal. I particular, after computig the Wisorized sample covariace S f, we pass this matrix to the glasso routie to carry out the optimizatio { } Ω = arg mi tr ΩS f log Ω +λ Ω. Ω 2305

12 LIU, LAFFERTY, AND WASSERMAN 6. Neighborhood Graphs We begi by describig a procedure to geerate graphs as i Meishause ad Bühlma, 2006, with respect to which several distributios ca the be defied. We geerate a p-dimesioal sparse graph G V,E as follows: Let V = {,..., p} correspod to variables X = X,...,X p. We associate each idex j with a poit Y [0,] 2 where j,y 2 j Y k,...,y k Uiform[0, ] for k =,2. Each pair of odes i, j is icluded i the edge set E with probability i, j E = exp y i y j 2 2π 2s where y i y i,y 2 i is the observatio of Y i,y 2 i ad represets the Euclidea distace. Here, s = 0.25 is a parameter that cotrols the sparsity level of the geerated graph. We restrict the maximum degree of the graph to be four ad build the iverse covariace matrix Ω 0 accordig to Ω 0 i, j = if i = j if i, j E 0 otherwise, where the value guaratees positive defiiteess of the iverse covariace matrix. Give Ω 0, data poits are sampled from X,...,X NNµ 0,Σ 0, f 0 where µ 0 =.5,...,.5, Σ 0 = Ω 0. For simplicity, the trasformatio fuctios for all dimesios are the same, f =...= f p = f. To sample data from the oparaormal distributio, we also require g f ; two differet trasformatios g are employed. Defiitio 9 Gaussia CDF Trasformatio Let g 0 be a oe-dimesioal Gaussia cumulative distributio fuctio with mea µ g0 ad the stadard deviatio σ g0, that is, t µg0 g 0 t Φ We defie the trasformatio fuctio g j = f j g j z j σ j where σ j = Σ 0 j, j. σ g0. for the j-th dimesio as Z t µj g 0 z j g 0 tφ σ j dt Z Z dt 2 t µj y µj g 0 y g 0 tφ σ j φ σ j dy + µ j 2306

13 THE NONARANORMAL before trasform ower trasform CDF trasform Desity Desity Desity N = 5000 Badwidth = N = 5000 Badwidth = N = 5000 Badwidth = 0.64 idetity fuctio power fuctio, alpha = 3 CDF of N0.05, Figure 3: The power ad cdf trasformatios. The desities are estimated usig a kerel desity estimator with badwidths selected by cross-validatio. Defiitio 0 Symmetric ower Trasformatio Let g 0 be the symmetric ad odd trasformatio give by g 0 t = sigt t α where α > 0 is a parameter. We defie the power trasformatio for the j-th dimesio as g j z j σ j g 0 z j µ j Z g 2 0 t µ jφ t µj σ j dt + µ j. These trasformatio are costructed to preserve the margial mea ad stadard deviatio. I the followig experimets, we refer to them as the cdf trasformatio ad the power trasformatio, respectively. For the cdf trasformatio, we set µ g0 = 0.05 ad σ g0 = 0.4. For the power trasformatio, we set α = 3. To visualize these two trasformatios, we sample 5000 data poits from a oe-dimesioal ormal distributio N0.5,.0 ad the apply the above two trasformatios; the results are show i Figure 3. It ca be see how the cdf ad power trasformatios map a uivariate ormal distributio ito a highly skewed ad a bi-modal distributio, respectively. 2307

14 LIU, LAFFERTY, AND WASSERMAN cdf power liear glasso path glasso path glasso path oparaormal path oparaormal path oparaormal path = 500 cdf power liear glasso path glasso path glasso path oparaormal path oparaormal path oparaormal path = 200 Figure 4: Regularizatio paths for the glasso ad oparaormal with = 500 top ad = 200 bottom. The paths for the relevat variables ozero iverse covariace etries are plotted as solid black lies; the paths for the irrelevat variables are plotted as dashed red lies. For o-gaussia distributios, the oparaormal better separates the relevat ad irrelevat dimesios. To geerate sythetic data, we set p = 40, resultig i = 820 parameters to be estimated, ad vary the sample sizes from = 200 to = 000. Three coditios are cosidered, correspodig to usig the cdf trasform, the power trasform, or o trasformatio. I each case, both the glasso ad the oparaormal are applied to estimate the graph. 2308

15 THE NONARANORMAL 6.. COMARISON OF REGULARIZATION ATHS We choose a set of regularizatio parameters Λ; for each λ Λ, we obtai a estimate Ω which is a matrix. The upper triagular matrix has 780 parameters; we vectorize it to get a 780-dimesioal parameter vector. A regularizatio path is the trace of these parameters over all the regularizatio parameters withi Λ. The regularizatio paths for both methods are plotted i Figure 4. For the cdf trasformatio ad the power trasformatio, the oparaormal separates the relevat ad the irrelevat dimesios very well. For the glasso, relevat variables are mixed with irrelevat variables. If o trasformatio is applied, the paths for both methods are almost the same ESTIMATED TRANSFORMATIONS For sample size = 000, we plot the estimated trasformatios for three of the variables i Figure 5. It is clear that Wisorizatio plays a sigificat role for the power trasformatio. This is ituitive due to the high skewess of the oparaormal distributio i this case. cdf power liear f estimated true f estimated true g estimated true x x x f estimated true f estimated true g estimated true x2 x2 x2 f estimated true f estimated true g estimated true x3 x3 x3 Figure 5: Estimated trasformatios for the first three variables. Wisorizatio plays a sigificat role for the power trasformatio due to its high skewess. 2309

16 LIU, LAFFERTY, AND WASSERMAN cdf power liear Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Oracle Score Oracle Score Oracle Score NoparaNormal Glasso NoparaNormal Glasso NoparaNormal Glasso Figure 6: Boxplots of the oracle scores for = 000,500,200 top, ceter, bottom QUANTITATIVE COMARISON To evaluate the performace for structure estimatio quatitatively, we use false positive ad false egative rates. Let G = V,E be a p-dimesioal graph which has at most p 2 edges i which there are E = r edges, ad let Ĝ λ = V,Ê λ be a estimated graph usig the regularizatio parameter λ. The umber of false positives at λ is Fλ umber of edges i Ê λ ot i E The umber of false egatives at λ is defied as The oracle regularizatio level λ is the FNλ umber of edges i E ot i Ê λ. λ = arg mi{fλ+fnλ}. λ Λ The oracle score is Fλ + FNλ. Figure 6 shows boxplots of the oracle scores for the two methods, calculated usig 00 simulatios. 230

17 THE NONARANORMAL To illustrate the overall performace of these two methods over the full paths, ROC curves are show i Figure 7, usig FNλ, Fλ r p. 2 r The curves clearly show how the performace of both methods improves with sample size, ad that the oparaormal is superior to the Gaussia model i most cases. cdf power liear CDF Trasform ower Trasform No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN CDF Trasform FN ower Trasform FN No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN CDF Trasform FN ower Trasform FN No Trasform F Noparaormal glasso F Noparaormal glasso F Noparaormal glasso FN FN FN Figure 7: ROC curves for sample sizes = 000,500,200 top, middle, bottom. Let FE Fλ ad FNE FNλ, Tables, 2, ad 3 provide umerical comparisos of both methods o data sets with differet trasformatios, where we repeat the experimets 00 times ad report the average FE ad FNE values with the correspodig stadard deviatios. It s clear from the tables that the oparaormal achieves sigificatly smaller errors tha the glasso if the true distributio of the data is ot multivariate Gaussia ad achieves performace comparable to the glasso whe the true distributio is exactly multivariate Gaussia. Figure 8 shows typical rus for the cdf ad power trasformatios. It s clear that whe the glasso estimates the graph icorrectly, the mistakes iclude both false positives ad egatives. 23

18 LIU, LAFFERTY, AND WASSERMAN Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table : Quatitative compariso o the data set usig the cdf trasformatio. For both FE ad FNE, the oparaormal performs much better i geeral. Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table 2: Quatitative compariso o the data set usig the power trasformatio. For both FE ad FNE, the oparaormal performs much better i geeral COMARISON IN THE GAUSSIAN CASE The previous experimets idicate that the oparaormal works almost as well as the glasso i the Gaussia case. This iitially appears surprisig, sice a parametric method is expected to be more efficiet tha a oparametric method if the parametric assumptio is correct. To maifest this efficiecy loss, we coducted some experimets with very small ad relatively large p. For multivariate Gaussia models, Figure 9 shows results with, p,s = 50,40,/8,50,00,/5 232

19 THE NONARANORMAL Noparaormal glasso FE sdfe FNE sdfne FE sdfe FNE sdfne Table 3: Quatitative compariso o the data set without ay trasformatio. The two methods behave similarly, the glasso is slightly better. ad 30, 00, /5. From the mea ROC curves, we see that oparaormal does ideed behave worse tha the glasso, suggestig some efficiecy loss. However, from the correspodig boxplots, the efficiecy reductio is relatively isigificat THE CASE WHEN p Figure 0 shows results from a simulatio of the oparaormal usig cdf trasformatios with = 200, p = 500 ad sparsity level s = /40. The boxplot shows that the oparaormal outperforms the glasso. A typical ru of the regularizatio paths cofirms this coclusio, showig that the oparaormal path separates the relevat ad irrelevat dimesios very well. I cotrast, with the glasso the relevat variables are buried amog the irrelevat variables. 6.2 Gee Microarray Data I this study, we cosider a data set based o Affymetrix GeeChip microarrays for the plat Arabidopsis thaliaa, Wille et al., The sample size is = 8. The expressio levels for each chip are pre-processed by log-trasformatio ad stadardizatio. A subset of 40 gees from the isopreoid pathway are chose, ad we study the associatios amog them usig both the paraormal ad oparaormal models. Eve though these data are geerally treated as multivariate Gaussia i the previous aalysis Wille et al., 2004, our study shows that the results of the oparaormal ad the glasso are very differet over a wide rage of regularizatio parameters. This suggests the oparaormal could support differet scietific coclusios COMARISON OF THE REGULARIZATION ATHS We first compare the regularizatio paths of the two methods, i Figure. To geerate the paths, we select 50 regularizatio parameters o a evely spaced grid i the iterval [0.6,.2]. Although 233

20 LIU, LAFFERTY, AND WASSERMAN cdf power true graph, p = 40 oparaormal, p = 40 true graph, p = 40 oparaormal, p = 40 z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z true graph, p = 40 oparaormal, p = 40 true graph, p = 40 oparaormal, p = 40 z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z z graphical lasso, p = 40 symmetric differece, p = 40 z z z z z Figure 8: Typical rus for the two methods for = 000 usig the cdf ad power trasformatios. The dashed black lies i the symmetric differece plots idicate edges foud by the glasso but ot the oparaormal, ad vice-versa for the solid red lies. z the paths for the two methods look similar, there are some subtle differeces. I particular, variables become ozero i a differet order, especially whe the regularizatio parameter is i the rage λ [0.2, 0.3]. As show below, these subtle differeces i the paths lead to differet model selectio behaviors COMARISON OF THE ESTIMATED GRAHS Figure 2 compares the estimated graphs for the two methods at several values of the regularizatio parameter λ i the rage [0.6,0.37]. For each λ, we show the estimated graph from the oparaormal i the first colum. I the secod colum we show the graph obtaied by scaig the full 234

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99

AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99 VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS Jia Huag 1, Joel L. Horowitz 2 ad Fegrog Wei 3 1 Uiversity of Iowa, 2 Northwester Uiversity ad 3 Uiversity of West Georgia Abstract We cosider a oparametric

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

High-dimensional support union recovery in multivariate regression

High-dimensional support union recovery in multivariate regression High-dimesioal support uio recovery i multivariate regressio Guillaume Oboziski Departmet of Statistics UC Berkeley gobo@stat.berkeley.edu Marti J. Waiwright Departmet of Statistics Dept. of Electrical

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Lecture 4: Cheeger s Inequality

Lecture 4: Cheeger s Inequality Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a d-regular

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Irreducible polynomials with consecutive zero coefficients

Irreducible polynomials with consecutive zero coefficients Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot STAPRO 66 pp: - col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N -- SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorov-type test for mootoicity of regressio Cecile Durot Laboratoire

More information

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations 44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:

More information

SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1

SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1 The Aals of Statistics 2011, Vol. 39, No. 1, 1 47 DOI: 10.1214/09-AOS776 Istitute of Mathematical Statistics, 2011 SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1 BY GUILLAUME OBOZINSKI,

More information

HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY

HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH NONCONVEXITY The Aals of Statistics 2012, Vol. 40, No. 3, 1637 1664 DOI: 10.1214/12-AOS1018 Istitute of Mathematical Statistics, 2012 HIGH-DIMENSIONAL REGRESSION WITH NOISY AND MISSING DATA: PROVABLE GUARANTEES WITH

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uiv-ates.fr Jea-Luc Marichal Applied Mathematics

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

arxiv:1506.03481v1 [stat.me] 10 Jun 2015

arxiv:1506.03481v1 [stat.me] 10 Jun 2015 BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

THE ABRACADABRA PROBLEM

THE ABRACADABRA PROBLEM THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Probabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ?

Probabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ? Probabilistic Egieerig Mechaics 4 (009) 577 584 Cotets lists available at ScieceDirect Probabilistic Egieerig Mechaics joural homepage: wwwelseviercom/locate/probegmech Do Roseblatt ad Nataf isoprobabilistic

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps Swaps: Costat maturity swaps (CMS) ad costat maturity reasury (CM) swaps A Costat Maturity Swap (CMS) swap is a swap where oe of the legs pays (respectively receives) a swap rate of a fixed maturity, while

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 11 04/01/2008. Sven Zenker

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 11 04/01/2008. Sven Zenker Parameter estimatio for oliear models: Numerical approaches to solvig the iverse problem Lecture 11 04/01/2008 Sve Zeker Review: Trasformatio of radom variables Cosider probability distributio of a radom

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

1 The Gaussian channel

1 The Gaussian channel ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.

More information

SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE

SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE SUPPLEMENTARY MATERIAL TO GENERAL NON-EXACT ORACLE INEQUALITIES FOR CLASSES WITH A SUBEXPONENTIAL ENVELOPE By Guillaume Lecué CNRS, LAMA, Mare-la-vallée, 77454 Frace ad By Shahar Medelso Departmet of Mathematics,

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff, NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

2-3 The Remainder and Factor Theorems

2-3 The Remainder and Factor Theorems - The Remaider ad Factor Theorems Factor each polyomial completely usig the give factor ad log divisio 1 x + x x 60; x + So, x + x x 60 = (x + )(x x 15) Factorig the quadratic expressio yields x + x x

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Data-Enhanced Predictive Modeling for Sales Targeting

Data-Enhanced Predictive Modeling for Sales Targeting Data-Ehaced Predictive Modelig for Sales Targetig Saharo Rosset Richard D. Lawrece Abstract We describe ad aalyze the idea of data-ehaced predictive modelig (DEM). The term ehaced here refers to the case

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 haupt@ieee.org Abstract:

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

An Efficient Polynomial Approximation of the Normal Distribution Function & Its Inverse Function

An Efficient Polynomial Approximation of the Normal Distribution Function & Its Inverse Function A Efficiet Polyomial Approximatio of the Normal Distributio Fuctio & Its Iverse Fuctio Wisto A. Richards, 1 Robi Atoie, * 1 Asho Sahai, ad 3 M. Raghuadh Acharya 1 Departmet of Mathematics & Computer Sciece;

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

Function factorization using warped Gaussian processes

Function factorization using warped Gaussian processes Fuctio factorizatio usig warped Gaussia processes Mikkel N. Schmidt ms@imm.dtu.dk Uiversity of Cambridge, Departmet of Egieerig, Trumpigto Street, Cambridge, CB2 PZ, UK Abstract We itroduce a ew approach

More information

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu Multi-server Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio -coectio

More information

Unbiased Estimation. Topic 14. 14.1 Introduction

Unbiased Estimation. Topic 14. 14.1 Introduction Topic 4 Ubiased Estimatio 4. Itroductio I creatig a parameter estimator, a fudametal questio is whether or ot the estimator differs from the parameter i a systematic maer. Let s examie this by lookig a

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information