A Kernel TwoSample Test


 Augustine Thomas
 2 years ago
 Views:
Transcription
1 Joural of Machie Learig Research 3 0) Subitted 4/08; Revised /; Published 3/ Arthur Gretto MPI for Itelliget Systes Speastrasse Tübige, Geray A Kerel TwoSaple Test Karste M. Borgwardt Machie Learig ad Coputatioal Biology Research Group Max Plack Istitutes Tübige Speastrasse Tübige, Geray Malte J. Rasch 9 XiJieKouWai St. State Key Laboratory of Cogitive Neurosciece ad Learig, Beijig Noral Uiversity, Beijig, 00875, P.R. Chia Berhard Schölkopf MPI for Itelliget Systes Speastrasse , Tübige, Geray Alexader Sola Yahoo! Research 8 Missio College Blvd Sata Clara, CA 95054, USA Editor: Nicolas Vayatis Abstract We propose a fraework for aalyzig ad coparig distributios, which we use to costruct statistical tests to deterie if two saples are draw fro differet distributios. Our test statistic is the largest differece i expectatios over fuctios i the uit ball of a reproducig kerel Hilbert space RKHS), ad is called the axiu ea discrepacy MMD). We preset two distributiofree tests based o large deviatio bouds for the MMD, ad a third test based o the asyptotic distributio of this statistic. The MMD ca be coputed i quadratic tie, although efficiet liear tie approxiatios are available. Our statistic is a istace of a itegral probability etric, ad various classical etrics o distributios are obtaied whe alterative fuctio classes are used i place of a RKHS. We apply our twosaple tests to a variety of probles, icludig attribute atchig for databases usig the Hugaria arriage ethod, where they perfor strogly. Excellet perforace is also obtaied whe coparig distributios over graphs, for which these are the first such tests.. Also at Gatsby Coputatioal Neurosciece Uit, CSML, 7 Quee Square, Lodo WCN 3AR, UK.. This work was carried out while K.M.B. was with the LudwigMaxiiliasUiversität Müche.. This work was carried out while M.J.R. was with the Graz Uiversity of Techology.. Also at The Australia Natioal Uiversity, Caberra, ACT 000, Australia. c 0 Arthur Gretto, Karste M. Borgwardt, Malte J. Rasch, Berhard Schölkopf ad Alexader Sola.
2 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Keywords: kerel ethods, twosaple test, uifor covergece bouds, schea atchig, itegral probability etric, hypothesis testig. Itroductio We address the proble of coparig saples fro two probability distributios, by proposig statistical tests of the ull hypothesis that these distributios are equal agaist the alterative hypothesis that these distributios are differet this is called the twosaple proble). Such tests have applicatio i a variety of areas. I bioiforatics, it is of iterest to copare icroarray data fro idetical tissue types as easured by differet laboratories, to detect whether the data ay be aalysed joitly, or whether differeces i experietal procedure have caused systeatic differeces i the data distributios. Equally of iterest are coparisos betwee icroarray data fro differet tissue types, either to deterie whether two subtypes of cacer ay be treated as statistically idistiguishable fro a diagosis perspective, or to detect differeces i healthy ad cacerous tissue. I database attribute atchig, it is desirable to erge databases cotaiig ultiple fields, where it is ot kow i advace which fields correspod: the fields are atched by axiisig the siilarity i the distributios of their etries. We test whether distributios p ad q are differet o the basis of saples draw fro each of the, by fidig a well behaved e.g., sooth) fuctio which is large o the poits draw fro p, ad sall as egative as possible) o the poits fro q. We use as our test statistic the differece betwee the ea fuctio values o the two saples; whe this is large, the saples are likely fro differet distributios. We call this test statistic the Maxiu Mea Discrepacy MMD). Clearly the quality of the MMD as a statistic depeds o the class F of sooth fuctios that defie it. O oe had, F ust be rich eough so that the populatio MMD vaishes if ad oly if p=q. O the other had, for the test to be cosistet i power,feeds to be restrictive eough for the epirical estiate of the MMD to coverge quickly to its expectatio as the saple size icreases. We will use the uit balls i characteristic reproducig kerel Hilbert spaces Fukuizu et al., 008; Sriperubudur et al., 00b) as our fuctio classes, sice these will be show to satisfy both of the foregoig properties. We also review classical etrics o distributios, aely the KologorovSirov ad EarthMover s distaces, which are based o differet fuctio classes; collectively these are kow as itegral probability etrics Müller, 997). O a ore practical ote, the MMD has a reasoable coputatioal cost, whe copared with other twosaple tests: give poits sapled fro p ad fro q, the cost is O+) tie. We also propose a test statistic with a coputatioal cost of O+): the associated test ca achieve a give Type II error at a lower overall coputatioal cost tha the quadraticcost test, by lookig at a larger volue of data. We defie three oparaetric statistical tests based o the MMD. The first two tests are distributiofree, eaig they ake o assuptios regardig p ad q, albeit at the expese of beig coservative i detectig differeces betwee the distributios. The third test is based o the asyptotic distributio of the MMD, ad is i practice ore sesitive to differeces i distributio at sall saple sizes. The preset work sythesizes ad expads o results of Gretto et al. 007a,b) ad Sola et al. 007), who i tur build o the earlier work of Borgwardt et al. 006). Note that. I particular, ost of the proofs here were ot provided by Gretto et al. 007a), but i a accopayig techical report Gretto et al., 008a), which this docuet replaces. 74
3 A KERNEL TWOSAMPLE TEST the latter addresses oly the third kid of test, ad that the approach of Gretto et al. 007a,b) is rigorous i its treatet of the asyptotic distributio of the test statistic uder the ull hypothesis. We begi our presetatio i Sectio with a foral defiitio of the MMD. We review the otio of a characteristic RKHS, ad establish that whe F is a uit ball i a characteristic RKHS, the the populatio MMD is zero if ad oly if p = q. We further show that uiversal RKHSs i the sese of Steiwart 00) are characteristic. I Sectio 3, we give a overview of hypothesis testig as it applies to the twosaple proble, ad review alterative test statistics, icludig the L distace betwee kerel desity estiates Aderso et al., 994), which is the prior approach closest to our work. We preset our first two hypothesis tests i Sectio 4, based o two differet bouds o the deviatio betwee the populatio ad epirical MMD. We take a differet approach i Sectio 5, where we use the asyptotic distributio of the epirical MMD estiate as the basis for a third test. Whe large volues of data are available, the cost of coputig the MMD quadratic i the saple size) ay be excessive: we therefore propose i Sectio 6 a odified versio of the MMD statistic that has a liear cost i the uber of saples, ad a associated asyptotic test. I Sectio 7, we provide a overview of ethods related to the MMD i the statistics ad achie learig literature. We also review alterative fuctio classes for which the MMD defies a etric o probability distributios. Fially, i Sectio 8, we deostrate the perforace of MMDbased twosaple tests o probles fro eurosciece, bioiforatics, ad attribute atchig usig the Hugaria arriage ethod. Our approach perfors well o high diesioal data with low saple size; i additio, we are able to successfully distiguish distributios o graph data, for which ours is the first proposed test. A Matlab ipleetatio of the tests is at gretto/d/d.ht.. The Maxiu Mea Discrepacy I this sectio, we preset the axiu ea discrepacy MMD), ad describe coditios uder which it is a etric o the space of probability distributios. The MMD is defied i ters of particular fuctio spaces that witess the differece i distributios: we therefore begi i Sectio. by itroducig the MMD for a arbitrary fuctio space. I Sectio., we copute both the populatio MMD ad two epirical estiates whe the associated fuctio space is a reproducig kerel Hilbert space, ad i Sectio.3 we derive the RKHS fuctio that witesses the MMD for a give pair of distributios.. Defiitio of the Maxiu Mea Discrepacy Our goal is to forulate a statistical test that aswers the followig questio: Proble Let x ad y be rado variables defied o a topological space X, with respective Borel probability easures p ad q. Give observatios X :={x,...,x } ad Y :={y,...,y }, idepedetly ad idetically distributed i.i.d.) fro p ad q, respectively, ca we decide whether p q? Where there is o abiguity, we use the shorthad otatio E x [ fx)] := E x p [ fx)] ad E y [ fy)] := E y q [ fy)] to deote expectatios with respect to p ad q, respectively, where x p idicates x has distributio p. To start with, we wish to deterie a criterio that, i the populatio settig, takes o a uique ad distictive value oly whe p = q. It will be defied based o Lea 9.3. of Dudley 00). 75
4 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Lea Let X,d) be a etric space, ad let p,q be two Borel probability easures defied o X. The p = q if ad oly if E x fx)) = E y fy)) for all f CX), where CX) is the space of bouded cotiuous fuctios o X. Although CX) i priciple allows us to idetify p=q uiquely, it is ot practical to work with such a rich fuctio class i the fiite saple settig. We thus defie a ore geeral class of statistic, for as yet uspecified fuctio classes F, to easure the disparity betwee p ad q Fortet ad Mourier, 953; Müller, 997). Defiitio Let F be a class of fuctios f :X R ad let p,q,x,y,x,y be defied as above. We defie the axiu ea discrepacy MMD) as MMD[F, p,q] := supe x [ fx)] E y [ fy)]). ) f F I the statistics literature, this is kow as a itegral probability etric Müller, 997). A biased epirical estiate of the MMD is obtaied by replacig the populatio expectatios with epirical expectatios coputed o the saples X ad Y, MMD b [F,X,Y] := sup f F fx i ) fy i ) ). ) We ust therefore idetify a fuctio class that is rich eough to uiquely idetify whether p=q, yet restrictive eough to provide useful fiite saple estiates the latter property will be established i subsequet sectios).. The MMD i Reproducig Kerel Hilbert Spaces I the preset sectio, we propose as our MMD fuctio classf the uit ball i a reproducig kerel Hilbert space H. We will provide fiite saple estiates of this quatity both biased ad ubiased), ad establish coditios uder which the MMD ca be used to distiguish betwee probability easures. Other possible fuctio classesf are discussed i Sectios 7. ad 7.. We first review soe properties of H Schölkopf ad Sola, 00). Sice H is a RKHS, the operator of evaluatio δ x appig f H to fx) R is cotiuous. Thus, by the Riesz represetatio theore Reed ad Sio, 980, Theore II.4), there is a feature appig φx) fro X to R such that fx)= f,φx) H. This feature appig takes the caoical for φx)=kx, ) Steiwart ad Christa, 008, Lea 4.9), where kx,x ) : X X R is positive defiite, ad the otatio kx, ) idicates the kerel has oe arguet fixed at x, ad the secod free. Note i particular that φx),φy) H = kx,y). We will geerally use the ore cocise otatio φx) for the feature appig, although i soe cases it will be clearer to write kx, ). We ext exted the otio of feature ap to the ebeddig of a probability distributio: we will defie a eleet µ p H such that E x f = f,µ p H for all f H, which we call the ea ebeddig of p. Ebeddigs of probability easures ito reproducig kerel Hilbert spaces are well established i the statistics literature: see Berliet ad ThoasAga 004, Chapter 4) for further detail ad refereces. We begi by establishig coditios uder which the ea ebeddig µ p exists Fukuizu et al., 004, p. 93), Sriperubudur et al., 00b, Theore ).. The epirical MMD defied below has a upward bias we will defie a ubiased statistic i the followig sectio. 76
5 A KERNEL TWOSAMPLE TEST Lea 3 If k, ) is easurable ad E x kx,x)< the µp H. Proof The liear operator T p f := E x f for all f F is bouded uder the assuptio, sice T p f = E x f E x f =E x f,φx) H E x kx,x) f H ). Hece by the Riesz represeter theore, there exists a µ p H such that T p f = f,µ p H. If we set f = φt)=kt, ), we obtai µ p t)= µ p,kt, ) H = E x kt,x): i other words, the ea ebeddig of the distributio p is the expectatio uder p of the caoical feature ap. We ext show that the MMD ay be expressed as the distace i H betwee ea ebeddigs Borgwardt et al., 006). Lea 4 Assue the coditio i Lea 3 for the existece of the ea ebeddigs µ p, µ q is satisfied. The MMD [F, p,q]= µ p µ q H. Proof MMD [F, p,q] = = [ sup f H E x [ fx)] E y [ fy)]) [ sup µp µ q, f H f H = µ p µ q H. ] ] We ow establish a coditio o the RKHS H uder which the ea ebeddig µ p is ijective, which idicates that MMD[F, p,q]=0 is a etric 3 o the Borel probability easures o X. Evidetly, this property will ot hold for allh: for istace, a polyoial RKHS of degree two caot distiguish betwee distributios with the sae ea ad variace, but differet kurtosis Sriperubudur et al., 00b, Exaple 3). The MMD is a etric, however, whe H is a uiversal RKHSs, defied o a copact etric space X. Uiversality requires that k, ) be cotiuous, ad H be dese i CX) with respect to the L or. Steiwart 00) proves that the Gaussia ad Laplace RKHSs are uiversal. Theore 5 Let F be a uit ball i a uiversal RKHS H, defied o the copact etric space X, with associated cotiuous kerel k, ). The MMD[F, p, q] = 0 if ad oly if p = q. Proof The proof follows Cortes et al. 008, Suppleetary Appedix), whose approach is clearer tha the origial proof of Gretto et al. 008a, p. 4). 4 First, it is clear that p = q iplies 3. Accordig to Dudley 00, p. 6) a etric dx, y) satisfies the followig four properties: syetry, triagle iequality, dx, x) = 0, ad dx, y) = 0 = x = y. A pseudoetric oly satisfies the first three properties. 4. Note that the proof of Cortes et al. 008) requires a applicatio the of doiated covergece theore, rather tha usig the Riesz represetatio theore to show the existece of the ea ebeddigs µ p ad µ q as we did i Lea 3. 77
6 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA MMD{F, p,q} is zero. We ow prove the coverse. By the uiversality of H, for ay give ε>0 ad f CX) there exists a g H such that We ext ake the expasio f g ε. E x fx) E y fy)) E x fx) E x gx) + E x gx) E y gy) + E y gy) E y fy). The first ad third ters satisfy Next, write E x fx) E x gx) E x fx) gx) ε. E x gx) E y gy)= g,µ p µ q H = 0, sice MMD{F, p,q}=0 iplies µ p = µ q. Hece E x fx) E y fy)) ε for all f CX) ad ε>0, which iplies p=q by Lea. While our result establishes the appig µ p is ijective for uiversal kerels o copact doais, this result ca also be show i ore geeral cases. Fukuizu et al. 008) itroduces the otio of characteristic kerels, these beig kerels for which the ea ap is ijective. Fukuizu et al. establish that Gaussia ad Laplace kerels are characteristic o R d, ad thus that the associated MMD is a etric o distributios for this doai. Sriperubudur et al. 008, 00b) ad Sriperubudur et al. 0a) further explore the properties of characteristic kerels, providig a siple coditio to deterie whether traslatio ivariat kerels are characteristic, ad ivestigatig the relatio betwee uiversal ad characteristic kerels o ocopact doais. Give we are i a RKHS, we ay easily obtai of the squared MMD, µ p µ q, i ters of H kerel fuctios, ad a correspodig ubiased fiite saple estiate. Lea 6 Give x ad x idepedet rado variables with distributio p, ad y ad y idepedet rado variables with distributio q, the squared populatio MMD is MMD [F, p,q]=e x,x [ kx,x ) ] E x,y [kx,y)]+e y,y [ ky,y ) ], where x is a idepedet copy of x with the sae distributio, ad y is a idepedet copy of y. A ubiased epirical estiate is a su of two Ustatistics ad a saple average, MMD u[f,x,y] = ) j= j i kx i,x j )+ ) j i ky i,y j ) kx i,y j ). 3) Whe =, a slightly sipler epirical estiate ay be used. Let Z := z,...,z ) be i.i.d. rado variables, where z :=x,y) p q i.e., x ad y are idepedet). A ubiased estiate of MMD is MMD u[f,x,y]= ) ) hz i,z j ), 4) 78 i j
7 A KERNEL TWOSAMPLE TEST which is a oesaple Ustatistic with hz i,z j ) := kx i,x j )+ky i,y j ) kx i,y j ) kx j,y i ). Proof Startig fro the expressio for MMD [F, p,q] i Lea 4, MMD [F, p,q] = µ p µ q H = µ p,µ p H + µ q,µ q H µ p,µ q H = E x,x φx),φx ) H + E y,y φy),φy ) H E x,y φx),φy) H, The proof is copleted by applyig φx),φx ) H = kx,x ); the epirical estiates follow straightforwardly, by replacig the populatio expectatios with their correspodig Ustatistics ad saple averages. This statistic is ubiased followig Serflig 980, Chapter 5). Note that MMD u ay be egative, sice it is a ubiased estiator of MMD[F, p,q]). The oly ters issig to esure oegativity, however, are hz i,z i ), which were reoved to reove spurious correlatios betwee observatios. Cosequetly we have the boud MMD u+ ) kx i,x i )+ky i,y i ) kx i,y i ) 0. Moreover, while the epirical statistic for = is a ubiased estiate of MMD, it does ot have iiu variace, sice we igore the crossters kx i,y i ), of which there are O). Fro 3), however, we see the iiu variace estiate is alost idetical Serflig, 980, Sectio 5..4). The biased statistic i ) ay also be easily coputed followig the above reasoig. Substitutig the epirical estiates µ X := φx i) ad µ Y := φy i) of the feature space eas based o respective saples X ad Y, we obtai MMD b [F,X,Y]= [ kx i,x j ) i, j=, i, j= kx i,y j )+ ky i,y j )]. 5) i, j= Note that the Ustatistics of 3) have bee replaced by Vstatistics. Ituitively we expect the epirical test statistic MMD[F,X,Y], whether biased or ubiased, to be sall if p=q, ad large if the distributios are far apart. It costs O+) ) tie to copute both statistics..3 Witess Fuctio of the MMD for RKHSs We defie the witess fuctio f to be the RKHS fuctio attaiig the supreu i ), ad its epirical estiate ˆf to be the fuctio attaiig the supreu i ). Fro the reasoig i Lea 4, it is clear that f t) φt),µ p µ q H = E x[kx,t)] E y [ky,t)], ˆf t) φt),µ X µ Y H = kx i,t) ky i,t). where we have defied µ X = φx i), ad µ Y by aalogy. The result follows sice the uit vector v axiizig v,x H i a Hilbert space is v=x/ x H. We illustrate the behavior of MMD i Figure usig a oediesioal exaple. The data X ad Y were geerated fro distributios p ad q with equal eas ad variaces, with p Gaussia 79
8 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Prob. desities ad ˆf t) ˆf p Gauss) q Laplace) t Figure : Illustratio of the fuctio axiizig the ea discrepacy i the case where a Gaussia is beig copared with a Laplace distributio. Both distributios have zero ea ad uit variace. The fuctio ˆf that witesses the MMD has bee scaled for plottig purposes, ad was coputed epirically o the basis of 0 4 saples, usig a Gaussia kerel with σ=0.5. ad q Laplacia. We chose F to be the uit ball i a Gaussia RKHS. The epirical estiate ˆf of the fuctio f that witesses the MMD i other words, the fuctio axiizig the ea discrepacy i ) is sooth, egative where the Laplace desity exceeds the Gaussia desity at the ceter ad tails), ad positive where the Gaussia desity is larger. The agitude of ˆf is a direct reflectio of the aout by which oe desity exceeds the other, isofar as the soothess costrait perits it. 3. Backgroud Material We ow preset three backgroud results. First, we itroduce the teriology used i statistical hypothesis testig. Secod, we deostrate via a exaple that eve for tests which have asyptotically o error, we caot guaratee perforace at ay fixed saple size without akig assuptios about the distributios. Third, we review soe alterative statistics used i coparig distributios, ad the associated twosaple tests see also Sectio 7 for a overview of additioal itegral probability etrics). 3. Statistical Hypothesis Testig Havig described a etric o probability distributios the MMD) based o distaces betwee their Hilbert space ebeddigs, ad epirical estiates biased ad ubiased) of this etric, we address the proble of deteriig whether the epirical MMD shows a statistically sigificat differece betwee distributios. To this ed, we briefly describe the fraework of statistical hypothesis testig as it applies i the preset cotext, followig Casella ad Berger 00, Chapter 8). Give i.i.d. 730
9 A KERNEL TWOSAMPLE TEST saples X p of size ad Y q of size, the statistical test,tx,y) : X X {0,} is used to distiguish betwee the ull hypothesis H 0 : p=q ad the alterative hypothesis H A : p q. This is achieved by coparig the test statistic 5 MMD[F,X,Y] with a particular threshold: if the threshold is exceeded, the the test rejects the ull hypothesis bearig i id that a zero populatio MMD idicates p=q). The acceptace regio of the test is thus defied as the set of real ubers below the threshold. Sice the test is based o fiite saples, it is possible that a icorrect aswer will be retured. A Type I error is ade whe p = q is rejected based o the observed saples, despite the ull hypothesis havig geerated the data. Coversely, a Type II error occurs whe p = q is accepted despite the uderlyig distributios beig differet. The level α of a test is a upper boud o the probability of a Type I error: this is a desig paraeter of the test which ust be set i advace, ad is used to deterie the threshold to which we copare the test statistic fidig the test threshold for a give α is the topic of Sectios 4 ad 5). The power of a test agaist a particular eber of the alterative class H A i.e., a specific p,q) such that p q) is the probability of wrogly acceptig p=q i this istace. A cosistet test achieves a level α, ad a Type II error of zero, i the large saple liit. We will see that the tests proposed i this paper are cosistet. 3. A Negative Result Eve if a test is cosistet, it is ot possible to distiguish distributios with high probability at a give, fixed saple size i.e., to provide guaratees o the Type II error), without prior assuptios as to the ature of the differece betwee p ad q. This is true regardless of the twosaple test used. There are several ways to illustrate this, which each give isight ito the kids of differeces that ight be udetectable for a give uber of saples. The followig exaple 6 is oe such illustratio. Exaple Assue we have a distributio p fro which we have draw i.i.d. observatios. We costruct a distributio q by drawig i.i.d. observatios fro p, ad defiig a discrete distributio over these istaces with probability each. It is easy to check that if we ow draw observatios fro q, there is at least a )! > e > 0.63 probability that we thereby obtai a saple fro p. Hece o test will be able to distiguish saples fro p ad q i this case. We could ake the probability of detectio arbitrarily sall by icreasig the size of the saple fro which we costruct q. 3.3 Previous Work We ext give a brief overview of soe earlier approaches to the two saple proble for ultivariate data. Sice our later experietal copariso is with respect to certai of these ethods, we give abbreviated algorith aes i italics where appropriate: these should be used as a key to the tables i Sectio This ay be biased or ubiased. 6. This is a variatio of a costructio for idepedece tests, which was suggested i a private couicatio by Joh Lagford. 73
10 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA 3.3. L DISTANCE BETWEEN PARZEN WINDOW ESTIMATES The prior work closest to the curret approach is the Parze widowbased statistic of Aderso et al. 994). We begi with a short overview of the Parze widow estiate ad its properties Silvera, 986), before proceedig to a copariso with the RKHS approach. We assue a distributio p o R d, which has a associated desity fuctio f p. The Parze widow estiate of this desity fro a i.i.d. saple X of size is ˆf p x)= We ay rescale κ accordig to κx i x), where κ satisfies ) κ x h d h X κx)dx= ad κx) 0. for a badwidth paraeter h. To siplify the discussio, we use a sigle badwidth h + for both ˆf p ad ˆf q. Assuig / is bouded away fro zero ad ifiity, cosistecy of the Parze widow estiates for f p ad f q requires li, hd + = 0 ad li, +)hd + =. 6) We ow show the L distace betwee Parze widows desity estiates is a special case of the biased MMD i Equatio 5). Deote by D r p,q) := f p f q r the L r distace betwee the desities f p ad f q correspodig to the distributios p ad q, respectively. For r= the distace D r p,q) is kow as the Lévy distace Feller, 97), ad for r = we ecouter a distace easure derived fro the Reyi etropy Gokcay ad Pricipe, 00). Assue that ˆf p ad ˆf q are give as kerel desity estiates with kerel κx x ), that is, ˆf p x)= κx i x) ad ˆf q y) is defied by aalogy. I this case [ D ˆf p, ˆf q ) = κx i z) κy i z)] dz = kx i x j )+ i, j= ky i y j ) i, j=, i, j= kx i y j ), where kx y)= κx z)κy z)dz. By its defiitio kx y) is a RKHS kerel, as it is a ier product betwee κx z) ad κy z) o the doaix. We ow describe the asyptotic perforace of a twosaple test usig the statistic D ˆf p, ˆf q ). We cosider the power of the test uder local departures fro the ull hypothesis. Aderso et al. 994) defie these to take the for f q = f p + δg, 7) where δ R, ad g is a fixed, bouded, itegrable fuctio chose to esure that f q is a valid desity for sufficietly sall δ. Aderso et al. cosider two cases: the kerel badwidth covergig to zero with icreasig saple size, esurig cosistecy of the Parze widow estiates of f p ad f q ; ad the case of a fixed badwidth. I the forer case, the iiu distace with which the test ca discriiate f p fro f q is 7 δ=+) / h d/ +. I the latter case, this iiu distace is δ = +) /, uder the assuptio that the Fourier trasfor of the kerel κ does ot vaish ), 7. Forally, defie s α as a threshold for the statistic D ˆf p, ˆf q chose to esure the test has level α, ad let δ = +) / h d/ + c for soe fixed c 0. Whe, such that / is bouded away fro 0 ad, ad 73
11 A KERNEL TWOSAMPLE TEST o a iterval Aderso et al., 994, Sectio.4), which iplies the kerel k is characteristic Sriperubudur et al., 00b). The power of the L test agaist local alteratives is greater whe the kerel is held fixed, sice for ay rate of decrease of h + with icreasig saple size, δ will decrease ore slowly tha for a fixed kerel. A RKHSbased approach geeralizes the L statistic i a uber of iportat respects. First, we ay eploy a uch larger class of characteristic kerels that caot be writte as ier products betwee Parze widows: several exaples are give by Steiwart 00, Sectio 3) ad Micchelli et al. 006, Sectio 3) these kerels are uiversal, hece characteristic). We ay further geeralize to kerels o structured objects such as strigs ad graphs Schölkopf et al., 004), as doe i our experiets Sectio 8). Secod, eve whe the kerel ay be writte as a ier product of Parze widows or d, the D statistic with fixed badwidth o loger coverges to a L distace betwee probability desity fuctios, hece it is ore atural to defie the statistic as a itegral probability etric for a particular RKHS, as i Defiitio. Ideed, i our experiets, we obtai good perforace i experietal settigs where the diesioality greatly exceeds the saple size, ad desity estiates would perfor very poorly 8 for istace the Gaussia toy exaple i Figure 5B, for which perforace actually iproves whe the diesioality icreases; ad the icroarray data sets i Table ). This suggests it is ot ecessary to solve the ore difficult proble of desity estiatio i high diesios to do twosaple testig. Fially, the kerel approach leads us to establish cosistecy agaist a larger class of local alteratives to the ull hypothesis tha that cosidered by Aderso et al. I Theore 3, we prove cosistecy agaist a class of alteratives ecoded i ters of the ea ebeddigs of p ad q, which applies to ay doai o which RKHS kerels ay be defied, ad ot oly desities or d. This ore geeral approach also has iterestig cosequeces for distributios or d : for istace, a local departure fro H 0 occurs whe p ad q differ at icreasig frequecies i their respective characteristic fuctios. This class of local alteratives caot be expressed i the for δg for fixed g, as i 7). We discuss this issue further i Sectio MMD FOR MULTINOMIALS Assue a fiite doai X := {,...,d}, ad defie the rado variables x ad y o X such that p i := Px=i) ad q j := Py= j). We ebed x ito a RKHSHvia the feature appig φx) := e x, where e s is the uit vector i R d takig value i diesio s, ad zero i the reaiig etries. The kerel is the usual ier product o R d. I this case, MMD [F, p,q]= p q R d = d p i q i ). 8) Harchaoui et al. 008, Sectio, log versio) ote that this L statistic ay ot be the best choice for fiite doais, citig a result of Leha ad Roao 005, Theore 4.3.) that Pearso s assuig coditios 6), the liit πc) := li Pr ) ) H A D ˆf p, ˆf q > sα +) is welldefied, ad satisfies α<πc)< for 0< c <, ad πc) as c. 8. The L error of a kerel desity estiate coverges as O 4/4+d) ) whe the optial badwidth is used Wassera, 006, Sectio 6.5). 733
12 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Chisquared statistic is optial for the proble of goodess of fit testig for ultioials. 9 It would be of iterest to establish whether a aalogous result holds for twosaple testig i a wider class of RKHS feature spaces FURTHER MULTIVARIATE TWOSAMPLE TESTS Biau ad Gyorfi 005) Biau) use as their test statistic the L distace betwee discretized estiates of the probabilities, where the partitioig is refied as the saple size icreases. This space partitioig approach becoes difficult or ipossible for high diesioal probles, sice there are too few poits per bi. For this reaso, we use this test oly for lowdiesioal probles i our experiets. A geeralisatio of the WaldWolfowitz rus test to the ultivariate doai was proposed ad aalysed by Frieda ad Rafsky 979) ad Heze ad Perose 999) FR Wolf), ad ivolves coutig the uber of edges i the iiu spaig tree over the aggregated data that coect poits i X to poits i Y. The resultig test relies o the asyptotic orality of the test statistic, ad is ot distributiofree uder the ull hypothesis for fiite saples the test threshold depeds o p, as with our asyptotic test i Sectio 5; by cotrast, our tests i Sectio 4 are distributiofree). The coputatioal cost of this ethod usig Kruskal s algorith is O+) log+)), although ore oder ethods iprove o the log + ) ter: see Chazelle 000) for details. Frieda ad Rafsky 979) clai that calculatig the atrix of distaces, which costs O+) ), doiates their coputig tie; we retur to this poit i our experiets Sectio 8). Two possible geeralisatios of the KologorovSirov test to the ultivariate case were studied by Bickel 969) ad Frieda ad Rafsky 979). The approach of Frieda ad Rafsky FR Sirov) i this case agai requires a iial spaig tree, ad has a siilar cost to their ultivariate rus test. A ore recet ultivariate test was itroduced by Rosebau 005). This etails coputig the iiu distace obipartite atchig over the aggregate data, ad usig the uber of pairs cotaiig a saple fro both X ad Y as a test statistic. The resultig statistic is distributiofree uder the ull hypothesis at fiite saple sizes, i which respect it is superior to the Frieda Rafsky test; o the other had, it costs O+) 3 ) to copute. Aother distributiofree test Hall) was proposed by Hall ad Tajvidi 00): for each poit fro p, it requires coputig the closest poits i the aggregated data, ad coutig how ay of these are fro q the procedure is repeated for each poit fro q with respect to poits fro p). As we shall see i our experietal coparisos, the test statistic is costly to copute; Hall ad Tajvidi cosider oly tes of poits i their experiets. 4. Tests Based o Uifor Covergece Bouds I this sectio, we itroduce two tests for the twosaple proble that have exact perforace guaratees at fiite saple sizes, based o uifor covergece bouds. The first, i Sectio 4., uses the McDiarid 989) boud o the biased MMD statistic, ad the secod, i Sectio 4., uses a Hoeffdig 963) boud for the ubiased statistic. 9. A goodess of fit test deteries whether a saple fro p is draw fro a kow target ultioial q. Pearso s Chisquared statistic weights each ter i the su 8) by its correspodig q i. 734
13 A KERNEL TWOSAMPLE TEST 4. Boud o the Biased Statistic ad Test We establish two properties of the MMD, fro which we derive a hypothesis test. First, we show that regardless of whether or ot p=q, the epirical MMD coverges i probability at rate O+ ) ) to its populatio value. This shows the cosistecy of statistical tests based o the MMD. Secod, we give probabilistic bouds for large deviatios of the epirical MMD i the case p=q. These bouds lead directly to a threshold for our first hypothesis test. We begi by establishig the covergece of MMD b [F,X,Y] to MMD[F, p,q]. The followig theore is proved i A.. Theore 7 Let p, q, X,Y be defied as i Proble, ad assue 0 kx, y) K. The ) } Pr X,Y { MMD b [F,X,Y] MMD[F, p,q] > K/) +K/) + ε where Pr X,Y deotes the probability over the saple X ad saple Y. exp ε K+) Our ext goal is to refie this result i a way that allows us to defie a test threshold uder the ull hypothesis p = q. Uder this circustace, the costats i the expoet are slightly iproved. The followig theore is proved i Appedix A.3. Theore 8 Uder the coditios of Theore 7 where additioally p=q ad =, MMD b [F,X,Y] E x,x [kx,x) kx,x )] + ε K/) / + ε, } {{ } } {{ } B F,p) B F,p) both with probability at least exp ε 4K ). I this theore, we illustrate two possible bouds B F, p) ad B F, p) o the bias i the epirical estiate 5). The first iequality is iterestig iasuch as it provides a lik betwee the bias boud B F, p) ad kerel size for istace, if we were to use a Gaussia kerel with large σ, the kx,x) ad kx,x ) would likely be close, ad the bias sall). I the cotext of testig, however, we would eed to provide a additioal boud to show covergece of a epirical estiate of B F, p) to its populatio equivalet. Thus, i the followig test for p=q based o Theore 8, we use B F, p) to boud the bias. 0 Corollary 9 A hypothesis test of level α for the ull hypothesis p=q, that is, for MMD[F, p,q]=0, has the acceptace regio MMD b [F,X,Y]< K/ + ) logα. We ephasize that this test is distributiofree: the test threshold does ot deped o the particular distributio that geerated the saple. Theore 7 guaratees the cosistecy of the test agaist fixed alteratives, ad that the Type II error probability decreases to zero at rate O /), assuig =. To put this covergece rate i perspective, cosider a test of whether two oral distributios have equal eas, give they have ukow but equal variace Casella ad Berger, 00, Exercise 8.4). I this case, the test statistic has a Studett distributio with + degrees of freedo, ad its Type II error probability coverges at the sae rate as our test. It is worth otig that bouds ay be obtaied for the deviatio betwee populatio ea ebeddigs µ p ad the epirical ebeddigs µ X i a copletely aalogous fashio. The proof 0. Note that we use a tighter bias boud tha Gretto et al. 007a). ), 735
14 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA requires syetrizatio by eas of a ghost saple, that is, a secod set of observatios draw fro the sae distributio. While ot the focus of the preset paper, such bouds ca be used to perfor iferece based o oet atchig Altu ad Sola, 006; Dudík ad Schapire, 006; Dudík et al., 004). 4. Boud o the Ubiased Statistic ad Test The previous bouds are of iterest sice the proof strategy ca be used for geeral fuctio classes with well behaved Radeacher averages see Sriperubudur et al., 00a). WheF is the uit ball i a RKHS, however, we ay very easily defie a test via a covergece boud o the ubiased statistic MMD u i Lea 4. We base our test o the followig theore, which is a straightforward applicatio of the large deviatio boud o Ustatistics of Hoeffdig 963, p. 5). Theore 0 Assue 0 kx i,x j ) K, fro which it follows K hz i,z j ) K. The Pr X,Y { MMD u F,X,Y) MMD F, p,q)>t } exp t ) 8K where := / the sae boud applies for deviatios of t ad below). A cosistet statistical test for p=q usig MMD u is the obtaied. Corollary A hypothesis test of level α for the ull hypothesis p=q has the acceptace regio MMD u <4K/ ) logα ). This test is distributiofree. We ow copare the thresholds of the above test with that i Corollary 9. We ote first that the threshold for the biased statistic applies to a estiate of MMD, whereas that for the ubiased statistic is for a estiate of MMD. Squarig the forer threshold to ake the two quatities coparable, the squared threshold i Corollary 9 decreases as, whereas the threshold i Corollary decreases as /. Thus for sufficietly large, the McDiaridbased threshold will be lower ad the associated test statistic is i ay case biased upwards), ad its Type II error will be better for a give Type I boud. This is cofired i our Sectio 8 experiets. Note, however, that the rate of covergece of the squared, biased MMD estiate to its populatio value reais at / bearig i id we take the square of a biased estiate, where the bias ter decays as / ). Fially, we ote that the bouds we obtaied i this sectio ad the last are rather coservative for a uber of reasos: first, they do ot take the actual distributios ito accout. I fact, they are fiite saple size, distributiofree bouds that hold eve i the worst case sceario. The bouds could be tighteed usig localizatio, oets of the distributio, etc.: see, for exaple, Bousquet et al. 005) ad de la Peña ad Gié 999). Ay such iproveets could be plugged straight ito Theore 9. Secod, i coputig bouds rather tha tryig to characterize the distributio of MMDF,X,Y) explicitly, we force our test to be coservative by desig. I the followig we ai for a exact characterizatio of the asyptotic distributio of MMDF, X,Y) istead of a boud. While this will ot satisfy the uifor covergece requireets, it leads to superior tests i practice.. I the case of α=0.05, this is. 736
15 A KERNEL TWOSAMPLE TEST 5. Test Based o the Asyptotic Distributio of the Ubiased Statistic We propose a third test, which is based o the asyptotic distributio of the ubiased estiate of MMD i Lea 6. This test uses the asyptotic distributio of MMD u uder H 0, which follows fro results of Aderso et al. 994, Appedix) ad Serflig 980, Sectio 5.5.): see Appedix B. for the proof. Theore Let kx i,x j ) be the kerel betwee feature space appigs fro which the ea ebeddig of p has bee subtracted, kx i,x j ) := φx i ) µ p,φx j ) µ p H = kx i,x j ) E x kx i,x) E x kx,x j )+E x,x kx,x ), 9) where x is a idepedet copy of x draw fro p. Assue k L X X, p p) i.e., the cetred kerel is square itegrable, which is true for all p whe the kerel is bouded), ad that for t = +, li, /t ρ x ad li, /t ρ y := ρ x ) for fixed 0<ρ x <. The uderh 0, MMD u coverges i distributio accordig to tmmd [ u[f,x,y] D λ l ρx / a l ρy / b l ) ρ x ρ y ) ], 0) l= where a l N0,) ad b l N0,) are ifiite sequeces of idepedet Gaussia rado variables, ad the λ i are eigevalues of X kx,x )ψ i x)d px)=λ i ψ i x ). We illustrate the MMD desity uder both the ull ad alterative hypotheses by approxiatig it epirically for p=q ad p q. Results are plotted i Figure. Our goal is to deterie whether the epirical test statistic MMD u is so large as to be outside the α quatile of the ull distributio i 0), which gives a level α test. Cosistecy of this test agaist local departures fro the ull hypothesis is provided by the followig theore, proved i Appedix B.. Theore 3 Defie ρ x, ρ y, ad t as i Theore, ad write µ q = µ p +g t, where g t H is chose such that µ p +g t reais a valid ea ebeddig, ad g t H is ade to approach zero as t to describe local departures fro the ull hypothesis. The g t H = ct / is the iiu distace betwee µ p ad µ q distiguishable by the test. A exaple of a local departure fro the ull hypothesis is described earlier i the discussio of the L distace betwee Parze widow estiates Sectio 3.3.). The class of local alteratives cosidered i Theore 3 is ore geeral, however: for istace, Sriperubudur et al. 00b, Sectio 4) ad Harchaoui et al. 008, Sectio 5, log versio) give exaples of classes of perturbatios g t with decreasig RKHS or. These perturbatios have the property that p differs fro q at icreasig frequecies, rather tha siply with decreasig aplitude. Oe way to estiate the α quatile of the ull distributio is usig the bootstrap o the aggregated data, followig Arcoes ad Gié 99). Alteratively, we ay approxiate the ull 737
16 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA 50 Epirical MMD desity uder H0 u 0 Epirical MMD desity uder H u Prob. desity Prob. desity MMD u MMD u Figure : Left: Epirical distributio of the MMD uder H 0, with p ad q both Gaussias with uit stadard deviatio, usig 50 saples fro each. Right: Epirical distributio of the MMD uder H A, with p a Laplace distributio with uit stadard deviatio, ad q a Laplace distributio with stadard deviatio 3, usig 00 saples fro each. I both cases, the histogras were obtaied by coputig 000 idepedet istaces of the MMD. distributio by fittig Pearso curves to its first four oets Johso et al., 994, Sectio 8.8). Takig advatage of the degeeracy of the Ustatistic, we obtai for = [MMD ] ) E u = ) E [ z,z h z,z ) ] ad [MMD ] 3 ) E u = 8 ) ) E [ z,z hz,z )E z hz,z )hz,z ) )] + O 4 ) ) see Appedix B.3), where hz,z ) is defied i Lea 6, z=x,y) p q where x ad y are idepedet, ad z,z ] ) [MMD 4 are idepedet copies of z. The fourth oet E u is ot coputed, sice it is both very sall, O 4 ), ad expesive to calculate, O 4 ). Istead, we replace the kurtosis with a lower boud due to Wilkis 944), kurt MMD u) skew MMD u )) +. I Figure 3, we illustrate the Pearso curve fit to the ull distributio: the fit is good i the upper quatiles of the distributio, where the test threshold is coputed. Fially, we ote that two alterative epirical estiates of the ull distributio have ore recetly bee proposed by Gretto et al. 009): a cosistet estiate, based o a epirical coputatio of the eigevalues λ l i 0); ad a alterative Gaa approxiatio to the ull distributio, which has a saller coputatioal cost but is geerally less accurate. Further detail ad experietal coparisos are give by Gretto et al.. The kurtosis is defied i ters of the fourth ad secod oets as kurt MMD u ) E [MMD u ] 4) = [ E [MMD u ] )]
17 A KERNEL TWOSAMPLE TEST CDF of the MMD ad Pearso fit 0.8 PMMD u < t) Ep. CDF Pearso t Figure 3: Illustratio of the epirical CDF of the MMD ad a Pearso curve fit. Both p ad q were Gaussia with zero ea ad uit variace, ad 50 saples were draw fro each. The epirical CDF was coputed o the basis of 000 radoly geerated MMD values. To esure the quality of fit was deteried oly by the accuracy of the Pearso approxiatio, the oets used for the Pearso curves were also coputed o the basis of these 000 saples. The MMD used a Gaussia kerel with σ= A Liear Tie Statistic ad Test The MMDbased tests are already ore efficiet tha the O log) ad O 3 ) tests described i Sectio assuig = for cociseess). It is still desirable, however, to obtai O) tests which do ot sacrifice too uch statistical power. Moreover, we would like to obtai tests which have O) storage requireets for coputig the test statistic, i order to apply the test to data streas. We ow describe how to achieve this by coputig the test statistic usig a subsaplig of the ters i the su. The epirical estiate i this case is obtaied by drawig pairs fro X ad Y respectively without replaceet. Lea 4 Defie := /, assue =, ad defie hz,z ) as i Lea 6. The estiator MMD l[f,x,y] := hx i,y i ),x i,y i )) ca be coputed i liear tie, ad is a ubiased estiate of MMD [F, p,q]. While it is expected that MMD l has higher variace tha MMD u as we will see explicitly later), it is coputatioally uch ore appealig. I particular, the statistic ca be used i strea coputatios with eed for oly O) eory, whereas MMD u requires O) storage ad O ) tie to copute the kerel h o all iteractig pairs. Sice MMD l is just the average over a set of rado variables, Hoeffdig s boud ad the cetral liit theore readily allow us to provide both uifor covergece ad asyptotic stateets with little effort. The first follows directly fro Hoeffdig 963, Theore ). 739
18 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Theore 5 Assue 0 kx i,x j ) K. The Pr X,Y { MMD l F,X,Y) MMD F, p,q)>t } exp t ) 8K where := / the sae boud applies for deviatios of t ad below). Note that the boud of Theore 0 is idetical to that of Theore 5, which shows the forer is rather loose. Next we ivoke the cetral liit theore e.g., Serflig, 980, Sectio.9). Corollary 6 Assue 0 < E h ) <. The MMD l coverges i distributio to a Gaussia accordig to MMD l MMD [F, p,q] ) D ) N 0,σ l, [ where σ l = E z,z h z,z ) [E z,z hz,z )] ], where we use the shorthad E z,z := E z,z p q. The factor of arises sice we are averagig over oly / observatios. It is istructive to copare this asyptotic distributio with that of the quadratic tie statistic MMD u uder H A, whe =. I this case, MMD u coverges i distributio to a Gaussia accordig to MMD u MMD [F, p,q] ) D ) N 0,σ u, where σ [ u= 4 E z Ez hz,z )) ] [E z,z hz,z ))] ) Serflig, 980, Sectio 5.5). Thus for MMD u, the asyptotic variace is up to scalig) the variace of E z [hz,z )], whereas for MMD l it is Var z,z [hz,z )]. We ed by otig aother potetial approach to reducig the cost of coputig a epirical MMD estiate, by usig a low rak approxiatio to the Gra atrix Fie ad Scheiberg, 00; Willias ad Seeger, 00; Sola ad Schölkopf, 000). A icreetal coputatio of the MMD based o such a low rak approxiatio would require Od) storage ad Od) coputatio where d is the rak of the approxiate Gra atrix which is used to factorize both atrices) rather tha O) storage ad O ) operatios. That said, it reais to be deteried what effect this approxiatio would have o the distributio of the test statistic uder H 0, ad hece o the test threshold. 7. Related Metrics ad Learig Probles The preset sectio discusses a uber of topics related to the axiu ea discrepacy, icludig etrics o probability distributios usig orkhs fuctio classes Sectios 7. ad 7.), the relatio with set kerels ad kerels o probability easures Sectio 7.3), a extesio to kerel easures of idepedece Sectio 7.4), a twosaple statistic usig a distributio over witess fuctios Sectio 7.5), ad a coectio to outlier detectio Sectio 7.6). 7. The MMD i Other Fuctio Classes The defiitio of the axiu ea discrepacy is by o eas liited to RKHS. I fact, ay fuctio classf that coes with uifor covergece guaratees ad is sufficietly rich will ejoy the above properties. Below, we cosider the case where the scaled fuctios if are dese i CX) which is useful for istace whe the fuctios i F are or costraied). 740
19 A KERNEL TWOSAMPLE TEST Defiitio 7 LetF be a subset of soe vector space. The star S[F] of a set F is S[F] :={α f f F ad α [0, )} Theore 8 Deote by F the subset of soe vector space of fuctios fro X to R for which S[F] CX) is dese i CX) with respect to the L X) or. The MMD[F, p,q]=0 if ad oly if p=q, ad MMD[F, p,q] is a etric o the space of probability distributios. Wheever the star of F is ot dese, the MMD defies a pseudoetric space. Proof It is clear that p = q iplies MMD[F, p,q]=0. The proof of the coverse is very siilar to that of Theore 5. Defie H := SF) CX). Sice by assuptio H is dese i CX), there exists a h H satisfyig h f < ε for all f CX). Write h := α g, where g F. By assuptio, E x g E y g = 0. Thus we have the boud E x fx) E y fy)) E x fx) E x h x) +α E x g x) E y g y) + E y h y) E y fy) ε for all f CX) ad ε>0, which iplies p=q by Lea. To show MMD[F, p,q] is a etric, it reais to prove the triagle iequality. We have sup E p f E q f +sup E q g E r g [ sup E p f E q f + ] E q f E r f F g F f F sup E p f E r f. f F Note that ay uifor covergece stateets i ters of F allow us iediately to characterize a estiator of MMDF, p, q) explicitly. The followig result shows how this reasoig is also the basis for the proofs i Sectio 4, although here we do ot restrict ourselves to a RKHS). Theore 9 Let δ 0,) be a cofidece level ad assue that for soe εδ,,f) the followig holds for saples {x,...,x } draw fro p: } Pr X {sup E x[ f] fx i ) > εδ,,f) δ. I this case we have that, f F Pr X,Y { MMD[F, p,q] MMD b [F,X,Y] >εδ/,,f)} δ, where MMD b [F,X,Y] is take fro Defiitio. Proof The proof works siply by usig covexity ad suprea as follows: MMD[F, p,q] MMD b [F,X,Y] = sup E x [ f] E y [ f] sup f F f F fx i ) fy i ) sup f F E x[ f] E y [ f] fx i )+ fy i ) sup E x[ f] fx i ) + sup E y[ f] fy i ). f F f F 74
20 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Boudig each of the two ters via a uifor covergece boud proves the clai. This shows that MMD b [F,X,Y] ca be used to estiate MMD[F, p,q], ad that the quatity is asyptotically ubiased. Reark 0 Reductio to Biary Classificatio) As oted by Frieda 003), ay classifier which aps a set of observatios {z i,l i } with z i X o soe doai X ad labels l i {±}, for which uifor covergece bouds exist o the covergece of the epirical loss to the expected loss, ca be used to obtai a siilarity easure o distributios siply assig l i = if z i X ad l i = for z i Y ad fid a classifier which is able to separate the two sets. I this case axiizatio of E x [ f] E y [ f] is achieved by esurig that as ay z pz) as possible correspod to fz)=, whereas for as ay z qz) as possible we have fz)=. Cosequetly eural etworks, decisio trees, boosted classifiers ad other objects for which uifor covergece bouds ca be obtaied ca be used for the purpose of distributio copariso. Metrics ad divergeces o distributios ca also be defied explicitly startig fro classifiers. For istace, Sriperubudur et al. 009, Sectio ) show the MMD iiizes the expected risk of a classifier with liear loss o the saples X ad Y, ad BeDavid et al. 007, Sectio 4) use the error of a hyperplae classifier to approxiate the Adistace betwee distributios Kifer et al., 004). Reid ad Williaso 0) provide further discussio ad exaples. 7. Exaples of NoRKHS Fuctio Classes Other fuctio spaces F ispired by the statistics literature ca also be cosidered i defiig the MMD. Ideed, Lea defies a MMD with F the space of bouded cotiuous realvalued fuctios, which is a Baach space with the supreu or Dudley, 00, p. 58). We ow describe two further etrics o the space of probability distributios, aely the Kologorov Sirov ad Earth Mover s distaces, ad their associated fuctio classes. 7.. KOLMOGOROVSMIRNOV STATISTIC The KologorovSirov KS) test is probably oe of the ost faous twosaple tests i statistics. It works for rado variables x R or ay other set for which we ca establish a total order). Deote by F p x) the cuulative distributio fuctio of p ad let F X x) be its epirical couterpart, F p z) := Pr{x z for x p} ad F X z) := X z xi. It is clear that F p captures the properties of p. The Kologorov etric is siply the L distace F X F Y for two sets of observatios X ad Y. Sirov 939) showed that for p=q the liitig distributio of the epirical cuulative distributio fuctios satisfies { [ } li Pr X,Y F, +] X F Y > x = j= ) j e j x for x 0, ) which is distributio idepedet. This allows for a efficiet characterizatio of the distributio uder the ull hypothesish 0. Efficiet uerical approxiatios to ) ca be foud i uerical aalysis hadbooks Press et al., 994). The distributio uder the alterative p q, however, is ukow. 74
= 1. n n 2 )= n n 2 σ2 = σ2
SAMLE STATISTICS A rado saple of size fro a distributio f(x is a set of rado variables x 1,x,,x which are idepedetly ad idetically distributed with x i f(x for all i Thus, the joit pdf of the rado saple
More informationA Comparison of Hypothesis Testing Methods for the Mean of a LogNormal Distribution
World Applied Scieces Joural (6): 845849 ISS 88495 IDOSI Publicatios A Copariso of Hypothesis Testig ethods for the ea of a ogoral Distributio 3 F. egahdari K. Abdollahezhad ad A.A. Jafari Islaic Azad
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationThe Binomial Multi Section Transformer
4/15/21 The Bioial Multisectio Matchig Trasforer.doc 1/17 The Bioial Multi Sectio Trasforer Recall that a ultisectio atchig etwork ca be described usig the theory of sall reflectios as: where: Γ ( ω
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationarxiv:0903.5136v2 [math.pr] 13 Oct 2009
First passage percolatio o rado graphs with fiite ea degrees Shakar Bhaidi Reco va der Hofstad Gerard Hooghiestra October 3, 2009 arxiv:0903.536v2 [ath.pr 3 Oct 2009 Abstract We study first passage percolatio
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationNPTEL STRUCTURAL RELIABILITY
NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationRADICALS AND SOLVING QUADRATIC EQUATIONS
RADICALS AND SOLVING QUADRATIC EQUATIONS Evaluate Roots Overview of Objectives, studets should be able to:. Evaluate roots a. Siplify expressios of the for a b. Siplify expressios of the for a. Evaluate
More informationORDERS OF GROWTH KEITH CONRAD
ORDERS OF GROWTH KEITH CONRAD Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really wat to uderstad their behavior It also helps you better grasp topics i calculus
More informationCHAPTER 4: NET PRESENT VALUE
EMBA 807 Corporate Fiace Dr. Rodey Boehe CHAPTER 4: NET PRESENT VALUE (Assiged probles are, 2, 7, 8,, 6, 23, 25, 28, 29, 3, 33, 36, 4, 42, 46, 50, ad 52) The title of this chapter ay be Net Preset Value,
More informationClass Meeting # 16: The Fourier Transform on R n
MATH 18.152 COUSE NOTES  CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationConvexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
More informationSequences II. Chapter 3. 3.1 Convergent Sequences
Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationLecture 4: Cauchy sequences, BolzanoWeierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, BolzaoWeierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationConfidence Intervals for One Mean with Tolerance Probability
Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationChapter 3. Compound Interest. Section 2 Compound and Continuous Compound Interest. Solution. Example
Chapter 3 Matheatics of Fiace Sectio 2 Copoud ad Cotiuous Copoud Iterest Copoud Iterest Ulike siple iterest, copoud iterest o a aout accuulates at a faster rate tha siple iterest. The basic idea is that
More informationCDAS: A Crowdsourcing Data Analytics System
CDAS: A Crowdsourcig Data Aalytics Syste Xua Liu,MeiyuLu, Beg Chi Ooi, Yaya She,SaiWu, Meihui Zhag School of Coputig, Natioal Uiversity of Sigapore, Sigapore College of Coputer Sciece, Zhejiag Uiversity,
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationif A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S,
Lecture 5: Borel Sets Topologically, the Borel sets i a topological space are the σalgebra geerated by the ope sets. Oe ca build up the Borel sets from the ope sets by iteratig the operatios of complemetatio
More informationAlternatives To Pearson s and Spearman s Correlation Coefficients
Alteratives To Pearso s ad Spearma s Correlatio Coefficiets Floreti Smaradache Chair of Math & Scieces Departmet Uiversity of New Mexico Gallup, NM 8730, USA Abstract. This article presets several alteratives
More informationARITHMETIC AND GEOMETRIC PROGRESSIONS
Arithmetic Ad Geometric Progressios Sequeces Ad ARITHMETIC AND GEOMETRIC PROGRESSIONS Successio of umbers of which oe umber is desigated as the first, other as the secod, aother as the third ad so o gives
More informationThe Sum of the Harmonic Series Is Not Enough. = a m π 2 + b m log 2 m.
The Su of the Haroic Series Is Not Eough I Proble 67 Ovidiu Furdui cojectured that for each iteger there are correspodig ratioal ubers a ad b such that S : log H H a π + b log Here log x is the atural
More informationHypothesis Tests Applied to Means
The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationTHE HEIGHT OF qbinary SEARCH TREES
THE HEIGHT OF qbinary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average
More informationStandard Errors and Confidence Intervals
Stadard Errors ad Cofidece Itervals Itroductio I the documet Data Descriptio, Populatios ad the Normal Distributio a sample had bee obtaied from the populatio of heights of 5yearold boys. If we assume
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the BruMikowski iequality for boxes. Today we ll go over the
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationg x is a generator polynomial and generates a cyclic code.
Epress, a Iteratioal Joural of Multi Discipliary Research ISSN: 2348 2052, Vol, Issue 6, Jue 204 Available at: wwwepressjouralco Abstract ENUMERATION OF CYCLIC CODES OVER GF (5) By Flora Mati Ruji Departet
More informationChapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
More informationLecture 7: Borel Sets and Lebesgue Measure
EE50: Probability Foudatios for Electrical Egieers JulyNovember 205 Lecture 7: Borel Sets ad Lebesgue Measure Lecturer: Dr. Krisha Jagaatha Scribes: Ravi Kolla, Aseem Sharma, Vishakh Hegde I this lecture,
More informationDefinition. Definition. 72 Estimating a Population Proportion. Definition. Definition
7 stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationUnit 20 Hypotheses Testing
Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect
More informationMethods of Evaluating Estimators
Math 541: Statistical Theory II Istructor: Sogfeg Zheg Methods of Evaluatig Estimators Let X 1, X 2,, X be i.i.d. radom variables, i.e., a radom sample from f(x θ), where θ is ukow. A estimator of θ is
More informationModule 4: Mathematical Induction
Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate
More informationThe Harmonic Series Diverges Again and Again
The Harmoic Series Diverges Agai ad Agai Steve J. Kifowit Prairie State College Terra A. Stamps Prairie State College The harmoic series, = = 3 4 5, is oe of the most celebrated ifiite series of mathematics.
More informationDistributed Storage Allocations for Optimal Delay
Distributed Storage Allocatios for Optial Delay Derek Leog Departet of Electrical Egieerig Califoria Istitute of echology Pasadea, Califoria 925, USA derekleog@caltechedu Alexadros G Diakis Departet of
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationN04/5/MATHL/HP2/ENG/TZ0/XX MATHEMATICS HIGHER LEVEL PAPER 2. Thursday 4 November 2004 (morning) 3 hours INSTRUCTIONS TO CANDIDATES
c IB MATHEMATICS HIGHER LEVEL PAPER DIPLOMA PROGRAMME PROGRAMME DU DIPLÔME DU BI PROGRAMA DEL DIPLOMA DEL BI N/5/MATHL/HP/ENG/TZ/XX 887 Thursday November (morig) hours INSTRUCTIONS TO CANDIDATES! Do ot
More informationNotes on Hypothesis Testing
Probability & Statistics Grishpa Notes o Hypothesis Testig A radom sample X = X 1,..., X is observed, with joit pmf/pdf f θ x 1,..., x. The values x = x 1,..., x of X lie i some sample space X. The parameter
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationGeometric Sequences and Series. Geometric Sequences. Definition of Geometric Sequence. such that. a2 4
3330_0903qxd /5/05 :3 AM Page 663 Sectio 93 93 Geometric Sequeces ad Series 663 Geometric Sequeces ad Series What you should lear Recogize, write, ad fid the th terms of geometric sequeces Fid th partial
More informationMeasurable Functions
Measurable Fuctios Dug Le 1 1 Defiitio It is ecessary to determie the class of fuctios that will be cosidered for the Lebesgue itegratio. We wat to guaratee that the sets which arise whe workig with these
More informationSection 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationInteger programming solution methods. Exactly where on this line this optimal solution lies we do not know, but it must be somewhere!
Iteger prograig solutio ethods J E Beasley Itroductio Suppose that we have soe proble istace of a cobiatorial optiisatio proble ad further suppose that it is a iiisatio proble. If, as i Figure 1, we draw
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationStretch Factor of Curveball Routing in Wireless Network: Cost of Load Balancing
Stretch Factor of urveball outig i Wireless Network: ost of Load Balacig Fa Li Yu Wag The Uiversity of North arolia at harlotte, USA Eail: {fli, yu.wag}@ucc.edu Abstract outig i wireless etworks has bee
More informationECONOMICS. Calculating loan interest no. 3.758
F A M & A N H S E E S EONOMS alculatig loa iterest o. 3.758 y Nora L. Dalsted ad Paul H. Gutierrez Quick Facts... The aual percetage rate provides a coo basis to copare iterest charges associated with
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationGSR: A Global Stripebased Redistribution Approach to Accelerate RAID5 Scaling
: A Global based Redistributio Approach to Accelerate RAID5 Scalig Chetao Wu ad Xubi He Departet of Electrical & Coputer Egieerig Virgiia Coowealth Uiversity {wuc4,xhe2}@vcu.edu Abstract Uder the severe
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationEntropy of bicapacities
Etropy of bicapacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uivates.fr JeaLuc Marichal Applied Mathematics
More informationB1. Fourier Analysis of Discrete Time Signals
B. Fourier Aalysis of Discrete Time Sigals Objectives Itroduce discrete time periodic sigals Defie the Discrete Fourier Series (DFS) expasio of periodic sigals Defie the Discrete Fourier Trasform (DFT)
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationEconomics 140A Confidence Intervals and Hypothesis Testing
Ecoomics 140A Cofidece Itervals ad Hypothesis Testig Obtaiig a estimate of a parameter is ot the al purpose of statistical iferece because it is highly ulikely that the populatio value of a parameter is
More informationExploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationAn example of nonquenched convergence in the conditional central limit theorem for partial sums of a linear process
A example of oqueched covergece i the coditioal cetral limit theorem for partial sums of a liear process Dalibor Volý ad Michael Woodroofe Abstract A causal liear processes X,X 0,X is costructed for which
More informationThroughput and Delay Analysis of Hybrid Wireless Networks with MultiHop Uplinks
This paper was preseted as part of the ai techical progra at IEEE INFOCOM 0 Throughput ad Delay Aalysis of Hybrid Wireless Networks with MultiHop Upliks Devu Maikata Shila, Yu Cheg ad Tricha Ajali Dept.
More informationThe Computational Rise and Fall of Fairness
Proceedigs of the TwetyEighth AAAI Coferece o Artificial Itelligece The Coputatioal Rise ad Fall of Fairess Joh P Dickerso Caregie Mello Uiversity dickerso@cscuedu Joatha Golda Caregie Mello Uiversity
More information13 Fast Fourier Transform (FFT)
13 Fast Fourier Trasform FFT) The fast Fourier trasform FFT) is a algorithm for the efficiet implemetatio of the discrete Fourier trasform. We begi our discussio oce more with the cotiuous Fourier trasform.
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More informationKey Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More informationLecture Notes CMSC 251
We have this messy summatio to solve though First observe that the value remais costat throughout the sum, ad so we ca pull it out frot Also ote that we ca write 3 i / i ad (3/) i T () = log 3 (log ) 1
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More information1 The Binomial Theorem: Another Approach
The Biomial Theorem: Aother Approach Pascal s Triagle I class (ad i our text we saw that, for iteger, the biomial theorem ca be stated (a + b = c a + c a b + c a b + + c ab + c b, where the coefficiets
More informationPlugin martingales for testing exchangeability online
Plugi martigales for testig exchageability olie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationDivide and Conquer, Solving Recurrences, Integer Multiplication Scribe: Juliana Cook (2015), V. Williams Date: April 6, 2016
CS 6, Lecture 3 Divide ad Coquer, Solvig Recurreces, Iteger Multiplicatio Scribe: Juliaa Cook (05, V Williams Date: April 6, 06 Itroductio Today we will cotiue to talk about divide ad coquer, ad go ito
More informationthe product of the hooklengths is over all boxes of the diagram. We denote by d (n) the number of semistandard tableaux:
O Represetatio Theory i Coputer Visio Probles Ao Shashua School of Coputer Sciece ad Egieerig Hebrew Uiversity of Jerusale Jerusale 91904, Israel eail: shashua@cs.huji.ac.il Roy Meshula Departet of Matheatics
More informationChapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing
Chapter 10 Hypothesis Tests Regardig a Parameter A secod type of statistical iferece is hypothesis testig. Here, rather tha use either a poit (or iterval) estimate from a simple radom sample to approximate
More informationSection 9.2 Series and Convergence
Sectio 9. Series ad Covergece Goals of Chapter 9 Approximate Pi Prove ifiite series are aother importat applicatio of limits, derivatives, approximatio, slope, ad cocavity of fuctios. Fid challegig atiderivatives
More informationARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorovtype test for monotonicity of regression. Cecile Durot
STAPRO 66 pp:  col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N  SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorovtype test for mootoicity of regressio Cecile Durot Laboratoire
More information