A Kernel Two-Sample Test

Size: px
Start display at page:

Download "A Kernel Two-Sample Test"

Transcription

1 Joural of Machie Learig Research 3 0) Subitted 4/08; Revised /; Published 3/ Arthur Gretto MPI for Itelliget Systes Speastrasse Tübige, Geray A Kerel Two-Saple Test Karste M. Borgwardt Machie Learig ad Coputatioal Biology Research Group Max Plack Istitutes Tübige Speastrasse Tübige, Geray Malte J. Rasch 9 XiJieKouWai St. State Key Laboratory of Cogitive Neurosciece ad Learig, Beijig Noral Uiversity, Beijig, 00875, P.R. Chia Berhard Schölkopf MPI for Itelliget Systes Speastrasse , Tübige, Geray Alexader Sola Yahoo! Research 8 Missio College Blvd Sata Clara, CA 95054, USA Editor: Nicolas Vayatis Abstract We propose a fraework for aalyzig ad coparig distributios, which we use to costruct statistical tests to deterie if two saples are draw fro differet distributios. Our test statistic is the largest differece i expectatios over fuctios i the uit ball of a reproducig kerel Hilbert space RKHS), ad is called the axiu ea discrepacy MMD). We preset two distributiofree tests based o large deviatio bouds for the MMD, ad a third test based o the asyptotic distributio of this statistic. The MMD ca be coputed i quadratic tie, although efficiet liear tie approxiatios are available. Our statistic is a istace of a itegral probability etric, ad various classical etrics o distributios are obtaied whe alterative fuctio classes are used i place of a RKHS. We apply our two-saple tests to a variety of probles, icludig attribute atchig for databases usig the Hugaria arriage ethod, where they perfor strogly. Excellet perforace is also obtaied whe coparig distributios over graphs, for which these are the first such tests.. Also at Gatsby Coputatioal Neurosciece Uit, CSML, 7 Quee Square, Lodo WCN 3AR, UK.. This work was carried out while K.M.B. was with the Ludwig-Maxiilias-Uiversität Müche.. This work was carried out while M.J.R. was with the Graz Uiversity of Techology.. Also at The Australia Natioal Uiversity, Caberra, ACT 000, Australia. c 0 Arthur Gretto, Karste M. Borgwardt, Malte J. Rasch, Berhard Schölkopf ad Alexader Sola.

2 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Keywords: kerel ethods, two-saple test, uifor covergece bouds, schea atchig, itegral probability etric, hypothesis testig. Itroductio We address the proble of coparig saples fro two probability distributios, by proposig statistical tests of the ull hypothesis that these distributios are equal agaist the alterative hypothesis that these distributios are differet this is called the two-saple proble). Such tests have applicatio i a variety of areas. I bioiforatics, it is of iterest to copare icroarray data fro idetical tissue types as easured by differet laboratories, to detect whether the data ay be aalysed joitly, or whether differeces i experietal procedure have caused systeatic differeces i the data distributios. Equally of iterest are coparisos betwee icroarray data fro differet tissue types, either to deterie whether two subtypes of cacer ay be treated as statistically idistiguishable fro a diagosis perspective, or to detect differeces i healthy ad cacerous tissue. I database attribute atchig, it is desirable to erge databases cotaiig ultiple fields, where it is ot kow i advace which fields correspod: the fields are atched by axiisig the siilarity i the distributios of their etries. We test whether distributios p ad q are differet o the basis of saples draw fro each of the, by fidig a well behaved e.g., sooth) fuctio which is large o the poits draw fro p, ad sall as egative as possible) o the poits fro q. We use as our test statistic the differece betwee the ea fuctio values o the two saples; whe this is large, the saples are likely fro differet distributios. We call this test statistic the Maxiu Mea Discrepacy MMD). Clearly the quality of the MMD as a statistic depeds o the class F of sooth fuctios that defie it. O oe had, F ust be rich eough so that the populatio MMD vaishes if ad oly if p=q. O the other had, for the test to be cosistet i power,feeds to be restrictive eough for the epirical estiate of the MMD to coverge quickly to its expectatio as the saple size icreases. We will use the uit balls i characteristic reproducig kerel Hilbert spaces Fukuizu et al., 008; Sriperubudur et al., 00b) as our fuctio classes, sice these will be show to satisfy both of the foregoig properties. We also review classical etrics o distributios, aely the Kologorov-Sirov ad Earth-Mover s distaces, which are based o differet fuctio classes; collectively these are kow as itegral probability etrics Müller, 997). O a ore practical ote, the MMD has a reasoable coputatioal cost, whe copared with other two-saple tests: give poits sapled fro p ad fro q, the cost is O+) tie. We also propose a test statistic with a coputatioal cost of O+): the associated test ca achieve a give Type II error at a lower overall coputatioal cost tha the quadratic-cost test, by lookig at a larger volue of data. We defie three oparaetric statistical tests based o the MMD. The first two tests are distributio-free, eaig they ake o assuptios regardig p ad q, albeit at the expese of beig coservative i detectig differeces betwee the distributios. The third test is based o the asyptotic distributio of the MMD, ad is i practice ore sesitive to differeces i distributio at sall saple sizes. The preset work sythesizes ad expads o results of Gretto et al. 007a,b) ad Sola et al. 007), who i tur build o the earlier work of Borgwardt et al. 006). Note that. I particular, ost of the proofs here were ot provided by Gretto et al. 007a), but i a accopayig techical report Gretto et al., 008a), which this docuet replaces. 74

3 A KERNEL TWO-SAMPLE TEST the latter addresses oly the third kid of test, ad that the approach of Gretto et al. 007a,b) is rigorous i its treatet of the asyptotic distributio of the test statistic uder the ull hypothesis. We begi our presetatio i Sectio with a foral defiitio of the MMD. We review the otio of a characteristic RKHS, ad establish that whe F is a uit ball i a characteristic RKHS, the the populatio MMD is zero if ad oly if p = q. We further show that uiversal RKHSs i the sese of Steiwart 00) are characteristic. I Sectio 3, we give a overview of hypothesis testig as it applies to the two-saple proble, ad review alterative test statistics, icludig the L distace betwee kerel desity estiates Aderso et al., 994), which is the prior approach closest to our work. We preset our first two hypothesis tests i Sectio 4, based o two differet bouds o the deviatio betwee the populatio ad epirical MMD. We take a differet approach i Sectio 5, where we use the asyptotic distributio of the epirical MMD estiate as the basis for a third test. Whe large volues of data are available, the cost of coputig the MMD quadratic i the saple size) ay be excessive: we therefore propose i Sectio 6 a odified versio of the MMD statistic that has a liear cost i the uber of saples, ad a associated asyptotic test. I Sectio 7, we provide a overview of ethods related to the MMD i the statistics ad achie learig literature. We also review alterative fuctio classes for which the MMD defies a etric o probability distributios. Fially, i Sectio 8, we deostrate the perforace of MMD-based two-saple tests o probles fro eurosciece, bioiforatics, ad attribute atchig usig the Hugaria arriage ethod. Our approach perfors well o high diesioal data with low saple size; i additio, we are able to successfully distiguish distributios o graph data, for which ours is the first proposed test. A Matlab ipleetatio of the tests is at gretto/d/d.ht.. The Maxiu Mea Discrepacy I this sectio, we preset the axiu ea discrepacy MMD), ad describe coditios uder which it is a etric o the space of probability distributios. The MMD is defied i ters of particular fuctio spaces that witess the differece i distributios: we therefore begi i Sectio. by itroducig the MMD for a arbitrary fuctio space. I Sectio., we copute both the populatio MMD ad two epirical estiates whe the associated fuctio space is a reproducig kerel Hilbert space, ad i Sectio.3 we derive the RKHS fuctio that witesses the MMD for a give pair of distributios.. Defiitio of the Maxiu Mea Discrepacy Our goal is to forulate a statistical test that aswers the followig questio: Proble Let x ad y be rado variables defied o a topological space X, with respective Borel probability easures p ad q. Give observatios X :={x,...,x } ad Y :={y,...,y }, idepedetly ad idetically distributed i.i.d.) fro p ad q, respectively, ca we decide whether p q? Where there is o abiguity, we use the shorthad otatio E x [ fx)] := E x p [ fx)] ad E y [ fy)] := E y q [ fy)] to deote expectatios with respect to p ad q, respectively, where x p idicates x has distributio p. To start with, we wish to deterie a criterio that, i the populatio settig, takes o a uique ad distictive value oly whe p = q. It will be defied based o Lea 9.3. of Dudley 00). 75

4 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Lea Let X,d) be a etric space, ad let p,q be two Borel probability easures defied o X. The p = q if ad oly if E x fx)) = E y fy)) for all f CX), where CX) is the space of bouded cotiuous fuctios o X. Although CX) i priciple allows us to idetify p=q uiquely, it is ot practical to work with such a rich fuctio class i the fiite saple settig. We thus defie a ore geeral class of statistic, for as yet uspecified fuctio classes F, to easure the disparity betwee p ad q Fortet ad Mourier, 953; Müller, 997). Defiitio Let F be a class of fuctios f :X R ad let p,q,x,y,x,y be defied as above. We defie the axiu ea discrepacy MMD) as MMD[F, p,q] := supe x [ fx)] E y [ fy)]). ) f F I the statistics literature, this is kow as a itegral probability etric Müller, 997). A biased epirical estiate of the MMD is obtaied by replacig the populatio expectatios with epirical expectatios coputed o the saples X ad Y, MMD b [F,X,Y] := sup f F fx i ) fy i ) ). ) We ust therefore idetify a fuctio class that is rich eough to uiquely idetify whether p=q, yet restrictive eough to provide useful fiite saple estiates the latter property will be established i subsequet sectios).. The MMD i Reproducig Kerel Hilbert Spaces I the preset sectio, we propose as our MMD fuctio classf the uit ball i a reproducig kerel Hilbert space H. We will provide fiite saple estiates of this quatity both biased ad ubiased), ad establish coditios uder which the MMD ca be used to distiguish betwee probability easures. Other possible fuctio classesf are discussed i Sectios 7. ad 7.. We first review soe properties of H Schölkopf ad Sola, 00). Sice H is a RKHS, the operator of evaluatio δ x appig f H to fx) R is cotiuous. Thus, by the Riesz represetatio theore Reed ad Sio, 980, Theore II.4), there is a feature appig φx) fro X to R such that fx)= f,φx) H. This feature appig takes the caoical for φx)=kx, ) Steiwart ad Christa, 008, Lea 4.9), where kx,x ) : X X R is positive defiite, ad the otatio kx, ) idicates the kerel has oe arguet fixed at x, ad the secod free. Note i particular that φx),φy) H = kx,y). We will geerally use the ore cocise otatio φx) for the feature appig, although i soe cases it will be clearer to write kx, ). We ext exted the otio of feature ap to the ebeddig of a probability distributio: we will defie a eleet µ p H such that E x f = f,µ p H for all f H, which we call the ea ebeddig of p. Ebeddigs of probability easures ito reproducig kerel Hilbert spaces are well established i the statistics literature: see Berliet ad Thoas-Aga 004, Chapter 4) for further detail ad refereces. We begi by establishig coditios uder which the ea ebeddig µ p exists Fukuizu et al., 004, p. 93), Sriperubudur et al., 00b, Theore ).. The epirical MMD defied below has a upward bias we will defie a ubiased statistic i the followig sectio. 76

5 A KERNEL TWO-SAMPLE TEST Lea 3 If k, ) is easurable ad E x kx,x)< the µp H. Proof The liear operator T p f := E x f for all f F is bouded uder the assuptio, sice T p f = E x f E x f =E x f,φx) H E x kx,x) f H ). Hece by the Riesz represeter theore, there exists a µ p H such that T p f = f,µ p H. If we set f = φt)=kt, ), we obtai µ p t)= µ p,kt, ) H = E x kt,x): i other words, the ea ebeddig of the distributio p is the expectatio uder p of the caoical feature ap. We ext show that the MMD ay be expressed as the distace i H betwee ea ebeddigs Borgwardt et al., 006). Lea 4 Assue the coditio i Lea 3 for the existece of the ea ebeddigs µ p, µ q is satisfied. The MMD [F, p,q]= µ p µ q H. Proof MMD [F, p,q] = = [ sup f H E x [ fx)] E y [ fy)]) [ sup µp µ q, f H f H = µ p µ q H. ] ] We ow establish a coditio o the RKHS H uder which the ea ebeddig µ p is ijective, which idicates that MMD[F, p,q]=0 is a etric 3 o the Borel probability easures o X. Evidetly, this property will ot hold for allh: for istace, a polyoial RKHS of degree two caot distiguish betwee distributios with the sae ea ad variace, but differet kurtosis Sriperubudur et al., 00b, Exaple 3). The MMD is a etric, however, whe H is a uiversal RKHSs, defied o a copact etric space X. Uiversality requires that k, ) be cotiuous, ad H be dese i CX) with respect to the L or. Steiwart 00) proves that the Gaussia ad Laplace RKHSs are uiversal. Theore 5 Let F be a uit ball i a uiversal RKHS H, defied o the copact etric space X, with associated cotiuous kerel k, ). The MMD[F, p, q] = 0 if ad oly if p = q. Proof The proof follows Cortes et al. 008, Suppleetary Appedix), whose approach is clearer tha the origial proof of Gretto et al. 008a, p. 4). 4 First, it is clear that p = q iplies 3. Accordig to Dudley 00, p. 6) a etric dx, y) satisfies the followig four properties: syetry, triagle iequality, dx, x) = 0, ad dx, y) = 0 = x = y. A pseudo-etric oly satisfies the first three properties. 4. Note that the proof of Cortes et al. 008) requires a applicatio the of doiated covergece theore, rather tha usig the Riesz represetatio theore to show the existece of the ea ebeddigs µ p ad µ q as we did i Lea 3. 77

6 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA MMD{F, p,q} is zero. We ow prove the coverse. By the uiversality of H, for ay give ε>0 ad f CX) there exists a g H such that We ext ake the expasio f g ε. E x fx) E y fy)) E x fx) E x gx) + E x gx) E y gy) + E y gy) E y fy). The first ad third ters satisfy Next, write E x fx) E x gx) E x fx) gx) ε. E x gx) E y gy)= g,µ p µ q H = 0, sice MMD{F, p,q}=0 iplies µ p = µ q. Hece E x fx) E y fy)) ε for all f CX) ad ε>0, which iplies p=q by Lea. While our result establishes the appig µ p is ijective for uiversal kerels o copact doais, this result ca also be show i ore geeral cases. Fukuizu et al. 008) itroduces the otio of characteristic kerels, these beig kerels for which the ea ap is ijective. Fukuizu et al. establish that Gaussia ad Laplace kerels are characteristic o R d, ad thus that the associated MMD is a etric o distributios for this doai. Sriperubudur et al. 008, 00b) ad Sriperubudur et al. 0a) further explore the properties of characteristic kerels, providig a siple coditio to deterie whether traslatio ivariat kerels are characteristic, ad ivestigatig the relatio betwee uiversal ad characteristic kerels o o-copact doais. Give we are i a RKHS, we ay easily obtai of the squared MMD, µ p µ q, i ters of H kerel fuctios, ad a correspodig ubiased fiite saple estiate. Lea 6 Give x ad x idepedet rado variables with distributio p, ad y ad y idepedet rado variables with distributio q, the squared populatio MMD is MMD [F, p,q]=e x,x [ kx,x ) ] E x,y [kx,y)]+e y,y [ ky,y ) ], where x is a idepedet copy of x with the sae distributio, ad y is a idepedet copy of y. A ubiased epirical estiate is a su of two U-statistics ad a saple average, MMD u[f,x,y] = ) j= j i kx i,x j )+ ) j i ky i,y j ) kx i,y j ). 3) Whe =, a slightly sipler epirical estiate ay be used. Let Z := z,...,z ) be i.i.d. rado variables, where z :=x,y) p q i.e., x ad y are idepedet). A ubiased estiate of MMD is MMD u[f,x,y]= ) ) hz i,z j ), 4) 78 i j

7 A KERNEL TWO-SAMPLE TEST which is a oe-saple U-statistic with hz i,z j ) := kx i,x j )+ky i,y j ) kx i,y j ) kx j,y i ). Proof Startig fro the expressio for MMD [F, p,q] i Lea 4, MMD [F, p,q] = µ p µ q H = µ p,µ p H + µ q,µ q H µ p,µ q H = E x,x φx),φx ) H + E y,y φy),φy ) H E x,y φx),φy) H, The proof is copleted by applyig φx),φx ) H = kx,x ); the epirical estiates follow straightforwardly, by replacig the populatio expectatios with their correspodig U-statistics ad saple averages. This statistic is ubiased followig Serflig 980, Chapter 5). Note that MMD u ay be egative, sice it is a ubiased estiator of MMD[F, p,q]). The oly ters issig to esure oegativity, however, are hz i,z i ), which were reoved to reove spurious correlatios betwee observatios. Cosequetly we have the boud MMD u+ ) kx i,x i )+ky i,y i ) kx i,y i ) 0. Moreover, while the epirical statistic for = is a ubiased estiate of MMD, it does ot have iiu variace, sice we igore the cross-ters kx i,y i ), of which there are O). Fro 3), however, we see the iiu variace estiate is alost idetical Serflig, 980, Sectio 5..4). The biased statistic i ) ay also be easily coputed followig the above reasoig. Substitutig the epirical estiates µ X := φx i) ad µ Y := φy i) of the feature space eas based o respective saples X ad Y, we obtai MMD b [F,X,Y]= [ kx i,x j ) i, j=, i, j= kx i,y j )+ ky i,y j )]. 5) i, j= Note that the U-statistics of 3) have bee replaced by V-statistics. Ituitively we expect the epirical test statistic MMD[F,X,Y], whether biased or ubiased, to be sall if p=q, ad large if the distributios are far apart. It costs O+) ) tie to copute both statistics..3 Witess Fuctio of the MMD for RKHSs We defie the witess fuctio f to be the RKHS fuctio attaiig the supreu i ), ad its epirical estiate ˆf to be the fuctio attaiig the supreu i ). Fro the reasoig i Lea 4, it is clear that f t) φt),µ p µ q H = E x[kx,t)] E y [ky,t)], ˆf t) φt),µ X µ Y H = kx i,t) ky i,t). where we have defied µ X = φx i), ad µ Y by aalogy. The result follows sice the uit vector v axiizig v,x H i a Hilbert space is v=x/ x H. We illustrate the behavior of MMD i Figure usig a oe-diesioal exaple. The data X ad Y were geerated fro distributios p ad q with equal eas ad variaces, with p Gaussia 79

8 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Prob. desities ad ˆf t) ˆf p Gauss) q Laplace) t Figure : Illustratio of the fuctio axiizig the ea discrepacy i the case where a Gaussia is beig copared with a Laplace distributio. Both distributios have zero ea ad uit variace. The fuctio ˆf that witesses the MMD has bee scaled for plottig purposes, ad was coputed epirically o the basis of 0 4 saples, usig a Gaussia kerel with σ=0.5. ad q Laplacia. We chose F to be the uit ball i a Gaussia RKHS. The epirical estiate ˆf of the fuctio f that witesses the MMD i other words, the fuctio axiizig the ea discrepacy i ) is sooth, egative where the Laplace desity exceeds the Gaussia desity at the ceter ad tails), ad positive where the Gaussia desity is larger. The agitude of ˆf is a direct reflectio of the aout by which oe desity exceeds the other, isofar as the soothess costrait perits it. 3. Backgroud Material We ow preset three backgroud results. First, we itroduce the teriology used i statistical hypothesis testig. Secod, we deostrate via a exaple that eve for tests which have asyptotically o error, we caot guaratee perforace at ay fixed saple size without akig assuptios about the distributios. Third, we review soe alterative statistics used i coparig distributios, ad the associated two-saple tests see also Sectio 7 for a overview of additioal itegral probability etrics). 3. Statistical Hypothesis Testig Havig described a etric o probability distributios the MMD) based o distaces betwee their Hilbert space ebeddigs, ad epirical estiates biased ad ubiased) of this etric, we address the proble of deteriig whether the epirical MMD shows a statistically sigificat differece betwee distributios. To this ed, we briefly describe the fraework of statistical hypothesis testig as it applies i the preset cotext, followig Casella ad Berger 00, Chapter 8). Give i.i.d. 730

9 A KERNEL TWO-SAMPLE TEST saples X p of size ad Y q of size, the statistical test,tx,y) : X X {0,} is used to distiguish betwee the ull hypothesis H 0 : p=q ad the alterative hypothesis H A : p q. This is achieved by coparig the test statistic 5 MMD[F,X,Y] with a particular threshold: if the threshold is exceeded, the the test rejects the ull hypothesis bearig i id that a zero populatio MMD idicates p=q). The acceptace regio of the test is thus defied as the set of real ubers below the threshold. Sice the test is based o fiite saples, it is possible that a icorrect aswer will be retured. A Type I error is ade whe p = q is rejected based o the observed saples, despite the ull hypothesis havig geerated the data. Coversely, a Type II error occurs whe p = q is accepted despite the uderlyig distributios beig differet. The level α of a test is a upper boud o the probability of a Type I error: this is a desig paraeter of the test which ust be set i advace, ad is used to deterie the threshold to which we copare the test statistic fidig the test threshold for a give α is the topic of Sectios 4 ad 5). The power of a test agaist a particular eber of the alterative class H A i.e., a specific p,q) such that p q) is the probability of wrogly acceptig p=q i this istace. A cosistet test achieves a level α, ad a Type II error of zero, i the large saple liit. We will see that the tests proposed i this paper are cosistet. 3. A Negative Result Eve if a test is cosistet, it is ot possible to distiguish distributios with high probability at a give, fixed saple size i.e., to provide guaratees o the Type II error), without prior assuptios as to the ature of the differece betwee p ad q. This is true regardless of the two-saple test used. There are several ways to illustrate this, which each give isight ito the kids of differeces that ight be udetectable for a give uber of saples. The followig exaple 6 is oe such illustratio. Exaple Assue we have a distributio p fro which we have draw i.i.d. observatios. We costruct a distributio q by drawig i.i.d. observatios fro p, ad defiig a discrete distributio over these istaces with probability each. It is easy to check that if we ow draw observatios fro q, there is at least a )! > e > 0.63 probability that we thereby obtai a saple fro p. Hece o test will be able to distiguish saples fro p ad q i this case. We could ake the probability of detectio arbitrarily sall by icreasig the size of the saple fro which we costruct q. 3.3 Previous Work We ext give a brief overview of soe earlier approaches to the two saple proble for ultivariate data. Sice our later experietal copariso is with respect to certai of these ethods, we give abbreviated algorith aes i italics where appropriate: these should be used as a key to the tables i Sectio This ay be biased or ubiased. 6. This is a variatio of a costructio for idepedece tests, which was suggested i a private couicatio by Joh Lagford. 73

10 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA 3.3. L DISTANCE BETWEEN PARZEN WINDOW ESTIMATES The prior work closest to the curret approach is the Parze widow-based statistic of Aderso et al. 994). We begi with a short overview of the Parze widow estiate ad its properties Silvera, 986), before proceedig to a copariso with the RKHS approach. We assue a distributio p o R d, which has a associated desity fuctio f p. The Parze widow estiate of this desity fro a i.i.d. saple X of size is ˆf p x)= We ay rescale κ accordig to κx i x), where κ satisfies ) κ x h d h X κx)dx= ad κx) 0. for a badwidth paraeter h. To siplify the discussio, we use a sigle badwidth h + for both ˆf p ad ˆf q. Assuig / is bouded away fro zero ad ifiity, cosistecy of the Parze widow estiates for f p ad f q requires li, hd + = 0 ad li, +)hd + =. 6) We ow show the L distace betwee Parze widows desity estiates is a special case of the biased MMD i Equatio 5). Deote by D r p,q) := f p f q r the L r distace betwee the desities f p ad f q correspodig to the distributios p ad q, respectively. For r= the distace D r p,q) is kow as the Lévy distace Feller, 97), ad for r = we ecouter a distace easure derived fro the Reyi etropy Gokcay ad Pricipe, 00). Assue that ˆf p ad ˆf q are give as kerel desity estiates with kerel κx x ), that is, ˆf p x)= κx i x) ad ˆf q y) is defied by aalogy. I this case [ D ˆf p, ˆf q ) = κx i z) κy i z)] dz = kx i x j )+ i, j= ky i y j ) i, j=, i, j= kx i y j ), where kx y)= κx z)κy z)dz. By its defiitio kx y) is a RKHS kerel, as it is a ier product betwee κx z) ad κy z) o the doaix. We ow describe the asyptotic perforace of a two-saple test usig the statistic D ˆf p, ˆf q ). We cosider the power of the test uder local departures fro the ull hypothesis. Aderso et al. 994) defie these to take the for f q = f p + δg, 7) where δ R, ad g is a fixed, bouded, itegrable fuctio chose to esure that f q is a valid desity for sufficietly sall δ. Aderso et al. cosider two cases: the kerel badwidth covergig to zero with icreasig saple size, esurig cosistecy of the Parze widow estiates of f p ad f q ; ad the case of a fixed badwidth. I the forer case, the iiu distace with which the test ca discriiate f p fro f q is 7 δ=+) / h d/ +. I the latter case, this iiu distace is δ = +) /, uder the assuptio that the Fourier trasfor of the kerel κ does ot vaish ), 7. Forally, defie s α as a threshold for the statistic D ˆf p, ˆf q chose to esure the test has level α, ad let δ = +) / h d/ + c for soe fixed c 0. Whe, such that / is bouded away fro 0 ad, ad 73

11 A KERNEL TWO-SAMPLE TEST o a iterval Aderso et al., 994, Sectio.4), which iplies the kerel k is characteristic Sriperubudur et al., 00b). The power of the L test agaist local alteratives is greater whe the kerel is held fixed, sice for ay rate of decrease of h + with icreasig saple size, δ will decrease ore slowly tha for a fixed kerel. A RKHS-based approach geeralizes the L statistic i a uber of iportat respects. First, we ay eploy a uch larger class of characteristic kerels that caot be writte as ier products betwee Parze widows: several exaples are give by Steiwart 00, Sectio 3) ad Micchelli et al. 006, Sectio 3) these kerels are uiversal, hece characteristic). We ay further geeralize to kerels o structured objects such as strigs ad graphs Schölkopf et al., 004), as doe i our experiets Sectio 8). Secod, eve whe the kerel ay be writte as a ier product of Parze widows or d, the D statistic with fixed badwidth o loger coverges to a L distace betwee probability desity fuctios, hece it is ore atural to defie the statistic as a itegral probability etric for a particular RKHS, as i Defiitio. Ideed, i our experiets, we obtai good perforace i experietal settigs where the diesioality greatly exceeds the saple size, ad desity estiates would perfor very poorly 8 for istace the Gaussia toy exaple i Figure 5B, for which perforace actually iproves whe the diesioality icreases; ad the icroarray data sets i Table ). This suggests it is ot ecessary to solve the ore difficult proble of desity estiatio i high diesios to do two-saple testig. Fially, the kerel approach leads us to establish cosistecy agaist a larger class of local alteratives to the ull hypothesis tha that cosidered by Aderso et al. I Theore 3, we prove cosistecy agaist a class of alteratives ecoded i ters of the ea ebeddigs of p ad q, which applies to ay doai o which RKHS kerels ay be defied, ad ot oly desities or d. This ore geeral approach also has iterestig cosequeces for distributios or d : for istace, a local departure fro H 0 occurs whe p ad q differ at icreasig frequecies i their respective characteristic fuctios. This class of local alteratives caot be expressed i the for δg for fixed g, as i 7). We discuss this issue further i Sectio MMD FOR MULTINOMIALS Assue a fiite doai X := {,...,d}, ad defie the rado variables x ad y o X such that p i := Px=i) ad q j := Py= j). We ebed x ito a RKHSHvia the feature appig φx) := e x, where e s is the uit vector i R d takig value i diesio s, ad zero i the reaiig etries. The kerel is the usual ier product o R d. I this case, MMD [F, p,q]= p q R d = d p i q i ). 8) Harchaoui et al. 008, Sectio, log versio) ote that this L statistic ay ot be the best choice for fiite doais, citig a result of Leha ad Roao 005, Theore 4.3.) that Pearso s assuig coditios 6), the liit πc) := li Pr ) ) H A D ˆf p, ˆf q > sα +) is well-defied, ad satisfies α<πc)< for 0< c <, ad πc) as c. 8. The L error of a kerel desity estiate coverges as O 4/4+d) ) whe the optial badwidth is used Wassera, 006, Sectio 6.5). 733

12 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Chi-squared statistic is optial for the proble of goodess of fit testig for ultioials. 9 It would be of iterest to establish whether a aalogous result holds for two-saple testig i a wider class of RKHS feature spaces FURTHER MULTIVARIATE TWO-SAMPLE TESTS Biau ad Gyorfi 005) Biau) use as their test statistic the L distace betwee discretized estiates of the probabilities, where the partitioig is refied as the saple size icreases. This space partitioig approach becoes difficult or ipossible for high diesioal probles, sice there are too few poits per bi. For this reaso, we use this test oly for low-diesioal probles i our experiets. A geeralisatio of the Wald-Wolfowitz rus test to the ultivariate doai was proposed ad aalysed by Frieda ad Rafsky 979) ad Heze ad Perose 999) FR Wolf), ad ivolves coutig the uber of edges i the iiu spaig tree over the aggregated data that coect poits i X to poits i Y. The resultig test relies o the asyptotic orality of the test statistic, ad is ot distributio-free uder the ull hypothesis for fiite saples the test threshold depeds o p, as with our asyptotic test i Sectio 5; by cotrast, our tests i Sectio 4 are distributiofree). The coputatioal cost of this ethod usig Kruskal s algorith is O+) log+)), although ore oder ethods iprove o the log + ) ter: see Chazelle 000) for details. Frieda ad Rafsky 979) clai that calculatig the atrix of distaces, which costs O+) ), doiates their coputig tie; we retur to this poit i our experiets Sectio 8). Two possible geeralisatios of the Kologorov-Sirov test to the ultivariate case were studied by Bickel 969) ad Frieda ad Rafsky 979). The approach of Frieda ad Rafsky FR Sirov) i this case agai requires a iial spaig tree, ad has a siilar cost to their ultivariate rus test. A ore recet ultivariate test was itroduced by Rosebau 005). This etails coputig the iiu distace o-bipartite atchig over the aggregate data, ad usig the uber of pairs cotaiig a saple fro both X ad Y as a test statistic. The resultig statistic is distributio-free uder the ull hypothesis at fiite saple sizes, i which respect it is superior to the Frieda- Rafsky test; o the other had, it costs O+) 3 ) to copute. Aother distributio-free test Hall) was proposed by Hall ad Tajvidi 00): for each poit fro p, it requires coputig the closest poits i the aggregated data, ad coutig how ay of these are fro q the procedure is repeated for each poit fro q with respect to poits fro p). As we shall see i our experietal coparisos, the test statistic is costly to copute; Hall ad Tajvidi cosider oly tes of poits i their experiets. 4. Tests Based o Uifor Covergece Bouds I this sectio, we itroduce two tests for the two-saple proble that have exact perforace guaratees at fiite saple sizes, based o uifor covergece bouds. The first, i Sectio 4., uses the McDiarid 989) boud o the biased MMD statistic, ad the secod, i Sectio 4., uses a Hoeffdig 963) boud for the ubiased statistic. 9. A goodess of fit test deteries whether a saple fro p is draw fro a kow target ultioial q. Pearso s Chi-squared statistic weights each ter i the su 8) by its correspodig q i. 734

13 A KERNEL TWO-SAMPLE TEST 4. Boud o the Biased Statistic ad Test We establish two properties of the MMD, fro which we derive a hypothesis test. First, we show that regardless of whether or ot p=q, the epirical MMD coverges i probability at rate O+ ) ) to its populatio value. This shows the cosistecy of statistical tests based o the MMD. Secod, we give probabilistic bouds for large deviatios of the epirical MMD i the case p=q. These bouds lead directly to a threshold for our first hypothesis test. We begi by establishig the covergece of MMD b [F,X,Y] to MMD[F, p,q]. The followig theore is proved i A.. Theore 7 Let p, q, X,Y be defied as i Proble, ad assue 0 kx, y) K. The ) } Pr X,Y { MMD b [F,X,Y] MMD[F, p,q] > K/) +K/) + ε where Pr X,Y deotes the probability over the -saple X ad -saple Y. exp ε K+) Our ext goal is to refie this result i a way that allows us to defie a test threshold uder the ull hypothesis p = q. Uder this circustace, the costats i the expoet are slightly iproved. The followig theore is proved i Appedix A.3. Theore 8 Uder the coditios of Theore 7 where additioally p=q ad =, MMD b [F,X,Y] E x,x [kx,x) kx,x )] + ε K/) / + ε, } {{ } } {{ } B F,p) B F,p) both with probability at least exp ε 4K ). I this theore, we illustrate two possible bouds B F, p) ad B F, p) o the bias i the epirical estiate 5). The first iequality is iterestig iasuch as it provides a lik betwee the bias boud B F, p) ad kerel size for istace, if we were to use a Gaussia kerel with large σ, the kx,x) ad kx,x ) would likely be close, ad the bias sall). I the cotext of testig, however, we would eed to provide a additioal boud to show covergece of a epirical estiate of B F, p) to its populatio equivalet. Thus, i the followig test for p=q based o Theore 8, we use B F, p) to boud the bias. 0 Corollary 9 A hypothesis test of level α for the ull hypothesis p=q, that is, for MMD[F, p,q]=0, has the acceptace regio MMD b [F,X,Y]< K/ + ) logα. We ephasize that this test is distributio-free: the test threshold does ot deped o the particular distributio that geerated the saple. Theore 7 guaratees the cosistecy of the test agaist fixed alteratives, ad that the Type II error probability decreases to zero at rate O /), assuig =. To put this covergece rate i perspective, cosider a test of whether two oral distributios have equal eas, give they have ukow but equal variace Casella ad Berger, 00, Exercise 8.4). I this case, the test statistic has a Studet-t distributio with + degrees of freedo, ad its Type II error probability coverges at the sae rate as our test. It is worth otig that bouds ay be obtaied for the deviatio betwee populatio ea ebeddigs µ p ad the epirical ebeddigs µ X i a copletely aalogous fashio. The proof 0. Note that we use a tighter bias boud tha Gretto et al. 007a). ), 735

14 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA requires syetrizatio by eas of a ghost saple, that is, a secod set of observatios draw fro the sae distributio. While ot the focus of the preset paper, such bouds ca be used to perfor iferece based o oet atchig Altu ad Sola, 006; Dudík ad Schapire, 006; Dudík et al., 004). 4. Boud o the Ubiased Statistic ad Test The previous bouds are of iterest sice the proof strategy ca be used for geeral fuctio classes with well behaved Radeacher averages see Sriperubudur et al., 00a). WheF is the uit ball i a RKHS, however, we ay very easily defie a test via a covergece boud o the ubiased statistic MMD u i Lea 4. We base our test o the followig theore, which is a straightforward applicatio of the large deviatio boud o U-statistics of Hoeffdig 963, p. 5). Theore 0 Assue 0 kx i,x j ) K, fro which it follows K hz i,z j ) K. The Pr X,Y { MMD u F,X,Y) MMD F, p,q)>t } exp t ) 8K where := / the sae boud applies for deviatios of t ad below). A cosistet statistical test for p=q usig MMD u is the obtaied. Corollary A hypothesis test of level α for the ull hypothesis p=q has the acceptace regio MMD u <4K/ ) logα ). This test is distributio-free. We ow copare the thresholds of the above test with that i Corollary 9. We ote first that the threshold for the biased statistic applies to a estiate of MMD, whereas that for the ubiased statistic is for a estiate of MMD. Squarig the forer threshold to ake the two quatities coparable, the squared threshold i Corollary 9 decreases as, whereas the threshold i Corollary decreases as /. Thus for sufficietly large, the McDiarid-based threshold will be lower ad the associated test statistic is i ay case biased upwards), ad its Type II error will be better for a give Type I boud. This is cofired i our Sectio 8 experiets. Note, however, that the rate of covergece of the squared, biased MMD estiate to its populatio value reais at / bearig i id we take the square of a biased estiate, where the bias ter decays as / ). Fially, we ote that the bouds we obtaied i this sectio ad the last are rather coservative for a uber of reasos: first, they do ot take the actual distributios ito accout. I fact, they are fiite saple size, distributio-free bouds that hold eve i the worst case sceario. The bouds could be tighteed usig localizatio, oets of the distributio, etc.: see, for exaple, Bousquet et al. 005) ad de la Peña ad Gié 999). Ay such iproveets could be plugged straight ito Theore 9. Secod, i coputig bouds rather tha tryig to characterize the distributio of MMDF,X,Y) explicitly, we force our test to be coservative by desig. I the followig we ai for a exact characterizatio of the asyptotic distributio of MMDF, X,Y) istead of a boud. While this will ot satisfy the uifor covergece requireets, it leads to superior tests i practice.. I the case of α=0.05, this is. 736

15 A KERNEL TWO-SAMPLE TEST 5. Test Based o the Asyptotic Distributio of the Ubiased Statistic We propose a third test, which is based o the asyptotic distributio of the ubiased estiate of MMD i Lea 6. This test uses the asyptotic distributio of MMD u uder H 0, which follows fro results of Aderso et al. 994, Appedix) ad Serflig 980, Sectio 5.5.): see Appedix B. for the proof. Theore Let kx i,x j ) be the kerel betwee feature space appigs fro which the ea ebeddig of p has bee subtracted, kx i,x j ) := φx i ) µ p,φx j ) µ p H = kx i,x j ) E x kx i,x) E x kx,x j )+E x,x kx,x ), 9) where x is a idepedet copy of x draw fro p. Assue k L X X, p p) i.e., the cetred kerel is square itegrable, which is true for all p whe the kerel is bouded), ad that for t = +, li, /t ρ x ad li, /t ρ y := ρ x ) for fixed 0<ρ x <. The uderh 0, MMD u coverges i distributio accordig to tmmd [ u[f,x,y] D λ l ρx / a l ρy / b l ) ρ x ρ y ) ], 0) l= where a l N0,) ad b l N0,) are ifiite sequeces of idepedet Gaussia rado variables, ad the λ i are eigevalues of X kx,x )ψ i x)d px)=λ i ψ i x ). We illustrate the MMD desity uder both the ull ad alterative hypotheses by approxiatig it epirically for p=q ad p q. Results are plotted i Figure. Our goal is to deterie whether the epirical test statistic MMD u is so large as to be outside the α quatile of the ull distributio i 0), which gives a level α test. Cosistecy of this test agaist local departures fro the ull hypothesis is provided by the followig theore, proved i Appedix B.. Theore 3 Defie ρ x, ρ y, ad t as i Theore, ad write µ q = µ p +g t, where g t H is chose such that µ p +g t reais a valid ea ebeddig, ad g t H is ade to approach zero as t to describe local departures fro the ull hypothesis. The g t H = ct / is the iiu distace betwee µ p ad µ q distiguishable by the test. A exaple of a local departure fro the ull hypothesis is described earlier i the discussio of the L distace betwee Parze widow estiates Sectio 3.3.). The class of local alteratives cosidered i Theore 3 is ore geeral, however: for istace, Sriperubudur et al. 00b, Sectio 4) ad Harchaoui et al. 008, Sectio 5, log versio) give exaples of classes of perturbatios g t with decreasig RKHS or. These perturbatios have the property that p differs fro q at icreasig frequecies, rather tha siply with decreasig aplitude. Oe way to estiate the α quatile of the ull distributio is usig the bootstrap o the aggregated data, followig Arcoes ad Gié 99). Alteratively, we ay approxiate the ull 737

16 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA 50 Epirical MMD desity uder H0 u 0 Epirical MMD desity uder H u Prob. desity Prob. desity MMD u MMD u Figure : Left: Epirical distributio of the MMD uder H 0, with p ad q both Gaussias with uit stadard deviatio, usig 50 saples fro each. Right: Epirical distributio of the MMD uder H A, with p a Laplace distributio with uit stadard deviatio, ad q a Laplace distributio with stadard deviatio 3, usig 00 saples fro each. I both cases, the histogras were obtaied by coputig 000 idepedet istaces of the MMD. distributio by fittig Pearso curves to its first four oets Johso et al., 994, Sectio 8.8). Takig advatage of the degeeracy of the U-statistic, we obtai for = [MMD ] ) E u = ) E [ z,z h z,z ) ] ad [MMD ] 3 ) E u = 8 ) ) E [ z,z hz,z )E z hz,z )hz,z ) )] + O 4 ) ) see Appedix B.3), where hz,z ) is defied i Lea 6, z=x,y) p q where x ad y are idepedet, ad z,z ] ) [MMD 4 are idepedet copies of z. The fourth oet E u is ot coputed, sice it is both very sall, O 4 ), ad expesive to calculate, O 4 ). Istead, we replace the kurtosis with a lower boud due to Wilkis 944), kurt MMD u) skew MMD u )) +. I Figure 3, we illustrate the Pearso curve fit to the ull distributio: the fit is good i the upper quatiles of the distributio, where the test threshold is coputed. Fially, we ote that two alterative epirical estiates of the ull distributio have ore recetly bee proposed by Gretto et al. 009): a cosistet estiate, based o a epirical coputatio of the eigevalues λ l i 0); ad a alterative Gaa approxiatio to the ull distributio, which has a saller coputatioal cost but is geerally less accurate. Further detail ad experietal coparisos are give by Gretto et al.. The kurtosis is defied i ters of the fourth ad secod oets as kurt MMD u ) E [MMD u ] 4) = [ E [MMD u ] )]

17 A KERNEL TWO-SAMPLE TEST CDF of the MMD ad Pearso fit 0.8 PMMD u < t) Ep. CDF Pearso t Figure 3: Illustratio of the epirical CDF of the MMD ad a Pearso curve fit. Both p ad q were Gaussia with zero ea ad uit variace, ad 50 saples were draw fro each. The epirical CDF was coputed o the basis of 000 radoly geerated MMD values. To esure the quality of fit was deteried oly by the accuracy of the Pearso approxiatio, the oets used for the Pearso curves were also coputed o the basis of these 000 saples. The MMD used a Gaussia kerel with σ= A Liear Tie Statistic ad Test The MMD-based tests are already ore efficiet tha the O log) ad O 3 ) tests described i Sectio assuig = for cociseess). It is still desirable, however, to obtai O) tests which do ot sacrifice too uch statistical power. Moreover, we would like to obtai tests which have O) storage requireets for coputig the test statistic, i order to apply the test to data streas. We ow describe how to achieve this by coputig the test statistic usig a subsaplig of the ters i the su. The epirical estiate i this case is obtaied by drawig pairs fro X ad Y respectively without replaceet. Lea 4 Defie := /, assue =, ad defie hz,z ) as i Lea 6. The estiator MMD l[f,x,y] := hx i,y i ),x i,y i )) ca be coputed i liear tie, ad is a ubiased estiate of MMD [F, p,q]. While it is expected that MMD l has higher variace tha MMD u as we will see explicitly later), it is coputatioally uch ore appealig. I particular, the statistic ca be used i strea coputatios with eed for oly O) eory, whereas MMD u requires O) storage ad O ) tie to copute the kerel h o all iteractig pairs. Sice MMD l is just the average over a set of rado variables, Hoeffdig s boud ad the cetral liit theore readily allow us to provide both uifor covergece ad asyptotic stateets with little effort. The first follows directly fro Hoeffdig 963, Theore ). 739

18 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Theore 5 Assue 0 kx i,x j ) K. The Pr X,Y { MMD l F,X,Y) MMD F, p,q)>t } exp t ) 8K where := / the sae boud applies for deviatios of t ad below). Note that the boud of Theore 0 is idetical to that of Theore 5, which shows the forer is rather loose. Next we ivoke the cetral liit theore e.g., Serflig, 980, Sectio.9). Corollary 6 Assue 0 < E h ) <. The MMD l coverges i distributio to a Gaussia accordig to MMD l MMD [F, p,q] ) D ) N 0,σ l, [ where σ l = E z,z h z,z ) [E z,z hz,z )] ], where we use the shorthad E z,z := E z,z p q. The factor of arises sice we are averagig over oly / observatios. It is istructive to copare this asyptotic distributio with that of the quadratic tie statistic MMD u uder H A, whe =. I this case, MMD u coverges i distributio to a Gaussia accordig to MMD u MMD [F, p,q] ) D ) N 0,σ u, where σ [ u= 4 E z Ez hz,z )) ] [E z,z hz,z ))] ) Serflig, 980, Sectio 5.5). Thus for MMD u, the asyptotic variace is up to scalig) the variace of E z [hz,z )], whereas for MMD l it is Var z,z [hz,z )]. We ed by otig aother potetial approach to reducig the cost of coputig a epirical MMD estiate, by usig a low rak approxiatio to the Gra atrix Fie ad Scheiberg, 00; Willias ad Seeger, 00; Sola ad Schölkopf, 000). A icreetal coputatio of the MMD based o such a low rak approxiatio would require Od) storage ad Od) coputatio where d is the rak of the approxiate Gra atrix which is used to factorize both atrices) rather tha O) storage ad O ) operatios. That said, it reais to be deteried what effect this approxiatio would have o the distributio of the test statistic uder H 0, ad hece o the test threshold. 7. Related Metrics ad Learig Probles The preset sectio discusses a uber of topics related to the axiu ea discrepacy, icludig etrics o probability distributios usig o-rkhs fuctio classes Sectios 7. ad 7.), the relatio with set kerels ad kerels o probability easures Sectio 7.3), a extesio to kerel easures of idepedece Sectio 7.4), a two-saple statistic usig a distributio over witess fuctios Sectio 7.5), ad a coectio to outlier detectio Sectio 7.6). 7. The MMD i Other Fuctio Classes The defiitio of the axiu ea discrepacy is by o eas liited to RKHS. I fact, ay fuctio classf that coes with uifor covergece guaratees ad is sufficietly rich will ejoy the above properties. Below, we cosider the case where the scaled fuctios if are dese i CX) which is useful for istace whe the fuctios i F are or costraied). 740

19 A KERNEL TWO-SAMPLE TEST Defiitio 7 LetF be a subset of soe vector space. The star S[F] of a set F is S[F] :={α f f F ad α [0, )} Theore 8 Deote by F the subset of soe vector space of fuctios fro X to R for which S[F] CX) is dese i CX) with respect to the L X) or. The MMD[F, p,q]=0 if ad oly if p=q, ad MMD[F, p,q] is a etric o the space of probability distributios. Wheever the star of F is ot dese, the MMD defies a pseudo-etric space. Proof It is clear that p = q iplies MMD[F, p,q]=0. The proof of the coverse is very siilar to that of Theore 5. Defie H := SF) CX). Sice by assuptio H is dese i CX), there exists a h H satisfyig h f < ε for all f CX). Write h := α g, where g F. By assuptio, E x g E y g = 0. Thus we have the boud E x fx) E y fy)) E x fx) E x h x) +α E x g x) E y g y) + E y h y) E y fy) ε for all f CX) ad ε>0, which iplies p=q by Lea. To show MMD[F, p,q] is a etric, it reais to prove the triagle iequality. We have sup E p f E q f +sup E q g E r g [ sup E p f E q f + ] E q f E r f F g F f F sup E p f E r f. f F Note that ay uifor covergece stateets i ters of F allow us iediately to characterize a estiator of MMDF, p, q) explicitly. The followig result shows how this reasoig is also the basis for the proofs i Sectio 4, although here we do ot restrict ourselves to a RKHS). Theore 9 Let δ 0,) be a cofidece level ad assue that for soe εδ,,f) the followig holds for saples {x,...,x } draw fro p: } Pr X {sup E x[ f] fx i ) > εδ,,f) δ. I this case we have that, f F Pr X,Y { MMD[F, p,q] MMD b [F,X,Y] >εδ/,,f)} δ, where MMD b [F,X,Y] is take fro Defiitio. Proof The proof works siply by usig covexity ad suprea as follows: MMD[F, p,q] MMD b [F,X,Y] = sup E x [ f] E y [ f] sup f F f F fx i ) fy i ) sup f F E x[ f] E y [ f] fx i )+ fy i ) sup E x[ f] fx i ) + sup E y[ f] fy i ). f F f F 74

20 GRETTON, BORGWARDT, RASCH, SCHÖLKOPF AND SMOLA Boudig each of the two ters via a uifor covergece boud proves the clai. This shows that MMD b [F,X,Y] ca be used to estiate MMD[F, p,q], ad that the quatity is asyptotically ubiased. Reark 0 Reductio to Biary Classificatio) As oted by Frieda 003), ay classifier which aps a set of observatios {z i,l i } with z i X o soe doai X ad labels l i {±}, for which uifor covergece bouds exist o the covergece of the epirical loss to the expected loss, ca be used to obtai a siilarity easure o distributios siply assig l i = if z i X ad l i = for z i Y ad fid a classifier which is able to separate the two sets. I this case axiizatio of E x [ f] E y [ f] is achieved by esurig that as ay z pz) as possible correspod to fz)=, whereas for as ay z qz) as possible we have fz)=. Cosequetly eural etworks, decisio trees, boosted classifiers ad other objects for which uifor covergece bouds ca be obtaied ca be used for the purpose of distributio copariso. Metrics ad divergeces o distributios ca also be defied explicitly startig fro classifiers. For istace, Sriperubudur et al. 009, Sectio ) show the MMD iiizes the expected risk of a classifier with liear loss o the saples X ad Y, ad Be-David et al. 007, Sectio 4) use the error of a hyperplae classifier to approxiate the A-distace betwee distributios Kifer et al., 004). Reid ad Williaso 0) provide further discussio ad exaples. 7. Exaples of No-RKHS Fuctio Classes Other fuctio spaces F ispired by the statistics literature ca also be cosidered i defiig the MMD. Ideed, Lea defies a MMD with F the space of bouded cotiuous real-valued fuctios, which is a Baach space with the supreu or Dudley, 00, p. 58). We ow describe two further etrics o the space of probability distributios, aely the Kologorov- Sirov ad Earth Mover s distaces, ad their associated fuctio classes. 7.. KOLMOGOROV-SMIRNOV STATISTIC The Kologorov-Sirov K-S) test is probably oe of the ost faous two-saple tests i statistics. It works for rado variables x R or ay other set for which we ca establish a total order). Deote by F p x) the cuulative distributio fuctio of p ad let F X x) be its epirical couterpart, F p z) := Pr{x z for x p} ad F X z) := X z xi. It is clear that F p captures the properties of p. The Kologorov etric is siply the L distace F X F Y for two sets of observatios X ad Y. Sirov 939) showed that for p=q the liitig distributio of the epirical cuulative distributio fuctios satisfies { [ } li Pr X,Y F, +] X F Y > x = j= ) j e j x for x 0, ) which is distributio idepedet. This allows for a efficiet characterizatio of the distributio uder the ull hypothesish 0. Efficiet uerical approxiatios to ) ca be foud i uerical aalysis hadbooks Press et al., 994). The distributio uder the alterative p q, however, is ukow. 74

A Comparison of Hypothesis Testing Methods for the Mean of a Log-Normal Distribution

A Comparison of Hypothesis Testing Methods for the Mean of a Log-Normal Distribution World Applied Scieces Joural (6): 845-849 ISS 88-495 IDOSI Publicatios A Copariso of Hypothesis Testig ethods for the ea of a og-oral Distributio 3 F. egahdari K. Abdollahezhad ad A.A. Jafari Islaic Azad

More information

3. Covariance and Correlation

3. Covariance and Correlation Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

More information

The Binomial Multi- Section Transformer

The Binomial Multi- Section Transformer 4/15/21 The Bioial Multisectio Matchig Trasforer.doc 1/17 The Bioial Multi- Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where: Γ ( ω

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

arxiv:0903.5136v2 [math.pr] 13 Oct 2009

arxiv:0903.5136v2 [math.pr] 13 Oct 2009 First passage percolatio o rado graphs with fiite ea degrees Shakar Bhaidi Reco va der Hofstad Gerard Hooghiestra October 3, 2009 arxiv:0903.536v2 [ath.pr 3 Oct 2009 Abstract We study first passage percolatio

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

CHAPTER 4: NET PRESENT VALUE

CHAPTER 4: NET PRESENT VALUE EMBA 807 Corporate Fiace Dr. Rodey Boehe CHAPTER 4: NET PRESENT VALUE (Assiged probles are, 2, 7, 8,, 6, 23, 25, 28, 29, 3, 33, 36, 4, 42, 46, 50, ad 52) The title of this chapter ay be Net Preset Value,

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Sequences II. Chapter 3. 3.1 Convergent Sequences

Sequences II. Chapter 3. 3.1 Convergent Sequences Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

7. Sample Covariance and Correlation

7. Sample Covariance and Correlation 1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

CDAS: A Crowdsourcing Data Analytics System

CDAS: A Crowdsourcing Data Analytics System CDAS: A Crowdsourcig Data Aalytics Syste Xua Liu,MeiyuLu, Beg Chi Ooi, Yaya She,SaiWu, Meihui Zhag School of Coputig, Natioal Uiversity of Sigapore, Sigapore College of Coputer Sciece, Zhejiag Uiversity,

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

4.1 Sigma Notation and Riemann Sums

4.1 Sigma Notation and Riemann Sums 0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Distributed Storage Allocations for Optimal Delay

Distributed Storage Allocations for Optimal Delay Distributed Storage Allocatios for Optial Delay Derek Leog Departet of Electrical Egieerig Califoria Istitute of echology Pasadea, Califoria 925, USA derekleog@caltechedu Alexadros G Diakis Departet of

More information

Unit 20 Hypotheses Testing

Unit 20 Hypotheses Testing Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect

More information

Stretch Factor of Curveball Routing in Wireless Network: Cost of Load Balancing

Stretch Factor of Curveball Routing in Wireless Network: Cost of Load Balancing Stretch Factor of urveball outig i Wireless Network: ost of Load Balacig Fa Li Yu Wag The Uiversity of North arolia at harlotte, USA Eail: {fli, yu.wag}@ucc.edu Abstract outig i wireless etworks has bee

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling

GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling : A Global -based Redistributio Approach to Accelerate RAID-5 Scalig Chetao Wu ad Xubi He Departet of Electrical & Coputer Egieerig Virgiia Coowealth Uiversity {wuc4,xhe2}@vcu.edu Abstract Uder the severe

More information

Notes on Hypothesis Testing

Notes on Hypothesis Testing Probability & Statistics Grishpa Notes o Hypothesis Testig A radom sample X = X 1,..., X is observed, with joit pmf/pdf f θ x 1,..., x. The values x = x 1,..., x of X lie i some sample space X. The parameter

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

ECONOMICS. Calculating loan interest no. 3.758

ECONOMICS. Calculating loan interest no. 3.758 F A M & A N H S E E S EONOMS alculatig loa iterest o. 3.758 y Nora L. Dalsted ad Paul H. Gutierrez Quick Facts... The aual percetage rate provides a coo basis to copare iterest charges associated with

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Integer programming solution methods. Exactly where on this line this optimal solution lies we do not know, but it must be somewhere!

Integer programming solution methods. Exactly where on this line this optimal solution lies we do not know, but it must be somewhere! Iteger prograig solutio ethods J E Beasley Itroductio Suppose that we have soe proble istace of a cobiatorial optiisatio proble ad further suppose that it is a iiisatio proble. If, as i Figure 1, we draw

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uiv-ates.fr Jea-Luc Marichal Applied Mathematics

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

9.8: THE POWER OF A TEST

9.8: THE POWER OF A TEST 9.8: The Power of a Test CD9-1 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Throughput and Delay Analysis of Hybrid Wireless Networks with Multi-Hop Uplinks

Throughput and Delay Analysis of Hybrid Wireless Networks with Multi-Hop Uplinks This paper was preseted as part of the ai techical progra at IEEE INFOCOM 0 Throughput ad Delay Aalysis of Hybrid Wireless Networks with Multi-Hop Upliks Devu Maikata Shila, Yu Cheg ad Tricha Ajali Dept.

More information

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

More information

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot STAPRO 66 pp: - col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N -- SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorov-type test for mootoicity of regressio Cecile Durot Laboratoire

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

The Computational Rise and Fall of Fairness

The Computational Rise and Fall of Fairness Proceedigs of the Twety-Eighth AAAI Coferece o Artificial Itelligece The Coputatioal Rise ad Fall of Fairess Joh P Dickerso Caregie Mello Uiversity dickerso@cscuedu Joatha Golda Caregie Mello Uiversity

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Optimizing Result Prefetching in Web Search Engines. with Segmented Indices. Extended Abstract. Department of Computer Science.

Optimizing Result Prefetching in Web Search Engines. with Segmented Indices. Extended Abstract. Department of Computer Science. Optiizig Result Prefetchig i Web Search Egies with Segeted Idices Exteded Abstract Roy Lepel Shloo Mora Departet of Coputer Sciece The Techio, Haifa 32000, Israel eail: frlepel,orag@cs.techio.ac.il Abstract

More information

Chapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing

Chapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing Chapter 10 Hypothesis Tests Regardig a Parameter A secod type of statistical iferece is hypothesis testig. Here, rather tha use either a poit (or iterval) estimate from a simple radom sample to approximate

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely

More information

An example of non-quenched convergence in the conditional central limit theorem for partial sums of a linear process

An example of non-quenched convergence in the conditional central limit theorem for partial sums of a linear process A example of o-queched covergece i the coditioal cetral limit theorem for partial sums of a liear process Dalibor Volý ad Michael Woodroofe Abstract A causal liear processes X,X 0,X is costructed for which

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

the product of the hook-lengths is over all boxes of the diagram. We denote by d (n) the number of semi-standard tableaux:

the product of the hook-lengths is over all boxes of the diagram. We denote by d (n) the number of semi-standard tableaux: O Represetatio Theory i Coputer Visio Probles Ao Shashua School of Coputer Sciece ad Egieerig Hebrew Uiversity of Jerusale Jerusale 91904, Israel eail: shashua@cs.huji.ac.il Roy Meshula Departet of Matheatics

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Distributions of Order Statistics

Distributions of Order Statistics Chapter 2 Distributios of Order Statistics We give some importat formulae for distributios of order statistics. For example, where F k: (x)=p{x k, x} = I F(x) (k, k + 1), I x (a,b)= 1 x t a 1 (1 t) b 1

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

Lecture 10: Hypothesis testing and confidence intervals

Lecture 10: Hypothesis testing and confidence intervals Eco 514: Probability ad Statistics Lecture 10: Hypothesis testig ad cofidece itervals Types of reasoig Deductive reasoig: Start with statemets that are assumed to be true ad use rules of logic to esure

More information

Transient Vibration of the single degree of freedom systems.

Transient Vibration of the single degree of freedom systems. Trasiet Vibratio of the sigle degree of freedo systes. 1. -INTRODUCTION. Trasiet vibratio is defied as a teporarily sustaied vibratio of a echaical syste. It ay cosist of forced or free vibratios, or both

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E. MCCARTHY, SANDRA POTT, AND BRETT D. WICK

THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E. MCCARTHY, SANDRA POTT, AND BRETT D. WICK THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E MCCARTHY, SANDRA POTT, AND BRETT D WICK Abstract We provide a ew proof of Volberg s Theorem characterizig thi iterpolatig sequeces as those for

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Gibbs Distribution in Quantum Statistics

Gibbs Distribution in Quantum Statistics Gibbs Distributio i Quatum Statistics Quatum Mechaics is much more complicated tha the Classical oe. To fully characterize a state of oe particle i Classical Mechaics we just eed to specify its radius

More information

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series 8 Fourier Series Our aim is to show that uder reasoable assumptios a give -periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series

More information

Research Article Analyzing Big Data with the Hybrid Interval Regression Methods

Research Article Analyzing Big Data with the Hybrid Interval Regression Methods Hidawi Publishig Corporatio e Scietific World Joural Volue 204, Article ID 24392, 8 pages http://dx.doi.org/0.55/204/24392 Research Article Aalyzig Big Data with the Hybrid Iterval Regressio Methods Chia-Hui

More information

Quantum bouncer with dissipation

Quantum bouncer with dissipation ENSEÑANZA REVISTA MEXICANA DE FÍSICA E5 ) 16 131 DICIEMBRE 006 Quatu boucer with dissipatio G. López G. Gozález Departaeto de Física de la Uiversidad de Guadalajara, Apartado Postal 4-137, 44410 Guadalajara,

More information

An Electronic Tool for Measuring Learning and Teaching Performance of an Engineering Class

An Electronic Tool for Measuring Learning and Teaching Performance of an Engineering Class A Electroic Tool for Measurig Learig ad Teachig Perforace of a Egieerig Class T.H. Nguye, Ph.D., P.E. Abstract Creatig a egieerig course to eet the predefied learig objectives requires a appropriate ad

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

AP Calculus AB 2006 Scoring Guidelines Form B

AP Calculus AB 2006 Scoring Guidelines Form B AP Calculus AB 6 Scorig Guidelies Form B The College Board: Coectig Studets to College Success The College Board is a ot-for-profit membership associatio whose missio is to coect studets to college success

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES Chapter 9 SEQUENCES AND SERIES Natural umbers are the product of huma spirit. DEDEKIND 9.1 Itroductio I mathematics, the word, sequece is used i much the same way as it is i ordiary Eglish. Whe we say

More information

Algebra Vocabulary List (Definitions for Middle School Teachers)

Algebra Vocabulary List (Definitions for Middle School Teachers) Algebra Vocabulary List (Defiitios for Middle School Teachers) A Absolute Value Fuctio The absolute value of a real umber x, x is xifx 0 x = xifx < 0 http://www.math.tamu.edu/~stecher/171/f02/absolutevaluefuctio.pdf

More information

AP Calculus BC 2003 Scoring Guidelines Form B

AP Calculus BC 2003 Scoring Guidelines Form B AP Calculus BC Scorig Guidelies Form B The materials icluded i these files are iteded for use by AP teachers for course ad exam preparatio; permissio for ay other use must be sought from the Advaced Placemet

More information

Irreducible polynomials with consecutive zero coefficients

Irreducible polynomials with consecutive zero coefficients Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem

More information

Lecture 5: Span, linear independence, bases, and dimension

Lecture 5: Span, linear independence, bases, and dimension Lecture 5: Spa, liear idepedece, bases, ad dimesio Travis Schedler Thurs, Sep 23, 2010 (versio: 9/21 9:55 PM) 1 Motivatio Motivatio To uderstad what it meas that R has dimesio oe, R 2 dimesio 2, etc.;

More information

arxiv:1506.03481v1 [stat.me] 10 Jun 2015

arxiv:1506.03481v1 [stat.me] 10 Jun 2015 BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,

More information