Lecture Notes on Nonparametrics

Size: px
Start display at page:

Download "Lecture Notes on Nonparametrics"

Transcription

1 Lecture Notes o Noparametrics Bruce E. Hase Uiversity of Wiscosi Sprig 2009

2 Itroductio Parametric meas ite-dimesioal. No-parametric meas i ite-dimesioal. Te di ereces are profoud. Typically, parametric estimates coverge at a 2 rate. No-parametric estimates typically coverge at a rate slower ta 2 : Typically, i parametric models tere is o distictio betwee te true model ad te tted model. I cotrast, o-parametric metods typically distiguis betwee te true ad tted models. No-parametric metods make te compleity of te tted model deped upo te sample. Te more iformatio is i te sample (i.e., te larger te sample size), te greater te degree of compleity of te tted model. Takig tis seriously requires a distict distributio teory. No-parametric teory ackowledges tat tted models are approimatios, ad terefore are ieretly misspeci ed. Misspeci catio implies estimatio bias. Typically, icreasig te compleitiy of a tted model decreases tis bias but icreases te estimatio variace. Noparametric metods ackowledge tis trade-o ad attempt to set model compleity to miimize a overall measure of t, typically mea-squared error (MSE). Tere are may oparametric statistical objects of potetial iterest, icludig desity fuctios (uivariate ad multivariate), desity derivatives, coditioal desity fuctios, coditioal distributio fuctios, regressio fuctios, media fuctios, quatile fuctios, ad variace fuctios. Sometimes tese oparametric objects are of direct iterest. Sometimes tey are of iterest oly as a iput to a secod-stage estimatio problem. If tis secod-stage problem is described by a ite dimesioal parameter we call te estimatio problem semiparametric. Noparametric metods typically ivolve some sort of approimatio or smootig metod. Some of te mai metods are called kerels, series, ad splies. Noparametric metods are typically ideed by a badwidt or tuig parameter wic cotrols te degree of compleity. Te coice of badwidt is ofte critical to implemetatio. Data-depedet rules for determiatio of te badwidt are terefore essetial for oparametric metods. Noparametric metods wic require a badwidt, but do ot ave a eplicit datadepedet rule for selectig te badwidt, are icomplete. Ufortuately tis is quite commo, due to te di culty i developig rigorous rules for badwidt selectio. Ofte i tese cases te badwidt is selected based o a related statistical problem. Tis is a feasible yet worrisome compromise. May oparametric problems are geeralizatios of uivariate desity estimatio. We will start wit tis simple settig, ad eplore its teory i cosiderable detail.

3 2 Kerel Desity Estimatio 2. Discrete Estimator Let X be a radom variable wit cotiuous distributio F () ad desity f() d df (): Te goal is to estimate f() from a radom sample fx ; :::; X g: Te distributio fuctio F () is aturally estimated by te EDF ^F () P i (X i ) : It migt seem atural to estimate te desity f() as te derivative of ^F (); d ^F d (); but tis estimator would be a set of mass poits, ot a desity, ad as suc is ot a useful estimate of f(). Istead, cosider a discrete derivative. For some small > 0, let ^f() ^F ( + ) ^F ( ) 2 We ca write tis as 2 ( + < X i + ) i 2 i k i j j were k(u) ( 2 ; juj 0 juj > is te uiform desity fuctio o [ ; ]: Te estimator ^f() couts te percetage of observatios wic are clsoe to te poit : If may observatios are ear ; te ^f() is large. Coversely, if oly a few X i are ear ; te ^f() is small. Te badwidt cotrols te degree of smootig. ^f() is a special case of wat is called a kerel estimator. Te geeral case is ^f() i k were k(u) is a kerel fuctio. 2.2 Kerel Fuctios A kerel fuctio k(u) : R! R is ay fuctio wic satis es R k(u)du : A o-egative kerel satis es k(u) 0 for all u: I tis case, k(u) is a probability desity fuctio. Te momets of a kerel are j (k) R uj k(u)du: A symmetric kerel fuctio satis es k(u) k( u) for all u: I tis case, all odd momets are zero. Most oparametric estimatio uses symmetric kerels, ad we focus o tis case. 2

4 Te order of a kerel, ; is de ed as te order of te rst o-zero momet. For eample, if (k) 0 ad 2 (k) > 0 te k is a secod-order kerel ad 2. If (k) 2 (k) 3 (k) 0 but 4 (k) > 0 te k is a fourt-order kerel ad 4. Te order of a symmetric kerel is always eve. Symmetric o-egative kerels are secod-order kerels. A kerel is iger-order kerel if > 2: Tese kerels will ave egative parts ad are ot probability desities. Tey are also refered to as bias-reducig kerels. Commo secod-order kerels are listed i te followig table Table : Commo Secod-Order Kerels Kerel Equatio R(k) 2 (k) eff(k) Uiform k 0 (u) 2 (juj ) 2 3 :0758 Epaecikov k (u) 3 4 u2 (juj ) 35 5 :0000 Biweigt k 2 (u) 5 6 u2 2 (juj ) 57 7 :006 Triweigt k 3 (u) u2 3 (juj ) :035 Gaussia k (u) p 2 ep u p :053 I additio to te kerel formula we ave listed its rougess R(k), secod momet 2 (k), ad its e ciecy eff(k), te last wic will be de ed later. Te rougess of a fuctio is R(g) g(u) 2 du: Te most commoly used kerels are te Epaecikov ad te Gaussia. Te kerels i te Table are special cases of te polyomial family k s (u) (2s + )!! 2 s+ s! u 2 s (juj ) were te double factorial meas (2s + )!! (2s + ) (2s ) 5 3 : Te Gaussia kerel is obtaied by takig te limit as s! after rescalig. Te kerels wit iger s are smooter, yieldig estimates ^f() wic are smooter ad possessig more derivatives. Estimates usig te Gaussia kerel ave derivatives of all orders. For te purpose of oparametric estimatio te scale of te kerel is ot uiquely de ed. Tat is, for ay kerel k(u) we could ave de ed te alterative kerel k (u) b k(ub) for some costat b > 0: Tese two kerels are equivalet i te sese of producig te same desity estimator, so log as te badwidt is rescaled. Tat is, if ^f() is calculated wit kerel k ad badwidt ; it is umerically idetically to a calculatio wit kerel k ad badwidt b: Some autors use di eret de itios for te same kerels. Tis ca cause cofusio uless you are attetive. 3

5 Higer-order kerels are obtaied by multiplyig a secod-order kerel by a (2 ) t order polyomial i u 2 : Eplicit formulae for te geeral polyomial family ca be foud i B. Hase (Ecoometric Teory, 2005), ad for te Gaussia family i Wad ad Scucay (Caadia Joural of Statistics, 990). 4t ad 6t order kerels of iterest are give i Tables 2 ad 3. Table 2: Fourt-Order Kerels Kerel Equatio R(k) 4 (k) eff(k) Epaecikov k 4; (u) u2 k (u) 54 2 :0000 Biweigt k 4;2 (u) 7 4 3u2 k 2 (u) :0056 Triweigt k 4;3 (u) u2 k 3 (u) :034 Gaussia k 4; (u) 2 3 u2 k (u) 2732 p 3 :0729 Table 3: Sit-Order Kerels Kerel Equatio R(k) 6 (k) eff(k) Epaecikov k 6; (u) u u4 k (u) :0000 Biweigt k 6;2 (u) u u4 k 2 (u) :0048 Triweigt k 6;2 (u) u2 + 3u 4 k 3 (u) :022 Gaussia k 6; (u) 8 5 0u2 + u 4 k (u) p 5 : Desity Estimator We ow discuss some of te umerical properties of te kerel estimator viewed as a fuctio of : ^f() i k First, if k(u) is o-egative te it is easy to see tat ^f() 0: However, tis is ot guareteed if k is a iger-order kerel. Tat is, i tis case it is possible tat ^f() < 0 for some values of : We tis appes it is prudet to zero-out te egative bits ad te rescale: ~f() ^f() ^f() 0 R ^f() : ^f() 0 d ~f() is o-egative yet as te same asymptotic properties as ^f(): Sice te itegral i te deomiator is ot aalytically available tis eeds to be calculated umerically. Secod, ^f() itegrates to oe. To see tis, rst ote tat by te cage-of-variables u (X i ) wic as Jacobia ; k d k (u) du : 4

6 Te cage-of variables u (X i tis trasformatio. Tus ^f()d i ) will be used frequetly, so it is useful to be familiar wit k d i k as claimed. Tus ^f() is a valid desity fuctio we k is o-egative. d Tird, we ca also calculate te umerical momets of te desity ^f(): Agai usig te cageof-variables u (X i te sample mea of te X i : ); te mea of te estimated desity is ^f()d i i k d (X i + u) k (u) du X i k (u) du + i X i i Te secod momet of te estimated desity is 2 ^f()d i i i 2 k d (X i + u) 2 k (u) du i (k): i It follows tat te variace of te desity ^f() is 2 ^f()d 2 ^f()d i X i k(u)du + i ^ (k) uk (u) du X 2 i i 2 u 2 k (u) du i! 2 X i i were ^ 2 is te sample variace. Tus te desity estimate i ates te sample variace by te factor 2 2 (k). Tese are te umerical mea ad variace of te estimated desity ^f(); ot its samplig 5

7 mea ad variace. 2.4 Estimatio Bias It is useful to observe tat epectatios of kerel trasformatios ca be writte as itegrals wic take te form of a covolutio of te kerel ad te desity fuctio: E k z k f(z)dz Usig te cage-of variables u (z ); tis equals By te liearity of te estimator we see k (u) f( + u)du: E ^f() E k i k (u) f( + u)du Te last epressio sows tat te epected value is a average of f(z) locally about : Tis itegral (typically) is ot aalytically solvable, so we approimate it usig a Taylor epasio of f( + u) i te argumet u; wic is valid as! 0: For a t-order kerel we take te epasio out to te t term f ( + u) f() + f () ()u + 2 f (2) () 2 u 2 + 3! f (3) () 3 u 3 + +! f () () u + o ( ) : Te remaider is of smaller order ta as! ; wic is writte as o( ): (Tis epasio assumes f (+) () eists.) Itegratig term by term ad usig R k (u) du ad te de itio R k (u) uj du j (k); k (u) f ( + u) du f() + f () () (k) + 2 f (2) () 2 2 (k) + 3! f (3) () 3 3 (k) + +! f () () (k) + o ( ) f() +! f () () (k) + o ( ) were te secod equality uses te assumptio tat k is a t order kerel (so j (k) 0 for j < ). 6

8 Tis meas tat E ^f() E k i f() +! f () () (k) + o ( ) : Te bias of ^f() is te Bias( ^f()) E ^f() f()! f () () (k) + o ( ) : For secod-order kerels, tis simpli es to Bias( ^f()) 2 f (2) () 2 2 (k) + O 4 : For secod-order kerels, te bias is icreasig i te square of te badwidt. Smaller badwidts imply reduced bias. Te bias is also proportioal to te secod derivative of te desity f (2) (): Ituitively, te estimator ^f() smoots data local to X i ; so is estimatig a smooted versio of f(): Te bias results from tis smootig, ad is larger te greater te curvature i f(): We iger-order kerels are used (ad te desity as eoug derivatives), te bias is proportioal to ; wic is of lower order ta 2 : Tus te bias of estimates usig iger-order kerels is of lower order ta estimates from secod-order kerels, ad tis is wy tey are called bias-reducig kerels. Tis is te advatage of iger-order kerels. 2.5 Estimatio Variace Sice te kerel estimator is a liear estimator, ad k var ^f() 2 var k 2 Ek 2 is iid, Ek 2 From our aalysis of bias we kow tat Ek f()+o() so te secod term is O : For te rst term, write te epectatio as a itegral, make a cage-of-variables ad a rst-order 7

9 Taylor epasio Ek 2 z k 2 f(z)dz k (u) 2 f ( + u) du k (u) 2 (f () + O ()) du f () R(k) + O () were R(k) R k (u)2 du is te rougess of te kerel. Togeter, we see Te remaider O 2.6 Mea-Squared Error f () R(k) var ^f() + O is of smaller order ta te O leadig term, sice! : A commo ad coveiet measure of estimatio precisio is te mea-squared error MSE( ^f()) 2 E ^f() f() Bias( ^f()) 2 + var ^f() 2 '! f () () (k) + f () R(k) 2 (k) (!) 2 f () () 2 2 f () R(k) + AMSE( ^f()) Sice tis approimatio is based o asymptotic epasios tis is called te asymptotic measquared-error (AMSE). Note tat it is a fuctio of te sample size ; te badwidt ; te kerel fuctio (troug ad R(k)), ad varies wit as f () () ad f() vary. Notice as well tat te rst term (te squared bias) is icreasig i ad te secod term (te variace) is decreasig i : For MSE( ^f()) to declie as! bot of tese terms must get small. Tus as! we must ave! 0 ad! : Tat is, te badwidt must decrease, but ot at a rate faster ta sample size. Tis is su ciet to establis te poitwise cosistecy of te estimator. Tat is, for all ; ^f()! p f() as!. We call tis poitwise covergece as it is valid for eac idividually. We discuss uiform covergece later. 8

10 A global measure of precisio is te asymptotic mea itegrated squared error (AMISE) AM ISE AMSE( ^f())d 2 (k) (!) 2 R f () 2 + R(k) : were R(f () ) R f () () 2 d is te rougess of f () : 2.7 Asymptotically Optimal Badwidt Te AMISE formula epresses te MSE as a fuctio of : Te value of wic miimizes tis epressio is called te asymptotically optimal badwidt. Te solutio is foud by takig te derivative of te AMISE wit respect to ad settig it equal to zero: wit solutio d d AMISE d 2 (k) d (!) 2 R f () 2 + R(k) (k) (!) 2 R f () R(k) C (k; f) (2+) C (k; f) R f () (2+) A (k) A (k) (!) 2 R(k) 2 2 (k)! (2+) Te optimal badwidt is propotioal to (2+) : We say tat te optimal badwidt is of order O (2+) : For secod-order kerels te optimal rate is O 5 : For iger-order kerels te rate is slower, suggestig tat badwidts are geerally larger ta for secod-order kerels. Te ituitio is tat sice iger-order kerels ave smaller bias, tey ca a ord a larger badwidt. Te costat of proportioality C (k; f) depeds o te kerel troug te fuctio A (k) (wic ca be calculated from Table ), ad te desity troug R(f () ) (wic is ukow): If te badwidt is set to 0 ; te wit some simpli catio te AMISE equals AMISE 0 (k) ( + 2) R f ()! 2 (k)r (k) 2 (2+) (!) 2 (2) 2 2(2+) : 9

11 For secod-order kerels, tis equals AMISE 0 (k) (k)R(k) 4 R f (2) 5 45 : As gets large, te covergece rate approaces te parametric rate : Tus, at least asymptotically, te slow covergece of oparametric estimatio ca be mitigated troug te use of iger-order kerels. Tis seems a bit magical. Wat s te catc? For oe, te improvemet i covergece rate requires tat te desity is su cietly smoot tat derivatives eist up to te ( + ) t order. As te desity becomes icreasigly smoot, it is easier to approimate by a low-dimesioal curve, ad gets closer to a parametric-type problem. Tis is eploitig te smootess of f; wic is ieretly ukow. Te oter catc is tat tere is a some evidece tat te bee ts of igerorder kerels oly develop we te sample size is fairly large. My sese is tat i small samples, a secod-order kerel would be te best coice, i moderate samples a 4t order kerel, ad i larger samples a 6t order kerel could be used. 2.8 Asymptotically Optimal Kerel Give tat we ave picked te kerel order, wic kerel sould we use? Eamiig te epressio AMISE 0 we ca see tat for ed te coice of kerel a ects te asymptotic precisio troug te quatity (k) R(k) : All else equal, AMISE will be miimized by selectig te kerel wic miimizes tis quatity. As we discussed earlier, oly te sape of te kerel is importat, ot its scale, so we ca set. Te te problem reduces to miimizatio of R(k) R k(u)2 du subject to te costraits R k(u)du ad R u k(u)du : Tis is a problem i te calculus of variatios. It turs out tat te solutio is a scaled of k ; ( see Muller (Aals of Statistics, 984)). As te scale is irrelevat, tis meas tat for estimatio of te desity fuctio, te igerorder Epaecikov kerel k ; wit optimal badwidt yields te lowest possible AMISE. For tis reaso, te Epaecikov kerel is ofte called te optimal kerel. To compare kerels, its relative e ciecy is de ed as eff(k) AMISE0 (k) (+2)2 AMISE 0 (k ; ) 2 (k) 2 R (k) ( 2 (k ; )) 2 R (k ; ) Te ratios of te AMISE is raised to te power ( + 2) 2 as for large ; te AMISE will be te same weter we use observatios wit kerel k ; or eff(k) observatios wit kerel k. Tus te pealty eff(k) is epressed as a percetage of observatios. Te e ciecies of te various kerels are give i Tables -3. Eamiig te secod-order kerels, we see tat relative to te Epaecikov kerel, te uiform kerel pays a pealty of about 7%, te Gaussia kerel a pealty of about 5%, te Triweigt kerel about.4%, ad te Biweigt 0

12 kerel less ta %. Eamiig te 4t ad 6t-order kerels, we see tat te relative e ciecy of te Gaussia kerel deteriorates, wile tat of te Biweigt ad Triweigt sligtly improves. Te di ereces are ot big. Still, te calculatio suggests tat te Epaecikov ad Biweigt kerel classes are good coices for desity estimatio. 2.9 Rule-of-Tumb Badwidt Te optimal badwidt depeds o te ukow quatity R f () : Silverma proposed tat we try te badwidt computed by replacig R f () i te optimal formula by R were g is a referece desity a plausible cadidate for f; ad ^ 2 is te sample stadard deviatio. Te stadard coice is to set g ^ ; te N(0; ^ 2 ) desity. Te idea is tat if te true desity is ormal, te te computed badwidt will be optimal. If te true desity is reasoably close to te ormal, te te badwidt will be close to optimal. Wile ot a perfect solutio, it is a good place to start lookig. For ay desity g; if we set g () g(); te g () () g () (): Tus R g () (2+) (2+) g () () 2 d (2+) 2 2 g () () 2 d 2 R g () (2+) : (2+) g () () 2 d g () ^ Furtermore, Tus! (2+) R () (2+) 2! 2 : (2)! R () ^! (2+) (2+) 2! 2^ : (2)! Te rule-of-tumb badwidt is te ^C (k) (2+) were C (k) R () (2+) A (k) 2 2 (!) 3 R(k) 2 (2)! 2 (k)! (2+) We collect tese costats i Table 4. Table 4: Rule of Tumb Costats

13 Kerel Epaecikov 2:34 3:03 3:53 Biweigt 2:78 3:39 3:84 Triweigt 3:5 3:72 4:3 Gaussia :06 :08 :08 Silverma Rule-of-Tumb: ^C (k) (2+) were ^ is te sample stadard deviatio, is te order of te kerel, ad C (k) is te costat from Table 4. If a Gaussia kerel is used, tis is ofte simpli ed to ^ (2+) : I particular, for te stadard secod-order ormal kerel, ^ 5 : 2.0 Desity Derivatives Cosider te problem of estimatig te r t derivative of te desity: f (r) () dr d r f(): A atural estimator is foud by takig derivatives of te kerel desity estimator. Tis takes te form were ^f (r) () dr d r ^f() +r k (r) i k (r) () dr d r k(): Tis estimator oly makes sese if k (r) () eists ad is o-zero. Sice te Gaussia kerel as derivatives of all orders tis is a commo coice for derivative estimatio. Te asymptotic aalysis of tis estimator is similar to tat of te desity, but wit a couple of etra wrikles ad oticably di eret results. First, to calculate te bias we observe tat E k(r) +r z k(r) +r f(z)dz z To simplify tis epressio we use itegratio by parts. As te itegral of k (r) is z k (r ) ; we d tat te above epressio equals ) z k(r r Repeatig tis a total of r times, we obtai z k 2 f () (z)dz: f (r) (z)dz:

14 Net, apply te cage of variables to obtai k (u) f (r) ( + u)dz: Now epad f (r) ( + u) i a t-order Taylor epasio about, ad itegrate te terms to d tat te above equals f (r) () +! f (r+) () (k) + o ( ) were is te order of te kerel. Hece te asymptotic bias is Bias( ^f (r) ()) E ^f (r) () f (r) ()! f (r+) () (k) + o ( ) : Tis of course presumes tat f is di eretiable of order at least r + +. For te variace, we d var ^f (r) () Te AMSE ad AMISE are ad var k (r) 2+2r 2 2 Ek(r) Ek(r) 2+2r +r z 2 2+2r k (r) f(z)dz f (r) () 2 + O +2r k (r) (u) 2 f ( + u) du + O f () +2r k (r) (u) 2 du + O f () R(k(r) ) +2r + O : AMSE( ^f (r) ()) f (r+) () (k) (!) 2 + f () R(k(r) ) +2r AMISE( ^f (r) ()) R f (r+) 2 2 (k) (!) 2 + R(k(r) ) +2r : Note tat te order of te bias is te same as for estimatio ofte desity. But te variace is ow of order O +2r wic is muc larger ta te O foud earlier. 3

15 Te asymptotically optimal badwidt is r C r; (k; f) (+2r+2) C r; (k; f) R f (r+) (+2r+2) Ar; (k) A r; (k) ( + 2r) (!) 2 R(k (r) ) 2 2 (k)! (+2r+2) Tus te optimal badwidt coverges at a slower rate ta for desity estimatio. Give tis badwidt, te rate of covergece for te AMISE is O 2(2r+2+) ; wic is slower ta te O 2(2+) 45 ) rate we r 0: We see tat we eed a di eret badwidt for estimatio of derivatives ta for estimatio of te desity. Tis is a commo situatio wic arises i oparametric aalysis. Te optimal amout of smootig depeds upo te object beig estimated, ad te goal of te aalysis. Te AMISE wit te optimal badwidt is AMISE( ^f (r) 2 (k) ()) ( + 2r + 2) (!) 2 ( + 2r) (2r+)(+2r+2) R k (r)! 2(+2r+2) 2(+2r+2) : 2 We ca also ask te questio of wic kerel fuctio is optimal, ad tis is addressed by Muller (984). Te problem amouts to miimizig R k (r) subject to a momet coditio, ad te solutio is to set k equal to k ;r+ ; te polyomial kerel of t order ad epoet r +: Tus to a rst derivative it is optimal to use a member of te Biweigt class ad for a secod derivative a member of te Triweigt class. Te relative e ciecy of a kerel k is te eff(k) AMISE0 (k) AMISE 0 (k ;r+ ) 2 (k) 2 (k ;r+ ) (+2r)2 (+2+2r)2 R k (r) : R k (r) ;r+ Te relative e ciecies of te various kerels are preseted i Table 5. (Te Epaecikov kerel is ot cosidered as it is iappropriate for derivative estimatio, ad similarly te Biweigt kerel for r 2): I cotrast to te case r 0; we see tat te Gaussia kerel is igly ie ciet, wit te e ciecy loss icreasig wit r ad : Tese calculatios suggest tat we estimatig desity derivatives it is importat to use te appropriate kerel. Table 5: Relative E ciecy eff(k) 4

16 Biweigt Triweigt Gaussia r 2 :0000 :085 :29 4 :0000 :059 : :0000 :036 :356 r 2 2 :0000 : :0000 : :0000 :6275 Te Silverma Rule-of-Tumb may also be applied to desity derivative estimatio. Agai usig te referece desity g ; we d te rule-of-tumb badwidt is C r; (k) ^ (2r+2+) were 2 ( + 2r) (!) 2 (r + )!R k (r)! (2r+2+) C r; (k) : (k) (2r + 2)! Te costats C r;v are collected i Table 6. sligtly decreasig as r icreases. For all kerels, te costats C r; are similar but Table 6: Rule of Tumb Costats Biweigt Triweigt Gaussia r 2 2:49 2:83 0:97 4 3:8 3:49 :03 6 3:44 3:96 :04 r 2 2 2:70 0:94 2. Multivariate Desity Estimatio 4 3:35 :00 6 3:84 :02 Now suppose tat X i is a q-vector ad we wat to estimate its desity f() f( ; :::; q ): A multivariate kerel estimator takes te form ^f() jhj K H (X i i ) were K(u) is a multivariate kerel fuctio depedig o a badwidt vector H ( ; :::; q ) 0 ad jhj 2 q : A multivariate kerel satis es Tat is, K(u) (du) K(u)du du q Typically, K(u) takes te product form: K(u) k (u ) k (u 2 ) k (u q ) : 5

17 As i te uivariate case, ^f() as te property tat it itegrates to oe, ad is o-egative if K(u) 0: We K(u) is a product kerel te te margial desities of ^f() equal uivariate kerel desity estimators wit kerel fuctios k ad badwidts j : is Wit some work, you ca sow tat we K(u) takes te product form, te bias of te estimator ad te variace is Hece te AMISE is Bias( ^f()) (k)! var j j 0 AMISE ^f() 2 (!) 2 f() j + o + + q f () R(K) jhj + O f () R(k)q 2 q + j j f() A j 2 : R(k) q (d) + 2 q Tere is o closed-form solutio for te badwidt vector wic miimizes tis epressio. However, eve witout doig do, we ca make a couple of observatios. First, te AMISE depeds o te kerel fuctio oly troug R(k) ad 2 (k); so it is clear tat for ay give ; te optimal kerel miimizes R(k); wic is te same as i te uivariate case. Secod, te optimal badwidts will all be of order (2+q) ad te optimal AMISE of order 2(2+q) : Tis rates are slower ta te uivariate (q ) case. Te fact tat dimesio as a adverse e ect o covergece rates is called te curse of dimesioality. May teoretical papers circumvet tis problem troug te followig trick. Suppose you eed te AMISE of te estimator to coverge at a rate O 2 or faster. Tis requires 2 (2 + q) > 2; or q < 2: For secod-order kerels ( 2) tis restricts te dimesio to be 3 or less. Wat some autors will do is slip i a assumptio of te form: Assume f() is di eretiable of order + were > q2; ad te claim tat teir results old for all q: Te trouble is tat wat te autor is doig is imposig greater smootess as te dimesio icreases. Tis does t really avoid te curse of dimesioality, rater it ides it beid wat appears to be a tecical assumptio. Te bottom lie is tat oparametric objects are muc arder to estimate i iger dimesios, ad tat is wy it is called a curse. To derive a rule-of-tumb, suppose tat 2 q : Te AMISE ^f() 2 (k)r (r f) (!) R(k)q q 6

18 were qx r f() We d tat te optimal j j f(): 0! (!) 2 (2+q) qr(k) q 2 2 (k)r (r (2+q) f) For a rule-of-tumb badwidt, we replace f by te multivariate ormal desity : We ca calculate tat R (r ) q q2 2 q+ (2 )!! + (q ) (( )!!) 2 : Makig tis substitutio, we obtai 0 C (k; q) (2+q) were 0 C (k; q2 2 q+ (!) 2 R(k) q 2 (k) (2 )!! + (q ) (( )!!) 2 A (2+q) Now tis assumed tat all variables ad uit variace. Rescalig te badwidts by te stadard deviatio of eac variable, we obtai te rule-of-tumb badwidt for te j t variable: j ^ j C (k; q) (2+q) : Numerical values for te costats C (k; q) are give i Table 7 for q 2; 3; 4. : Table 7: Rule of Tumb Costats 2 q 2 q 3 q 4 Epaecikov 2:20 2:2 2:07 Biweigt 2:6 2:52 2:46 Triweigt 2:96 2:86 2:80 Gaussia :00 0:97 0:95 4 Epaecikov 3:2 3:20 3:27 Biweigt 3:50 3:59 3:67 Triweigt 3:84 3:94 4:03 Gaussia :2 :6 :9 6 Epaecikov 3:69 3:83 3:96 Biweigt 4:02 4:8 4:32 Triweigt 4:33 4:50 4:66 Gaussia :3 :8 :23 7

19 2.2 Least-Squares Cross-Validatio Rule-of-tumb badwidts are a useful startig poit, but tey are i eible ad ca be far from optimal. Plug-i metods take te formula for te optimal badwidt, ad replace te ukows by estimates, e.g. R ^f () : But tese iitial estimates temselves deped o badwidts. Ad eac situatio eeds to be idividually studied. Plug-i metods ave bee torougly studied for uivariate desity estimatio, but are less well developed for multivariate desity estimatio ad oter cotets. A eible ad geerally applicable data-depedet metod is cross-validatio. Tis metod attempts to make a direct estimate of te squared error, ad pick te badwidt wic miimizes tis estimate. I may seses te idea is quite close to model selectio based o a iformatio criteria, suc as Mallows or AIC. Give a badwidt ad desity estimate ^f() of f(); de e te mea itegrated squared error (MISE) MISE () ^f() f() 2 (d) ^f() 2 (d) 2 ^f()f() (d) + f() 2 (d) Optimally, we wat ^f() to be as close to f() as possible, ad tus for MISE () to be as small as possible. As M ISE () is ukow, cross-validatio replaces it wit a estimate. Te goal is to d a estimate of MISE (), ad d te wic miimizes tis estimate. As te tird term i te above epressio does ot deped o te badwidt ; it ca be igored. Te rst term ca be directly calculated. For te uivariate case ^f() 2 d i 2 2 i j k k! 2 d Xj k d Te covolutio of k wit itself is k() R k (u) k ( u) du R k (u) k (u ) du (by symmetry of k). Te makig te cage of variables u X i ; X i X j k k d k (u) k u du k X j : 8

20 Hece ^f() 2 d 2 i j X j k : Discussio of k () ca be foud i te followig sectio. I te multivariate case, ^f() 2 d 2 jhj i j K H (X i X j ) were K (u) k (u ) k (u q ) Te secod term i te epressio for MISE () depeds o f() so is ukow ad must be estimated. A itegral wit respect to f() is a epectatio wit respect to te radom variable X i : Wile we do t kow te true epectatio, we ave te sample, so ca estimate tis epectatio by takig te sample average. I geeral, a reasoable estimate of te itegral R g()f()d is P i g (X i) ; suggestig te estimate P i ^f (X i ) : I tis case, owever, te fuctio ^f () is itself a fuctio of te data. I particular, it is a fuctio of te observatio X i : A way to clea tis up is to replace ^f (X i ) wit te leave-oe-out estimate ^f i (X i ) ; were ^f i () ( ) jhj X K H (X j j6i ) is te desity estimate computed witout observatio X i ; ad tus ^f i (X i ) ( ) jhj X K H (X j X i ) : j6i Tat is, ^f i (X i ) is te desity estimate at X i ; computed wit te observatios ecept X i : We ed up suggestig to estimate R ^f()f()d wit i ^f i (X i ) ( ) jhj X K H (X j i j6i X i ) : It turs out tat tis is a ubiased estimate, i te sese tat E! ^f i (X i ) i E ^f()f()d 9

21 To see tis, te LHS is E ^f (X ) E E ^f (X ) j X ; :::; X E ^f ()f() (d) E ^f() f() (d) E ^f()f() (d) te secod-to-last equality ecagig itegratio, ad sice E ^f() depeds oly i te badwidt, ot te sample size. Togeter, te least-squares cross-validatio criterio is CV ( ; :::; q ) 2 jhj i j K H (X i X j ) 2 ( ) jhj X K H (X j X i ) : i j6i Aoter way to write tis is CV ( ; :::; q ) K (0) jhj + 2 jhj ' R(k)q jhj + 2 jhj X i j6i X i j6i K H (X i X j ) 2 ( ) jhj X K H (X j i j6i K H (X i X j ) 2K H (X j X i ) X i ) usig K (0) k(0) q ad k(0) R k (u) 2 ; ad te approimatio is by replacig by : Te cross-validatio badwidt vector are te value ^ ; :::; ^ q wic miimizes CV ( ; :::; q ) : Te cross-validatio fuctio is a complicated fuctio of te badwidts; so tis eeds to be doe umerically. I te uivariate case, is oe-dimesioal tis is typically doe by plottig (a grid searc). Pick a lower ad upper value [ ; 2 ]; de e a grid o tis set, ad compute CV () for eac i te grid. A plot of CV () agaist is a useful diagostic tool. Te CV () fuctio ca be misleadig for small values of : Tis arises we tere is data roudig. Some autors de e te cross-validatio badwidt as te largest local miimer of CV () (rater ta te global miimizer). Tis ca also be avoided by pickig a sesible iitial rage [ ; 2 ]: Te rule-of-tumb badwidt ca be useful ere. If 0 is te rule-of-tumb badwidt, te use 0 3 ad or similar. We we discussed above, CV ( ; :::; q ) + R f() 2 (d) is a ubiased estimate of MISE () : Tis by itself does ot mea tat ^ is a good estimate of 0 ; te miimizer of MISE () ; but it 20

22 turs out tat tis is ideed te case. Tat is, ^ 0 0! p 0 Tus, ^ is asymptotically close to 0 ; but te rate of covergece is very slow. Te CV metod is quite eible, as it ca be applied for ay kerel fuctio. If te goal, owever, is estimatio of desity derivatives, te te CV badwidt ^ is ot appropriate. A practical solutio is te followig. Recall tat te asymptotically optimal badwidt for estimatio of te desity takes te form 0 C (k; f) (2+) ad tat for te r t derivative is r C r; (k; f) (+2r+2) : Tus if te CV badwidt ^ is a estimate of 0 ; we ca estimate C (k; f) by ^C ^ (2+) : We also saw (at least for te ormal referece family) tat C r; (k; f) was relatively costat across r: Tus we ca replace C r; (k; f) wit ^C to d ^ r ^C (+2r+2) ^ (2+) (+2r+2) ^ (+2r+2)(2+)(+2r+2) (2+)(+2r+2)(2+) ^ 2r((2+)(+2r+2)) Alteratively, some autors use te rescalig ^ r ^ (+2)(+2r+2) 2.3 Covolutio Kerels If k() () te k() ep( 2 4) p 4: We k() is a iger-order Gaussia kerel, Wad ad Scucay (Caadia Joural of Statistics, 990, p. 20) give a epressio for k(). For te polyomial class, because te kerel k(u) as support o [ ; ]; it follows tat k() as support o [ 2; 2] ad for 0 equals k() R k(u)k( u)du: Tis itegral ca be easily solved usig algebraic software (Maple, Matematica), but te epressio ca be rater cumbersome. For te 2d order Epaecikov, Biweigt ad Triweigt kerels, for 0 2; k () 3 60 (2 ) k 2 () (2 ) k 3 () (2 ) Tese fuctios are symmetric, so te values for < 0 are foud by k() k( ): 2

23 For te 4t, ad 6t order Epaecikov kerels, for 0 2; k 4; () (2 ) k 6; () (2 ) Asymptotic Normality Te kerel estimator is te sample average ^f() i jhj K H (X i ) : We ca terefore apply te cetral limit teorem. But te covergece rate is ot p : We kow tat var ^f() f () R(k)q 2 q + O : so te covergece rate is p 2 q : We we apply te CLT we scale by tis, rater ta te covetioal p : As te estimator is biased, we also ceter at its epectatio, rater ta te true value Tus p 2 q ^f() E ^f() p p 2 q p 2 q p i i jhj K H (X i ) E jhj K H (X i i jhj K H (X i ) E jhj K H (X i i ) ) were We see tat i p 2 q jhj K H (X i ) E jhj K H (X i var ( i ) ' f () R(k) q ) Hece by te CLT, p 2 q ^f() E ^f()! d N (0; f () R(k) q ) : 22

24 We also kow tat E( ^f()) f() + (k)! So aoter way of writig tis is p 2 q 0 ^f() j j j j f() j + o + + f() A j! d N (0; f () R(k) q ) : I te uivariate case tis is p ^f() f() (k) f (2) ()! d N (0; f () R(k))! Tis epressio is most useful we te badwidt is selected to be of optimal order, tat is C (2+) ; for te p C +2 ad we ave te equivalet statemet p ^f() f()! d N C +2 (k) f (2) (); f () R(k)! Tis says tat te desity estimator is asymptotically ormal, wit a o-zero asymptotic bias ad variace. Some autors play a dirty trick, by usig te assumptio tat is of smaller order ta te optimal rate, e.g. o (2+) : For te te obtai te result p ^f() f()! d N (0; f () R(k)) Tis appears muc icer. Te estimator is asymptotically ormal, wit mea zero! Tere are several costs. Oe, if te badwidt is really seleted to be sub-optimal, te estimator is simply less precise. A sub-optimal badwidt results i a slower covergece rate. Tis is ot a good tig. Te reductio i bias is obtaied at i icrease i variace. Aoter cost is tat te asymptotic distributio is misleadig. It suggests tat te estimator is ubiased, wic is ot oest. Fially, it is uclear ow to pick tis sub-optimal badwidt. I call tis assumptio a dirty trick, because it is slipped i by autors to make teir results cleaer ad derivatios easier. Tis type of assumptio sould be avoided. 2.5 Poitwise Co dece Itervals Te asymptotic distributio may be used to costruct poitwise co dece itervals for f(): I te uivariate case covetioal co dece itervals take te form ^f() 2 ^f () R(k) () 2 : 23

25 Tese are ot ecessarily te best coice, sice te variace equals te mea: Tis set as te ufortuate property tat it ca cotai egative values, for eample. Istead, cosider costructig te co dece iterval by ivertig a test statistic. H 0 : f() f 0 ; a t-ratio is t (f 0 ) ^f() f 0 p f0 R(k) : To test We reject H 0 if jt (f 0 )j > 2: By te o-rejectio rule, a asymptotic 95% co dece iterval for f is te set of f 0 wic do reject, i.e. te set of f suc tat jt (f)j 2: Tis is Tis set must be foud umerically. ( ) ^f() f C() f : p 2 fr(k) 24

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Hypothesis testing using complex survey data

Hypothesis testing using complex survey data Hypotesis testig usig complex survey data A Sort Course preseted by Peter Ly, Uiversity of Essex i associatio wit te coferece of te Europea Survey Researc Associatio Prague, 5 Jue 007 1 1. Objective: Simple

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series 8 Fourier Series Our aim is to show that uder reasoable assumptios a give -periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series

More information

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as: A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio

More information

A Recursive Formula for Moments of a Binomial Distribution

A Recursive Formula for Moments of a Binomial Distribution A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Solving Logarithms and Exponential Equations

Solving Logarithms and Exponential Equations Solvig Logarithms ad Epoetial Equatios Logarithmic Equatios There are two major ideas required whe solvig Logarithmic Equatios. The first is the Defiitio of a Logarithm. You may recall from a earlier topic:

More information

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal) 6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) No-parametric: o assumptio made about the distributio Advatages of assumig

More information

1. MATHEMATICAL INDUCTION

1. MATHEMATICAL INDUCTION 1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1

More information

A STRATIFIED SAMPLING PLAN FOR BILLING ACCURACY IN HEALTHCARE SYSTEMS

A STRATIFIED SAMPLING PLAN FOR BILLING ACCURACY IN HEALTHCARE SYSTEMS A STRATIFIED SAMPLING PLAN FOR BILLING ACCURACY IN HEALTHCARE SYSTEMS Jiracai Buddakulsomsiri a Partaa Partaadee b Swatatra Kacal a a Departmet of Idustrial ad Maufacturig Systems Egieerig, Uiversity of

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Building Blocks Problem Related to Harmonic Series

Building Blocks Problem Related to Harmonic Series TMME, vol3, o, p.76 Buildig Blocks Problem Related to Harmoic Series Yutaka Nishiyama Osaka Uiversity of Ecoomics, Japa Abstract: I this discussio I give a eplaatio of the divergece ad covergece of ifiite

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

THE ABRACADABRA PROBLEM

THE ABRACADABRA PROBLEM THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC TO: Users of the ACTEX Review Semiar o DVD for SOA Eam MLC FROM: Richard L. (Dick) Lodo, FSA Dear Studets, Thak you for purchasig the DVD recordig of the ACTEX Review Semiar for SOA Eam M, Life Cotigecies

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

arxiv:1506.03481v1 [stat.me] 10 Jun 2015

arxiv:1506.03481v1 [stat.me] 10 Jun 2015 BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

Partial Di erential Equations

Partial Di erential Equations Partial Di eretial Equatios Partial Di eretial Equatios Much of moder sciece, egieerig, ad mathematics is based o the study of partial di eretial equatios, where a partial di eretial equatio is a equatio

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

THE problem of fitting a circle to a collection of points

THE problem of fitting a circle to a collection of points IEEE TRANACTION ON INTRUMENTATION AND MEAUREMENT, VOL. XX, NO. Y, MONTH 000 A Few Methods for Fittig Circles to Data Dale Umbach, Kerry N. Joes Abstract Five methods are discussed to fit circles to data.

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k. 18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The

More information

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot STAPRO 66 pp: - col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N -- SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorov-type test for mootoicity of regressio Cecile Durot Laboratoire

More information

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu> (March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

4.3. The Integral and Comparison Tests

4.3. The Integral and Comparison Tests 4.3. THE INTEGRAL AND COMPARISON TESTS 9 4.3. The Itegral ad Compariso Tests 4.3.. The Itegral Test. Suppose f is a cotiuous, positive, decreasig fuctio o [, ), ad let a = f(). The the covergece or divergece

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006 Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam

More information

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern. 5.5 Fractios ad Decimals Steps for Chagig a Fractio to a Decimal. Simplify the fractio, if possible. 2. Divide the umerator by the deomiator. d d Repeatig Decimals Repeatig Decimals are decimal umbers

More information

A modified Kolmogorov-Smirnov test for normality

A modified Kolmogorov-Smirnov test for normality MPRA Muich Persoal RePEc Archive A modified Kolmogorov-Smirov test for ormality Zvi Drezer ad Ofir Turel ad Dawit Zerom Califoria State Uiversity-Fullerto 22. October 2008 Olie at http://mpra.ub.ui-mueche.de/14385/

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uiv-ates.fr Jea-Luc Marichal Applied Mathematics

More information

Original Research Comparison of Analytical and Numerical Solutions for Steady, Gradually Varied Open-Channel Flow

Original Research Comparison of Analytical and Numerical Solutions for Steady, Gradually Varied Open-Channel Flow Polis J. of Eviro. Stud. Vol., No. 4 (), 95-9 Origial Researc Compariso of Aalytical ad Numerical Solutios for Steady, Gradually Varied Ope-Cael Flow Jacek Kuratowski* Departmet of Hydroegieerig, West

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information