Proceedigs of the 3rd WSEAS It Cof o RENEWABLE ENERGY SOURCES O The Compariso of Several Goodess of Fit Tests: With Applicatio to Wid Speed Data FAZNA ASHAHABUDDIN, KAMARULZAMAN IBRAHIM, AND ABDUL AZIZ JEMAIN School of Mathematical Scieces, Faculty of Sciece ad Techology, Uiversiti Kebagsaa Malaysia, 436 UKM Bagi, Selagor MALAYSIA aza@ukmmy, kamarulz@ukmmy & azizj@ukmmy, Abstract: - I this paper a study is coducted to ivestigate the of several goodess of fit tests such as Kolmogorov Smirov (), Aderso-Darlig(), Cramer- vo- Mises () ad a proposed modificatio of Kolmogorov-Smirov goodess of fit test which icorporates a variace stabilizig trasformatio (F) The performaces of these selected tests are studied ad applied usig wid speed data This study shows that, the proposed test (F) performs better tha other GOF tests Key-Words: - Empirical distributio fuctio, goodess-of-fit, order statistics, wid speed, GEV 1 Itroductio Assessig the goodess of fit test of proposed probability model is a fudametal cocer i the applicatio of statistical methods Goodess of fit test has may applicatios i the area of applied statistics ad may works have bee carried out to compare the efficiecy of several goodess of fit tests procedures Goodess of fit tests (GOF) measure the degree of agreemet betwee the distributio of a observed sample data ad a theoretical statistical distributio The problems ivolve a compariso of the empirical distributio fuctio (EDF) for a set of ordered observatios of size, say F ( y ( i: ) ), with a particular theoretical distributio with kow parameters, deoted as F ( y ) The problem ca be formulated uder the ( i: ) test of hypothesis ivolvig H : F( y) F ( y ) ( i : ) where F is the hypothesized cotiuous cumulative distributio fuctio (cdf) with kow parameters agaist H : F( y) F ( y ) 1 ( i : ) Numerous GOF test methods based o empirical distributio fuctio have bee developed over the years by various researchers, see for example, Gree ad Hegazy [1], Stephes ad D Agostio [2], Gues etal [3], Swaepoel etal [4], Zhag [5] ad Zhag & Wu [6] Gree ad Hegazy [1] have proposed some modificatios o the origial EDF tests usig various empirical distributios F ( y ( i: ) ) ad foud that their modificatios have better performace tha the origial GOF tests Gues etal [3], study the performace of his proposed modificatios for the Iverse Gaussia distributio ad quite recetly, Zhag [5] proposed modificatios based o the likelihood ratio tests Zhag [5] claimed that his tests yield the best overall tests uder several popular alteratives distributios I this paper, the performace of several goodess of fit tests such as Kolmogorov-Smirov (), Aderso-Darlig(), Cramer-vo-Mises () ad Zhag s [5] modified versio of test () are ivestigated I additio a modified goodess of fit test which icorporates a variace stabilizig trasformatio (F) is proposed The performaces of these selected tests are illustrated by usig the wid speed data 2 Goodess of Fit Test I order to test whether or ot a radom sample of size, deoted as, y1, y2,, y comes from a particular distributio, for example a stadard ormal N (,1), the ull hypothesis H : F( y) N(,1) is tested agaist the alterative hypothesis H : F( y) N(,1) To study the degree 1 of discrepacies betwee the EDF ad the theoretical distributio, various GOF statistics had bee proposed i the literature The GOF tests that are of particular iterest i this study iclude the Kolmogorov-Smirov (), Aderso-Darlig (), Cramer-vo-Mises () as give by [2], ISSN: 179-595 394 ISBN: 978-96-474-93-2
Proceedigs of the 3rd WSEAS It Cof o RENEWABLE ENERGY SOURCES Zhag s test () [5] ad a proposed modificatio of which icorporates the variace stabilizig trasformatio (F) The popular test is defied as max( D, D ) (1) where i i 1 D max F ( y ), D max F ( y ) ( i: ) ( i: ) ad ( i: ) ( i: ) F ( y ) P( Y y ), i 1,2, is the cumulative probability of the i-th ordered statistics The respective Aderso-Darlig ad Cramervo-Mises tests are defie as i5 log F y( : ) i (2) 2 i1 i5 log 1F y( i: ) ad 2 i 5 1 ( i: ) i1 12 (3) F y The test as proposed by [5] is give by 1 i 1 log 2 i 2 F( y( i: ) ) max 1i 1 i 1 i log 2 2 (1 F( y( i: ) ) (4) A alterative GOF test is offered beside the statistics show i equatios (1) to (4) The proposed modified statistics which icorporates variace stabilizig trasformatio to the modified of Gree ad Hegazy [1], which is called F, is defied as F max D, D (5) where, a 1 1 D a max si si F( y( i : ) b i 1 1 1 i 1 max si F( y( : ) si 1 ad D b i The performaces of the above tests are ivestigated usig simulatios To test the hypothesis the followig simulatio steps are take: (i) Geerates a radom sample of size from the selected distributio uder the ull hypothesis (the parameters are assume kow) (ii) Sort the sample i ascedig order to obtai a order statistics y, y,, y (1: ) (2: ) ( : ) (iii) Assumig that the ull hypothesis is true, F ( y ) are calculated ( i: ) (iv) To study the degree of discrepacies betwee the EDF ie F ( y( i: ) ) ad F ( y( i: ) ) the above GOF tests from equatios (1) to (5) are calculated (v) Steps (i) to (iv) are repeated 5 times to geerate 5 idepedet test statistics for each GOF test (vi) The 9 th, 95 th ad 99 th percetiles of the ordered sample are obtaied uder the ull hypothesis These values are the critical values upo which the test statistics obtaied uder the assumed alterative hypotheses are compared (vii) The performace of the test statistics is the evaluated based o the of each test The is calculated by determiig the proportio of times the test statistics uder the assumptio of the alterative hypothesis is true, falls i the rejectio regio For calculatig the of the test, the simulatio is repeated 1, times The of each test is the proportio of times the ull hypothesis is rejected 3 Example usig Wid speed data To illustrate the above procedures we use wid speed data take from [8] The data is show i Table 1 below Table 1: Aual maximum widspeed data i miles per hour, Browsville, Texas, 1947-1977 32, 33, 34, 34, 35, 36, 37, 37, 38, 38, 39, 39,4, 4, 41, 41,42,42, 43,43,43, 44,44,46, 46, 48, 48, 49, 51,53, 53, 53, 56, 63, 66 The quatile-quatile plot i Fig1 is close to a straight lie which idicates that Browsville s wid speed data has a close fit to the GEV distributio ISSN: 179-595 395 ISBN: 978-96-474-93-2
Proceedigs of the 3rd WSEAS It Cof o RENEWABLE ENERGY SOURCES (vi) GEV ( 45, 6, k 37) wid speed (miles/hr) 35 4 45 5 55 6 35 4 45 5 55 6 Fig1 Quatile-Quatile Plot For Browsville Wid Speed Data q(x) Graphs i figures Fig2(a) to Fig2(c) show the shapes of the distributio uder the alterative hypotheses i relatio to the hypothesized distributio as the locatio ad scale chage Figure 2(a), shows the chages i the shape of the distributio uder the alterative hypotheses whe the scale parameter varies Figure 2(b), shows the distributios uder the alterative hypotheses shifted to the right whe the locatio varies Figure 2(c), shows the shape of the distributios uder the alterative hypotheses whe both the locatio ad scale parameters vary The Geeralized Extreme Value (GEV) distributio is give by x 1 1(1 k ) xe f ( y) e (6) k 1 log 1 k( x ) /, k where x, x /, k 5 1 15 2 Ho:GEV(3978,626,-37) (i)ha:gev(3978,55,-37) (ii)ha:gev(3978,4,-37) 35 4 45 5 55 6 65 / k y if k ; y / k if k ; y if k= Parameter estimatio for locatio ( ), scale ( ) ad shape ( k ) parameters for wid speed data are obtaied usig L-momet method itroduced by Hoskigs [7] The estimated parameters foud are: 3978, 626 ad k 37 The data has mea=4363, variace= 449, L-skewess, 3 1937 ad L- kurtosis, 4 159 The determied distributio is show i Fig2(a) A simulatio study was carried out to test the hypothesis H : F( y) GEV (,, k ) agaist H : F( y) GEV (,, k) where, ad k are the locatio, scale ad shape parameters respectively The followig alterative hypotheses are cosidered agaist the ull hypothesis to allow for differeces i locatios ad scales i the alterative distributios The alteratives are listed as follows: (i) GEV ( 3978, 55, k 37) (ii) GEV ( 3978, 4, k 37) (iii) GEV ( 42, 626, k 37) (iv) GEV ( 45, 626, k 37) (v) GEV ( 42, 55, k 37) widspeed(miles/hour) -Browsville Fig2(a) Shape of the distributio uder alterative hypotheses (i) ad (ii) 5 1 15 2 35 4 45 5 55 6 65 widspeed(miles/hour) -Browsville Ho:GEV(3978,626,-37) (iii)ha:gev(42,626,-37) (iv)ha:gev(45,626,-37) Fig2(b) Shape of the distributio uder alterative hypotheses (iii) ad (iv) 5 1 15 2 35 4 45 5 55 6 65 widspeed(miles/hour) -Browsville Ho:GEV(3978,626,-37) (v)ha:gev(42,55,-37) (vi)ha:gev(45,6,-37) Fig2(c) Shape of the distributio uder alterative hypotheses (v) ad (vi) ISSN: 179-595 396 ISBN: 978-96-474-93-2
Proceedigs of the 3rd WSEAS It Cof o RENEWABLE ENERGY SOURCES 4 Simulatio Results The simulatio results for compariso purposes are show i Fig 3(i) to Fig3(vi) Fig3 (i) ad Fig3(ii) show the results whe the scale parameter is chaged but the locatio ad shape are the same as the hypothesized distributio I this case, the performace of F is foud to be most ful i both cases ad perform better tha all other tests I the cases of Fig3(iii) ad Fig3(iv), where the differeces are due to the differet locatio but the same scale ad shape parameters, F has better performace tha other tests whe the sample size is small ie 5,(see Fig3(iii)), whe 5, both ad outperform F I Fig3(iv), F has better performace iitially for smaller values of but as the sample sizes icreases ad perform equally good as F followed by ad However, i the case of allowig for the differeces both i locatio ad scale, the proposed test, F still outperform all other test as show i Fig3(v) ad Fig3(vi) [7] JRM Hoskigs199 Aalysis ad Estimatio of Distributios usig Liear Combiatios of Order Statistics Joural of the Royal Statistical Society Series B, Vol52 No1, pp 15-124 [8] E Simiu, MJ Chagery, ad JJFillibe 1979 Extreme wid speeds at 129 statios i the cotiguous Uited States Buildig Sciece Series 118, Natioal Bureau of Stadards, Washigto, DC 5 Coclusio I this paper a ew GOF tests which icorporates variace stabilizig trasformatio is itroduced The proposed modified GOF test, ie F is foud to perform better tha other GOF tests i most of the cases ivestigated Further aalysis eed to be carried out to study the properties of the proposed test uder various distributios Refereces: [1] JR Gree ad YAS Hegazy, 1976 Powerful Modified Goodess of fit test, Joural of the America Statistical Associatio, Vol 71, No353, pp 24-29 [2] RBD Agostio ad MA Stephes 1986 Goodess of fit techiques, Marcel Dekker, New York [3] HGues, DCDietz, PF Auclair ad AH Moore 1997 Modified Goodess of Fit tests for the Iverse Gaussia Computatioal Statistics & Data Aalysis Vol 24 pp 63-67 [4] JWH Swaepoel ad CV Graa 21 Goodess of fit tests based o Estimated Expectatios of Probability Itegral Trasformed Order Statistics A Ist Statist Math Vol 54, No 3, pp 531-542 [5] J Zhag 22 Powerful goodess of fit tests based o the likelihood ratio Joural of Royal Statist Soc B, No 64, Part 2, pp 281-294 [6] JZhag ad Y Wu 22 Beta Approximatio to the Distributio of Kolmogorov-Smirov Statistics A Ist Statist Math Vol 54, No 3, pp 577-584 ISSN: 179-595 397 ISBN: 978-96-474-93-2
Proceedigs of the 3rd WSEAS It Cof o RENEWABLE ENERGY SOURCES (i) Ho:GEV(3978,626,-37) vs Ha:GEV(3978,55,-37) (ii) Ho:GEV(3978,626,-37) vs Ha:GEV(3978,4,-37) 2 4 6 8 1 F 2 4 6 8 1 F 5 1 15 5 1 15 (iii) Ho:GEV(3978,626,-37) vs Ha:GEV(42,626,-37) (iv) Ho:GEV(3978,626,-37) vs Ha:GEV(45,626,-37) 2 4 6 8 1 F 2 4 6 8 1 F 5 1 15 5 1 15 (v) Ho:GEV(3978,626,-37) vs Ha:GEV(42,55,-37) (vi) Ho:GEV(3978,626,-37) vs Ha:GEV(45,6,-37) 2 4 6 8 1 F 2 4 6 8 1 F 5 1 15 5 1 15 Fig 3 Power compariso of GOF tests agaist various alterative hypotheses ISSN: 179-595 398 ISBN: 978-96-474-93-2