On the Theory and Application of Model Misspecification Tests in Geodesy

Size: px
Start display at page:

Download "On the Theory and Application of Model Misspecification Tests in Geodesy"

Transcription

1 Istitut für Geodäsie ud Geoiformatio der Uiversität Bo O the Theory ad Applicatio of Model Misspecificatio Tests i Geodesy Iaugural Dissertatio zur Erlagug des akademische Grades Doktor Igeieur Dr. Ig. der Hohe Ladwirtschaftliche Fakultät der Rheiische Friedrich Wilhelms Uiversität zu Bo vorgelegt am 8. Mai 7 vo Dipl. Ig. Boris Kargoll aus Karlsruhe

2 Hauptberichterstatter: Mitberichterstatter: Prof. Dr. tech. W.-D. Schuh Prof. Dr. rer. at. H.-P. Helfrich Tag der müdliche Prüfug:. Jui 7 Gedruckt bei: Diese Dissertatio ist auf dem Hochschulschrifteserver der ULB Bo olie elektroisch publiziert. Erscheiugsjahr: 7

3 O the Theory ad Applicatio of Model Misspecificatio Tests i Geodesy Abstract May geodetic testig problems cocerig parametric hypotheses may be formulated withi the framework of testig liear costraits imposed o a liear Gauss-Markov model. Although geodetic stadard tests for such problems are computatioally coveiet ad ituitively soud, o rigorous attempt has yet bee made to derive them from a uified theoretical foudatio or to establish optimality of such procedures. Aother shortcomig of curret geodetic testig theory is that o stadard approach exists for tacklig aalytically more complex testig problems, cocerig for istace ukow parameters withi the weight matrix. To address these problems, it is prove that, uder the assumptio of ormally distributed observatio, various geodetic stadard tests, such as Baarda s or Pope s test for outliers, multivariate sigificace tests, deformatio tests, or tests cocerig the specificatio of the a priori variace factor, are uiformly most powerful UMP withi the class of ivariat tests. UMP ivariat tests are prove to be equivalet to likelihood ratio tests ad Rao s score tests. It is also show that the computatio of may geodetic stadard tests may be simplified by trasformig them ito Rao s score tests. Fially, testig problems cocerig ukow parameters withi the weight matrix such as autoregressive correlatio parameters or overlappig variace compoets are addressed. It is show that, although strictly optimal tests do ot exist i such cases, correspodig tests based o Rao s Score statistic are reasoable ad computatioally coveiet diagostic tools for decidig whether such parameters are sigificat or ot. The thesis cocludes with the derivatio of a parametric test of ormality as aother applicatio of Rao s Score test. Zur Theorie ud Awedug vo Modell-Misspezifikatiostests i der Geodäsie Zusammefassug Was das Teste vo parametrische Hypothese betrifft, so lasse sich viele geodätische Testprobleme i Form eies Gauss-Markov-Modells mit lieare Restriktioe darstelle. Obwohl geodätische Stadardtests recherisch eifach ud ituitiv verüftig sid, wurde bisher kei streger Versuch uteromme, solche Tests ausgehed vo eier eiheitliche theoretische Basis herzuleite oder die Optimalität solcher Tests zu begrüde. Ei weiteres Defizit im gegewärtige Verstädis geodätischer Testtheorie besteht dari, dass kei Stadardverfahre zum Löse vo aalytisch komplexere Testprobleme exisitiert, welche beispielsweise ubekate Parameter i der Gewichtsmatrix betreffe. Um diese Probleme gerecht zu werde wird bewiese, dass uter der Aahme ormalverteilter Beobachtuge verschiedee geodätische Stadardtests, wie z.b. Baardas oder Popes Ausreissertest, multivariate Sigifikaztests, Deformatiostests, oder Tests bzgl. der Agabe des a priori Variazfaktors, allesamt gleichmäßig beste egl.: uiformly most powerful - UMP ivariate Tests sid. Es wird ferer bewiese dass UMP ivariate Tests äquivalet zu Likelihood-Quotiete-Tests ud Raos Score-Tests sid. Ausserdem wird gezeigt, dass sich die Berechug vieler geodätischer Stadardtests vereifache lässt idem diese als Raos Score-Tests formuliert werde. Abschließed werde Testprobleme behadelt i Bezug auf ubekate Parameter ierhalb der Gewichtsmatrix, beispielsweise i Bezug auf autoregressive Korrelatiosparameter oder überlappede Variazkompoete. I solche Fälle existiere keie im strege Sie beste Tests. Es wird aber gezeigt, dass etsprechede Tests, die auf Raos Score-Statistik beruhe, sivolle ud vom Recheaufwad her güstige Diagose-Tools darstelle um festzustelle, ob Parameter wie die eigags erwähte sigifikat sid oder icht. Am Ede dieser Dissertatio steht mit der Herleitug eies parametrische Tests auf Normalverteilug eie weitere Awedug vo Raos Score-Test.

4

5 Cotets Itroductio. Objective Outlie Theory of Hypothesis Testig 3. The observatio model The testig problem The test decisio The size ad power of a test Best critical regios Most powerful MP tests Reductio to sufficiet statistics Uiformly most powerful UMP tests Reductiotoivariatstatistics Uiformly most powerful ivariat UMPI tests Reductio to the Likelihood Ratio ad Rao s Score statistic Theory ad Applicatios of Misspecificatio Tests i the Normal Gauss-Markov Model Itroductio Derivatio of optimal tests cocerig parameters of the fuctioal model Reparameterizatio of the test problem Ceterig of the hypotheses Full decorrelatio/homogeizatio of the observatios Reductio to miimal sufficiet statistics with elimiatio of uisace parameters Reductio to a maximal ivariat statistic Back-substitutio Equivalet forms of the UMPI test cocerig parameters of the fuctioal model Applicatio : Testigforoutliers Baarda s test Pope s test Applicatio : Testigforextesiosofthefuctioalmodel Applicatio 3: Testigforpoitdisplacemets Derivatio of a optimal test cocerig the variace factor Applicatios of Misspecificatio Tests i Geeralized Gauss-Markov models Itroductio Applicatio 5: Testigforautoregressivecorrelatio Applicatio 6: Testigforoverlappigvariacecompoets Applicatio 7: Testigforo-ormalityoftheobservatioerrors Coclusio ad Outlook 86

6 6 Appedix: Datasets Dam Dataset Gravity Dataset Refereces 88

7 Itroductio. Objective Hypothesis testig is the foudatio of all critical model aalyses. Particularly relevat to geodesy is the practice of model misspecificatio testig which has the objective of determiig whether a give observatio model accurately describes the physical reality of the data. Examples of commo testig problems iclude how to detect outliers, how to determie whether estimated parameter values or chages thereof are sigificat, or how to verify the measuremet accuracy of a give istrumet. Geodesists kow how to hadle such problems ituitively usig stadard parameter tests, but it ofte remais uclear i what mathematical sese these tests are optimal. The first goal of this thesis is to develop a theoretical foudatio which allows establishig optimality of such tests. The approach will be based o the theory of Neyma ad Pearso 98, 933, whose celebrated fudametal lemma defies a optimal test as oe which is most powerful amog all tests with some particular sigificace level. As this cocept is applicable oly to very simple problems, tests must be cosidered that are most powerful i a wider sese. A ituitively appealig way to do so is based o the fact that complex testig problems may ofte be reduced to simple problems by exploitig symmetries. Oe mathematical descriptio of symmetry is ivariace, whose applicatio to testig problems the leads to ivariat tests. I this cotext, a uiformly most powerful ivariat test defies a test which is optimal amog all ivariat tests available i the give testig problem. I this thesis, it will be demostrated for the first time that may geodetic stadard tests fit ito this framework ad share the property of beig uiformly most powerful. I order to be useful i practical situatios, a testig procedure should ot oly be optimal, but it must also be computatioally maageable. It is well kow that hypothesis tests have differet mathematical descriptios, which may vary cosiderably i computatioal complexity. Most geodetic stadard tests are usually derived from likelihood ratio tests see, for istace, Koch, 999; Teuisse,. A alterative, oftetimes much simpler represetatio is based o Rao s 948 score test, which has ot bee ackowledged as such by geodesists although it has foud its way ito geodetic practice, for istace, via Baarda s outlier test. To shed light o this importat topic, it is aother major itet of this thesis to describe Rao s score method i a geeral ad systematic way, ad to demostrate what types of geodetic testig problems are ideally hadled by this techique.. Outlie The followig Sectio of this thesis begis with a review of classical testig theory. The focus is o parametric testig problems, that is, hypotheses to be tested are propositios cocerig parameters of the data s probability distributio. We will the follow the classical approach of cosiderig tests with fixed sigificace level ad maximum power. I this cotext, the Neyma-Pearso Lemma ad the resultig idea of a most powerful test will be explaied, ad the cocept of a uiformly most powerful test will be itroduced. The subsequet defiitio of sufficiecy will play a cetral role i reducig the complexity of testig problems. Followig this, we will examie more complex problems that require a simplificatio goig beyod sufficiecy. For this puropse, we will use the priciple of ivariace, which is the mathematical descriptio of symmetry. We will see that ivariat tests are tests with power distributed symmetrically over the space of parameters. This leads us to the otio of a uiformly most powerful ivariat UMPI test, which is a desigated optimal test amog such ivariat tests. Fially, we will explore the relatioships of UMPI tests to likelihood ratio tests ad Rao s score tests. Sectio 3 exteds the ideas developed i Sectio to address the geeral problem of testig liear hypotheses i the Gauss-Markov model with ormally distributed observatios. Here we focus o the case i which the desig matrix is of full rak ad where the weight matrix is kow. The, the testig problem will be reduced by sufficiecy ad ivariace, ad UMPI tests derived for the two cases where the variace of uit weight is either kow or ukow a priori. Emphasis will be placed o demostratig further that these UMPI tests correspod to the tests already used i geodesy. Aother key result of this sectio will be to show how all these tests are formulated as likelihood ratio ad Rao s score tests. The sectio cocludes with a discussio of various geodetic testig problems. It will be show that may stadard tests used so far, such as Baarda s ad Pope s outlier test, multivariate parameter tests, deformatio tests, or tests cocerig the variace of uit weight, are optimal UMPI i a statistical sese, but that computatioal complexity ca ofte be effectively reduced by usig equivalet Rao s score tests istead. Sectio 4 addresses a umber of testig problems i geeralized Gauss-Markov models for which o UMPI

8 INTRODUCTION tests exist, because a reductio by sufficiecy ad ivariace are ot effective. The first problem cosidered will be testig for first-order autoregressive correlatio. Rao s score test will be derived, ad its power agaist several simple alterative hypotheses will be determied by carryig out a Mote Carlo simulatio. The secod applicatio of this sectio will treat the case of testig for a sigle overlappig variace compoet,for which Rao s score test will be oce agai derived. The fial problem cosists of testig whether observatios follow a ormal distributio. It this situatio, Rao s score test will be show to lead to a test which measures the deviatio of the sample s skewess ad kurtosis from the theoretical values of a ormal distributio. Fially, Sectio 5 highlights the mai coclusios of this work ad gives a outlook o promisig extesios to the theory ad applicatios of the approach preseted i this thesis.

9 3 Theory of Hypothesis Testig. The observatio model Let us assume that some data vector y =[y,...,y ] is subject to a statistical aalysis. As this thesis is cocered rather with explorig theoretical aspects of such aalyses, it will be useful to see this data vector as oe of may potetial realizatios of a vector Y of observables Y,...,Y. This is reflected by the fact that measurig the same quatity multiple times does ot result i idetical data values, but rather i some frequecy distributio of values accordig to some radom mechaism. I geodesy, quatities that are subject to observatio or measuremet usually have a geometrical or physical meaig. I this sese, Y, or its realizatio y, will be viewed as beig icorporated i some kid of model ad thereby coected to some other quatities or parameters. Parametric observatio models may be set up for multiple reasos. They are ofte used as a way to reduce great volumes of raw data to low-dimesioal approximatig fuctios. A model might also be used simply because the quatity of primary iterest is ot directly observable, but must be derived from other data. I reality, both aspects ofte go had i had. To give these explaatios a mathematical expressio, let the radom vector Y with values i R be part of a liear model Y = Xβ + E,.- where β R m deotes a vector of ukow o-stochastic parameters ad X R m a kow matrix of ostochastic coefficiets reflectig the fuctioal relatioship. It will be assumed throughout that raka = m ad that >mso that.- costitutes a geuie adjustmet problem. E represets a real-valued radom vector of ukow disturbaces or errors, which are assumed to satisfy EE} = ad ΣE} = σ P ω..- We will occasioally refer to these two coditios as the Markov coditios. The weight matrix P ω may be a fuctio of ukow parameters ω, which allows for certai types of correlatio ad variace-chage or heteroscedasticity models regardig the errors. Wheever such parameters do ot appear, we will use P to deote the weight matrix. To make the followig testig procedures operable, these liear model specificatios must be accompaied by certai assumptios regardig the type of probability distributio cosidered for Y. For this purpose, it will be assumed that ay such distributio P may be defied by a parametric desity fuctio fy; β,σ, ω, c,.-3 which possibly depeds o additioal ukow shape parameters c cotrollig, for istace, the skewess ad kurtosis of the distributio. Now, let the vector θ := [β,σ, ω, c ] comprise the totality of ukow parameters takig values i some u-dimesioal space Θ. The parameter space Θ the correspods to a collectio of desities F = fy; θ :θ Θ},.-4 which i tur defies the cotemplated collectio of distributios W = P θ : θ Θ}..-5 Example.: A agle has bee idepedetly observed times. Each observatio Y,...,Y is assumed to follow a distributio that belogs to the class of ormal distributios W = Nµ, σ :µ R,σ R +}.-6 with mea µ ad variace σ, or i short otatio Y i Nµ, σ. The relatioship betwee Y =[Y,...,Y ] ad the mea parameter µ costitutes the simplest form of a liear model.-, where X is a -vector of oes ad β equals the sigle parameter µ. Furthermore, as the observatios are idepedet with costat mea ad variace, the joit ormal desity fuctio fy; µ, σ may be decomposed i.e. factorized itothe product fy; µ, σ = fy i ; µ, σ.-7

10 4 THEORY OF HYPOTHESIS TESTING of idetical uivariate ormal desity fuctios defied by fy i ; µ, σ = exp } yi µ y i R,µ R,σ R +,,...,..-8 πσ σ Therefore, the class of desities F cosidered for Y may be writte as } } F = πσ / exp σ Y i µ :[µ, σ ] Θ.-9 with two-dimesioal parameter space Θ = R R +.. The testig problem The goal of ay parametric statistical iferece is to extract iformatio from the give data y about the ukow true parameters θ, which refer to the ukow true probability distributio P θ ad the true desity fuctio fy; θ with respect to the observables Y. For this purpose, we will assume that θ, P θ, adfy; θ are uique ad idetifiable elemets of Θ, W, adf respectively. While estimatio aims at determiig the umerical values of θ, that is, selectig oe specific elemet from Θ, the goal of testig is somewhat simpler i that oe oly seeks to determie whether θ is a elemet of a subset Θ of Θ or ot. Despite this seemigly great differece betwee the purpose of estimatio ad testig, which is reflected by a separate treatmet of both topics i most statistical text books, certai cocepts from estimatio will tur out to be idispesable for the theory of testig. As this thesis is focussed o testig, the ecessary estimatio methodology will be itroduced without a detailed aalysis thereof. I order to formulate the test problem, a o-empty ad geuie subset Θ Θ correspodig to some W W ad F F must be specified. The, the ull hypothesis is defied as the propositio H : θ Θ..- Whe the ull hypothesis is such that Θ represets oe poit θ withi the parameter space Θ, the the elemets of θ assig uique umerical values to all the elemets i θ, ad.- simplifies to the propositio H : θ = θ..- I this case, H is called a simple ull hypothesis. O the other had, if at least oe elemet of θ is assiged a whole rage of values, say R +,theh is called a composite ull hypothesis. Isuchacase,aequality relatio as i.- ca clearly ot be established for all the parameters i θ. Ukow parameters whose true values are ot uiquely fixed uder H are also called uisace parameters. Example. Example. cotiued: O the basis of give observed umerical values y =[y,...,y ], we wat to test whether the observed agle is a exact right agle go or ot. Let us ivestigate three differet scearios:. If σ is kow apriorito take the true value σ,theθ = R is oe-dimesioal, ad uder the ull hypothesis H : µ = the subset Θ shriks to the sigle poit Θ = }. Hece, H is a simple ull hypothesis by defiitio.. If µ ad σ are both ukow, the the ull hypothesis, writte as H : µ = σ R +, leaves the uisace parameter σ uspecified. Therefore, the subset Θ =,σ :σ R +} does ot specify a sigle poit, but a iterval of values. Cosequetly, H is composite uder this sceario. 3. If the questio is whether the observed agle is a go ad the stadard deviatio is really 3 mgo e.g. as promised by the producer of the istrumet, the the ull hypothesis H : µ =, σ =.3 refers to the sigle poit Θ =,.3 withiθ. Ithatcase,H is see to be simple.

11 .3 The test decisio 5.3 The test decisio Imagie that the space S of all possible observatios y cosists of two complemetary regios: a regio of acceptace S A, which cosists of all values that support a certai ull hypothesis H, ad a regio of rejectio or critical regio S C, which comprises all the observatios that cotradict H i some sese. A test decisio could the be o based simply observig whether some give data values y are i S A which would imply acceptace of H, or whether y S C which would result i rejectio of H. It will be ecessary to perceive ay test decisio as the realizatio of a radom variable φ which, as a fuctio of Y, takes the value i case of rejectio ad i case of acceptace of H. This mappig, defied as, if y SC, φy =.3-, if y S A, is also called a test or critical fuctio, for it idicates whether a give observatio y falls ito the critical regio or ot..3- ca be viewed as the mathematical implemetatio of a biary decisio rule, whichis typical for test problems. This otio ow allows for the more formal defiitio of the regios S A ad S C as S C = φ = y S φy =},.3-3 S A = φ = y S φy =}..3-4 Example.3 Ex.. cotiued: For simplicity, let Y = be the sigle observatio of a agle, which is assumed to be ormally distributed with ukow mea µ ad kow stadard deviatio σ = σ =3mgo. To test the hypothesis that the observed agle is a right agle H : µ =, a egieer suggests the followig decisio rule: Reject H, whe the observed agle deviates from go by at least five times the stadard deviatio. The critical fuctio reads, if y or y.5 φy =.3-5, if <y<.5. The critical regio is give by S C =, ] [.5, +, ad the regio of acceptace by S A = ,.5. Due to the radom ad biary ature of a test, two differet types of error may occur. The error of the first kid or Type I error arises, whe the data y truly stems from a distributio i W specified by H, but happes to fall ito the regio of rejectio S C.Cosequetly,H is falsely rejected. The error of the secod kid or Type II error occurs, whe the data y does ot stem from a distributio i W, but is a elemet of the regio of acceptace S A. Clearly, H is the accepted by mistake. From Example.3 it is ot clear whether the suggested decisio rule is i fact reasoable. The followig subsectio will demostrate how the two above errors ca be measured ad how they ca be used to fid optimal decisio rules..4 The size ad power of a test As ay test.3- is itself a radom variable derived from the observatios Y, it is straightforward to ask for the probabilities with which these errors occur. Sice tests with small error probabilities appear to be more desirable tha tests with large errors, it is atural to use these probabilities i order to fid optimal test procedures. For this purpose, let α deote the probability of a Type I error, adβ ot to be cofused with the parameter β of the liear model.- the probability of a Type II error. Istead of β, itismore commo to use the complemetary quatity π := β, called the power of a test. Whe H is simple, i.e. whe all the ukow parameter values are specified by H, the the umerical value for α may be computed from.3- by α = P θ [φy =]=P θ Y S C = fy; θ dy..4-6 S C From.4-6 it becomes evidet why α is also called the size of the critical regio, because its value represets the area uder the desity fuctio measured over S C. Notice that for composite H,thevaluefor α will geerally deped o the values of the uisace parameters. I that case, it is appropriate to defie α as a fuctio with αθ =P θ [φy =]=P θ Y S C = fy; θ dy θ Θ..4-7 S C

12 6 THEORY OF HYPOTHESIS TESTING Example.4 Example.3 cotiued: What is the size of the critical regio or the probability of the Type I error for the test defied by.3-5? Recall that µ = is the value assiged to µ by H ad that σ =.3 is the fixed value for σ assumed as kow apriori. The, after trasformig Y ito a N, -distributed radom variable, the values of the stadard ormal distributio fuctio Φ may be obtaied from statistical tables see, for istace, Kreyszig, 998, p to aswer the above questio. α = P θ Y S C =N µ,σ Y or Y.5 = N µ,σ <Y < µ = N, < Y µ <.5 µ σ σ σ = [Φ5 Φ 5]. If σ was ukow, the the umerical value of α would deped o the value of σ. Let us fiish the discussio of the size of a test by observig i Fig.. that differet choices of the critical regio may have the same total probability mass. S A S C S C S A Nµ,σ Nµ,σ α α S C S A S C S A S C S A Nµ,σ α/ α/ Nµ,σ α Fig.. Let Nµ,σ deote the distributio of a sigle observatio Y uder a simple H with kow ad fixed variace σ. This figure presets four out of ifiitely may differet ways to specify a critical regio S C of fixed size α.

13 .4 The size ad power of a test 7 The computatio of the probability of a Type II error is more itricate tha that of α, because the premise of a false H does ot tell us aythig about which distributio we should use to measure the evet that y is i S A. For this very reaso, a alterative class of distributios W W must be specified which cotais the true distributio if H is false. If we let W be represeted by a correspodig o-empty parameter subset Θ Θ, the we may defie the alterative hypothesis as H : θ Θ Θ Θ, Θ Θ =,.4-8 which may be simple or composite i aalogy to H. The coditio Θ Θ = is ecessary to avoid ambiguities due to overlappig hypotheses. Example.5 Example. cotiued: For testig the right agle hypothesis H : µ =, we will assume that σ = σ =.3 is fixed ad kow. Let us cosider the followig three situatios.. Imagie that a map idicates that the observed agle is a right agle, while a secod older map gives a value of say.8 go. I this case, the data y couldbeusedtotesth agaist the alterative H : µ =.8. Θ =.8} represets oe poit i Θ, hece H is simple.. If the right agle hypothesis is doubtful but there is evidece that the agle ca defiitely ot be smaller tha go, the the appropriate alterative reads H : µ >, which is ow composite due to Θ = µ : µ>}, ad it is called oe-sided, because the alterative values for µ are elemets of a sigle iterval. 3. Whe o prior iformatio regardig potetial alterative agle sizes is available, the H : µ is a reasoable choice as we will see later. Sice the alterative values for µ are split up ito two itervals separated by the value uder H, we speak of a two-sided composite H. With the specificatio of a alterative subspace Θ Θ, which the ukow true parameter θ is assumed to be a elemet of if H is false, the probability of a Type II error follows to be either β = P θ [φy =]=P θ Y S A = fy; θ dy.4-9 S A if H is simple i.e. if θ is the uique elemet of Θ, or βθ =P θ [φy =]=P θ Y S A = fy; θ dy θ Θ.4- S A if Θ is composed of multiple elemets. As simple alteratives are rarely ecoutered i practical situatios, the geeral otatio of.4- will be maitaied. As already metioed, it is more commo to use the power of a test, defied as Πθ := P θ Y S A =P θ Y S C =P θ [φy =] θ Θ..4- The umerical values of Π may be iterpreted as the probabilities of avoidig a Type II error. Whe desigig a test, it will be useful to determie the probability of rejectig H as a fuctio defied over the etire parameter space Θ. Such a fuctio may be defied as Pfθ :=P θ [φy =]=P θ Y S C θ Θ.4- ad will be called the power fuctio of a test. Clearly, this fuctio will i particular produce the sizes α for all θ Θ ad the power values Π for all θ Θ. For all the other values of θ, this fuctio will provide the hypothetical power of the test if the true parameter is either assumed to be a elemet of Θ, or of Θ. Example.6 Example.5 cotiued: Recall that the size of this test tured out to be approximately as Ex..4 demostrated. Let us ow ask, what the power of the test would be for testig H : µ = agaist H : µ = µ =.8 with σ = σ =.3 kow apriori. Usig the.4-, we obtai Π= P µ,σ Y S A= N µ,σ <Y < µ = N, < Y µ.3.3 <.5 µ.3 = [Φ Φ ].843.

14 8 THEORY OF HYPOTHESIS TESTING Notice that the larger the differece betwee µ ad µ, the larger the power becomes. For istace, if H had bee specified as µ =., the the power would icrease to Π.977, ad for µ =.4 the power would already be very close to. This is ituitively uderstadable, because very similar hypotheses are expected to be harder to separate o the basis of some observed data tha extremely differet hypotheses. Figure. illustrates this poit. S A S C S A S C Nµ,σ Nµ,σ Nµ,σ Nµ,σ β α β α µ µ µ µ Fig.. The probability of a Type II error β = Π becomes smaller as the distace µ µ with idetical variace σ betwee the ull hypothesis H ad the alterative H icreases. Aother importat observatio to make i this cotext is that, ufortuately, the errors of the first ad secod kid caot be miimized idepedetly. For istace, whe the critical regio S C is exteded towards µ Fig..3 left right, the clearly its size becomes larger. I doig this, S A shriks, ad the error of the secod kid becomes smaller. This effect is explaied by the fact that both errors are measured i complemetary regios ad thereby affect each other s size. Therefore, o critical fuctio ca exist that miimizes both error probabilities simultaeously. The purpose of the followig subsectio is to preset a practical solutio to resolve this coflict. S A S C S A S C Nµ,σ Nµ,σ Nµ,σ Nµ,σ β α β α µ µ Fig..3 Let Nµ,σ adnµ,σ deote the distributios of a sigle observatio Y uder simple H ad H, respectively. Chagig the S C /S A partitioig of the observatio space abscissa ecessarily causes a icrease i probability of oe error type ad a decrease i probability of the other type.

15 .5 Best critical regios 9.5 Best critical regios As poited out i the previous sectio, shiftig the critical regio ad makig oe error type more ulikely always causes the other error to become more probable. Therefore, the probabilities of Type I ad Type II errors caot be miimized simultaeously. Oe way to resolve this coflict is to keep the probability of a Type I error fixed at a relatively small value ad to seek a critical regio that miimizes the probability of a Type II error, or equivaletly that maximizes the power of the test. To make the mathematical cocepts, ecessary for this procedure, ituitively uderstadable, examples will be give maily with respect to the class of observatio models.-6 itroduced i Example.. The remaider of this Sectio.5 is orgaized such that tests with best critical regios will be costructed for testig problems that are progressively complex withi that class of models. The determiatio of optimal critical regios i the cotext of the geeral liear model.- with geeral parametric desities as i.-3 will be subject of detailed ivestigatios i Sectios 3 ad Most powerful MP tests The simplest kid of problem for which a critical regio with optimal power may exist is that of testig a simple H : θ = θ agaist a simple alterative hypothesis H : θ = θ ivolvig a sigle ukow parameter. Usig defiitios.4-6 ad.4-, the problem is to fid a set S C such that the restrictio fy; θ dy = α.5-3 S C is satisfied, where α as a give size is also called the sigificace level, ad fy; θ dy is a maximum..5-4 S C Such a critical regio will be called the best critical regio BCR, ad a test based o the BCR will be deoted as most powerful MP for testig H agaist H at level α. A solutio to this problem may be foud o the basis of the followig lemma of Neyma ad Pearso see, for istace, Rao, 973, p Theorem. Neyma-Pearso Lemma. Suppose that fy ; θ ad fy ; θ are two desities defied o aspaces. LetS C S be ay critical regio with fy; θ dy = α,.5-5 S C where α has a give value. If there exists a costat k α such that for the regio SC S with fy; θ fy; θ >k α if y SC.5-6 fy; θ fy; θ <k α if y / SC, coditio.5-5 is satisfied, the fy; θ dy fy; θ dy..5-7 S C S C Notice if whe fy ; θ adfy ; θ are desities uder simple hypotheses H ad H, ad if the coditios.5-5 ad.5-6 hold for some k α,thesc deotes the BCR for testig H versus H at fixed level α, because.5-7 is equivalet to the desired maximum power coditio.5-4. Also observe that.5-6 the defies the MP test, which may be writte as fy; θ if φy = if fy; θ >k α fy; θ fy; θ <k α..5-8 This coditio.5-8 expresses that i order for a test to be most powerful, the critical regio S C must comprise all the observatios y, for which the so-called desity ratio fy; θ /fy; θ is larger tha some

16 THEORY OF HYPOTHESIS TESTING α-depedet umber k α. This ca be explaied by the followig ituitios of Stuart et al. 999, p. 76. Usig defiitio.4-, the power may be rewritte i terms of the desity ratio as fy; θ Π= fy; θ dy = S C fy; θ fy; θ dy. S C Sice α has a fixed value, maximizig Π is equivalet to maximizig the quatity fy; θ Π α = S C fy; θ fy; θ dy. fy; θ dy S C I order for a test to have maximum power, its critical regio S C must clearly iclude all the observatios y,. for which the itegral value i the deomiator equals α, ad. for which the desity ratio i the omiator produces the largest possible values, whose lower boud may be defied as the umber k α with the values of the additioal factor fy; θ fixed by coditio. These are the very coditios give by the Neyma-Pearso Lemma. A more formal proof may be foud, for istace, i Teuisse, p. 3f.. The followig example demostrates how the BCR may be costructed for a simple test problem by applyig the Neyma-Pearso Lemma. Example.7: Test of the ormal mea with kow variace - Simple alteratives. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo kow stadard deviatio σ = σ. What is the BCR for a test of the simple ull hypothesis H : µ = µ agaist the simple alterative hypothesis H : µ = µ at level α? It is assumed that µ, µ, σ ad α have fixed umerical values. I order to costruct the BCR, we will first try to fid a umber k α such that coditio.5-6 about the desity ratio fy; θ /fy; θ holds. As the observatios are idepedetly distributed with commo mea µ ad variace σ, the factorized form of the joit ormal desity fuctio fy accordig to Example. may be applied. This yields the expressio exp } } yi µ exp fy; θ πσ fy; θ = σ πσ σ y i µ exp } = } yi µ exp.5-9 πσ σ πσ σ y i µ for the desity ratio. A applicatio of the ordiary biomial formula allows us to split off a factor that does ot deped o µ, thatis } } exp fy; θ πσ fy; θ = σ yi exp σ y i µ + µ } }. exp πσ σ yi exp.5-3 σ y i µ + µ Now, the first two factors i the omiator ad deomiator cacel out due to their idepedece of µ. Rearragig the remaiig terms leads to } µ exp fy; θ fy; θ = σ y i µ σ }.5-3 µ exp =exp σ µ σ y i µ σ y i µ σ =exp µ µ σ } y i µ σ + µ σ.5-3 } µ µ,.5-33 y i σ

17 .5 Best critical regios which reveals two remarkable facts: the simplified desity ratio depeds o the observatios oly through their sum y i, ad the desity ratio, as a expoetial fuctio, is a positive umber. Therefore, we may choose aother positive umber k α such that } exp µ µ y i µ µ > k α.5-34 σ σ always holds. Takig atural logarithms o both sides of this iequality yields µ µ y i σ σ or, after multiplicatio with σ µ µ µ µ > l k α ad expasio of the left side by, y i > σ l k α + µ µ. Depedig o whether µ > µ or µ < µ, the sample mea ȳ = y i must satisfy or ȳ> σ l k α + µ µ µ µ =: k α if µ >µ ȳ< σ l k α + µ µ µ µ =: k α if µ <µ i order for the secod coditio.5-6 of the Neyma-Pearso Lemma to hold. Note that the quatities σ,,µ,µ are all costats fixed apriori,adk α is a costat whose exact value is still to be determied. Thus, k α is itself a ukow costat. Now, i order for the first coditio.5-5 of the Neyma-Pearso Lemma to hold i additio, S C must have size α uder the ull hypothesis. As metioed above, the critical regio S C may be costructed solely by ispectig the value ȳ, which may be viewed as the outcome of the radom variable Ȳ := Y i. Uder H, Ȳ is ormally distributed with expectatio µ idetical to the expectatio of each of the origial observatios Y,...,Y ad stadard deviatio σ /. Therefore, the size is determied by N µ,σ α = / Ȳ > k α if µ >µ, N µ,σ / Ȳ < k α if µ <µ. It will be more coveiet to stadardize Ȳ because this allows us to evaluate the size i terms of the stadard ormal distributio. The coditio to be satisfied by k α the reads Ȳ µ N, σ α = / > k α µ σ / if µ >µ, Ȳ µ N, σ / < k α µ σ / if µ <µ, or, usig the stadard ormal distributio fuctio Φ, k Φ α µ σ α = / if µ > µ, k Φ α µ σ / if µ < µ. Rewritig this as Φ k α µ σ / = α if µ > µ, α if µ < µ

18 THEORY OF HYPOTHESIS TESTING allows us to determie the argumet of Φ by applyig the iverse stadard ormal distributio fuctio Φ to the previous equatio, which yields k α µ σ / = Φ α ifµ > µ, Φ α ifµ < µ, from which the costat k α is obtaied as µ + σ Φ α ifµ > µ, k α = µ + σ Φ α ifµ < µ, or k α = µ + σ Φ α ifµ > µ, µ σ Φ α ifµ < µ. Cosequetly, depedig o the sig of µ µ, there are two differet values for k α that satisfy the first coditio.5-5 of the Neyma-Pearso Lemma. Whe µ > µ the BCR is see to cosist of all the observatios y S, forwhich ȳ>µ + σ Φ α,.5-35 ad whe µ <µ, the BCR reads ȳ<µ σ Φ α. I the first case µ >µ, the MP test is give by ifȳ>µ + σ Φ α, φ u y = ifȳ<µ + σ Φ α, ad i the secod case µ <µ, the MP test is ifȳ<µ σ Φ α, φ l y = ifȳ>µ σ Φ α Observe that the critical regios deped solely o the value of the oe-dimesioal radom variable Ȳ,which, as a fuctio of the observatios Y, is also called a statistic. As this statistic appears i the specific cotext of hypothesis testig, we will speak of Ȳ as a test statistic. We see from this that it is ot ecessary to actually specify a -dimesioal regio S C used as the BCR, but the BCR may be expressed coveietly i terms of oe-dimesioal itervals. For this purpose, let c u, + ad,c l deote the critical regios with respect to the sample mea ȳ as defied by.5-35 ad The real costats ad c u := µ + σ Φ α.5-39 c l := µ σ Φ α.5-4 are called the upper critical value ad the lower critical value correspodig to the BCR for testig H versus H. I a practical situatio, it will be clear from the umerical specificatio of H which of the tests.5-37 ad.5-38 should be applied. The, the test is carried out by computig the mea ȳ of the give data y ad by checkig how large its value is i compariso to the critical value of.5-37 or.5-38, respectively.

19 .5 Best critical regios 3 Example.8: A most powerful test about the Beta distributio. Let Y,...,Y be idepedetly ad Bα, β-distributed observatios o [, ] with commo ukow parameter ᾱ which i this case is ot to be cofused with the size or level of the test ad commo kow parameter β = ot to be cofused with the probability of a Type II error. What is the BCR for a test of the simple ull hypothesis H :ᾱ = α = agaist the simple alterative hypothesis H :ᾱ = α =atlevelα? The desity fuctio of the uivariate Beta distributio i stadard form is defied by Γα + β fy; α, β = ΓαΓβ yα y β <y<; α, β >,.5-4 see Johso ad Kotz 97b, p. 37 or Koch 999, p. 5. Notice that.5-4 simplifies uder H to fy; α = Γ ΓΓ y y = <y<,.5-4 ad uder H to fy; α = Γ3 ΓΓ y y =y <y<.5-43 where we used the facts that Γ = Γ = ad Γ3 =. The desity.5-4 defies the so-called uiform distributio with parameters a = adb =, see Johso ad Kotz 97b, p. 57 or Koch, p.. We may ow proceed as i Example.7 ad determie the BCR by usig the Neyma-Pearso Lemma Theorem.. For idepedet observatios, the joit desity may be writte as the product of the idividual uivariate desities, which results i the desity ratio fy; α fy; α = y i / = y i,.5-44 whereweassumedthateachobservatiois strictly withi the iterval,. As the desity ratio is a positive umber, we may choose a umber k α such that y i >k α holds. Divisio by ad takig both sides to the power of / yields the equivalet iequality y i / > k α /. Now we have foud a seemigly coveiet coditio about the sample s geometric mea Y := Y i / rather tha about the etire sample Y itself. The the secod coditio.5-6 or equivaletly.5-8 of the Neyma-Pearso Lemma gives if y> k α / =: k α φy = if y< k α / =: k α. To esure that φ has some specified level α, the first coditio.5-5 of the Neyma-Pearso Lemma requires that α equals the probability uder H that the geometric mea exceeds k α. Ufortuately, i cotrast to the arithmetic mea Ȳ of idepedet ormal variables, the geometric mea Y of idepedet stadard uiform variables does ot have a stadard distributio. However, as Stuart ad Ord 3, p. 393 demostrate i their Example.5, the statistic U := l Y = l Y i = l Y i follows a Gamma distributio Gb, p withb =adp =, defied by Equatio.7 i Koch 999, p.. Thus the first Neyma-Pearso coditio reads α = G, U>k α = F G, k α, from which the critical value k α follows to be k α = F G, α, ad which may be obtaied i MATLAB by executig the commad CV = gamiv α,,. I summary, the MP test is give by if uy = l y i >k α = l k α =F G, α, φy = if uy = l y i <k α = l k α =F G, α.

20 4 THEORY OF HYPOTHESIS TESTING.5. Reductio to sufficiet statistics We saw i Example.7 that applyig the coditios of the Neyma-Pearso Lemma to derive the BCR led to a coditio about the sample mea ȳ rather tha about the origial data y. Wemightsaythatitwassufficiet to use the mea value of the data for testig a hypothesis about the parameter µ of the ormal distributio. This raises the importat questio of whether it is always possible to reduce the data i such a way. To geeralize this idea, let F = fy; θ :θ Θ} be a collectio of desities where the parameter θ is ukow. Further, let each fy; θ deped o the value of a radom fuctio or statistic T Y whichis idepedet of θ. If ay iferece about θ, be it estimatio or testig, depeds o the observatios Y oly through the value of T Y, the this statistic will be called sufficiet for θ. This qualitative defiitio of sufficiecy ca be iterpreted such that a sufficiet statistic captures all the relevat iformatio that the data cotais about the ukow parameters. The poit is that the data might have some additioal iformatio that does ot cotribute aythig to solvig the estimatio or test problem. The followig classical example highlights this distictio betwee iformatio that is essetial ad iformatio that is completely egligible for estimatig a ukow parameter. Example.9: Sufficiet statistic i Beroulli s radom experimet. Let Y,...,Y deote idepedet biary observatios withi a idealized settig of Beroulli s radom experimet see, for istace, Lehma, 959a, p The probability p of the elemetary evet success y i = is assumed to be ukow, but valid for all observatios. The probability of the secod possible outcome failure y i =isthe p. Now, it is ituitively clear that i order to estimate the ukow success rate p, it is completely sufficiet to kow how may successes T y := y i occurred i total withi trials. The additioal iformatio regardig which specific observatios were successes or failures does ot cotribute aythig useful for determiig the success rate p. I this sese the use of the statistic T Y reduces the data to a sigle value which carries all the essetial iformatio required to determie p. The cocept of sufficiecy provides a coveiet tool to achieve a data reductio without ay loss of iformatio about the ukow parameters. The defiitio above, however, is ot easily applicable whe oe has to deal with specific estimatio or testig problems. As a remedy, Neyma s Factorizatio Theorem gives a easy-to-check coditio for the existece of a sufficiet statistic i ay give parametric iferece problem. Theorem. Neyma s Factorizatio Theorem. Let F = fy; θ :θ Θ} be a collectio of desities for a sample Y =Y,...,Y. A vector of statistics T Y is sufficiet for θ if ad oly if there exist fuctios gt Y ; θ ad hy such that fy; θ =gt y; θ hy.5-45 holds for all θ Θ ad all y S. Proof. A deeper uderstadig of the sufficiecy cocept ivolves a ivestigatio ito coditioal probabilities which is beyod the scope of this thesis. The reader familiar with coditioal probabilities is referred to Lehma ad Romao 5, p. for a proof of this theorem. It is easy to see that the trivial choice T y :=y, gt y; θ :=fy; θ adhy := is always possible, but achieves o data reductio. Far more useful is the fact that ay reversible fuctio of a sufficiet statistic is also sufficiet for θ cf. Casella ad Berger,, p. 8. I particular, multiplyig a sufficiet statistic with costats yields agai a sufficiet statistic. The followig example will ow establish sufficiet statistics for the ormal desity with both parameters µ ad σ ukow. Example.: Suppose that observatios Y,...,Y are idepedetly ad ormally distributed with commo ukow mea µ ad commo ukow variace σ. Let the sample mea ad variace be defied as Ȳ = Y i/ ad S = Y i Ȳ /, respectively. The joit ormal desity ca the be writte as fy; µ, σ = exp } } πσ σ y i µ =πσ / exp µ σ + µ σ y i σ yi =πσ / exp σ ȳ µ } σ s I R y where T Y :=[Ȳ,S ] is sufficiet for µ, σ adhy :=I R y =withi as the idicator fuctio.

21 .5 Best critical regios 5 The great practical value of Neyma s Factorizatio Theorem i coectio to hypothesis testig lies i the simple fact that ay desity ratio will automatically simplify i the same way as i Example.7 from.5-3 to.5-3. What geerally happes is that the factor hy isthesameforθ ad θ due to its idepedece of ay parameters, ad thereby cacels out i the ratio, that is, fy; θ fy; θ = gt y; θ hy gt y; θ hy = gt y; θ gt y; θ for all y S I additio, this ratio will ow be a fuctio of the observatios Y through a statistic T Y which is usually low-dimesioal, such as [ Ȳ,S ] i Example.. This usually reduces the complexity ad dimesioality of the test problem greatly. Example.7 revisited. Istead of startig the derivatio of the BCR by settig up the desity ratio fy; θ /fy; θ of the raw data as i.5-9, we could save time by first reducig Y to the sufficiet statistic T Y =Ȳ ad by applyig.5-46 i coectio with the distributio Nµ, σ / of the sample mea. The πσ / exp } ȳ µ σ / gȳ; θ gȳ; θ = πσ / exp ȳ µ σ / =exp σ µ µ ȳ } σ µ µ } =exp σ ȳ µ + } σ ȳ µ leads to.5-33 more directly. We have see so far that the sample mea is sufficiet whe µ is the oly ukow parameter, ad that the sample mea ad variace are joitly sufficiet whe µ ad σ are ukow. Now, what is the maximal reductio geerally possible for data that are geerated by a more complex observatio model, such as by.-? Clearly, whe a parametric estimatio or testig problem comprises u ukow parameters that are ot redudat, the a reductio from > uobservatios to u correspodig statistics appears to be maximal. It is difficult to give clear-cut coditios that would ecompass all possible statistical models ad that would also be easily comprehesible without goig ito too may mathematical details. Therefore, the problem will be addressed oly by providig a workig defiitio ad a practical theorem, which will be applicable to most of the test problems i this thesis. Now, to be more specific, we will call a sufficiet statistic T Y miimally sufficiet if, for ay other sufficiet statistic T Y, T Y is a fuctio of T Y. As this defiitio is rather impractial, the followig theorem of Lehma ad Scheffe will be a useful tool. Theorem.3 Lehma-Scheffe. Let fy; θ deote the joit desity fuctio of observatios Y. Suppose there exists a statistic T Y such that, for every two data poits y ad y, the ratio fy ; θ/fy ; θ is costat as a fuctio of θ if ad oly if T y =T y.thet Y is miimally sufficiet for θ. Proof. See Casella ad Berger, p Example.: Suppose that observatios Y,...,Y are idepedetly ad ormally distributed with commo ukow mea µ ad commo ukow variace σ. Let y ad y be two data poits, ad let ȳ,s ad ȳ,s be the correspodig values of the sample mea Ȳ ad variace S. To prove that the sample mea ad variace are miimally sufficiet statistics, the ratio of desities is rewritte as fy ; µ, σ exp fy ; µ, σ = πσ σ y,i µ } πσ / exp [ȳ µ + s exp πσ σ y,i µ } = ]/ σ } πσ / exp [ȳ µ + s ]/ σ } =exp [ ȳ ȳ+µȳ ȳ s s ]/σ }. As this ratio is costat oly if ȳ = ȳ ad s = s,thestatistict Y =Ȳ,S is ideed miimally sufficiet. The observatios Y caot be reduced beyod T Y without losig relevat iformatio.

22 6 THEORY OF HYPOTHESIS TESTING.5.3 Uiformly most powerful UMP tests The cocept of the BCR for testig a simple H agaist a simple H about a sigle parameter, as defied by the Neyma-Pearso Lemma, is dissatifactory isofar that the great majority of test problems ivolves composite alteratives. The questio to be addressed i this subsectio is how a BCR may be defied for such problems. Let us start with the basic premise that we seek a optimal critical fuctio for testig the simple ull H : θ = θ.5-47 versus a composite alterative hypothesis H : θ Θ,.5-48 where the set of parameter values Θ ad θ are disjoit subsets of a oe-dimesioal parameter space Θ. The most straightforward way to establish optimality uder these coditios is to determie the BCR for testig H agaist a fixed simple H : θ = θ for a arbitrary θ Θ ad to check whether the resultig BCR is idepedet of the specific value θ. If this is the case, the all the values θ Θ produce the same BCR, because θ was selected arbitrarily. This critical regio that all the simple alteratives H i H = H : θ = θ with θ Θ }.5-49 have i commo may the be defied as the BCR for testig a simple H agaist a composite H. A test based o such a BCR is called uiformly most powerful UMP for testig H versus H at level α. Now, it seems rather cumbersome to derive the BCR for a composite H by applyig the coditios of the Neyma-Pearso Lemma to each simple H H. The followig theorem replaces this ifeasible procedure by coditios that ca be verified more directly. These coditios say that i order for a UMP test to exist, the test problem may have oly oe ukow parameter, the alterative hypothesis must be oesided, ad 3 each distributio i W must have a so-called mootoe desity ratio. The third coditio meas that, for all θ >θ with θ,θ Θ, the ratio fy; θ /fy; θ or the ratio gt; θ /gt; θ iterms of the sufficiet statistic T Y must be a strictly mootoical fuctio of T Y. The followig example will illumiate this issue. Example.: To show that the ormal distributio Nµ, σ with ukow µ ad kow σ has a mootoe desity ratio, we may directly ispect the simplified desity ratio.5-33 from Example.7. We see immediately that the ratio is a icreasig fuctio of T y := y i whe µ >µ. Theorem.4. Let W be a class of distributios with a oe-dimesioal parameter space ad mootoe desity ratio i some statistic T Y.. Suppose that H : θ = θ is to be tested agaist the upper oe-sided alterative H : θ >θ. The, there exists a UMP test φ u at level α ad a costat C with, if T y >C, φ u T y :=.5-5, if T y <C ad P θ φ u T Y = } = α For testig H agaist the lower oe-sided alterative H : θ <θ, there exists a UMP test φ l at level α ad a costat C with, if T y <C φ l T y :=.5-5, if T y >C ad P θ φ l T Y = } = α..5-53

23 .5 Best critical regios 7 Proof. To prove, cosider first the case of a simple alterative H : θ = θ for some θ >θ. With the values for θ ad θ fixed, the desity ratio ca be writte as fy; θ fy; θ = gt y; θ = ht y, gt y; θ that is, as a fuctio of the observatios aloe. Accordig to the Neyma-Pearso Lemma., the ratio must be large eough, i.e. ht y >kwith k depedig o α. Now, if T y <Ty holds for some y, y S, the certaily also ht y ht y due to the assumptio that the desity ratio is mootoe i T Y. I other words, the observatio y is i both cases at least as suitable as y for makig the ratio h sufficietly large. I this way, the BCR may be equally well costructed by all the data y S for which T y is large eough, for istace T y >C, where the costat C must be determied such that the size of this BCR equals the prescribed value α. As these implicatios are true regardless of the exact value θ, the BCR will be the same for all the simple alteratives with θ >θ. Therefore, the test.5-5 is UMP. The proof of follows the same sequece of argumets with all iequalities reversed. The ext theorem is of great practical value as it esures that most of the stadard distributios used i hypothesis testig have a mootoe desity ratio eve i their o-cetral forms. Theorem.5. The followig P -distributios with possibly additioal kow parameters µ, σ,p ad kow degrees of freedom f, f,, f, have a desity with mootoe desity ratio i some statistic T :. Multivariate idepedet ormal distributios Nµ, σi ad Nµ,σ I,. Gamma distributio Gb, p, 3. Beta distributio Bα, β, 4. No-cetral Studet distributio tf,λ, 5. No-cetral Chi-squared distributio χ f,λ, 6. No-cetral Fisher distributio F f,,f,,λ, Proof. The proofs of ad may be elegatly based o the more geeral result that ay desity that is a member of the oe-parameter expoetial family, defied by fy; θ =hycθexpwθt y}, hy,cθ,.5-54 cf. Olive, 6, for more details has a mootoe desity ratio see Lehma ad Romao, 5, p. 67, ad that the ormal ad Gamma distributios ca be writte i this form The desity fuctio of Nµ, σ I.5-9 ca be rewritte as } fy; µ =πσ / exp } } σ yi exp µ µ σ exp σ y i, where hy :=πσ / exp } σ y i, cθ :=exp µ }, wθ := µ,adt y := σ σ y i satisfy Similarly, the desity fuctio of Nµ,σ I reads, i terms of.5-54, } fy; σ =I R yπσ / exp σ y i µ, where hy correspods to the idicator fuctio I R y with defiite value oe, cθ :=πσ /, wθ := σ,adt y := y i µ.. The Gamma distributio, defied by Equatio.7 i Koch 999, p., with kow parameter p may be directly writte as fy; b = yp Γp bp exp by} b >,p >,y R +, where hy :=y p /Γp, cθ :=b p, wθ :=b, adt y := y satisfy The proofs for these distributios are legthy ad may be obtaied from Lehma ad Romao 5, p. 4 ad 37.

24 8 THEORY OF HYPOTHESIS TESTING Example.3: Test of the ormal mea with kow variace composite alteratives. We are ow i a positio to exted Example.7 ad seek BCRs for composite alterative hypotheses. For demostratio purposes, both the raw defiitio of a UMP test ad the more coveiet Theorem.4 will applied if possible. Let us first look at the formal statemet of the test problems. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo kow variace σ = σ. Do a UMP tests for testig the simple ull hypothesis H : µ = µ agaist the composite alterative hypothesis. H : µ>µ,. H : µ<µ, 3. H : µ µ exist at level α, ad if so, what are the BCRs? It is assumed that µ, σ ad α have fixed umerical values.. Recall from Example that the BCR for the test of H : µ = µ agaist the simple H : µ = µ with µ >µ is give by all the observatios satisfyig ȳ>µ + σ Φ α, whe µ >µ. Evidetly, the critical regio is the same for all the simple alteratives H : µ = µ with µ >µ }, because it is idepedet of µ. Therefore, the critical fuctio.5-37 is UMP for testig H agaist the composite alterative H : µ>µ. The followig alterative proof makes direct use of Theorem.4. I Example., the ormal distributio Nµ, σ with kow variace was already demostrated to have a mootoe desity ratio i the sufficiet statistic Y i or i T Y := Y i/ as a reversible fuctio thereof. As the curret testig problem is about a sigle parameter, a oe-sided H, ad a class of distributio with mootoe desity ratio, all the coditios of Theorem.4 are satisfied. It remais to fid a costat C such that the critical regio.5-5 has size α accordig to coditio.5-5. It is foud easily because we kow already that T Y is distributed as Nµ,σ uder H,sothat T Y µ α = P µ,σ / φy =} = P µ,σ / Y S C } = N µ,σ / T Y >C} = N, σ / C µ = Φ σ /, from which C follows to be C = µ + σ Φ α. < C µ σ / Note that the umber C would chage to C = µ + σ Φ α if Y i wasusedasthesufficiet statistic istead of Y i/, because the mea ad variace of the ormal distributio are affected by the factor /.. The proof of existece ad determiatio of the BCR of a UMP test for testig H versus H : µ<µ is aalogous to the first case above. All the coditios required by Theorem.4 are satisfied, ad the costat C appearig i the UMP test.5-5 ad satisfyig.5-53 is ow foud to be C = µ σ Φ α. 3. I this case, there is o commo BCR for testig H agaist H : µ µ. Although the BCRs.5-35 ad.5-36 do ot idividually deped o the value of µ, they differ i sigs through the locatio of µ relative to µ. Cosequetly, there is o UMP test for the two-sided alterative. This fact is also reflected by Theorem.4, which requires the alterative to be oe-sided.

25 .5 Best critical regios Reductio to ivariat statistics We will ow tackle the problem of testig a geerally multi-parameter ad composite ull hypothesis H : θ Θ agaist a possibly composite ad two-sided alterative H : θ Θ with the usual assumptio that Θ ad Θ are o-empty ad disjoit subsets of the parameter space Θ, which is coected to a parametric family of desities F = fy; θ :θ Θ}, or equivaletly F T = f T T y; θ :θ Θ}, whe miimally sufficiet statistics T Y are used as ersatz observatios for Y. Recall that sufficiecy oly reduces the dimesio of the observatio space, whereas it always leaves the parameter space uchaged. The problem is ow that o UMP test exists whe the parameter space is multi-dimesioal or whe the alterative hypothesis is two-sided, because the coditios of Theorem.4 would be violated. To overcome this serious limitatio, we will ivestigate a reductio techique that may be applied i additio to a reductio by sufficiecy, ad that will oftetimes produce a simplified test problem for which a UMP test the exists. Sice ay reductio beyod miimal sufficiecy is ecessarily boud to cause a loss of relevat iformatio, it is essetial to uderstad what kid of iformatio may be safely discarded i a give test problem, ad what the equivalet mathematical trasformatio is. The followig example gives a first demostratio about the ature of such trasformatios. Example.4: Recall from Example.3 that there exists o UMP test for testig H : µ = µ = agaist the two-sided H : µ µ =with σ = σ kow, as the oe-sidedess coditio of Theorem.4 is violated. However, if we discard the sig of the sample mea, i.e. if we oly measure the absolute deviatio of the sample mea from µ, ad if we use the sig-isesitive statistic Ȳ istead of Ȳ, the the problem becomes oe of testig H : µ = agaist the oe-sided H : µ >. This is so because Ȳ Nµ, σ / implies that Ȳ σ has a o-cetral chi-squared distributio χ,λ with oe degree of freedom ad o-cetrality parameter λ = µ see Koch, 999, p. 7. The, µ = is equivalet to λ = uder H σ,ad µ is equivalet to λ > uder H. As the trasformed test problem is about a oe-sided alterative ad a test statistic with a mootoe desity ratio by virtue of Theorem.5-4, the UMP test accordig to.5-5 of Theorem.4 is give by φ, if ȳ >C, σ ȳ σ :=.5-55, if ȳ <C, σ where, accordig to coditio.5-5, C is fixed such that the size of.5-55 equals the prescribed value α. Usig defiitio.4-6 ad the fact that Ȳ has a cetral chi-squared distributio with oe degree of σ freedom uder H,thisis α = χ, Ȳ <C = F χ, C, σ which yields as the critical value C = F χ, α. We will call the trasformed problem of testig H : λ = agaist H : λ >theivariace-reduced testig problem, ad the correspodig test.5-55 based o Theorem.4 the UMP ivariat test. It will be iterestig to compare the power fuctio of this test with the power fuctios of the UMP tests for

26 THEORY OF HYPOTHESIS TESTING the oe-sided alteratives from Example.3. Usig.4-, the power fuctio of the ivariat test.5-55 reads Pfµ = χ,µ /σ σ Ȳ <C = F χ F α.,µ /σ χ, The power fuctios of the upper ad lower oe-sided UMP tests derived i Example.3 here with the specific value µ = are foud to be Pf µ = Φ Φ α µ σ ad Pf µ = Φ Φ α µ, σ respectively. Power fuctio Oe sided left Oe sided right Two sided.5 Fig..4 Power fuctios for the two UMP tests at level α =.5 about H : µ< lower oe-sided ad H : µ> upper oe-sided, ad a UMP ivariat test about H : µ reduced to H : µ >. Figure.4 shows that each of the UMP tests has slightly higher power withi their oe-sided Θ -domais tha the ivariace-reduced test for the origially two-sided alterative. Observe that each of the UMP oe-sided tests would have practically zero power whe the value of the true parameter is uexpectedly o the other side of the parameter space. O the other had, the ivariace-reduced test guaratees reasoable power throughout theetiretwo-sided parameter space. Clearly, the power fuctio of the ivariace-reduced test.5-55 is symmetrical with respect to µ =, because the sig of the sample mea, ad cosequetly that of the mea parameter, is ot beig cosidered. Therefore, we might say that this test has bee desiged to be equally sesitive i both directios away from µ =. I mathematical termiology, oe would say that the test is ivariat uder sig chages Ȳ ±Ȳ,adȲ is a sig-ivariat statistic, i.e. a radom variable whose value remais uchaged whe the sig of Ȳ chages. Notice that the oe-sidedess coditio of the Theorem.4 is restored by virtue of the fact the parameter λ = µ of the ew test statistic σ Ȳ is ow o-egative, thereby resultig i a oe-sided H. The crucial poit is, however, that the hypotheses of the reduced testig problem remai equivalet to the origial hypotheses. Reductio by ivariace is ot oly suitable for trasformig a test problem about a two-sided H ito oe about a oe-sided H. I fact, we will see that the cocept of ivariace may also be applied to trasform a testig problem ivolvig multiple ukow parameters ito a test problem with a sigle ukow parameter, as required by Theorem.4. To make this approach operable withi the framework of geeral liear models, a umber of defiitios ad theorems will be itroduced ow. To begi with, it will be assumed throughout the remaider of this sectio that the origial observatios Y with sample space S have bee reduced to miimally sufficiet ersatz observatios T Y withvaluesis T

27 .5 Best critical regios ad with a collectio of desities F T. I fact, Arold 985 showed that ay iferece based o the followig ivariace priciples is exactly the same for Y ad T Y. The, let us cosider a ivertible trasformatio g of the ersatz observatios from S T to S T such as the sig chage of the sample mea T Y =Ȳ i Example.4. Typically, such a statistic gt will iduce a correspodig trasformatio ḡθ of the parameters from Θ to Θ such as Ȳ ±Ȳ iduces µ ±µ i Example.4. What kid of trasformatio g is suitable for reducig a test problem i a meaigful way? Accordig to Arold 98, p., the first desideratum is that ay trasformatio g with iduced ḡ leaves the hypotheses of the give test problem uchaged. I other words, we require that ḡθ :=ḡθ :θ Θ} = Θ.5-56 ḡθ :=ḡθ :θ Θ } = Θ ḡθ :=ḡθ :θ Θ } = Θ gt has a desity fuctio i f T gt ; ḡθ : ḡθ Θ}.5-59 holds see also Cox ad Hikley, 974, p. 57. If such a trasformatio of the testig problem exists, we will say that the testig problem is ivariat uder g with iduced ḡ. I Example.4 we have see that the hypotheses i terms of the parameter µ is equivalet to the hypotheses i terms of the ew parameter λ = gµ whe the reversal of the sig is used as trasformatio g. The secod desideratum cf. Arold, 98, p. is that ay trasformatio g with iduced ḡ leaves the test decisio, that is, the critical fuctio φ uchaged. Mathematically, this is expressed by the coditio φgt = φt for all t S T..5-6 If this is the case for some trasformatio g, we will say that the critical fuctio or test is ivariat uder g with iduced ḡ. The first desideratum, which defies a ivariat test problem, may also be iterpreted such that if we observe gt with some desity fuctio f T gt; ḡθ rather tha t with desity fuctio f T t; θ, ad if the hypotheses are equivalet i the sese that H : θ Θ ḡ θ Θ ad H : θ Θ ḡ θ Θ,the the test problem about the trasformed data gt is clearly the same as that i terms of the origial data T. The it seems logical to apply a decisio rule φ which yields the same result o matter if gt ort has bee observed. But this is the very propositio of the secod desideratum, which says that φgt should equal φt. Example.4 costitutes the rare case that a test problem is ivariat uder a sigle trasformatio g. Usually, test problems are ivariat uder a certai collectio G of ivertible trasformatios g withi the data domai with a correspodig collectio Ḡ of ivertible trasformatios ḡ withi the parameter domai. The followig propositio reflects a very useful fact about such collectios of trasformatios see Arold, 98, p.. Propositio.. If a test problem is ivariat uder some ivertible trasformatios g G, g G,ad g Gfrom a space S T to S T with iduced trasformatios ḡ Ḡ, ḡ Ḡ, adḡ Ḡ from a space Θ to Θ, the it is also ivariat uder the iverse trasformatio g ad the compositio g g of two trasformatios, ad the iduced trasformatios are g =ḡ ad g g =ḡ ḡ, respectively. If a test problem remais ivariat uder each g Gwith iduced ḡ Ḡ, the this propositio says that both G ad Ḡ are closed uder compositios ad iverses which will agai be elemets of S T ad Θ, respectively. I that case, G ad Ḡ are said to be groups. Let us ow ivestigate how ivariat tests may be geerally costructed. We have see i Example.4 that a reasoable test may be based o a ivariat statistic, whichremais uchaged by trasformatios i G such as MȲ :=Ȳ uder gȳ =±Ȳ. Clearly, ay statistic MT that is to be ivariat uder a collectio G of trasformatios o S T must satisfy MT =MgT.5-6 for all g G. However, the ivariace coditio.5-6 aloe does ot ecessarily guaratee that a test which isbasedosuchastatisticmt is itself ivariat. I fact, wheever two data poits t ad t from S T produce the same value Mt =Mt for the ivariat statistic, the additioal coditio t = gt.5-6 is required to hold for some g G. A ivariat statistic which satisfies also.5-6 is called a maximal ivariat. Coditio.5-6 esures that G is the largest collectio uder which the testig problem is ivariat.

28 THEORY OF HYPOTHESIS TESTING Example.5: As i Example., let T Y :=[Ȳ, S ] be the vector of joitly sufficiet statistics for idepedetly ad ormally distributed observatios Y,...,Y with commo ukow mea µ ad commo ukow variace σ. The problem of testig H : µ =versush : µ is ivariat uder the trasformatio g Ȳ = Ȳ, S S which we will write i the form g Ȳ,S = Ȳ,S for coveiece. To see this, we first otice that g iduces the trasformatio ḡ µ, σ = µ, σ, because Ȳ N µ, σ, while S ad thus its distributio remais uchaged. With Θ = R R +, Θ = } R + ad Θ = R } R +,weobtai ḡθ =ḡ µ, σ : µ =,σ R + } =,σ : σ R + } = Θ, ḡθ =ḡ µ, σ : µ,σ R + } = µ, σ : µ,σ R + } = Θ. Due to Θ = Θ Θ,ḡΘ =Θ also holds. Thus, the above testig problem is ivariat uder the trasformatio g. Cosider ow the statistic MT =M Ȳ,S := Ȳ S. This statistic is ivariat because of M g Ȳ,S = Ȳ,S = Ȳ S = Ȳ S = M Ȳ,S. Let us ow ivestigate the questio whether M is also maximally ivariat. Suppose that t =[ȳ, s ] ad t =[ȳ,s ] are two realizatios of T Y. The Mt =Mt is see to hold e.g. for ȳ =ȳ ad s =4s because of M t =M ȳ,σ = M ȳ, 4σ 4ȳ = 4s = ȳ s However, the ecessary coditio t = gt is ot satisfied, sice g ȳ,σ = ȳ,σ = ȳ, 4σ t. = M ȳ,σ = M t. Cosequetly, M must be ivariat uder a larger group of trasformatios tha g. Ideed, M ca be show to be maximally ivariat uder the group of trasformatios defied by Ȳ,S g c = cȳ,c S c, which icludes the above trasformatio with c =. Arold 98, Sectio.5 demostrates a techique for provig maximality, which shall be outlied here as well. First, we assume that t =[ȳ,σ] ad t =[ȳ,σ] are two realizatios of T Y forwhichmt =Mt holds. If we fid some c forwhicht = g c t is satisfied, the M follows to be maximally ivariat. Observe ext that, usig the above defiitio of M, the assumptio Mȳ,σ=Mȳ,σisequivalettoȳ /s =ȳ/s or ȳ /ȳ = s /s. The, if we defie c := ȳ /ȳ, we see immediately that ȳ = cȳ ad s = c s, ad we have t = ȳ = cȳ = g c ȳ = g c t s c s s as desired. The followig propositio from Arold 98, p. 3 esures that maximal ivariats exist geerally. Propositio.. For ay group G of ivertible trasformatios g from a arbitrary space S T to S T there exists a maximal ivariat.

29 .5 Best critical regios 3 The ext theorem provides the maximal ivariats for some groups of trasformatios that will be particularly useful for reducig testig problems. Theorem.6. Let T be a radom vector, S ad T radom variables, ad c a positive real umber. The,. MT,S =T /S is a maximal ivariat statistic uder the group G of scale chages gt,s =ct, c S c> MT,S =T /S is a maximal ivariat statistic uder the group G of scale chages gt,s =c T,c S c> MT =T is a maximal ivariat statistic uder the sig chage gt = T MT,S =T,S is a maximal ivariat uder the sig chage gt,s = T,S MT =T T is a maximal ivariat statistic uder the group G of orthogoal trasformatios gt =ΓT,.5-67 where Γ is a arbitrary orthogoal matrix. 6. MT,S =T T,S is a maximal ivariat statistic uder the group G of orthogoal trasformatios gt,s =ΓT,S,.5-68 where Γ is a arbitrary orthogoal matrix. Proof.. See Example.5.. MT,S is a ivariat statistic because of MgT,S = Mc T,c S = c T c S = T S = MT,S. To prove maximality, suppose that Mt,s =Mt,s holds. From this, the equivalet coditios t /s = t /s ad t /t = s /s follow. Defiig c := t /t results i t = c t ad s = c s,thatis t = c t = g t s c s s as required. 3. Ivariace of MT follows from MgT = M T = T = T = MT. The, let t ad t be two realizatios of T for which Mt =Mt holds. This equatio is equivalet to t = t, which is satisfied by t = t. Hece, t = gt, which proves that MT is a maximally ivariat statistic uder g. 4. The proof of this fact follows from the same lie of reasoig as As ay orthogoal matrix satisfies Γ Γ = I, weobtai MgT = MΓT =ΓT ΓT =T Γ ΓT = T T = MT, which shows that MT is a ivariat statistic. To prove maximality, let t ad t be two o-zero realizatios of T,forwhichMt =Mt, or equivaletly, t t = t t holds. This coditio expresses that the vectors t ad t must have equal legth. The, there always exists a orthogoal trasformatio Γt which does ot chage the legth of t see Meyer,, Characterizatio #4 regardig the matrix P, p. 3, that is, which satisfies t t = t t for some vector t. 6. The proof of this fact follows from the same lie of reasoig as 5.

30 4 THEORY OF HYPOTHESIS TESTING.5.5 Uiformly most powerful ivariat UMPI tests Let us begi the curret sectio with the followig defiitio. Sice ay group G of trasformatios actig o the observatio space S T iduces a correspodig group Ḡ of trasformatios actig o the parameter space Θ, there will exist a maximal ivariat Mθ uder Ḡ, which will be called the parameter maximal ivariat. Theorem.7. φt is a ivariat critical fuctio if ad oly if there exists φ MT such that φt = φ Mt holds for every t S T. The, the distributio of MT depeds oly o Mθ, the iduced maximal ivariat uder Ḡ. Proof. See Arold 98, p. 3. From Theorem.7 it becomes evidet that we may restrict attetio to the ivariace-reduced problem of testig agaist H : M θ MΘ H : M θ MΘ based o maximally ivariat statistics MT with distributio depedig o parameters Mθ. If a complete reductio by ivariace is possible, the MT will be a scalar test statistic depedig o a sigle parameter Mθ, ad the trasformed spaces MΘ admθ will represet a sigle poit simple H ad a oe-sided iterval oe-sided H, respectively. Give that such a oe-dimesioal test statistic MT has a mootoe desity ratio, all the requiremets of Theorem.4 are satisfied by the fully ivariace-reduced test problem. The UMP critical fuctio for the ivariat test problem the reads, if Mt >C, φmt :=.5-69, if Mt <C if H is a upper oe-sided alterative, ad, if Mt <C, φmt :=, if Mt >C.5-7 if H is a lower oe-sided alterative hypothesis. I both cases the critical value must satisfy the coditio P θ φmt = } = α,.5-7 which guaratees that the test φ has fixed level α. Recall also that t = T y cotais the values of the sufficiet statistics T at the observed data y. Sice such a test if it exists as presumed here is UMP at level α for the ivariace-reduced test problem usig group G, it is UMP amog all tests that are ivariat uder G. Therefore, φ will be called the UMP ivariat UMPI test at level α for testig the origial hypotheses H : θ Θ agaist H : θ Θ. Parethesis: Let us retur for a momet to Example.4 ad the problem of testig H : µ =versus H : µ. If we ispect the power fuctio of the UMPI test i Figure.4, we see that it does ot fall below the level α. This geerally desirable property of a test is called ubiasedess. Arold 98, Theorem.3 states that ay UMPI test is also ubiased. Without goig ito details, it should be metioed that there exist testig problems for which o UMPI tests exist, but for which a test ca be foud which is UMP withi the class of all ubiased tests. However, we will ot be cocered with such testig problems i this thesis. Istead, the reader iterested i the cocept of such optimally ubiased UMPU tests is referred to Koch 999, p. 77, where coditios for the existece of UMPU tests are give, or to Lehma ad Romao 5, Chapters 4 ad 5 for a detailed discussio of that topic.

31 .5 Best critical regios 5 Example.6 Example.4 restated: Test of the ormal mea with kow variace - Two-sided alterative. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo kow variace σ = σ. What is the best critical regio for a test of the simple ull hypothesis H : µ = µ agaist the two-sided alterative hypothesis H : µ µ at level α? This example is slightly more geeral tha Example.4, because the hypotheses are ot cetered aroud. However, the simple trasformatio Y i = Y i µ of the origial radom variables Y i ito variables Y i solves this techical problem. Such a procedure is justified i light of the fact that the distributio of Y Nµ, σi is trasformed ito Nµ µ,σ IforY without chagig the secod momet. Thus the true mea of the Y i is ow µ := µ µ, ad the hypotheses become H : µ = µ µ =adh : µ = µ µ, respectively. Therefore, we may restrict ourselves to the simple case µ = kowig that a test problem about µ may always be cetered by trasformig the observatios. It should be metioed here that the cases of correlated ad/or heteroscedastic observatios will be discussed i the cotext of the ormal Gauss-Markov modelisectio3. Now, the test problem Y Nµ, σ I H : µ = agaist H : µ.5-7 does ot allow for a UMP test sice H is two-sided. This fact does ot chage after reducig the problem about Y to the equivalet test problem T Y =Ȳ Nµ, σ / H : µ = agaist H : µ.5-73 about the sample mea Ȳ used as a sufficiet statistic T Y forµ. However, by usig the ivariace priciple, this test problem may be trasformed ito a problem about a oe-sided H. To be more specific, the test problem is ivariat uder sig chages gȳ = Ȳ N µ, σ /, ad the iduced trasformatio actig o the parameter space is obviously ḡµ = µ. Due to ḡθ =Θ with Θ degeeratig to the sigle poit µ =,ḡθ =Θ with Θ = R ad ḡθ = Θ, the problem is ideed ivariat uder g. From Theorem.6-3 it follows that MȲ =Ȳ is a maximal ivariat uder sig chages. To obtai a test statistic with a stadard distributio, it is more coveiet to use the stadardized sample mea σ Ȳ N σ µ, as a sufficiet statistic, which is possible, because ay reversible fuctio of a sufficiet statistic is itself sufficiet. The, the maximally ivariat test statistic MY = Ȳ σ has a o-cetral chi-squared distributio χ,λ with oe degree of freedom ad o-egative o-cetrality parameter λ = µ. Now, it is easily see that the ew testig problem σ MY = σ Ȳ χ,λ with λ = σ H : λ = agaist H : λ >.5-74 µ is equivalet to the origial problem of testig H : µ = agaist H : µ, because µ = is equivalet λ = whe H is true, ad µ is equivalet to λ >wheh is true. As this reduced testig problem ivolves oly oe ukow parameter λ which correspods to Mθ i Theorem.7 ad a oe-sided H, ad sice the distributio of MY has a mootoe desity ratio by virtue of Theorem.5-5, a UMP test φ exists as a cosequece of Theorem.4 with φy :=, if My = Ȳ σ, if My = Ȳ σ >k χ α, <k χ α,.5-75

32 6 THEORY OF HYPOTHESIS TESTING ad critical value k χ,λ= α. Recall that the critical value is always computed uder the assumptio of a true H,whichiswhyλ =. The, φ is also the UMPI test at level α for the origial test problem. This test may be writte equivaletly i terms of the N, -distributed test statistic MY, that is, φy :=, if My = Ȳ σ, if My = Ȳ σ >k N, α/, <k N, α/,.5-76 with critical value k N, α/ =Φ α/. Example.7: Test of the ormal mea with ukow variace - Two-sided alterative. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo ukow variace σ. What is the best critical regio for a test of the composite ull hypothesis H : µ = µ σ > agaist the two-sided alterative hypothesis H : µ µ σ > at level α? As demostrated i Example.6, it will be sufficiet to cosider µ = without loss of geerality. We have see i Example. that the observatios Y may be reduced without loss of iformatio to the joitly sufficiet statistic T Y =[Ȳ,S ] where S = Y i Ȳ deotes the sample variace. Therefore, the give test problem Y Nµ, σ I H : µ = agaist H : µ may be writte as T Y =Ȳ Nµ, σ / T Y = S /σ χ H : µ = agaist H : µ.5-77 I the preset case we are ot oly faced with the problem of a two-sided H which we already leart to hadle i Example.6, but with the additioal challege of a two-dimesioal parameter space. Let us ivestigate both problems separately by fidig suitable groups of trasformatios first. We will the combie the results later o to obtai the fial solutio to the test problem. To begi with, it is easily verified that the test problem.5-77 i terms of T Y adt Y is ivariat uder the group G of sig chages actig o the sample mea. With g Ȳ,S = Ȳ,S the iduced trasformatio is idetified as ḡ µ, σ = µ, σ due to the chage i distributio Ȳ Nµ, σ / Ȳ N µ, σ /. As already explaied i Example.6, sig chages ḡ µ, σ do ot affect the hypotheses as they are symmetrical about. Accordig to Theorem.6-4, maximal ivariats uder g are [Ȳ,S ], or M Ȳ,S := [ σ Ȳ, S ] after rescalig, which leads to the reduced problem M, Y = σ Ȳ χ,λ with λ = σ µ M, Y = S /σ χ H : λ = agaist H : λ >.5-78 Although the alterative hypothesis is ow oe-sided, there are still two statistics for the two ukow parameters λ ad σ. Therefore, we coclude that reductio by sig ivariace aloe does ot go far eough i this case.

33 .5 Best critical regios 7 I additio to beig sig-ivariat, the test problem.5-77 ca also be show to be scale-ivariat, that is, ivariat uder the group G of scale chages g Ȳ,S =cȳ,c S. This trasformatio arises whe the scale of the origial observatios Y is chaged by multiplyig them with a positive costat c, because the distributio of cy chages to Ncµ, c σ I, ad the sufficiet statistics to cȳ ad c S, respectively. Evidetly, G iduces Ḡ with ḡ µ, σ =cµ, c σ, because of the trasitios i distributio Ȳ Nµ, σ / c Ȳ Ncµ, c σ /, S G /, σ / c S G /, c σ /. That the test problem is ideed scale-ivariat is see from the fact that Ḡ with c> does ot chage the hypotheses due to ad ḡ Θ =ḡ µ, σ :µ =,σ R + } = c,c σ :σ R + } =,σc :σ c R+ } =Θ ḡ Θ =ḡ µ, σ :µ R },σ R + } = cµ, c σ :µ R },σ R + } = µ c,σc :µ c R },σc R+ } =Θ. By virtue of Theorem.6- a maximal ivariat rescaled by uder G is give by M Y = Ȳ S = Ȳ S.5-79 which has a t,λ-distributio with degrees of freedom ad o-cetrality parameter λ = µ σ.the resultig test problem M Y = Ȳ t,λ S H : λ = agaist H : λ.5-8 ow has a reduced parameter space i light of the sigle parameter λ. However, due to σ>, the origially two-sided alterative H : µ is oly equivalet to λ because a egative µ will cause a egative λ, ad a positive µ leads to a positive λ. I summary, reductio by scale ivariace could successfully produce a equivalet test problem about oe sigle parameter, but the problem cocerig the two-sidedess of H could ot be resolved. Parethesis: Sice the preset test problem is ivariat uder two differet groups of trasformatios, ad sice either group does ot simplify the problem far eough, it is logical to seek a maximal ivariat as a test statistic that correspods to a test problem which is ivariat uder both groups. The followig theorem is of great practical value as it allows us determie a maximal ivariat step by step. Theorem.8. Let G ad G be two groups of trasformatios from S T to S T ad let G be the smallest group cotaiig G ad G. Suppose that M T is a maximal ivariat uder G ad that M T satisfies M g T = ĝ M T. Further, let M T be a maximal ivariat uder the group Ĝ of trasformatios ĝ.themt =M M T is a maximal ivariat uder G. Proof. See Stuart et al. 999, p. 97.

34 8 THEORY OF HYPOTHESIS TESTING Example.7 cotiued: Let us ow combie these complemetary results. Theorem.8 allows us to determie the maximal ivariat uder the uio G of the two sub-groups G ad G sequetially. M as the maximal ivariat uder the group G of sig chages with g Ȳ,S = Ȳ,S ca be show to satisfy M g T = ĝ M T, because there exists a trasformatio ĝ such that M g Ȳ,S = c Ȳ,c S =ĝ Ȳ,S, where Ĝ is the group of trasformatios ĝ from Theorem.6-, which gives M Ȳ,S =Ȳ /S as its maximal ivariat. It follows from Theorem.8 that MȲ,S =M M Ȳ,S = Ȳ /S.5-8 is the total maximal ivariat uder the uio of G ad G. Now recall that we may always use rescaled versios of maximal ivariats. The, due to σ Ȳ χ,λwithλ = σ µ ad S σ χ, the ratio σ MY := Ȳ S σ / = Ȳ S.5-8 follows a F,,λ-distributio. Sice µ = is equivalet to λ =adµ toλ> with arbitrary σ >, the hypotheses of the origial test problem.5-77 are equivalet to H : λ =versush : λ >. I summary, the fully reduced, both sig- ad scale-ivariat test problem reads MY = Ȳ S F,,λ with λ = σ µ.5-83 H : λ = agaist H : λ >.5-84 As Theorem.5-6 shows that the o-cetral F-distributio with kow degrees of freedom ad ukow ocetrality parameter λ has a mootoe desity ratio, all three coditios oe ukow parameter, oe-sided H, ad a test statistic with mootoe desity ratio for the existece of UMP test are satisfied, ad Theorem.4 gives the best test φy :=, if My = Ȳ S, if My = Ȳ S >k F, α, <k F, α,.5-85 with critical value is give by k F, α. By defiitio it follows that φ is the UMPI test for the origial test problem This test is usually give i terms of My which has Studet s distributio t, that is, φy :=, if My = Ȳ S, if My = Ȳ S >kt α/, <kt α/,.5-86 with critical value k t α/. The purpose of Examples.6 ad.7 was to demostrate that the stadard tests cocerig the mea with the variace either kow.5-76 or ukow.5-86 are optimal withi the class of ivariat tests. Equipped with these tools for reducig the space of observatios ad the space of parameters to oedimesioal itervals, we could ow proceed ad ivestigate more complex test problems, which occur very ofte i the cotext of liear models. Liear models essetially costitute geeralizatios of the observatio model Nµ, σ ItoNXβ,σ P, where Xβ represets a possibly multi-dimesioal ad o-costat mea, ad where σ P idicates that the observatios might be correlated ad of o-costat variace. However, to keep the theoretical explaatios short, the reader iterested i tests withi the cotext of liear models is referred to Sectio 3. At this poit we will cotiue the curret sectio by presetig coveiet oestep reductio techiques that are oftetimes equivalet to a sequetial reductio by sufficiecy ad ivariace.

35 .5 Best critical regios Reductio to the Likelihood Ratio ad Rao s Score statistic At the begiig of the curret Sectio.5, where we discussed the case of testig agaist a simple H,wehave see that the desity ratio fy; θ /fy; θ, used as a test statistic for the MP test as defied i.5-6 by the Neyma-Pearso Lemma, may be simplified such that a sigle sufficiet statistic ca be used as a equivalet test statistic. Wheever a test problem has a oe-sided H but still oly oe sigle ukow parameter, the that sufficiet statistic must have a distributio with a mootoe desity ratio i order for a UMP test to exist. If multiple parameters are ukow, the our approach was to shrik the dimesio of the parameter space to i order to have a oe-dimesioal test statistic at oe s disposal. Such a test statistic was derived as the maximal ivariat uder a group of trasformatios, which lead us to tests that are UMP amog all tests ivariat uder these trasformatios. As a test problem may be ivariat uder umerous sub-groups of trasformatios such as the oe i Example.7, a maual step-wise reductio of a test problem by ivariace ca become quite cumbersome. Therefore, we will ivestigate ways for obtaiig a UMPI test i a more direct maer. We will see that there are i fact two equivalet methods for reducig a test problem about observatios or equally well about m miimal sufficiet statistics ad m ukow parameters to a oe-parameter problem with a oe-sided H. Reductio to the Likelihood Ratio statistic. Let us cosider the problem of testig H : θ Θ agaist H : θ Θ o the basis of observatios Y with true desity fuctio i F = fy; θ :θ Θ}. Ispectio of the desity ratio fy; θ /fy; θ,usedfortestigasimpleh versus a simple H by the Neyma- Pearso Lemma, reveals that this quatity is ot uique aymore if the hypotheses are composite, i.e. if θ ad θ are elemets of itervals Θ ad Θ. I that case it would ot be clear at which values θ ad θ the desity ratio should be evaluated. This situatio is of course ot improved if the desities comprise multiple ukow parameters θ ad θ. Oe approach to removig the ambiguity of the desity ratio cosists i takig the maximum value of the desity fuctio over Θ ad over Θ, that is, to determie the value of max θ Θ fy; θ max θ Θ fy; θ Sice the desities i.5-87 are ow treated as fuctios of θ rather tha of y, it is ecessary to switch the argumets, or formally to itroduce a ew fuctio from Θ to R, which is defied as Lθ; y :=fy; θ,.5-88 ad which treats y as give costats. L is called the likelihood fuctio for y, adthefractio max θ Θ Lθ; y.5-89 max θ Θ Lθ; y deotes the geeralized likelihood ratio. Notice that i this defiitio the omiator ad deomiator have bee switched with respect to the geeralized desity ratio To take this chage ito accout whe comparig this ratio with the critical value, we oly eed to switch the </>-relatio accordigly. If the hypotheses are such that Θ = Θ Θ, the we may modify.5-89 slightly ito GLR := max θ Θ Lθ; y max θ Θ Lθ; y..5-9 The oly differece betwee.5-9 ad.5-89 is that.5-9 may take the value, because Θ is a subset of Θ see also Koch, 999, p. 79, for a discussio of the properties of the geeralized likelihood ratio. All the examples discussed so far ad all the applicatios to be ivestigated i Sectios 3 ad 4 allow us to rewrite the hypotheses H : θ Θ ad H : θ Θ i the form of liear costraits restrictios < H : H θ = w versus H : H θ > w,.5-9 where H is a r u-matrix with kow costats ad rak r, adwherew is a r -vector of kow costats.

36 3 THEORY OF HYPOTHESIS TESTING Example.8: I the Examples.3,.4, ad.6 we cosidered the problems of testig the specificatio H : µ = µ of the mea parameter agaist the alterative specificatios H : µ<µ, H : µ>µ,ad H : µ µ o the basis of ormally distributed observatios with kow variace. These hypotheses may be rewritte i the form.5-9 by usig the vectors/matrices θ := [ µ ], H := [ ], ad w := [ µ ], which are all scalars i this case. I Example.7, we ivestigated the problem of testig H : µ = µ versus H : µ µ i the same class of ormal distributios, but with ukow variace. These hypotheses are expressed as i.5-9 by defiig θ := µ σ, H := [, ], ad w := [ µ ]. Whe the hypotheses are give i terms of liear restrictios.5-9, the the maxima i.5-9 may be iterpreted i the followig way. The value θ for which the likelihood fuctio i the omiator of.5-9 attais its maximum over Θ, or equivaletly for which the costrait H θ = w holds, is called the restricted maximum likelihood ML estimate for θ. O the other had, the value θ for which the likelihood fuctio i the deomiator of.5-9 attais its maximum over the etire parameter space Θ, deotes the the urestricted maximum likelihood ML estimate for θ. If we assume that the likelihood fuctio is at least twice differetiable with positive defiite Hessia, the the restricted ML estimate θ is obtaied as the solutio of θ Lθ; y k Hθ w =,.5-9 where k deotes a r -vector of ukow Lagrage multipliers. The urestricted ML estimate θ follows as the solutio of the likelihood equatio Lθ; y =. θ.5-93 The, rewritig.5-9 i terms of the ML estimators ad the radom vector Y yields GLRY = L θ; Y L θ; Y This is the reciprocal of the statistic that Koch 999, Chap. 4. ad Teuisse, Chap. 3 use to derive the test of the geeral hypothesis i the ormal Gauss-Markov model. I that case, which will also be addressed i detail i Sectio 3 of this thesis, the restricted ud urestricted ML estimates are equivalet to the restricted ad urestricted least squares estimates. However, it shall already be metioed here that there are importat cases where the Gauss-Markov model is ot restricted to the class of ormal distributios, but where the likelihood fuctio may deped o additioal distributio parameters see Applicatio 7. For this reaso, we will maitai the more geeral otatio i terms of the likelihood fuctio ad the restricted/urestricted ML estimates, ad we will speak of restricted/urestricted least squares estimates oly if we apply a class of ormal distributios. I certai cases, it will sometimes be more coveiet to use a logarithmic versio of the GLR, thatis, lglr = l L θ; Y L θ; Y = l L θ; Y l L θ; Y Due to the strictly icreasig mootoicity of the logarithmus aturalis, the estimates θ ad θ remai uchaged if, as i.5-95, the so-called log-likelihood fuctio Lθ; y :=llθ; y.5-96 is maximized istead of the likelihood fuctio This property guaratees that the restricted ML estimate θ is also the solutio of θ Lθ; y k Hθ w =,.5-97 ad that the urestricted ML estimate θ is the solutio of the log-likelihood equatio Lθ; y =. θ.5-98

37 .5 Best critical regios 3 Oe advatage of this approach is that the logarithm of a Gaussia desity will cacel out with the expoetial operator, which results i a fuctio that is easier to hadle see Example.9. We call the statistic T LR, defied by T LR Y := lglr = L θ; Y L θ; Y,.5-99 the likelihood ratio LR statistic. A test which uses.5-94 as test statistic is called the geeralized likelihood ratio GLR test, give by if L θ; y/l θ; y <kα, φ GLR y =.5- if L θ; y/l θ; y >kα, where the critical value kα is such that φ has level α. Alteratively, the test if L θ; y L θ; y >k α, φ LR y = if L θ; y L θ; y <k α,.5- based o the statistic.5-99 is called the likelihood ratio LR test. Both tests are truly equivalet because both the correspodig statistics ad critical values are strictly mootoic fuctios of each other. It is easily verified that the test.5- is equivalet to the MP test.5-6 of Neyma ad Pearso if both H ad H are simple hypotheses ad if θ is a sigle parameter, because the maxima the equal the poit values of the desities at Θ = θ } ad Θ = θ }, respectively. Furthermore, if H is a oe-sided hypothesis with θ still beig a sigle parameter ad if Y has a desity with mootoe desity ratio, the the GLR/LR test is also equal to the UMP test i Theorem.4 see Lemma i Birkes, 99. Eve more importatly, if a test problem ivolves multiple parameters i a ormal Gauss-Markov model, the the GLR/LR test is also idetical to the UMPI test obtaied from a step-wise reductio by ivariace. We will demostrate this fact, which has bee prove by Lehma 959b, i the followig simple example ad i greater detail i Sectio 3. Example.9 Example.7 revisited: The LR test of the ormal mea with ukow variace. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo ukow variace σ. What is the LR test for testig H : µ = µ σ > versus H : µ µ σ > at level α? Usig the fact that the joit desity of idepedetly distributed observatios is the product of the uivariate desities, we obtai for the log-likelihood fuctio Lθ; y :=lfy; µ, σ = l exp } yi µ πσ σ [ = l πσ / exp }] yi µ σ [ = lπ l σ ] σ y i µ = lπ l σ σ y i µ Let us first determie the urestricted ML estimates for µ ad σ by applyig From the first order coditios µ Lθ; y = σ y i µ =,.5- Lθ; y = σ σ + σ 4 y i µ =,.5-3

38 3 THEORY OF HYPOTHESIS TESTING we obtai the solutios µ = y i ad σ = y i µ Notice that σ differs from the sample variace S = y i µ. The restricted estimates result from solvig.5-97, that is, y i µ k =, µ Lθ; y kµ µ = σ σ Lθ; y kµ µ = σ + σ 4 k Lθ; y kµ µ = µ µ =. The third equatio reproduces the restrictio, that is y i µ =, µ = µ. Usig this result, the secod equatio gives σ = y i µ Substitutig µ = µ ad σ ito the first equatio results i the estimate k = σ y i µ.5-8 for the Lagrage multiplier. To evaluate the test statistics based o the geeralized likelihood ratio, we eed to compute the likelihood fuctio both at the urestricted ad the restricted estimates, which leads to } L µ, σ ; y =π σ / exp σ y i µ =π / σ / exp } ad L µ, σ ; y =π σ / exp σ } y i µ =π / σ / exp With this, the GLR i.5-94 tales the value GLRy = L µ, σ ; y σ / L µ, σ ; y = σ, ad the value of the LR statistic i.5-99 becomes }. T LR y = lglr = l σ σ. We will ow show that T LR Y thus also the GLR statistic is equivalet to the statistic MY i.5-83 of the UMPI test.5-85 for testig H : µ = µ = σ > versus H : µ µ = σ > at level α. Recall that MY = Ȳ S with sample mea Ȳ = Y i ad sample variace S = Y i Ȳ. The, due to Ȳ = µ ad S = σ,wehave + MY Ȳ µ =+ =+ S σ = σ + µ σ = y i µ + µ σ = y i σ = σ σ.

39 .5 Best critical regios 33 Thus, T LR Y = l + MY is a strictly mootoically icreasig fuctio of MY. If we trasform the critical value C = k F, α of the UMPI test.5-85 accordigly ito C = l + C, the the LR test φ LR y :=, if T LR y = l σ σ >C,, if T LR y = l σ σ <C,.5-9 will produce the same result as the UMPI test Reductio to Rao s Score statistic. Aother way to formulate the Likelihood Ratio statistic.5-99 results from applyig a two-term Taylor series to the log-likelihood fuctio For this purpose, we will assume throughout this thesis that the first two derivatives of the log-likelihood fuctio exist. If the urestricted ML estimate is used as Taylor poit, the we obtai Lθ; y Lθ; y =L θ; y+ θ + θ θ θ Lθ; y θ θ θ θ..5- The vector of first partial derivatives Lθ; y Sθ; y :=.5- θ is called the log-likelihood or efficiet score. TheHessia matrix of secod partial derivatives will be deoted by θ Hθ; y := Lθ; y θ θ..5- This matrix, which appears i the exact residual term of the Taylor series.5-, is evaluated at possibly differet poits betwee θ ad θ. Now, it follows from the log-likelihood equatios.5-98 that the score vector vaishes at θ, thatiss θ; y =. The, evaluatio of.5- at the restricted ML estimate θ yields L θ; y =L θ; y+ θ θ Hθ ; y θ θ, or L θ; y L θ; y = θ θ Hθ ; y θ θ. We will ow use a argumet by Stuart et al. 999, p. 57 statig that, i terms of radom variables, Hθ ; Y EH θ; Y } for large. Note that if the log-likelihood fuctio is aturally give as a quadratic fuctio of θ, the the Hessia will be a matrix of costats. I that case, we will write the Hessia as H Y,whichisthe idetical to EH Y }. The expectatio of the egative Hessia of the log-likelihood fuctio, that is Iθ; Y :=E Hθ; Y }.5-3 is called the iformatio matrix. With this, we obtai for the test statistic T LR Y = L θ; Y L θ; Y θ θ E H θ; Y θ θ = θ θ I θ; Y θ θ.5-4 I a secod step we apply a oe-term Taylor series to the score with Taylor poit θ, thatis Sθ; y Sθ; y =S θ; y+ θ θ θ. θ The we evaluate the score at the maximum likelihood estimate θ, whichgives S θ; y = = S θ; y+hθ ; y θ θ.

40 34 THEORY OF HYPOTHESIS TESTING Now, the same argumet applies as above, i.e. Hθ ; Y EH θ; Y } for large, whichgives or I θ; Y θ θ S θ; Y, θ θ I θ; Y S θ; Y Agai, if the log-likelihood fuctio is quadratic i θ, the the above approximatios become exact. Substitutig the last equatio for θ θ ito.5-4 fially yields T LR Y S θ; Y I θ; Y S θ; Y =:T RS Y..5-5 The statistic T RS Y is called the Rao s Score RS statistic see Equatio 6e.3.6 i Rao, 973, p. 48, which was origially proposed i Rao 948 for the problem of testig H : θ = θ versus H : θ >θ with a sigle ukow parameter θ. I this oe-dimesioal case, Rao s Score statistic takes the simple form T RS Y =S θ ; Y /Iθ ; Y..5-6 Example. Example.7 revisited: The RS test of the ormal mea with ukow variace. Let Y,...,Y be idepedetly ad ormally distributed observatios with commo ukow mea µ ad commo ukow variace σ. What is the RS test for testig H : µ = µ σ > versus H : µ µ σ > at level α? To determie the value of Rao s Score statistic.5-5, we eed to determie the log-likelihood score ad the iverse of the iformatio matrix, ad the evaluate these quatities at the restricted ML estimates. The first partial derivatives of the log-likelihood fuctio with respect to µ ad σ have already bee determied as.5- ad.5-3 i Example.9. Thus, the log-likelihood score vector follows to be Sθ; y = Lµ,σ ;y µ Lµ,σ ;y σ = σ y i µ y i µ σ + σ 4 The Hessia of the log-likelihood fuctio comprises the secod partial derivatives with respect to all ukow parameters. For the curret example, these are Lµ, σ ; y = µ µ σ, Lµ, σ ; y µ σ = σ 4 y i µ, Lµ, σ ; y σ = µ σ 4 y i µ, Lµ, σ ; y σ σ = σ 4 σ 6 y i µ. The, the iformatio matrix follows to be Iθ; Y = E Hθ; Y } σ = E σ Y 4 i µ σ Y 4 i µ σ 4 σ Y 6 i µ σ = σ EY 4 i } µ σ EY 4 i } µ σ + 4 σ EY 6 i µ. }.

41 .5 Best critical regios 35 Due to the defiitios EY i } = µ ad EY i µ } = σ of the first momet ad the secod cetral momet, respectively, the off-diagoal compoets of the iformatio matrix vaish, ad we obtai Iθ; Y = σ. σ 4 The, usig the fact that the restricted ML estimate for µ is µ = µ, Rao s Score statistic i.5-5 becomes T RS Y = S µ, σ ; Y I µ, σ ; Y S µ, σ ; Y σ Y = i µ σ + σ Y 4 i µ = σ Y i µ σ σ σ σ σ Y i µ σ Y i µ Y i µ σ + σ 4 = σ Y i µ. Two aspects are typical for Rao s Score statistic. Firstly, the log-likelihood score vaishes i the directio of σ. This happes ecessarily because σ is ot restricted by H. Therefore, the urestricted ML estimate of such a free parameter will certaily maximize the log-likelihood fuctio i that directio. Secodly, the iformatio matrix is diagoal, reflectig the fact that both parameters are determied idepedetly. These two properties, which are true also for more complex testig problems such as for the applicatios i Sectios 3 ad 4, simplify the determiatio of Rao s Score statistic cosiderably. If we recall from.5-8 i Example.9 that σ Y i µ is the estimator for the Lagrage multiplier, we may rewrite Rao s Score statistic i the form T RS Y = σ σ Y i µ = σ k. For this reaso, Rao s Score statistic is ofte called the Lagrage Multiplier LM statistic, atermwhich was probably first used by Silvey 959 ad which is used typically i the field of ecoometrics. Let us assume that the observatios have bee cetered such that the hypotheses are about µ =,as demostrated i Example.6. As for the relatio betwee T LR ad the UMPI test statistic derived i Example.9, we ca show that T RS is ot idetical with the UMPI test statistic, but a strictly mootoic fuctio thereof. Recall from Example.9 that MY / = µ / σ ad + MY / = σ + µ / σ.with this, we obtai MY µ = MY σ / σ σ = µ σ = σ Y i = T RS. + Therefore, Rao s Score test φ RS y :=, if T RS y = σ k >C,, if T RS y = σ k <C,.5-7 with critical value C = C/ +C/ will be exactly the same as the UMPI test.5-85 with critical value C = k F, α.

42 36 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM 3 Theory ad Applicatios of Misspecificatio Tests i the Normal Gauss-Markov Model 3. Itroductio I this sectio we will cosider a umber of very commo test problems withi the cotext the ormal Gauss- Markov model GMM Y = Xβ + E, Σ = ΣE} = σ P with ormally distributed zero-mea errors E, kow desig matrix X R m of full rak, kow positive defiite weight matrix P R, ad parameters β R m ad σ R +, respectively. Thus, we may write the resultig class of distributios with respect to the observables Y as W = N Xβ,σ P : β R m,σ R + }, 3.- which correspods to the space Θ = R m R + of parameters θ =β,σ ad to the class of multivariate ormal desity fuctios F = f y; β,σ : β R m,σ R + }, of multivariate ormal desity fuctios, defied by fy; β,σ =π / det σ P / exp } σ y Xβ P y Xβ 3.- see Equatio.5 i Koch, 999, p. 7. We will further assume that the ukow true parameter vector θ is oe elemet of Θ. Frequetly, the umerical value σ for σ is kow apriori. I this case we will rewrite the class of distributios as W = N Xβ,σ P : β R m }, 3.- the space of parameters θ = β as Θ = R m, ad the correspodig class of desity fuctios as F = f y; β :β R m } 3.-3 with fy; β =π / det σ P / exp } σ y Xβ P y Xβ Notice that, by settig X := ad P := I, we obtai the observatio model used i Examples.6 ad.7 depedig o whether σ is kow or ukow apriori. As the parameter space comprises two types of parameters, we will aturally cosider two categories of test problems. The first oe is about testig the parameters β appearig i the fuctioal model 3.-8, ad the secod oe is about testig the variace factor σ, which is part of the stochastic model. Solutios to these test problems are well kow see, for istace, Koch, 999; Teuisse, ad belog to the stadard procedures of geodetic adjustmet theory. Therefore, rather tha to repeat commo kowledge, the purpose of this sectio is to recoceptualize these tests, i particular the test statistics, by derivig them as optimal procedures. For this purpose, we will exploit symmetry assumptios, that is, ivariace priciples with respect to the power fuctio i the same way as demostrated i Sectio for some simpler test problems. It will tur out that the stadard outlier tests, sigificace tests, tests of liear hypotheses, ad the test of the variace factor owe much of their uselfuless to the fact that they may all be derived as uiformly most powerful ivariat tests.

43 3. Derivatio of optimal tests cocerig parameters of the fuctioal model Derivatio of optimal tests cocerig parameters of the fuctioal model So far we have cofied ourselves to problems, where the hypothesis that parameters take particular values was to be tested agaist some simple or composite alterative. However, limitatio to such problems is uecessarily restrictive. There are situatios where we would rather wat to test whether a set of liear fuctios of the parameters takes particular values. As a commo example, it is desired i deformatio aalysis to test whether differeces of coordiates are zero, or whether they differ sigificatly see Applicatio 3. We have already see i.5-9 of Sectio.5.6 that hypotheses may take the form of liear costraits restrictios cocerig the parameters to be tested. This model also fits coveietly ito the framework of the ormal GMM I the curret sectio, we shall restrict attetio to hypotheses cocerig parameters β withi the fuctioal model 3.-8, which may the be writte as costraits H : H β = w versus H : H β w, 3.-4 where H R m m with m m deotes a matrix of full rak. This geeral model setup may be simplified i various ways before addressig the fudametal questio of optimality procedures. The first step will be to reparameterize the GMM ad the costraits such that the hypotheses become direct propositios about the values of the ukow parameters rather tha about the values of fuctios thereof. Furthermore, to exploit symmetries withi the parameter space effectively, it will also be coveiet to ceter these trasformed hypotheses about zero. Fially, we shall simplify the stochastic model 3.-9 by trasformig the observatios ito ucorrelated variables with costat variace. After carryig out these preprocessig steps, we will reduce the testig problem by sufficiecy ad ivariace i a similar maer to the approach preseted i Sectio. The, after reversig the preprocessig steps, we will obtai, as the mai result of this sectio, the UMPI test for testig the hypotheses i The idividual steps of this preprocessig ad reductio procedure will ow be carried out withi the followig subsectios:. Reparameterizatio of the test problem.. Ceterig of the hypotheses. 3. Full decorrelatio/homogeizatio of the observatios. 4. Reductio to idepedet sufficiet statistics with elimiatio of additioal fuctioal parameters. 5. Reductio to a maximal ivariat statistic. 6. Back-substitutio reversal of steps Reparameterizatio of the test problem Followig Meissl 98, Sectio C.., we expad the m m-matrix H by some m m m-matrix M ito a ivertible m m block matrix ad itroduce ew parameters β r R m where m := m m ad β r R m with βr := M β β r H Usig the ivertibility assumptio we obtai the equivalet relatio β = M βr. H β r The, multiplyig this equatio with X from the left yields Xβ = X M βr [ ] =: X r H β r X r βr β r. 3.-6

44 38 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Notice that this defiitio allows us to derive the followig expressio for the origial desig matrix X from the implicatio X M H =: [ ] X r X r X = [ ] X r X r M H Usig 3.-6 we may substitute the origial fuctioal model 3.-8 by Xβ = X r βr + X r βr 3.-8 ad, i light of 3.-5, the liear restrictio 3.-4 by H β = β r This reparameterizatio leads to a equivalet testig problem which ivolves the trasformed versio X r = [ X r X r ] of the origial desig matrix X =[X X ] i partitioed form. We will describe this simplified class of test problems by the ew observatio model Y N X r βr + X r βr,σ P 3.-3 where the true value of σ may be kow or ukow apriori, ad by the ew hypotheses H : β r = w versus H : β r w Ceterig of the hypotheses Similarly to the data trasformatio i Example.6 we may subtract the geeralized costat mea X r w from the data, that is, Y c := Y X r w While this trasformatio leaves the covariace matrix as the secod cetral momet uchaged, it chages the expectatio to } } } E Y X r w = E X r βr + X r βr X r w = E X r βr + X r β r w Settig β rc := β r w leads to the ew observatio model Y c N X r βr + X r βrc,σ P, where the true value of σ may be kow or ukow apriori. The hypotheses i terms of ersatz parameters β rc are the evidetly give by H : β rc = versus H : β rc Full decorrelatio/homogeizatio of the observatios A Cholesky decompositio of the weight matrix ito P = GG, where G stads for a ivertible lower triagular matrix, allows for a full decorrelatio or homogeizatio of the observatios by virtue of the oe-to-oe trasformatio Y ch := G Y c cf. Koch, 999, p. 54. The expectatio of the trasformed observables becomes E G Y c} = G X r βr + X r βrc = G X r βr + G X r βrc.

45 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 39 The fact that Y ch has covariace matrix σ I follows directly from a applicatio of error propagatio to the liear fuctio G Y c, which yields with Σ G Y c} = G Σ Y c} G = G σ P G = σ G GG G = σ G G G G = σ I. After itroducig the trasformed block desig matrices X rh := G X r ad X rh := G X r,thetrasformed observatio model the reads Y ch N X rh β r + X rh β rc,σ I, where the true value of σ may be kow or ukow apriori. As homogeizatio does ot trasform parameters, the hypotheses may still be writte as i 3.-35, that is, H : β rc = versus H : β rc Wheever a test problem with structure 3.-/3.-, 3.-4 or 3.-3, 3.-3 or 3.-34, is give, it may be trasformed directly ito 3.-38, 3.-39, which will tur out to be the most suitable structure for subsequet reductios by sufficiecy ad ivariace Reductio to miimal sufficiet statistics with elimiatio of uisace parameters To reduce the observatios Y ch by sufficiecy, we eed to geeralize the result of Examples. ad. to the case of the liear model. Propositio 3.. I the ormal Gauss-Markov model Y NXβ,σ I, the least squares estimators X X β = X Y 3.-4 ad m σ =Y X β Y X β 3.-4 costitute idepedetly distributed ad miimally sufficiet statistics for β ad σ, respectively. Proof. Usig the estimates defied by 3.-4 ad 3.-4, the multivariate ormal desity 3.- may be rewritte as fy; β,σ = π / det σ I / exp } σ y Xβ y Xβ = πσ / exp σ m σ + β β X X β β } I R y. Thus, it follows from Neyma s Factorizatio Theorem. that β ad m σ are joitly sufficiet statistics for β ad σ. The, Arold 98, p. 65 shows that these statistics are complete, which implies miimality see Arold, 99, p. 346, ad idepedetly distributed with ad β Nβ,σ X X 3.-4 m σ /σ χ m Next, we will rewrite this fudametal result i terms of partitioed parameters as demaded by the observatio model

46 4 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Propositio 3.. I the ormal liear Gauss-Markov model Y NX β + X β,σ I with partitioed parameters β R m ad β R m, the least squares estimators = X Y, short : = X Y ad X X X X X X X X β β N N N N β β m m σ =Y X β X β Y X β X β costitute miimally sufficiet statistics for β, β,adσ, respectively. Furthermore, the statistics [ β β ] ad σ are idepedetly distributed with β N β,σ X X X X β β X X X X ad m m σ /σ χ m m Before cosiderig a reductio of Y ch to sufficiet statistics we may otice that the observatio model comprises fuctioal parameters β r ot subject to hypotheses. Therefore, we may elimiate them from the ormal equatios cf. Schuh, 6a, Sectio.. without chagig the test problem itself. First we rewrite the partitioed ormal equatios i terms of reparameterized, cetered, ad homogeized quatities, that is, Isolatio of X rh X rh X rh X rh β r = β r β r + X rh β r + X rh X rh X rh i yields X rh X rh ad after substitutio ito β rc = N β rc = X rh Y ch β rc = X rh Y ch X rh Y ch X rh X rh X rh X rh X rh X rh X rh β rc, 3.-5 rh X Y ch 3.-5 with Schur complemet N := X rh X rh X rh X rh X rh X rh rh X X rh 3.-5 as abbreviatio. We will ot give N the idex. rh because this matrix aturally refers to the model with two groups of parameters β r ad β r, ad it may be writte directly i terms of o-homogeeous quatities, that is, N = X r PX r X r PX r X r PX r r X PX r. The residuals i the observatio model are defied as Ê rch = Y ch X rh β r X rh ad may, after substitutio of 3.-5, be writte as Ê rch = I X rh X rh X rh rh X β rc, Y ch X rh β rc

47 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 4 If the true value of the variace of uit weight must be estimated, the these two alterative formulatios for the residuals correspod to the followig expressios for the estimator of σ : m m σ rch = Ê rch Ê rch = Y ch X rh = Y ch X rh β r X rh β rc β rc I X rh Y ch X rh X rh X rh β r X rh rh X β rc Y ch X rh rc Istead of usig the vector β with possibly fully populated weight matrix, it will be much more coveiet to operate with the fully decorrelated ad homogeized vector β rc. β rch := G rc β which is also sufficiet for β rc as a oe-to-oe fuctio of. Here, G represets the ivertible lower triagular matrix obtaied from the Cholesky factorizatio P := N = G G. After reducig the rch observatio model by sufficiecy, we ow have a test problem about ersatz observatios [ β, σ ] with reduced dimesio m +. Thehypotheses β rc H : β rch = versus H : β rch follow from by observig that rch rc rch rc β = if ad oly if β = ad β if ad oly if β Reductio to a maximal ivariat statistic I this step we seek to reduce the test problem i terms of idepedet sufficiet statistics by ivariace i the same way as we did i Example.7. The oly differece will be that i the preset case we caot apply sig ivariace as the expectatio is ow give by a o-costat mea vector. Istead we will verify that the test problem is ivariat uder the group of orthogoal trasformatios actig o the mea vector β rch. Case : σ = σ kow. Let us begi with the simpler case that the true value of the variace factor is kow apriori. First we ote that each orthogoal trasformatio rch rch g β =Γ β rch β Nβ rch,σ rch ItoΓ β NΓβ rch,σ I, where the i G results i a chage of distributio from covariace matrix of Γ From this the iduced trasformatio withi the parameter domai is see to be ḡ β rch =Γβ rch. β rch remais uchaged due to the property of ay orthogoal matrix Γ that ΓΓ = I. rch The, Theorem.6-5 gives β β rch rch as the maximal ivariat with respect to the trasformatio Γ β. We must still prove that the origial test problem remais itself ivariat uder G with iduced group of trasformatios Ḡ. This is truly the case because ad ḡ Θ =ḡ } =Γ} = } = Θ ḡ Θ = ḡ β rch : β rch R m } =Γβ rch : β rch R m }} = β rchγ : β rchγ R m }} = Θ leaves the hypotheses uchaged. To formulate the ivariat test problem we eed to fid the distributio rch of the maximal ivariat. From β Nβ rch,σi rch itfollowsthat β β rch /σ χ m,λwith o-cetrality parameter λ = β rch β rch /σ see Koch, 999, p. 7. Notice ow that λ =ifadoly if β rch =, adthatλ> if ad oly if β rch. Therefore, we may write the two-sided hypothesis

48 4 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM testig problem i terms of the maximal parameter ivariat λ, which will take a positive value if H is true. Cosequetly, the ivariat test problem rch MY = β β rch /σ χ m,λ H : λ = agaist H : λ > with sigle parameter λ = β rch β rch /σ has a oe-sided alterative hypothesis. Furthermore, the ocetral χ -distributio with fixed degree of freedom has a mootoe desity ratio accordig to Theorem.5-5. Therefore, Theorem.4 is applicable, which gives the UMP test φy =, if My >k χ m α,, if My <k χ m α, at level α. It follows that φ is the UMPI test for testig H : H β = w versus H : H β w i the origial observatio model Y N Xβ,σ P, i which the variace factor has bee assumed to be kow apriori. Case : σ ukow. I this case, the statistic σ rch acts as a additioal ersatz observatio. The group rch of orthogoal trasformatios, actig o the geerally multi-dimesioal statistic β, is defied by rch g β, σ rch rch =Γ β, σ, rch ad causes the distributio to chage from β Nβ rch,σ rch ItoΓ β NΓβ rch,σ I. As the statistic σ rch is ot chaged by ay trasformatio g G, its distributio also remais uchaged. The iduced group of trasformatios follows to be ḡ β rch,σ =Γβ rch,σ. Theorem.6-6 gives M Y =[ β rch β rch, σ rch ] as the maximal ivariats with respect to the trasformatio Γ β. The test problem is ivariat because rch of ad ḡ Θ =ḡ β rch,σ :β rch =,σ R + }=Γ,σ :σ R + } =,σ :σ R + } = Θ ḡ Θ = ḡ β rch,σ :β rch R m },σ R + } = Γβ rch,σ :β rch R m },σ R + } = β rchγ,σ :β rchγ R m },σ R + } = Θ. To further reduce M Y, observe that the test problem is also ivariat uder the group G of scale chages rch g β, σ rch rch =c β,c σ rch, which iduces the group Ḡ of parameter trasformatios ḡ β rch,σ =cβ rch,c σ, because of the trasitios i distributio β rch Nβ rch,σ rch I c β Ncβ rch,c σ I, σ rch G m/, σ / m c σ G m/, c σ / m. Next, we observe that holds if rch M g β, σ rch = M rch c β,c σ rch ĝ rch β β rch, σ rch =c β rch β rch,c σ rch =c β rch β rch,c rch β =ĝ M β, σ rch

49 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 43 defies the group Ĝ of scale chages. It follows from Theorem.6- that M rch β β rch, σ rch rch β = σ rch β rch is a maximal ivariat uder Ĝ. The, Theorem.8 implies that rch M M β, σ rch rch β = σ rch β rch rch is the statistic maximally ivariat uder orthogoal trasformatios ad scale chages. Sice β β rch /σ has a o-cetral χ -distributio with m degrees of freedom ad m σ rch /σ a cetral χ -distributio with m degrees of freedom both statistics beig idepedetly distributed, the maximal ivariat MY := rch m β β rch /m σ rch is distributed as F m, m, λ see Koch, 999, p. 3. The ivariat test problem is fially give by rch β β rch MY = m σ rch F m, m, λ H : λ = agaist H : λ >, with λ = β rch β rch /σ which is about oe sigle ukow parameter λ, a two-sided alterative hypothesis, ad a test statistic whose distributio has a mootoe desity ratio see Theorem.5-6. Therefore, there exists a UMP test for the ivariace-reduced test problem see Theorem.4 at level α, whichisgiveby φy =, if My >k F m, m α,, if My <k F m, m α It follows that φ is the UMPI test for testig H : H β = w versus H : H β w i the origial observatio model Y N Xβ,σ P, i which the variace factor has bee assumed to be ukow apriori.

50 44 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM 3..6 Back-substitutio The test statistic MY is icoveiet to compute as it comprises quatities trasformed i multiple ways. Therefore, we will express MY i terms of the origial quatities of models 3.-8 ad This will be achieved i three steps reversig the trasformatios i , where each step covers a particular equivalet form of the test problem ofte ecoutered i practice. Case : σ = σ kow. Propositio 3.3. The ivariat test statistic MY = rch β β rch /σ 3.-6 for the UMP test at level α regardig the hypotheses H : λ = versus H : λ >, 3.-6 i.e. for the UMPI test at level α regardig the origial hypotheses H : H β = w versus H : H β w, is idetical to:. the test statistic MY = rc β N for the equivalet test problem βrc /σ 3.-6 Y ch N X rh β rc + X rh β rc,σi H : β rc = versus H : β rc, which we will call the problem of testig the sigificace of additioal parameters β rc with kow variace factor σ, if the least squares estimator β rc = N X rh Y ch N X rh X rh X rh X rh rh X Y ch with residuals Ê rch = I X rh X rh X rh X rh Y ch X rh β r is used. Wheever a test problem is aturally give i the form ad 3.-64, i.e. by Y N X β + X β,σi H : β = versus H : β which we will call the atural problem of testig the sigificace of additioal parameters β with kow variace factor σ, the all the idices are omitted, i which case the test statistic of the UMPI test reads MY = β N with least squares estimator β /σ β = N X Y N X X X X X Y 3.-7 ad residuals Ê = I X X X X Y X β. 3.-7

51 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 45. idetical to the test statistic MY = βr w for the equivalet test problem Y N N βr w /σ 3.-7 X r βr + X r βr,σ P H : β r = w versus H : β r w if the least squares estimator β r = N X r PY N X r PX r X r PX r r X PY with residuals Ê r = I X r X r PX r r X P Y X r β r is used. Wheever a test problem is aturally give i the form ad 3.-74, i.e. by Y N X β + X β,σp H : β = w versus H : β w the all the idices are omitted, i which case the test statistic of the UMPI test reads MY = β w β w /σ with least squares estimator N β = N X PY N X PX X PX X PY 3.-8 ad residuals Ê = I X X PX X Y P X β idetical to the test statistic MY =H β w HA PA H H β w/σ, 3.-8 for the origial test problem Y N Xβ,σP H : H β = w versus H : H β w if the least squares estimator with residuals β =X PX X PY Ê = Y X β is used. Proof. Part : Reversig the parameter homogeizatio by usig yields rch MY = β β rch /σ rc = β G G rc β /σ rc = β N βrc /σ, which proves equality of 3.-6 ad The hypotheses 3.-6 ad have already bee show to be equivalet by virtue of ivariace of the hypotheses Sectio 3..5, Case. Furthermore, 3.-85

52 46 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM is the sufficiet statistic i the observatio model itroduced i Sectio Part : Reversig the data homogeizatio to 3.-5 results i β rc = N X r G X r GG X r X r GG X r r X = N X r PY c N X r PX r X r PX r G G Y c r X PY c. Usig the o-cetered observatios 3.-3 ad the defiitio 3.-5 of the Schur complemet, we obtai β rc = N X r P Y X r w = N X r N PY N X r PX r X r PX r X r PX r = N X r PY N X r PX r = β r w. N X r PX r X r PX r X r PX r X r PX r X r PX r r X X r X r PY PX r r X P Y X r w w PY N N w r From Propositio 3. it follows that β is ideed the least squares estimator for β r i the partitioed observatio model Usig this result, the test statistics 3.-6 ad 3.-7 are idetical if the least squares estimators withi the correspodig observatio models are applied. Part 3: As a first step towards provig the third part of the propostio we will ow prove the idetity = HX PX H. N Usig the expressio 3.-7 for X we obtai X PX = [M H ] X r P = [M H ] X r [ ] X r X r M H X r PX r X r PX r X r PX r X r PX r M H. Ivertig both sided yields X PX = M H X r PX r X r PX r X r PX r X r PX r [ M H ] It follows that M H X PX [ M H ] = X r PX r X r PX r X r PX r X r PX r. After expadig the left side ad itroducig blocks of the total iverse, we have MX PX M MA PA H = X r PX r X r PX r HA PA M HA PA H X r PX r X r PX r. Usig the defiitio 3.-5 of the Schur complemet, the idetity HX PX H = X r PX r = N

53 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 47 is see to hold. Iversio of this equatio provides us with the desired result HX PX H = N I a secod step we will prove the equality may be expressed as β = M H = M H X r PX r X r PX r β r = H β. Usig 3.-7 ad 3.-87, the least squares estimator X r PX r X r PX r X r PX r X r PX r X r PX r X r PX r Now, the pre-multiplied versio of this equatio, that is, M β = X r PX r X r PX r X r PY H X r PX r X r PX r X r [ M H ] [ M H ] X r X r PY X r PY X r clearly implies that H β = [ X r PX r X r PX r ] X r X r PY. Observig that X r PX r = N ad usig Equatio. i Koch 999, p. 33 to obtai the expressio X r PX r = N X r PX r X r PX r for the other block of the iverse, we obtai H β = [ ] N X r PX r X r PX r N X r X r PY = N X r PX r X r PX r X r PY + N X r r PY = β accordig to This proves that βr w N i.e. the statistics 3.-8 ad 3.-7 are idetical. βr w = H β HX w PX H H β w,

54 48 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Case : σ ukow. Propositio 3.4. The ivariat test statistic MY = rch β β rch /m σ rch 3.-9 for the UMP test at level α regardig the hypotheses H : λ = versus H : λ >, 3.-9 i.e. for the UMPI test at level α regardig the origial hypotheses H : H β = w versus H : H β w, is idetical to:. the test statistic MY = rc β N for the equivalet test problem Y ch N βrc /m σ rch 3.-9 X rh β rc + X rh β rc,σ I H : β rc = versus H : β rc, which we will call the problem of testig the sigificace of additioal parameters β rc with ukow variace factor, if the least squares estimators β rc = N X rh Y ch N X rh X rh X rh X rh rh X Y ch σ rch = Ê rch Ê rch / m with residuals Ê rch = I X rh X rh X rh X rh Y ch X rh β r are used. Wheever a test problem is aturally give i the form ad 3.-94, i.e. by Y N X β + X β,σ I H : β = versus H : β, which we will call the atural problem of testig the sigificace of additioal parameters β with ukow variace factor, the all the idices are omitted, i which case the test statistic of the UMPI test reads MY = β N with least squares estimators β /m σ 3.- β = N X Y N X X X X X Y 3.- σ = Ê Ê/ m 3.- ad residuals Ê = I X X X X Y X β idetical to the test statistic MY = βr w N βr w /m σ r 3.-4

55 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 49 for the equivalet test problem Y N X r βr + X r βr,σ P 3.-5 H : β r = w versus H : β r w 3.-6 if the least squares estimators β r = N X r PY N X r PX r X r PX r r X PY 3.-7 σ r = Ê r P Êr / m 3.-8 with residuals Ê r = I X r X r PX r r X P Y X r β r 3.-9 are used. Wheever a test problem is aturally give i the form 3.-5 ad 3.-6, i.e. by Y N X β + X β,σ P 3.- H : β = w versus H : β w 3.- the all the idices are omitted, i which case the test statistic of the UMPI test reads MY = β w with least squares estimators N β w /m σ 3.- β = N X PY N X PX X PX X PY 3.-3 σ = Ê P Ê/ m 3.-4 ad residuals Ê = I X X PX X Y P X β idetical to the test statistic MY =H β w HX PX H H β w/m σ, 3.-6 for the origial test problem Y N Xβ,σ P 3.-7 H : H β = w versus H : H β w 3.-8 if the least squares estimators β = X PX X PY 3.-9 σ = Ê P Ê/ m 3.- with residuals Ê = Y X β 3.- are used.

56 5 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Proof. Part : Reversig the parameter homogeizatio by usig yields rch MY = β β rch /m σ rch rc = β G G rc β /m σ rch rc = β N βrc /m σ rch, which proves equality of 3.-9 ad The hypotheses 3.-9 ad have already bee show to be equivalet by virtue of ivariace of the hypotheses Sectio 3..5, Case. I additio, 3.-9 ad 3.- are the sufficiet statistics i the observatio model itroduced i Sectio rc r Part : I Part of the proof of Propositio 3.3, it was already show that β = β w. Now we show i additio that σ ch = σ. Reversig the homogeizatio 3.-37, ceterig 3.-3, ad the result β rc = β r w, the residuals may be rewritte as Ê rch = I P r X X r PX r r X P = I P r X X r PX r r X P = P I X r X r PX r r X P = P Ê r. P Y X r P Y X r Y X r w P r r X β w w Xr β r β r + X r w Therefore, σ rch = Ê rch Ê rch / m =Ê r P Êr / m = σ. I other words, the estimator for the variace of uit weight is ot affected by trasformatios that ivolve ceterig or homogeizatio of the observatio model. Propositio 3. shows that σ r is the least squares estimator of σ. Usig the above equalities, the test statistics 3.-9 ad 3.-4 are idetical if the least squares estimators withi the correspodig observatio models are applied. Part 3: I additio to the result give by Part 3 of the proof of Propositio 3.3, it remais to prove that also σ = σ r. From the equivalet expressios ad 3.-7 for the least squares estimator 3.-9 ad X, respectively, it follows that Ê = Y X β [ ] = Y X r X r M M H H [ = Y X r X r PX r X r + X r ] + X r N X r PY X r PX r X r PX r X r PX r X r PX r X r PX r X r + X r X r X r PY X r PX r X r Usig the idetity relatios Equatio. i Koch, 999, p. 33 for submatrices of the iverse of a -block matrix, we obtai [ Ê = Y X r X r PX r +X r PX r X r PX r N X r PX r X r PX r ] X r PY + X r N X r PX r X r PX r X r PY +X r X r PX r X r PX r N X r PY X r N X r PY = Y X r = X r PY N X r PY N X r PX r X r PX r X r +X r X r PX r X r PX r I X r X r PX r X r P X r PX r X r N X r Y X r PY N β r PY X r PX r X r PX r X r = Êr PY accordig to This proves that the reparameterizatio does ot affect the estimator of the residuals. Due to σ = Ê P Ê/ m =Ê r P Êr / m = σ r, the reparameterizatio does ot chage the estimator of the variace factor either. This completes the proof of βr w N βr w /m σ r = H β HX w PX H H β w m σ.

57 3. Derivatio of optimal tests cocerig parameters of the fuctioal model Equivalet forms of the UMPI test cocerig parameters of the fuctioal model We will ow prove that the Geeralized Likelihood Ratio test is equivalet to the UMPI test if the set of liear restrictios is tested. We will restrict attetio to the problem of testig the sigificace of additioal parameters Case of Propositio 3.3, kowig that if the test problem aturally ivolves a set of liear restrictios 3.-4 Case 3 of Propositio 3.3, it may be trasformed ito the first form. Case : σ = σ kow. Propositio 3.5. For testig H : β = versus H : β i the possibly reparameterized, cetered ad homogeized liear model Y NX β + X β,σi, the statistic 3.-6 with MY = β X X X X X X X X β /σ 3.- of the UMPI test is idetical to:. the Likelihood Ratio statistic.5-99 with T LR Y = L β, β ; Y L β, β ; Y, Rao s Score statistic.5-5 with T RS Y =Ũ X X X X X X X X X X Ũ/σ, 3.-4 if the least squares estimators β uder the restrictios of H ad β, β urestricted are used. Proof. Part : TheGLR/LR test compares, o the basis of give data, the likelihood of the observatio model uder H to that of the model uder H. If H is true, the the above observatio model is a Gauss-Markov model with restrictios β =. As this restricted model may also be writte as Y NX β,σi, the restricted least squares estimators are give by X X β = X Y, β = If, o the other had, H is true, the the observatio model costitutes a urestricted Gauss-Markov model, ad the urestricted least squares estimators read X X β + X X β = X Y, 3.-7 X X β + X X β = X Y Koch 999, p. 6 proved that, due to the ormal distributio of the observatios, these least squares estimators are idetical to the maximum likelihood estimators. Therefore, the GLR becomes, accordig to.5-94, GLRY = L β }, ; Y πσ / L β, β ; Y = exp Y X σ β X β Y X β X β } πσ / exp Y X σ β X β Y X β X β exp Y Y β σ = X Y β X Y + β X X β + β X X β + β X X β } exp Y Y β σ X Y β X Y + β X X β + β X X β + β } X X β = exp σ β X Y β X Y + β X X β + β X X β + β X X β + β X Y + β X Y β X X β β X X β β X X β }. Notice ow that β X Y equals β times Substitutio of this ad of 3.-6 yields GLRY = exp σ β X Y + β X X β + β X Y + β X X β + β X X β β X X β β X X β β X X β } = exp σ β X Y + β X X β + β X Y β X X β + β } X X β

58 5 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM The, substitutio of β =X X X Y from 3.-5 ad β =X X X Y X X β from 3.-7 leads to GLRY = exp Y σ X X X X Y + Y X X X X X X X X Y +X Y X X β X X X Y X Y X X β X X X X X X X Y X X β + β X X β } = exp β X X β β } X X X X X X β = exp σ β σ X X X X X X X X β }. Usig defiitios.5-95 ad.5-99, the LR statistic T LR Y = lglry = σ β X X X X X X X X β. is ideed equal to MY. Part : The RS test determies whether the estimates for the Gauss-Markov model with restrictios valid uder H satisfy the likelihood equatios for the urestricted Gauss-Markov model valid uder H. Therefore, we must first determie log-likelihood fuctio of the urestricted model, which is give by [ πσ / Lβ, β ; Y = l exp }] σ Y X β X β Y X β X β = lπσ σ Y X β X β Y X β X β. From this, the log-likelihood score.5- follows to be Sβ, β ; Y = Lβ, β ; Y β Lβ, β ; Y β = σ X Y X β X β X Y X β X β Evaluatig the score at the restricted least squares estimators β =X X X Y ad β = as i 3.-5 ad 3.-6, ad usig the correspodig residuals Ũ = Y X β, yields S β, ; Y = X Y X β σ X Y X β = σ X ũ The Hessia ad the iformatio of Y are the Lβ, β ; Y Hβ, β ; Y = β β Lβ, β ; Y β β Lβ, β ; Y β β Lβ, β ; Y β β = σ X X X X X X X X ad Iβ, β ; Y =E β,β Hβ, β ; Y } = X X X X σ. X X X X

59 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 53 Now, usig the defiitio of the RS statistic.5-5, we obtai T RS = X X X X σ X Ũ X X X X X Ũ = ] [ σ Ũ X X X X X X X X X X Ũ = σ Ũ X N X Ũ = σ Ũ X X X X X X X X X X Ũ, where we used the Schur complemet as defied i The statistic T RS is defied i terms of the restricted estimates β through the residuals ũ. O the other had, M is a fuctio of urestricted estimates β. To show that both statistics are idetical, we will use to express M as a fuctio of β. The first step is to combie 3.-5 ad 3.-7, which yields X X β + X X β = X X β, whichiturimpliesthat β = β X X X X β. Substitutio of this result ito 3.-8 gives X X β X X X X β + X X β = X Y, or, after rearragig terms, X X X X X X X X β = X Y X X β, ad fially β = X X X X X X X X X Y X β = N X Ũ. NowwecarewritethestatisticM as MY = β N β /σ = Ũ X N N N X Ũ = Ũ X N X Ũ, which completes the proof that M = T RS.

60 54 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Case : σ ukow. Propositio 3.6. For testig H : β = versus H : β i the possibly reparameterized, cetered ad homogeized liear model Y NX β + X β,σ I, the UMPI test 3.-59, based o the statistic 3.-9 with MY = β X X X X X X X X β /m σ 3.-9 is equivalet to:. the Likelihood Ratio test.5-, based o the statistic.5-94 with T LR Y = l + m m MY, Rao s Score test, based o the statistic.5-5 with T RS Y = Ũ X X X X X X X X X X Ũ/ σ ML 3.-3 = m m + m m MY 3.-3 if the maximum likelihood estimators β, σ uder the restrictio of H ad β, β, σ urestricted are used. Proof. Part : The oly differece to Part of Propositio 3.5 is that the variace factor must be estimated here i additio to the fuctioal paramaters. Now, if H is true, the the observatio model is a Gauss-Markov model with restrictios, which may be writte i the simple form Y NX β,σ I. The restricted least squares estimators are give by X X β = X Y, β =, σ = Y X β Y X β / m Notice that the estimators ad whe the variace factor is ukow are exactly the same as the estimators 3.-5 ad 3.-6 whe the variace factor is kow. The alterative observatio model uder H readsy NX β + X β,σ I, for which the least squares estimators are give by X X β + X X β = X Y, X X β + X X β = X Y, σ = Y X β X β Y X β X β / m Agai, the fact that σ must be estimated does ot affect the structure of the estimators for β ad β. Cosequetly, the estimators ad of the curret model are idetical to 3.-7 ad 3.-8 of the model with kow variace factor. It will also be useful to ote that the variace estimators ad are divided by differet scalig factors or degrees of freedom m ad m, respectively i light of the fact that the fuctioal models uder H ad H have differet umbers of fuctioal parameters m parameters β ad m = m + m parameters [ β β ], respectively. As i Part of Propositio 3.5, the least squares estimators 3.-33, 3.-34, ad for the parameters of the fuctioal model are exactly the same as the maximum likelihood estimators. However, as far as the estimatio of σ is cocered, Koch 999, p. 6 shows that the ubiased least squares estimators ad differ from the correspodig biased maximum likelihood estimators, give by σ ML =Y X β Y X β / for the observatio model uder H,ad σ ML =Y X β X β Y X β X β / 3.-4 for the alterative observatio model uder H. However, if these maximum likelihood estimators are adjusted such that the scalig factor is replaced i each case by the correct degree of freedom, the we obtai the

61 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 55 least squares estimators ad 3.-38, which we might call the bias-corrected maximum likelihood estimators of σ. After these prelimiary remarks, we may evaluate the Geeralized Likelihood Ratio, that is, π σ / ML exp GLRY = L β,, σ ; Y L β, β, σ ; Y = = σ ML σ ML } X σ MLY β X Y X β X } X σ MLY β X β Y X β X β } σ / = ML. π σ ML / exp / exp σ ML σ ML + σ ML σ ML By virtue of the defiitios.5-95 ad.5-99 regardig the Likelihood Ratio statistic, we obtai σ / T LR = lglry = l ML σ ML = l We will ow prove that l + m m M= l σ ML + m m M = + m = + β m β σ ML σ ML σ ML σ ML or, equivaletly, that + m m M = σ ML. X X X X X X X X β /m σ X X X X X X X X β / m σ. Usig the equality m σ = σ ML,weobtai + m m M = σ ML σ ML + β } X X X X X X X X β = σ ML Y X β X β Y X β X β + β } X X X X X X X X β. Observe ow that the equality β X Y = β X X β + β X X β whe is multiplied by β,so that some of terms cacel out, leadig to + m m M = Y σ ML Y Y X β + β X X β β } X X X X X X β. Notice that, by usig 3.-33, may be rewritte as β =X X X Y X X β = β X X X X β. Substitutio of this expressio allows for the simplificatio + m m M = [ Y σ ML Y Y X β X X X X β ] [ + β β X X X X ] [ β X X X X β ] β X X X X X X β } = X X Y Y Y X β + β } X X β σ ML = σ ML σ ML. From this follows directly the desired equality l + σ ML = Y σ ML X β Y X β m m M= l σ ML = T σ ML LR. The LR test is truly equivalet to the UMPI test because T LR is see to be a strictly mootoically icreasig fuctio of M. Therefore, if the UMPI test is based o the statistic T LR Y istead of MY, the the critical value may be trasformed accordigly by this fuctio, ad the trasformed regio of rejectio is equivalet to the origial regio. The differece betwee Case σ kow aprioriadcaseσ ukow is that, i the first case, the test statistics T LR Y admy, hece their distributios, therefore also the regio of rejectio remais uchaged, while i the secod case, all of these quatities do chage, but remai equivalet.

62 56 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Part : From the log-likelihood fuctio [ πσ Lβ, β,σ ; Y =l exp }] σ Y X β X β Y X β X β = lπσ σ Y X β X β Y X β X β. of the observatio model icludig both H ad H, the log-likelihood score is derived as Lβ, β,σ ; Y β Sβ, β,σ ; Y = Lβ, β,σ ; Y β Lβ, β,σ = ; Y σ σ X Y X β X β σ X Y X β X β σ + σ 4 Y X β X β Y X β X β If U = Y X β X β deote the residuals, the Hessia ad the iformatio matrix read Lβ, β,σ ; Y Lβ, β,σ ; Y Lβ, β,σ ; Y β β β β β σ Hβ, β,σ ; Y = Lβ, β,σ ; Y Lβ, β,σ ; Y Lβ, β,σ ; Y β β β β β σ Lβ, β,σ ; Y Lβ,β,σ ;Y Lβ, β,σ ; Y σ β σ β σ σ σ X X σ X X σ X 4 U = σ X X σ X X σ X U 4 σ U X 4 σ U X 4 σ 4 σ U U 6 ad Iβ, β,σ ; Y =E Hβ, β,σ ; Y } σ X X = σ X X σ X X σ X X σ 4 X EU} σ 4 X EU} σ 4 EU }X σ 4 EU }X σ 4 + σ 6 EU U} = σ X X σ X X σ X X σ X X σ 4 where EU} = ad EU U} = σ by virtue of the Markov coditios. From the defiitio of Rao s Score statistic, we obtai T RS Y =S β, β, σ ; Y I β, β, σ ; Y S β, β, σ ; Y X σ X ML σ MLX X = σ MLX Ũ σ MLX X σ MLX X σ ML 4 = σ ML Ũ X X X X Ũ = Ũ X X X X X X X X X X Ũ/ σ ML. as proposed i To prove 3.-3, rewrite T RS as T RS Y = σ ML Y X β X N X Y X β = σ ML X Y X X β N substitute for β,thatis, T RS Y = σ ML X Y X X β, X Y X X X X X Y N N σ MLX Ũ,. N X Y X X X X X Y.

63 3. Derivatio of optimal tests cocerig parameters of the fuctioal model 57 Substitutio of β as give i 3.- yields T RS Y = β σ ML N β. From Part of the curret proof, we already kow that + m m MY = σ ML / σ ML. Isolatio of σ ML ad substitutio ito T RS the results i β N β T RS Y =. σ ML + m m M Usig the relatioship m σ = σ ML betwee the least squares ad the maximum likelihood estimator for σ,weobtai β N β β m N β / σ m m T RS Y = = m σ + m m MY + m m MY = MY + m mmy. As with the relatioship betwee the statistics T LR ad M, established i Part of this proof, we see that the statistic T RS is a strictly mootoically icreasig fuctio of M. Therefore, the UMPI test may be based upo T RS istead of M if the critical value is trasformed accordig to I this sese, we say that Rao s Score test is equivalet to the UMPI test. If the true value of σ is kow apriori,thepropositio 3.5 states that the statistics T RS ad M, hece their distributios, ad therefore the correspodig critical regios are idetical.

64 58 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM 3.3 Applicatio : Testig for outliers Cosider the Gauss-Markov model Y = Xβ + Z + U, Σ = ΣU} = σ I, where Xβ represets the determiistic tred model uderlyig observatios Y,ad where Z deotes a additioal mea shift model. Both desig matrices X R m ad Z R m are assumed to be kow ad of full rak. The mea shift model takes its simplest form if Z is a vector with zeros i the compoets,...,i,i+,..., ad a oe i the i-th compoet, that is, Z =[ ] The, the mea shift parameter may be viewed as a sigle additive outlier or gross error affectig observatio Y i. If a model for multiple outliers is desired, the Z is simply expaded by additioal colums. A test for a sigle or multiple outliers may the be based o the hypotheses H : = versus H : Clearly, if H is true, the o outliers are preset i the data, ad if H is true, the at least oe outlier is preset. If the errors U follow a ormal distributio with expectatio EU} =, the we may rewrite the Gauss- Markov model as Y NXβ + Z,σ I This observatio model, together with the hypotheses , is see to costitute a atural problem of testig the sigificace of additioal parameters accordig to or , depedig o whether the variace factor σ is kow or ukow apriori. We will ivestigate both case separately i subsequet sectios 3.3. ad Example 3.: Liear regressio with a group of adjacet outliers. The followig dataset Fig. 3. has bee aalyzed by Rousseeuw ad Leroy 3, p. 6 i the cotext of outlier testig ad robust parameter estimatio. As the observatios betwee 964 ad 969 are see to clearly mismatch the rest of the data, which may be approximated reasoably well by a straight lie, we could take this fact ito cosideratio by itroducig additioal mea shift parameters,..., 6. 5 Number of Calls x Mio Year Figure 3.. Liear regressio model with six additioal adjacet mea shift parameters.

65 3.3 Applicatio : Testig for outliers 59 The fuctioal model may be writte accordig to 3.3-4, that is, Y. Y 4 Y 5 Y 6 Y 7 Y 8 Y 9 Y Y.. Y 4 = β β U. U 4 U 5 U 6 U 7 U 8 U 9 U U.. U Baarda s test If the true value of the variace factor i is kow to take the apriorivalue σ = σ,thepropositio 3.3 guaratees that there exists a UMPItestfortheoutliertestproblem Y NXβ + Z,σ I ad hypotheses Accordig to 3.-58, the UMPI test is give by φy =, if My >k χ m α,, if My <k χ m α, where m deotes the umber of modeled outliers with statistic MY = Z Z Z X X X X Z /σ followig from 3.-69, ad least squares estimator = Z Z Z X X X X Z Z Y Z Z Z X X X X Z Z X X X X Y for the outliers correspodig to This test is called Baarda s test Baarda, 967, 968. We may rewrite i the more commo form = Z I X X X X Z Z I X X X X Y = Z QŨ}Z Z QŨ}Y where QŨ} evidetly deotes the cofactor matrix of residuals Ũ = Y X β i the outlier-free Gauss-Markov model Y = Xβ + U, Σ = ΣU} = σ I

66 6 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM As QŨ} is also the projector ito the orthogoal space with respect to the colum space of X, QŨ}Y = Ũ is easily verified, ad becomes = Z QŨ}Z Z Ũ Let us ow look at the case where Z cotais oly a sigle colum as i , that is where the observatio model comprises a sigle outlier parameter. The, observig that QŨ} ii = Z QŨ}Z ad Ũ i = Z Ũ, simplifies to = QŨ} ii Ũi = Ũi, r i where r i, as the value of the i-th mai diagoal elemet of QŨ}, deotes the partial redudacy of Y i.usig the defiitio of QŨ} ad of the partial redudacy, as well as the scalar outlier estimator , the statistic MY i takes the cosiderably simpler form MY = Z QŨ}Z /σ = QŨ} ii /σ = r i Notice that this test statistic requires that the additioal parameter is estimated. However, this is ot ecessary as we may substitute the secod part of for i , which yields the alterative test statistic σ MY = Ũ i r i σ It is istructive to see that this expressio is othig else tha Rao s Score statistic give i Part of the equivalece Propositio 3.5. To see this, rewrite 3.-4 first i terms of matrix X ad vector Z as T RS Y =Ũ Z Z Z Z XX X X Z Z Ũ/σ, the substitute QŨ}, which yields T RS Y =Ũ Z Z QŨ}Z Z Ũ/σ = Ũi QŨ} ii Ũi /σ = Ũ i r i σ Therefore, both statistics MY i ad T RS Y i are idetical. However, Rao s Score statistic T RS Y, beig aturally based o the residuals of the outlier-free Gauss-Makov model, is more coveiet to compute ad simpler to implemet tha the ivariace-reduced statistic MY based o the estimator of the outlier. We may directly apply the UMPI test i terms of Rao s Score statistic to the curret outlier test problem ad write, if φ Baarda y =, if ũ i r i σ ũ i r i σ >k χ α, <k χ α, which is the test proposed by Baarda 967, p. 3. A alterative expressio of Baarda s test for a sigle outlier may be writte as, if φ Baarda y =, if ri ũ i >k N, σ α, ri ũ i <k N, σ α.

67 3.3 Applicatio : Testig for outliers 6 Example 3.: Testig for multiple outliers. The Gravity Dataset give i Appedix 6., which has bee kidly commuicated by Dr. Diethard Ruess, cosists of = 9 gravity differeces betwee the old ad the ew Austria gravity etwork Österreichisches Schweregrudetz, ÖSGN. To approximate this data, we use the polyomial y i = β + β φ i + β 3 λ i i =,..., as fuctioal model, where φ i deote the latitudes ad λ i the logitudes i decimal degrees. The data is assumed to be ucorrelated ad of costat stadard deviatio σ =.8 mgal. Schuh 6b suggested that the observatios y 3,y 6,y,y 4,y 45,y 78,y 87,y 89 are outliers. To test the data agaist this hypothesis, we add eight additioal shift parameters,..., 8 to the fuctioal model The observatio equatios the read y i = β + β φ i + β 3 λ i + u i i =, y 3 = β + β φ 3 + β 3 λ u y i = β + β φ i + β 3 λ i + u i i =4, 5 y 6 = β + β φ 6 + β 3 λ u 6 y i = β + β φ i + β 3 λ i + u i i =7, 8, 9 y = β + β φ + β 3 λ u y i = β + β φ i + β 3 λ i + u i i =,...,4 y 4 = β + β φ 4 + β 3 λ u 4 y i = β + β φ i + β 3 λ i + u i i =43,...,44 y 45 = β + β φ 45 + β 3 λ u 45 y i = β + β φ i + β 3 λ i + u i i =46,...,77 y 78 = β + β φ 78 + β 3 λ u 78 y i = β + β φ i + β 3 λ i + u i i =79,...,86 y 87 = β + β φ 87 + β 3 λ u 87 y i = β + β φ i + β 3 λ i + u i i = 88 y 89 = β + β φ 89 + β 3 λ u 89 y i = β + β φ i + β 3 λ i + u i i =9, 9 Hece, the observatio model is give as i with rouded desig matrices X = , Z = Estimatio of the mea shift parameters accordig to yields =[.34,.33,.68,.944,.,.73,.756,.899], ad the test statistic takes the value My =54.46, which is larger tha k χ 8.95 =5.5. Thus, Baarda s test rejects H, i.e. vector of outlier parameters is sigificat. To carry out this test, we could also compute Rao s Score statistic via the estimated residuals ũ = y X β based o the restricted parameter estimates β =X X X y =[3.883,.86,.88]. This gives the idetical result T RS =54.46.

68 6 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM 3.3. Pope s test If, i cotrast to the situatio i Sectio 3.3., the variace factor is ukow apriori, the we must apply Part of Propositio 3.4, which states that there exists a UMPI test for the outlier test problem Y NXβ + Z,σ I with hypotheses Accordig to 3.-59, the UMPI test is give by, if My >k F m, m α, φy =, if My <k F m, m α, with statistic MY = Z Z Z X X X X Z /m σ followig from 3.-. The least squares estimator for the outliers, rearraged as i by usig the residuals Ũ = Y X β of the o-exteded model, = Z QŨ}Z Z QŨ}Y follow from 3.-3 ad the least squares estimator σ = Û Û/ m for σ from 3.-4, where the residuals Û = I X X X X Y Z of the exteded model are give by If oly a sigle outlier is modeled, i other words, if Z has oly m = colums, the becomes = Ũi, r i which is idetical to for Baarda s test. The test statistic the simplifies to MY = r i σ The oly differece betwee the statistics ad is that the former is based o the apriori variace factor σ ad the latter o the estimator of the variace factor i the exteded model. As the oexteded model is easier to adjust, it is more coveiet to apply Rao s Score statistic whose geeral form 3.-3 is simplified for the curret test problem as follows: T RS Y =Ũ Z Z Z Z XX X X Z Z Ũ/ σ ML = r i σ ML = The UMPI test i terms of Rao s Score statistic takes the expressio ũ, if i, m >kf mr φ Pope y = i σ α,, if ũ i, m <kf α, mr i σ Ũ i Ũ i mr i σ which was proposed i Pope 976, p. 7. Pope s test is sometimes writte i a square-root versio of as, if ũ i, m >kτ ri σ α, φ Pope y =, if ũ i, m <kτ ri σ α, where k τ, m α deotes the critical value of the Tau distributio. Koch 999, p. 33 shows that the Tau distributio is a fuctio of the F distributio, ad that both forms ad of Pope s test are idetical.

69 3.4 Applicatio : Testig for extesios of the fuctioal model Applicatio : Testig for extesios of the fuctioal model Suppose that we wat to approximate give observables Y by a Gauss-Markov model p Y i = B j x i a j + U i, ΣU} = σ I, j= where B j x i deote base fuctios evaluated at kow, ot ecessarily equidistat, odes or locatios x i i =,...,adβ =[a,...,a p ] the ukow parameters. Frequetly used base fuctios B j j =,...,p are, for istace, polyomials, trigoometric fuctios, or spherical harmoics. Let us further assume that the errors U i i =,..., are ucorrelated, homoscedastic, ad ormally distributed variables. If they were correlated ad/or heteroscedastic with weight matrix P, the we would preprocess the observatio equatios by decorrelatio ad/or homogeizatio as i Sect I a practical situatio with isufficiet kowledge about the physical or geometrical relatioship betwee the data ad the odes, it might ot be clear how high the degree m of, for example, a polyomial expasio should be. Let us say we believe that the degree of the expasio should be specified by p, but we would like to check whether the base fuctio B p with p = p + should be added to the model. Oe approach would be to estimate the parameters of the model up to degree p ad to perform a sigificace test of the parameter a p.ifa p turs out to be isigificat, the we carry out a ew adjustmet of the model with degree p. If we defie β := [ a,...,a p ]adβ := a p, ad if we let X ad X cotai the values of the base fuctios evaluated at the locatios, the we may rewrite as y = X β + X β + u, ΣU} = σ I, ad the sigificace test of a p is about the hypotheses H : β = versus H : β. As this problem is oe of testig the sigificace of m = additioal parameter β with either kow or ukow variace factor, the UMPI test or is based o either the statistic or MY = β X X X X X X X X β /σ MY = β X X X X X X X X β /m σ by virtue of ad 3.- i Propositios 3.4/3.5. If however, o the grouds of prior iformatio, we favor the model up to degree p over the model up to degree p, the it seems more reasoable to adjust the smaller model with p first, ad to estimate the additioal parameters oly after they have bee verified to be sigificat. To imlemet such a sigificace test, which is a reversed versio of the sigificace test based o MY, we may use Rao s Score statistic T RS Y =Ũ X X X X X X X X X X Ũ/σ if the variace factor is kow or T RS Y =Ũ X X X X X X X X X X Ũ/ σ ML if the variace factor must be estimated as defied i 3.-4 ad These statistics are based o the estimated residuals of the Gauss-Markov model with the restrictio β =, which is equivalet to the ordiary Gauss-Markov model y = X β + u, ΣU} = σ I Rao s Score statistics for testig the sigificace of additioal base fuctios are merely a geeralizatio of Baarda s ad Pope s statistic for testig the sigificace of outliers. This fact has already bee poited out by Jaeger et al. 5, Sectio i the cotext of testig the sigificace of additioal trasformatio parameters.

70 64 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM Example 3.3: Testig for extesio of a two-dimesioal polyomial. We will demostrate the two differet approaches to sigificace testig of additioal model parameters by further aalyzig the Gravity Dataset from Example 3.. For simplicity, the outlyig observatios y 3,y 6,y,y 4,y 45,y 78,y 87,y 89 will ot be used, i.e. the correspodig rows are elimiated from y ad X. Now, we could cosider Y i = a + φ i a + λ i a i =,..., as the model we favor uder H, ad the extesio to degree Y i = a + φ i a + λ i a + φ i a 3 + φ i λ i a 4 + λ i a 5 i =,..., as a alterative model specificatio. This model is simply a two-dimesioal polyomial versio of the model i We see that the ull model is obtaied from the exteded model if the additioal parameters a 3,a 4,a 5 are restricted to zero. To test whether the ull model is misspecified, we will rewrite the exteded model i the form with σ =.8 kow apriori. I the preset example, we defie β := [a,a,a ] ad β := [a 3,a 4,a 5 ], ad use the hypotheses H : β = versus H : β. The desig matrix X is obtaied from X i Example 3. by deletig rows as described above. X = , X = Notice the close similarity of this testig problem with the problem of testig for outliers i Example 3.. The oly differece is that the additioal parameters β appear i every sigle observatio equatios while each mea shift parameter i affects oly a sigle observatio. Here we could aalogously estimate β as i 3.-7 ad the compute the value of the test statistic This gives ad β = X X X X X X X X X I X X X X y = [.87,.7,.44] My = β X X X X X X X X β /σ =7.74, which is smaller tha the critical value k χ 3.95 =7.8. This test agai yields the same result if oly the parameters β of the ull model are estimated ad if the the correspodig residuals ũ = y X β are used to determie the value of Rao s Score statistic This would result i ad β =X X X y =[4.459,.,.4] T RS y =ũ X X X X X X X X X X ũ/σ =7.74. The values i β differ slightly from those i β from Example 3. as a cosequece of deletig observatios. Hece, we could ot reject the ull model, i.e. the joit set of additioal parameters could ot be prove to be sigificat.

71 3.5 Applicatio 3: Testig for poit displacemets Applicatio 3: Testig for poit displacemets I Meissl 98, Sect. 5.4 the followig test problem is discussed. A levelig etwork has bee measured twice see Fig. 3., ad the questio is whether three of the poits 7,8, ad 9, which are located o a dam, chaged i heights betwee both measuremet campaigs. Poit has a fixed height, ad the heights of the poits, 3, 4, 5, ad 6 are assumed to be ukow, but costat over time Fig. 3. A levelig etwork observed i a first campaig left ad a secod campaig right later i time. The geeral structure of the observatio model is specified as y = Xβ + u, ΣU} = σ I, where the fuctioal model comprises = 34 levelig observatios y i.e. observed height differeces with ukow accuracy σ see Appedix 6. for the umerical values. To allow for height displacemets, the parameter vector β =[H,H 3,H 4,H 5,H 6,H 7,H 8,H 9,H 7,H 8,H 9 ] cotais the set of dam heights H 7,H 8,H 9 regardig the first campaig ad the set H 7,H 8,H 9 forthe possibly differet poits modeled with respect to the secod campaig. Let, for example, y 5 represet the observed height differece betwee poits ad 8 made i the first campaig ad y 4 the observed height differece betwee poits ad 8 made i the secod campaig. The, the correspodig observatio equatios read y 5 = H 8 H + u 5, y 4 = H 8 H + u 4, ad the correspodig rows of the desig matrix are give by X 5 = [,,,,,,,,,, ], X 4 = [,,,,,,,,,, ]. The mathematical formulatio of the above questio, whether the poits 7, 8, ad 9 shifted sigificatly, is give by the ull hypothesis H 7 = H 7,H 8 = H 8,H 8 = H 8 versus the alterative hypothesis H 7 H 7,H 8 H 8,H 8 H 8, or i matrix otatio by H : H β = versus H : H β with H = For this testig problem, Meissl 98 gives the statistic MY =H β HX X H H β/3 σ,

72 66 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM which is distributed as F 3, m, because the model has three restrictios. As a additioal result however, we may coclude from Propositio 3.43 that this statistic leads to a optimal ivariat test, because the testig problem is of the form with P = I ad w =. Cosequetly, usig the test statistic 3.-6 with m = 3 ad least squares estimates β = X X X y, û = y X β, σ = û û/ m accordig to leads to the UMPI test as give i With the give umerical values i Appedix 6., the test statistic takes the value My =53.6, which exceeds for istace the critical value k F 3,3.95 =3.3. Therefore, we coclude that the data shows sigificat evidece for a shift i height. We will ow demostrate how this testig problem is reparameterized as i 3... Notice first that the fuctioal model may be partitioed ito y = X β + X β + u with β = [H,H 3,H 4,H 5,H 6,H 7,H 8,H 9 ], β = [H 7,H 8,H 9 ] ad, regardig observatios y 5 ad y 4, X 5, = [,,,,,,, ], X 5, =[,, ], X 4, = [,,,,,,, ], X 4, =[,, ]. Now, if we use H 7 := H 7 H 7 H 8 := H 8 H 8 H 9 := H 9 H 9, i.e. the trasformed quatities Hβ, as parameters istead of H 7, H 8,adH 9, the the ew parameter vector reads β r =[β r, β r ] =[H,H 3,H 4,H 5,H 6,H 7,H 8,H 9, H 7, H 8, H 9 ] with compoets β r = [H,H 3,H 4,H 5,H 6,H 7,H 8,H 9 ], β r = [ H 7, H 8, H 9 ]. Clearly, the hypotheses H : β r = versus H : β r i terms of the ew parameters are idetical to the origial hypotheses. To see how the desig matrix chages, we first rewrite the observatio equatios, for istace, with respect to y 5 ad y 4 y 5 = H 8 H + u 5, y 4 = H 8 H 8 + H 8 H + u 4 = H 8 + H 8 H + u 4. The correspodig rows of the desig matrix with respect to the ew parameters β r read X r 5, X r 4, = [,,,,,,, ], Xr 5, =[,, ], = [,,,,,,, ], Xr 4, =[,, ]. We see that β r = β, X r 5, = X 5,, X r 5, = X 5,, adx r 4, = X 4,. For the preset example these equatios i fact hold for all rows of the desig matrix, so that the reparameterized observatio model is give by y = X r β + X β r + u, Σ = σ I.

73 3.5 Applicatio 3: Testig for poit displacemets 67 Accordig to Propositio 3.4-, the UMPI test for this reparameterized testig problem is based o the value of the statistic r My = β X X X Xr X r X r X r X βr /3 σ with estimates β r = X X X Xr X r X r X r X X I X r X r X r X r y, û = I X r X r X r X r y X βr, σ = û û/ m r With the give data, the displacemet parameters take the values β =[.44,.47,.56 ],adthe test statistic becomes My =53.6, which is of course the same value as determied above for Notice that, as demostrated i the proof of Propositio 3.4, the quatities My, û, ad σ remai uchaged by the reparameterizatio of the observatio equatios. This trasformatio allows us to apply Propositio 3.6 ad computethevalueofrao sscorestatisticby T RS y =ũ X X X X X r X r X r X r X X ũ/ σ ML, which requires the estimates β = X r X r X r y, ũ = y X r β, = ũ ũ/ σ ML With the give data, we obtai T RS y =9.74. As the relatioship betwee the statistics M ad T RS has bee show to be 3.-3, we may apply this formula to check the validity of the results ad to compute the critical value valid for T RS. We fid that T RS y = My My ad k TRS 3 3,3.95 =34 3kF ,3 3kF.95 =9.63. As Rao s Score statistic is determied uder the assumptio that H is true, the ull hypothesis H : β r = acts as a restrictio o the Gauss-Markov model, ad thus elimiates the parameters β r from the reparameterized observatio equatios. I other words, if H is true, the the Gauss-Markov model with restrictios y = X r β + X β r + u, β r = Σ = σ I is equivalet to the Gauss-Markov model y = X r β + u, Σ = σ I, for which the estimates are give by The mai advatage of Rao s Score statistic T RS i over M i is that the restricted estimates β are clearly less complex to compute tha the r urestricted estimates β. Furthermore, T RS has a advatage over M i , because the restricted reparameterized Gauss-Markov model has less ukow parameters to be estimated tha the origial model

74 68 3 THEORY AND APPLICATIONS OF MISSPECIFICATION TESTS IN THE NORMAL GMM 3.6 Derivatio of a optimal test cocerig the variace factor So far we have oly discussed testig problems cocerig the parameters β of the fuctioal model Y = Xβ + E. I the curret sectio, we will derive a optimal procedure for testig hypotheses cocerig the variace factor σ i the stochastic model Σ = ΣE} = σ P. As usual we will assume that the desig matrix X R m is kow ad of full rak, ad that the weight matrix P R is positive defiite. If we decompose the weight matrix ito P = GG as i Sectio 3..3, the the observatios ad the desig matrix may be trasformed ito Y h := G Y ad X h := G X,where Y h has the covariace matrix σ I. Let us ow cosider the problem of testig H : σ = σ versus H : σ >σ i the observatio model Y h NX h β,σ I. Such a testig problem arises if we suspect that the give measuremet accuracy σ is too optimistic. Propositio 3. allows us ow to reduce Y h to the set of miimally sufficiet statistics T Y h :=[ β, σ ] with β =X h X h X h Y h ad m σ =Y h X h β Y h X h β. The reduced observatio model the reads β N β,σ X h X h, m σ χ m σ, ad the hypotheses are still give by H : σ = σ versus H : σ >σ This testig problem is ivariat uder the group G of traslatios [ ] [ ] β β + a g σ = σ with a R m actig o β. Each of these traformatios will cause a chage of distributio from β N β,σ X h X h to β + a N β + a,σ X h X h, while the secod cetral momet of β ad the distributio of σ remai uaffected by these traslatios. Thus, the iduced group Ḡ of trasformatios withi the parameter domai is give by [ ] [ ] β β + a ḡ σ = σ Evidetly, either the parameter space or the hypotheses are chaged uder Ḡ. Moreover,weseethat σ,or more coveietly MY h := m σ is a maximal ivariat uder G. Fromthefactthat m σ /σ χ m = G m, where G stads here for the Gamma distributio it follows that m σ G m, σ. Thus, the ivariat test problem MY = m σ G m/, σ H : σ = σ versus H : σ >σ, has oe ukow parameter, a oe-sided alterative hypothesis, ad a test distributio with a mootoe desity ratio by virtue of Theorem.5-. For this reduced testig problem, Theorem.4 gives the UMP test, if My >k G m/,σ α,, if m σ φy =, if My <k G m/,σ = /σ m >kχ α, α,, if m σ /σ m <kχ α, which is UMPI for the origial problem of testig H : σ = σ agaist H : σ >σ i the observatio model Y h NX h β,σ I. This test is the same as give i Koch 999, Sectio 4..4, but it was show here i additio that is optimal withi the class of all tests with equal power i each directio of β.

75 69 4 Applicatios of Misspecificatio Tests i Geeralized Gauss-Markov models 4. Itroductio I this sectio, we will look at testig problems, where parameters of the distributio or stochastic model are hypothesized. I each of these problems, the ull hypothesis states that the errors are distributed as E N,σ I, whereas the alterative hypothesis represets oe of the followig types of model errors:. the errors are ot ucorrelated, but correlated through a autoregressive process Sectio 4.;. the errors are ot homoscedastic, but the variace chages accordig to a additive variace compoet Sectio 4.3; 3. the errors are ot ormally distributed Sectio 4.4. I cotrast to the testig problems i Sectio 3, where oly the parameters of the fuctioal model were subject to hypothesis, there will be o UMPI tests uder the above scearios, as these problems caot be reduced to a sigle maximal ivariat statistic. However, they ca be give a suitable mathematical expressio such that, at least, reductios to Likelihood Ratio or Rao s Score statistics are feasible. Although these statistics will the ot be optimal i a strictly mathematical sese, oe may hope that they will remai reasoable tools for detectig the above model errors. I this sese, Egle 984, used the term diagostic i his isightful review article. The first step towards derivig such diagostics for each of the above cases is to exted the mathematical model by estimable parameters that allow the data to be correlated, heteroscedastic, or o-ormally distributed. H the restricts these additioal parameters to zero ad thereby reduces the exteded model to a ordiary ormal Gauss-Markov model, while uder H, these parameters remai urestricted. The, the Likelihood Ratio test compares the value of the likelihood fuctio evaluated at the urestricted ML estimate with the value of the likelihood fuctio obtaied at the restricted estimate. Therefore, if the restrictio uder H reduces the likelihood sigificatly, the test statistic will take a large value ad thus idicate that H should be rejected. O the other had, if the restrictio is reflected by the give data, the the restricted likelihood will be close to the urestricted likelihood, which will probably cause the statistic to take a isigificat value. Rao s Score test, o the other had, does ot require computig the urestricted ML estimates which may be computatioally expesive if, for istace, variace compoets are preset, because it measures the extet to which the scores i.e. the first partial derivatives of the log-likelihood fuctio differ from zero if the restricted estimates are used. Although the testig procedures based o Rao s Score statistic will be computatioally feasible ad statistically powerful, we will have to deal with oe icoveiece: i cotrast to the testig problems i Sectio 3, where the distributio of Rao s Score statistic was always exact as a strictly mootoically icreasig fuctio of a χ -orf -distributio, there will be o exact test distributios available for the problems above. Istead we will have to cofie ourselves to usig approximative test distributios, that is to critical values which are valid asymptotically. Therefore, the testig problems stated i the curret sectio should be applied oly whe a large umber i.e. at least of observatios is give. It is beyod the scope of this thesis to explai i detail the defiitio of asymptotic distributio as this would require a rather legthy discussio of various types of covergece of radom variables. The iterested reader shall therefore be referred to Godfrey 988, p. 3-5, who gives a proof ad more techical explaatio of the followig propositio. Propositio 4.. Suppose that Y,...,Y are idetically distributed observatios with true desity fuctio i F = fy; θ :θ Θ} ad cosider the problem of testig H : H θ = w versus H : H θ w, whereh is a kow p u-matrix with p<uad full row rak p. The, uder H, the asymptotic distributio of the LR ad the equivalet RS statistic is give by: L θ; Y L θ; Y S θ; Y I θ; Y S θ; Y a χ p, where θ is the ML estimator for θ restricted by H ad θ the urestricted ML estimator.

76 7 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs 4. Applicatio 5: Testig for autoregressive correlatio We will ow discuss a liear model Y i = X i β + E i i =,...,, where X i represets the i-throwofthedesigmatrixx R m with rakx = m adwhereβ R m deotes a vector of ukow fuctioal parameters. We will assume that each error E i follows a first-order autoregressive error model/process, or AR model/process, defied by E i = αe i + U i i =,...,, i which α is a ukow parameter. Let the stochastic model of the errors U i be give by ΣU} = σ I If ormally distributed error variables E i are to have expectatios, variaces, ad covariaces that are idepedet of the idex i that are, for istace, idepedet of the absolute time or locatio, the we must require the AR model to be weakly statioary up to secod order. This requiremet imposes certai restrictios o the specificatio of the umerical value for α. Let us explore the ature of these restrictios by ivestigatig statioarity with repect to the first momet. We may rewrite as E i = ααe i + U i +U i = α ααe + U +U +...+U i = α i U + α i U U i. Takig the expected value of both sides of this equatio yields EE i } = α i EU } + α i EU } EU i }. Uder the coditio of costat mea µ = EU } = EU } =...= EU },weobtai EE i } = µ α i + α i µ α i = α, for α. µ i, for α =. We see from this result that the error variables E i have costat mea µ E = EE } = EE } =...= EE } oly if µ = holds, because oly this coditio elimiates the depedece of EE i } o the idex i. I other words, a AR model is weakly statioary up to order oe if the mea µ of the idepedet errors U i i =,..., is zero. Similarly, we obtai for the covariace of two variables separated by distace h EE i E i+h } = E α i U + α i U U i α i+h U + α i+h U U i+h } = Eα i+h U + αi+h 4 U αh Ui +αi+h 3 U U +...} Notice that, due to the stochastic model 4.-98, all the expected values of the mixed terms α i+h 3 U U,..., i.e. all the covariaces betwee ay two distict error variables U i, U j i j, are zero. Furthermore, expresses that the variaces of all the U i are costat with σ = EU } = EU } =... = EU }. Withthis, we may rewrite as EE i E i+h } = σ α i+h + α i+h α h σ = α h α i α, for α σ i, for α =. It is see that EE i E i+h } is idepedet of i oly if σ =. However, i that case all the errors E i would be exactly zero, which is a osesical requiremet if the E i represet measuremet errors. O the other had, if σ >, the we could resort to the followig type of asymptotic statioarity. If α <, the lim i σ α h α i α = σ α h α. It follows that the variace h = ad the covariace h > of the errors E i are asymptotically idepedet of the idex i.

77 4. Applicatio 5: Testig for autoregressive correlatio 7 I summary, the coditios for the AR model to be asymptotically weakly statioary up to order two are give by EU i } =foralli =,...,,ad α <. If these coditios are presumed, the covariace matrix of the error variables E is easily deduced from the limit σ α h α of If we let h ru from,...,, we obtai α α ΣE} := σ Q α = σ α α α......, 4.-3 α α where we take ito accout that the cofactor matrix Q α of the autoregressive errors E depeds o the ukow parameter α. Accordig to Peracchi, p. 77, the weight matrix is give by P α = Q α = α α +α α α α The weight matrix is tridiagoal ad positive defiite. We see that the errors E are ucorrelated oly if α =. If α, the we say that the errors have serial correlatios. This term was coied due to the fact that autoregressive error models have traditioally bee applied to time series. To give a geodetic example, Schuh 996, Chap. 3 used a autoregressive movig average ARMA model, which is a geeralizatio of the AR model, to obtai the covariace matrix of satellite data that have a bad-limited error spectrum. Such data may be treated as a time series recorded alog the satellite s orbit. Typically, such time series comprise very large umbers of observatios, which justifies the use of asymptotic covariace matrices as i Now, the observatio model uder the additioal assumptio of ormally distributed errors may be summarized as Y NXβ,σ P α To keep the observatio model as simple as possible, we might wat to check the data whether serial correlatio is sigificat or ot. Such a test may be based o the hypotheses H :ᾱ = versus H :ᾱ For this purpose, we should clearly apply Rao s Score statistic, because it avoids estimatio of the parameter α. To see this, recall that Rao s Score statistic is based o the residuals of the Gauss-Markov model with restrictio H. As this restrictio reduces the observatio model to the simpler model Y NXβ,σ I, we will oly have to estimate β ad σ. To derive Rao s Score statistic, we eed to determie the log-likelihood score ad the iformatio with respect to the exteded parameterizatio i 4.-33, ad the to evaluate these quatities at the estimates of the restricted model. As the observatios are ow correlated ad heteroscedastic, the joit desity ad cosequetly the loglikelihood do ot factorize ito the product of idetical uivariate desities. However, such a factorizatio is made possible quite easily through a trasformatio of the fuctioal model as i Sectio If we decompose the weight matrix ito P α = G α G α with G α = α α α, which may be verified directly by multiplicatio, ad trasform the observatios ad the desig matrix by Y h = G α Y ad X h = G α X, the the homogeized observatios Y h will have uity weight matrix. This

78 7 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs trasformatio, which may also be writte compoet-by-compoet as Y h i = α Y i, for i =, Y i αy i, for i =,...,, Xh i = α X i, for i =, X i αx i, for i =,...,, is also kow as the Prais-Wiste trasformatio see, for istace, Peracchi,, p. 78. Now we may use the factorized form of the log-likelihood i.5-96, that is Lθ; Y =l exp πσ Y h i X h i β σ =lπσ / σ Y h X h β. Reversig the trasformatio 4.-36, we obtai Lθ; Y = lπ l σ α σ Y α X β σ Y i αy i X i αx i β i= = lπ l σ α σ Y X β σ Y i X i β αy i X i β. Before determiig the first partial derivatives, it will be coveiet to expad the secod sum ad to move the parameter α outside the summuatio, which gives. Lθ; Y = lπ l σ σ + α σ Y i X i βy i X i β α σ i= i= Y i X i β + α σ Y X β Y i X i β. The first partial derivatives of the log-likelihood fuctio give the log-likelihood scores.5-, that is S βj θ; Y := Lθ; Y β j = σ α σ i= S σ θ; Y := Lθ; Y σ α σ 4 i= S α θ; Y := Lθ; Y α α σ i= i= Y i X i βx i,j α σ Y X βx,j α σ Y i X i βx i,j + α σ = σ + σ 4 Y i X i βx i,j, i= Y i X i βy i X i β+ α σ 4 Y i X i β α σ 4 Y X β Y i X i β, i= = α σ Y X β + σ Y i X i βy i X i β Y i X i β. The Hessia.5- which comprises the secod partial derivatives follows to be H βjβ k θ; Y := Lθ; Y β j β k = σ + α σ i= i= X i,k X i,j + α σ X,kX,j + α σ X i,k X i,j α σ H βjσ θ; Y := Lθ; Y β j σ = σ 4 X i,k X i,j, i= + α σ 4 Y i X i βx i,j α i= Y i X i βx i,j i= X i,k X i,j i= Y i X i βx i,j + α σ 4 Y X βx,j + α σ 4 σ 4 i= Y i X i βx i,j, Y i X i βx i,j i=

79 4. Applicatio 5: Testig for autoregressive correlatio 73 H βjαθ; Y := Lθ; Y β j α σ i= = α σ Y X βx,j σ Y i X i βx i,j + α σ H σ σ θ; Y := Lθ; Y σ σ = σ 4 σ 6 + α σ 6 i= Y i X i βx i,j i= Y i X i βx i,j, i= Y i X i βy i X i β α σ 6 H σ αθ; Y := Lθ; Y σ α = α σ 4 Y X β σ 4 + α σ 4 Y i X i β, i= H αα θ; Y := Lθ; Y α α Y i X i β + α σ 6 Y X β Y i X i β, i= Y i X i βy i X i β i= = σ Y X β σ Y i X i β. This gives for the iformatio matrix.5-3 i terms of the errors E i = Y i X i β I βjβ k θ; Y := EH βjβ k θ; Y } = σ X i,k X i,j α σ X,kX,j α σ X i,k X i,j α σ i= X i,k X i,j + α σ I βjσ θ; Y := EH β jσ θ; Y } = σ 4 α σ 4 i= i= X i,k X i,j, i= EE i }X i,j + α σ 4 i= EE i }X i,j α σ 4 EE i}x,j α σ 4 EE i }X i,j, i= I βjαθ; Y := EH βjαθ; Y } = α σ EE }X,j + σ α σ EE i }X i,j, i= I σ σ θ; Y := EH σ σθ; Y } = σ 4 + σ 6 + α σ 6 i= EE i }, I σ αθ; Y := EH σ αθ; Y } = α σ 4 EE } + σ 4 I αα θ; Y := EH αα θ; Y } = σ EE } + σ i= EE i }X i,j + σ EEi } α σ 6 EE i } α σ 6 i= i= EE i E i } α σ 4 EE i }. i= EE i }X i,j i= EE i }X i,j i= EE i E i } i= EE i }, Evaluatio of the scores at the restricted maximum likelihood estimates θ =[ β σ ML α ] yields S βj θ; Y = σ ML S σ θ; Y = Y i X i βxi,j =, σ ML S α θ; Y = σ ML + σ 4 ML Y i X i β =, Y i X i βyi X i β = ρ i=

80 74 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs where we used the orthogoality relatio Y X βx j =wherex j deotes the j-th colum of the desig matrix X, the maximum likelihood variace estimator σ ML = Y i X i β /, ad the autocorrelatio estimator ρ = i= Y i X i βyi X i β/ Y i X i β for lag h =. Uder H the observatios are ucorrelated. Hece, the iformatio matrix at θ is give by I βjβ k θ; Y = σ ML X i,k X i,j, I θ; βjσ Y = I βjα θ; Y =I σ α θ; Y =, I σ σ θ; Y =, σ 4 ML I αα θ; Y =. To obtai the total score vector S θ θ; Y ad the etire iformatio matrix I θθ θ; Y, we must express all the idividual compoets i matrix otatio. I particular, we set up the m -subvector S β θ; Y = cosistig of all the etries S βj θ; Y j =,...,m, ad the m m-submatrix I ββ θ; Y = σ MLX X cotaiig all the elemets I βjβ k θ; Y. The, we obtai for Rao s Score statistic T RS = S β, σ ML, α; Y I β, σ ML, α; Y S β, σ ML, α; Y σ = MLX X σ. ML 4 ρ ρ The iformatio matrix is block-diagoal, hece the three parameter groups are idepedet. We see immediately that T RS = ρ ρ, where the approximatio will certaily be sufficiet for large. This statistic is called the Durbi-Watso statistic see, for example, Krämer ad Soberger, 986, p. 7 ad asymptotically follows a χ -distributio uder H see Propositio 4.. This statistic measures the size of the absolute value of the empirical autocorrelatio for lag h =, which is a reasoable procedure i light of the fact that the cofactor matrix Q α is domiated by the cofactors α o the secodary diagoal. I fact, if α is much smaller tha, the the cofactors for the higher lags decay quite rapidly accordig to α h. The procedure explaied here to obtai a sigificace test of the parameter α of a AR model may be geeralized quite easily to a ARp model, which is defied as E i = α E i α p E i p + U i i =,..., If a joit sigificat test is desired with respect to the parameters α,...,α p of this model, the the loglikelihood fuctio, the log-likelihood score, ad the iformatio matrix may be obtaied accordig to the derivatios above. The mai differece is that the empirical autocorrelatios up to lag p will appear i the score vector. I that case, Rao s Score statistic ca be show to take the form p T RS = ρ j, j= which is also kow as the Portmateau or Box-Pierce statistic, which is asymptotically distributed as χ p see, for istace, Peracchi,, p. 367.

81 4. Applicatio 5: Testig for autoregressive correlatio 75 Example 4.: Liear regressio model with AR errors. Let us ispect the power of the exact Durbi-Watso test for detectig correlatios followig a AR model by performig a Mote Carlo simulatio. For this purpose, suppose that the fuctioal model Y i = X i β + E i is represeted by a straight lie through the origi with X i =[,i]adβ =[,.] i =,...,. The error variables are assumed to follow the autoregressive model E i = αe i + U i with σ =. The observatios are ucorrelated uder the ull hypothesis, that is H : ᾱ =,whereasthe alterative hypothesis allows for correlatios accordig to a o-zero value of α, thatish :ᾱ. Depedig o the give problem, the determiistic model could be far more complex tha the oe we adopted here, but the mai purpose of this example is to explai how the power of a test is examied empirically. Now, we would ituitively expect the power of a reasoable diagostic to icrease as the value of α becomes larger. To verify our ituitio, we geerate M = vectors with dimesio of stadard-ormally distributed radom umbers. These vectors u j j =,...,M represet radom realizatios of the ucorrelated error variables U i. The we trasform these errors ito possibly correlated errors e i by usig the set of parameter values α =,.5,.,.5}. The value α = represets the case of o correlatios, that is of a true H. The values -.5 ad -. reflect moderate egative correlatio, while the value -.5 produces a strog egative correlatio see Fig. 4.. Now, addig the above liear tred to the error vectors e j yields the data vectors y j j =,...,M, which will be tested i the followig. The first step i the testig procedure described i the curret sectio cosists i estimatig the M sets of lie parameters by β j =X X X y, where potetial correlatios are eglected. From these estimates the residual vectors follow to be ũ j = Y X β j. The, we eed to compute the autocorrelatio for lag h = with respect to each of these residual vectors, that is ρ j = i= ũj i ũj i / ũj i. This quatity is simply a stadardized empirical versio of the theoretical covariace σ α h / α of the AR error model see Fig. 4., right. Fially, we determie the values of the Durbi-Watso statistic T j RS = ρj ad compare these to the critical value of the χ -distributio for istace at level.5. To obtai empirical values of the power fuctio evaluated at α =,.5,.,.5}, we oly eed to cout how may times the test rejects H, i.e. determie N R =# T j RS >kχ.95 j =,...,M ad divide this umber by the umber M of trials. The, the ratio N R /M is a estimate for the probability Πα thath is rejected, which we expected to deped o the value α of the autoregressive parameter. The results of this simualtio are summarized i the followig table: α N R /M We see that for α =, the level of the test is reproduced reasoably well, ad that the power of the test is almost for α =.5. To obtai the fier details of the empirical power fuctio, we would oly have to exted this simulatio to a fier grid of α-values. Naturally, we could also improve o the accuracy of the power estimates by geeratig a higher umber M of radom samples U j,saym =. Value Covariace Nodes Lag Fig. 4. A sigle realizatio of a AR error process with parameter α =.5 superimposed o a liear tred left; theoretical covariace fuctio for the same process.

82 76 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs 4.3 Applicatio 6: Testig for overlappig variace compoets Suppose that observatios Y are approximated by a liear model Y = Xβ + E with ormally distributed zero-mea errors E, kow desig matrix X R m of full rak, ad parameters β R m. Regardig the stochastic model, we assume that the covariace matrix is writte as ΣU} = σ I + γv, where V is a positive kow diagoal matrix. From these specificatios it follows that the errors are ucorrelated, ad that they possibly have uequal variaces σi := σ U i = σ + γv ii The model represets a Gauss-Markov model with two overlappig variace compoets σ ad γ. This model is a particular versio of the geeral Gauss-Markov model with k ukow variace ad covariace compoets, defied i Koch 999, Equatio A test about the hypotheses H : γ = versus H : γ> is most coveietly based o Rao s Score statistic, which avoids computatio of the additioal parameter γ. Recall that Rao s Score statistic requires that the first ad secod partial derivatives with respect to all ukow parameters, i.e. the score vector ad the iformatio matrix, are evaluated at the maximum likelihood estimates uder the restrictio H. Therefore, we must first determie the log-likelihood fuctio for Y.Usig the defiitio of the desity for the uivariate ormal distributio with parameters θ =[β σ γ ], we fid Lθ; Y =l fy i ; θ = l fy i ; θ = lπσi / exp } Yi X i β σ i = lπ lσ + γv ii Y i X i β σ. + γv ii The first partial derivatives of the log-likelihood fuctio give the scores S βj θ; Y := Lθ; Y Y i X i βx i,j = β j σ, γv ii S σ θ; Y := Lθ; Y σ = σ + γv ii + S γ θ; Y := Lθ; Y = V ii γ σ + + γv ii The secod partial derivatives follow to be H βjβ k θ; Y := Lθ; Y = β j β k H βjσ θ; Y := Lθ; Y β j σ H βjγθ; Y := Lθ; Y β j γ = = H σ σ θ; Y := Lθ; Y σ σ = H σ γθ; Y := Lθ; Y σ γ H γγ θ; Y := Lθ; Y γ γ = = Y i X i β σ + γv ii, Y i X i β σ + γv ii V ii X i,j X i,k σ + γv ii, Y i X i βx i,j σ + γv ii, Y i X i βx i,j σ + γv ii V ii, σ + γv ii Y i X i β σ + γv ii 3, V ii σ + γv ii Y i X i β σ + γv ii 3 V ii, Vii σ + γv ii Y i X i β σ + γv ii 3 V ii

83 4.3 Applicatio 6: Testig for overlappig variace compoets 77 Usig the Markov coditios EE i } =adeei } = σ i compoets of the iformatio matrix: about the errors E i = Y i X i β, we obtai for the I βjβ k θ; Y := EH βjβ k θ; Y } = I θ; βjσ Y := EH β jσθ; Y } = X i,j X i,k σ + γv ii, EY i X i β}x i,j σ + γv ii =, I βjγθ; Y := EH βjγθ; Y } = I σ σ θ; Y := EH σ σ θ; Y } = I σ γθ; Y := EH σ γθ; Y } = I γγ θ; Y := EH γγ θ; Y } = EY i X i β}x i,j σ + γv ii V ii =, σ + γv ii + V ii σ + γv ii + Vii σ + γv ii + EY i X i β } σ + γv ii 3 = EY i X i β } σ + γv ii 3 V ii = EY i X i β } σ + γv ii 3 V ii = σ + γv ii, V ii σ + γv ii, V ii σ + γv ii. Evaluatio of the scores at the restricted maximum likelihood estimates θ =[ β σ ML γ ] uder H yields S βj θ; Y = S σ θ; Y = S γ θ; Y = Y i X i βxi,j σ ML = ŨiX i,j σ ML =, Y i X i β = + Ũ i =, σ ML = σ ML V ii σ ML + Ũ i σ ML σ 4 ML σ ML Y i X i β V ii = where we may use the stadardized residuals σ 4 ML V ii = σ ML σ 4 ML V ii σ ML + Ũ i V ii σ 4 ML Ū i V ii = σ ML V Ū, Ū i := Ũ i σ ML = Ũ i Ũ Ũ/ ad the -vector of oes to allow for a more compact otatio. Evaluatio of the iformatio at θ yields I βjβ k θ; Y = X i,jx i,k σ ML = σ ML X X, I θ; βjσ Y =I βjγ θ; Y =, I σ σ θ; Y =, I σ γ θ; Y = I γγ θ; Y = σ 4 ML V ii σ 4 ML V ii σ 4 ML = trv σ ML 4 = σ ML 4 V, = trv V σ 4 ML = σ ML 4 VV. For computatioal purposes, the expressios for I σ γ θ; Y adi γγ θ; Y i terms of the trace of V or VV are more coveiet to use. However, to costruct the test statistic itself, we will use these equatios i matrix otatio.

84 78 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs Now we obtai for Rao s Score statistic T RS = S β, σ ML, γ; Y I β, σ ML, γ; Y S β, σ ML, γ; Y X X σ ML = σ ML V Ū σ ML 4 σ ML V 4 4 V σ ML 4 σ ML VV 4 = [ Ū V ] σ ML X X V V VV = Ū V VV V Ū. V Ū σ ML V Ū 4 As all the compoets of log-likelihood score for the urestricted parameters vaish, we oly eed to fid the Schur complemet of the block VV, which is simple to compute due to the block-diagoality of the submatrix with respect to the parameter groups β ad σ.weobtai [ ] VV σ = VV [ V ] ML X [ ] X V = VV V V = VV V V = V [ I ] V. Rao s Score statistic for testig the sigificace of a sigle additive variace compoet γ fially reads T RS = Ū V V [ I ] V V Ū As this test statistic has a approximate χ -distributio by virtue of Propositio 4., Rao s Score test is give by, if T φ AV C y = RS >k χ α,, if T RS <k χ α, Koch 98 co- Example 4.: Sigificace testig of a distace-depedet variace compoet. sidered a additive heteroscedasticity model of the form σ i = a + b s i to explai the variaces σi of distace measuremets s,...,s by a costat part a ad a distace-depedet part γs i. If we further assume the observatios to be ucorrelated, the the stochastic model follows to be ΣY } = σ... + γ s s... s. We might desire a test of the ull hypothesis that measured distaces have costat accuracy σ agaist the alterative hypothesis that there is a sigificat distace-depedet variace compoet superposig the variace σ. These hypotheses take the form H : γ = versus H : γ. Uder the assumptio of ormally distributed observatios, the resultig observatio model reads Y NXβ,σ I + γv.

85 4.4 Applicatio 7: Testig for o-ormality of the observatio errors Applicatio 7: Testig for o-ormality of the observatio errors Let us cosider the liear fuctioal model Y = Xβ + U, where X R m deotes a kow desig matrix of full rak ad β R m a vector of ukow fuctioal parameters. We will assume for ow that the errors U are ucorrelated ad homoscedastic accordig to the stochastic model ΣU} = σ I We have used such a Gauss-Markov model, for istace, i Sectio 3 to obtai the UMPI statistic M for testig liear restrictios H β = w. The exact χ -orf -distributio of this test statistic has bee derived from the basic premise that the error variables are ormally distributed. This ormality assumptio becomes eve more evidet if we recall that we used the ormal desity/likelihood fuctio to derive of the Likelihood Ratio ad Rao s Score statistic for that problem. If the errors do ot follow a ormal distributio, the these tests are ot reliable aymore, because the exact distributios of these test statistics ad therefore the critical values will be at least iaccurate, ad the likelihood fuctio will be misspecified. Therefore, if we have serious doubts about the ormality of the error variables, the we should test this assumptio. I this sectio, we will ivestigate a test of ormality which fits ito the framework of parametric testig problems, ad which may be derived coveietly o the basis of Rao s Score statistic. Let us start by recallig that the desity of a uivariate ormal distributio is characterized by four parameters: a variable mea µ, a variable variace σ, a costat skewess γ = reflectig symmetry about the mea, ad a costat kurtosis γ = idicatig a mesokurtic shape. The mea is the idetical to the first momet µ = xfxdx ad the variace idetical to the secod cetral momet µ = x µ fxdx, whereas the skewess ad kurtosis are based o the third ad fourth cetral momets µ 3 = x µ 3 fxdx ad µ 4 = x µ 4 fxdx through the relatios ad γ = µ 3 µ 3/ γ = µ 4 µ see Stuart ad Ord, 3, p. 74 ad 9. A atural idea is ow to estimate the skewess ad kurtosis from the give data ad to compare these estimates with the values ascribed to the ormal distributio. O the oe had, if the empirical skewess turs out to be sigificatly smaller/larger tha, the the errors will have a o-symmetrical distributio with a lower/upper tail that is heavier tha for a ormal distributio. O the other had, if the empirical kurtosis is sigificatly smaller/larger tha, the the errors will have a platykurtic/leptokurtic distributio with a flatter/sharper top see Stuart ad Ord, 3, p. 9. Ufortuately, it is ot clear how large these deviatios from must be to idicate sigificat o-ormality, because we do ot kow the probability distributio of the estimators for γ ad γ.

86 8 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs Nevertheless, this problem of testig H : γ =,γ =versush : γ,γ maybetackledia quite coveiet way by cosiderig Pearso s collectio of distributios W P. The desity fuctio of each uivariate distributio withi W P satisfies the differetial equatio d du l fu; c c u,c,c = c c u + c u u R, where the parameters c, c,adc determie the shape of the desity fuctio f, adwhereu is a quatity measured about its mea, such as a error u i = y i X i β i We will see later that the desity fuctio of the cetered ormal distributio with µ = satisfies for c = σ, c =,adc =. Furthermore, the parameters c ad c correspod to the skewess ad kurtosis through the relatios ad c = γ γ +6 µ γ γ c = 3γ γ γ γ see Equatio 6.4 i Stuart ad Ord, 3, p. 7, so that the problem of testig of H : γ =,γ =versus H : γ,γ is equivalet to testig H : c =,c =versush : c,c. Let us ow determie the geeral solutio of Itegratio of yields l fu; c,c,c +k = which we may rewrite as c u c c u + c u du, l fu; c,c,c +k = gu; c,c,c k R, where k deotes the itegratio costat ad c u gu; c,c,c := du c c u + c u a atiderivative. Now, usig expl u = u, the geeral solutio of follows to be fu; c,c,c =expgu; c,c,c k} =exp k} expgu; c,c,c } =: k expgu; c,c,c } The itegratio costat k is determied by stadardizig the area uder f to, which yields = + fu; c,c,c du = + k exp gu; c,c,c du = k Now, substitutig k =/ + exp gu; c,c,c du ito leads to + exp gu; c,c,c du. fu; c,c,c = exp gu; c,c,c + exp gu; c,c,c du as the stadardized solutio of

87 4.4 Applicatio 7: Testig for o-ormality of the observatio errors 8 Pearso s collectio of distributios comprises a large umber of stadard distributios ad, as idicated earlier, the ormal distributio is its most promiet member. Some particularly useful members of W P are summarized by the followig propositio. Propositio 4.. The followig uivariate distributios are members of Pearso s collectio of distributio:. Cetered ormal distrubutio N,σ,. Gamma distributio Gb, p, 3. Beta distributio Bα, β, 4. Studet distributio tk. Proof.. To see that the desity fuctio fu; σ ofn,σ satisfies , set c = σ,c = c =i , for which becomes u exp σ fu; σ = exp u σ du + exp u σ du du = The itegral i the deomiator is solved by usig + exp u σ du exp a x dx = π a a > see Brostei ad Semedjajew, 99, p. 66; Itegral 3, where i the give case a =/σ is positive as a cosequece of the fact that σ > by defiitio. Also ote that itegratio o, + doubles the value of because of exp a x =exp a x. Thus it follows that + u exp σ du du = π σ, from which the desity fuctio fu; σ = exp u } πσ σ of the cetered ormal distributio is obtaied. Proofs for.-4. are foud, for istace, i Stuart ad Ord 3, Chap. 6. Now, the log-likelihood fuctio may be determied from the desities with respect to the errors u i = y i X i β i =,...,. If these error variables are assumed to be idepedetly distributed, the the joit desity as a fuctio of y with additioal parameters β may be factorized as fy; β,c,c,c = exp gu i ; c,c,c + exp gu i; c,c,c du i Notice that the value of the itegral i the deomiator is oly a fuctio of c, c,adc, ot of β, because u i acts there oly as a itegratio variable. Defiig the parameter vector as θ := [β,c,c,c ], the log-likelihood fuctio follows to be Lθ; y =l exp gu i ; c,c,c + exp gu = i; c,c,c du i gu i ; c,c,c l + exp gu i ; c,c,c du i.

88 8 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs Takig the first partial derivatives with respect to the fuctioal parameters β j j =,...,m yields S βj θ; y := = Lθ; y gu i ; c,c,c = = β j β j β j u i c u i β j u i c c u i + c u du i = i which defies the radom variable S βj θ; Y = X i,j c u i c c u i + c u du i i X i,j c u i c c u i + c u, i c U i c c U i + c Ui Peracchi, p. 365 poits out a elegat way to derive the score with respect to the parameters c j j =,...,, which shall be explaied here i greater detail. Applyig the partial derivative to both terms withi the sum of the log-likelihood fuctio gives S cj θ; y := = Lθ; y c j = gu i; c,c,c c j gu i; c,c,c c j c j l + + exp gu i; c,c,c du i exp gu i ; c,c,c du i c j + exp gu i ; c,c,c du i. Here we may iterchage the itegral ad derivative, which results i S cj θ; y = gu + i; c,c,c c + j exp gu exp gu i ; c,c,c du i i; c,c,c du i c j + = gu i; c,c,c exp gu i; c,c,c gui;c,c,c c j du i c + j exp gu. i; c,c,c du i The ext step is to see that the itegral i the deomiator ca be moved ito the itegral i the omiator, which allows us to apply , that is S cj θ; y = = gu i; c,c,c c j gu i; c,c,c c j + + exp gu i ; c,c,c + exp gu gu; c,c,c i; c,c,c du i c j fu; c,c,c gu i; c,c,c c j du i. Fially, we may use the fact that the itegral represets the expectatio of the radom variable gu i ; c,c,c / c j, which leads to the result } gui ; c,c,c gui ; c,c,c S cj θ; Y = E c j c j as give i Peracchi. To compute the partial derivatives gu i ; c,c,c / c j regardig the atiderivative defied i , we may agai iterchage the derivative ad the itegral. The we obtai gu i ; c,c,c c gu i ; c,c,c c = gu i ; c,c,c c = c u i c c u i + c u du i, i c c u i + c u i c u i u i c c u i + c u du i = i c u i u i = c c u i + c u du i. i du c + c u i u i c c u i + c u i du i,

89 4.4 Applicatio 7: Testig for o-ormality of the observatio errors 83 Evetually, we will have to evaluate the scores at the ML estimates with the restrictios H : c = c =. Furthermore, the parameter c will be idetical to the variace σ uder these restrictios, as metioed above. The, evaluatio of the partial derivatives at c = c = with the parameters c = σ ad β remaiig uspecified gives gu i ; σ,, ui = c σ 4 du i = σ 4 u i du i = u i σ 4, gu i ; σ,, σ u i = c σ 4 du i = σ du i σ 4 u i du i = u i σ u3 i 3σ 4, gu i ; σ,, u 3 = i c σ 4 du i = σ 4 u 3 i du i = u4 i 4σ 4. These quatities defie radom variables whose expectatios, uder the restrictios H,aregiveby gui ; σ },, E = EU i } c σ 4 = σ, gui ; σ },, E = EU i} c σ EU i 3} 3σ 4 =, gui ; σ },, E = EU i 4} c 4σ 4 = 3 4, where we used the followig facts about the momets of U i : EU i } = by virtue of the first Markov coditio; EUi } = σ = µ i light of the secod Markov coditio; 3 c = γ = implies EUi 3}µ 3 = because of ; ad 4 c = γ = implies EUi 4} = µ 4 =3µ =3σ4 due to This gives fially the compoets of the score S σ θ; Y = S c θ; Y = S c θ; Y = U i σ 4 σ = σ 4 Ui σ, Ui σ U 3 i 3σ 4 = σ U i 3σ 4 Ui U 4 i 4σ 4 3 = 4 4σ 4 Ui To costruct Rao s Score statistic, the scores ad must be evaluated at the restricted ML estimates θ. Uder the restrictios c = c = c, the Gauss-Markov model ad has ormally distributed errors U. Therefore, the restricted ML estimator for β is idetical to the least squares estimator β =X X X Y. The residuals are the estimated by Ũ = Y X β. This leads to the restricted ML estimator for c = σ,thatis σ ML = Ũ Ũ/, which differs from the least squares estimator oly i usig the factor / istead of / m. Fially, we will also use the estimators µ j = Ũ j i / j =, 3, 4 for the secod, third, ad fourth cetral momets the first momet µ j = Ũi is zero as we assumed the presece of a itercept parameter. Exploitig the orthogoality betwee the j-th colum of X i.e. the j-th row of X ad the vector of estimated residuals, we obtai from S βj θ; Y = ad from S σ θ; Y = σ 4 ML S c θ; Y = σ ML S c θ; Y = 4 σ 4 ML X i,j Ũi σ 4 ML Ũi = σ 4 ML σ ML Ũ i 3 σ 4 ML X i,j Ũ i =, = σ ML σ ML 4 σ ML =, Ui 3 = µ 3 3 σ ML 4, Ũi = µ 4 4 σ ML

90 84 4 APPLICATIONS OF MISSPECIFICATION TESTS IN GENERALIZED GMMs As already metioed i Sectio.5.6, we see that the score vaishes i the directio of the urestricted parameters β ad σ, because the estimates for the urestricted parameters are give the freedom to satisfy the correspodig likelihood equatios exactly, i.e. to maximize the log-likelihood fuctio i those directios. The derivatio of the secod partial derivatives of the log-likelihood fuctio with respect to all the parameters i θ, or equivaletly of the first partial derivatives of the scores ad , is very legthy. Therefore, we will refer to Propositio i Bera ad Jarque 98, from which the iformatio matrix at θ is obtaied as σ MLX X I θ; Y = σ 4 ML 3 σ ML 3 σ ML Now we obtai for Rao s Score statistic 3 σ ML 6 T RS = S θ; Y I θ; Y S θ; Y σ MLX X = µ3 σ ML 4 3 σ ML 4 3 σ ML 6 µ 4 4 σ 4 ML 3 4 If we defie the subvectors [ ] S :=, S := [ µ 3 3 σ 4 ML µ 4 4 σ 4 ML 3 σ ML 3 4 ], the submatrices [ ] σ I := MLX X, I = I := σ ML 4 3 σ ML µ3 3 σ 4 ML µ 4 4 σ 4 ML 3 4 [ ] [ 3 σ 3, I := ML σ ML ], ad the Schur complemets I, I, I, I as the blocks of the total iverse I θ; Y, the T RS follows to be T RS = S I S + S I S + S I S + S I S = S I S. With I =I I I I [ ] [ ][ ] = 3 σ ML 6 3 σ MLX [ ] X 3 σ ML σ ML 4 σ ML [ ] [ ] [ ] [ ] 3 σ = ML σ ML X X 6 3 σ σ 4 ML ML / 3 σ ML [ ] [ ] 3 σ = ML 6 9 = [ 3 σ ML ], 3 which is a part of the result give i Propositio 3 by Bera ad Jarque, 98, we obtai [ ] µ 3 [ 3 3 σ T RS = ML 4 σ ML ] [ ] [ ] µ3 3 σ µ [ ] 3 ML 4 3 σ µ 3 = ML 4 σ ML 3 µ 4 4 σ 4 ML 3 4 = µ 3 6 σ ML 6 + µ 4 4 σ ML 8 = 6 µ 3 σ 6 ML + 4 µ 4 σ 8 ML µ 4 4 σ 4 ML 3 4 µ 4 8 σ ML 4 µ 4 6 µ 4 σ ML σ 4 ML +3/8 µ 4 4 σ 4 ML 3 4 µ 4 6 σ 4 ML = 6 µ 3 σ 6 ML + 4 µ4 σ 4 ML 3.

91 4.4 Applicatio 7: Testig for o-ormality of the observatio errors 85 Observe ow that the defiitio of σ ML is idetical to that of the empirical secod cetral momet µ. The, if we defie the empirical skewess γ = µ 3 µ 3/ ad the empirical kurtosis, γ = µ 4 µ 3, which deped o the estimated residuals of the Gauss-Markov model through the empirical momets µ j = Ũ j i j =,...,4, the Rao s Score statistic takes its fial form T RS = 6 γ + 4 γ Evidetly, this statistic measures the absolute deviatios of the data s skewess ad kurtosis from the values, thus compares how far the distributio of the estimated residuals differs from a ormal distributio. This test of ormality is also called the Jarque-Bera test Bera ad Jarque, 98. Example 4.3: Testig the Gravity Dataset for o-ormality. I Example 3. we cosidered a twodimesioal polyomial model of degree with additioal mea shift parameters. To check whether the errors U i the model Y = Xβ + Z + U follow a ormal distributio, we first compute the residuals ũ = y X β Z based o the least squares estimates [ ] [ ] β X = XX [ ] Z X y Z X Z Z Z. y Cotrary to Example 3., we use tildes istead of hats o top of the estimate, because here the mea shift parameters, which were prove to be sigificat, aturally belog to the fuctioal model. The tildes idicate that the estimates have bee determied uder the restrictios c = c =. From these residuals we the obtai γ =.5 for the empirical skewess ad γ =.9 for the kurtosis measure With these values, Rao s Score statistic becomes T RS =.8, which is isigificat i light of the critical value k χ.95 = Therefore, we may assume the errors to be ormally distributed, which is also roughly reflected by the followig histogram plot Figure 4.. Histogram of the estimated residuals.

92 86 5 CONCLUSION AND OUTLOOK 5 Coclusio ad Outlook I the framework of the Gauss-Markov model with ormally distributed errors, uiformly most powerful ivariat tests geerally exist. These tests have three equivalet formulatios: the form obtaied from a direct applicatio of ivariace priciples, the likelihood ratio test, ad 3 Rao s score test. Of the three, Rao s score test is easiest to compute for problems where sigificace testig is required. If the testig problem ivolves ukow parameters withi the weight matrix, or if the errors do ot follow a ormal distributio, the o uiformly most powerful ivariat tests exist. I these cases too, Rao s score test offers a attractive method that is both powerful ad computatioally coveiet. This thesis has demostrated that hypothesis testig by applyig Rao s score method is a effective - ad i may cases optimal - approach for resolvig a wide rage of problems faced i geodetic model aalysis. New satellite missios such as GOCE Gravity Field ad steady-state Ocea Circulatio Explorer require powerful ad computatioally feasible tests for diagosig fuctioal ad stochastic models that are far more complex tha the models cosidered i this thesis cf. Lacker, 6. To fid covicig solutios to these challeges, it will be ecessary for geodesists to further elaborate their uderstadig of statistical testig theory. Lookig at the methodology curretly offered by mathematical statistics, some directios of further research are particularly promisig. Rao s score approach ca be applied to a full rage of testig problem fields such as deformatio aalysis, time series aalysis, or geostatistics - applicatios that have ot yet bee explored i moder geodetic literature. It is crucial that geodesists develop a stroger expertise i the asymptotic behaviour of statistical theories, such as give i Lehma ad Romao 5, Part II. This will be a ecessary step towards assessig the quality of geodetic hypothesis tests, such as those preseted i Sectio 4, for which o strict optimality criteria are applicable. The scope of the theory preseted i this thesis is restricted to a specific miimizatio problem regardig Type I ad Type II error probabilities withi the class of ivariat tests. However, miimizig error probability does ot correspod to a miimizatio of costs whe oe cosiders losses i work time, computatioal time, or eve accuracy of estimated parameters. To overcome this limitatio, hypothesis tests could be derived withi the framework of decisio theory, by miimizig a loss fuctio which represets the expected loss/cost due to a erroeous test decisio cf. Lehma ad Romao, 5, p. 59. Fially, it is ofte argued that classical testig theory is too limited i that the test decisio is always made o the premise of a true ull hypothesis, ad that a priori iformatio with respect to the ukow parameters may ot be used see, for istace, Jayes, 3, Chapter 6. A theory which does allow the treatmet of the ull ad the alterative hypothesis o equal grouds ad icorporatio of a priori iformatio is offered by Bayes statistics. Bayesia tests may be viewed as geeralizatios of likelihood ratio tests i that the likelihood ratio is exteded by a a priori desity with respect to the ukow parameters, which are treated as radom variables cf. Koch,, Sectio 3.4. It would be highly istructive to formulate the model misspecificatio tests developed i this thesis withi the Bayesia framework ad to compare them i terms of testig power, applicability to a wide rage of problems, ad computatioal coveiece.

93 87 6 Appedix: Datasets 6. Dam Dataset The umerical values of the observatio model y = Xβ + u, ΣU} = σ I, for the Dam dataset used i Applicatio 3 of Sect. 3.5 are give by = H H 3 H 4 H 5 H 6 H 7 H 8 H 9 H 7 H 8 H 9 + u u u 3 u 4 u 5 u 6 u 7 u 8 u 9 u u u u 3 u 4 u 5 u 6 u 7 u 8 u 9 u u u u 3 u 4 u 5 u 6 u 7 u 8 u 9 u 3 u 3 u 3 u 33 u

94 88 6 APPENDIX: DATASETS 6. Gravity Dataset Idex Aomaly Latitude Logitude Idex Aomaly Latitude Logitude i dg mi mi i dg mi mi

95 89 Refereces Arold, S.F. 98. The theory of liear models ad multivariate aalysis. Wiley, New York. Arold, S.F Sufficiecy ad ivariace. Statistics & Probability Letters 3: Arold, S.F. 99. Mathematical statistics. Pretice Hall, Eglewood Cliffs, New Jersey. Baarda, W Statistical cocepts i geodesy. Publicatios o Geodesy New Series, Vol., Number 4, Netherlads Geodetic Commissio, Delft. Baarda, W A testig procedure for use i geodetic etworks. Publicatios o Geodesy New Series, Vol., Number 5, Netherlads Geodetic Commissio, Delft. Bera, A.K. ad C.M. Jarque 98. Model specificatio tests. A simultaeous approach. Joural of Ecoometrics, :59-8. Birkes, D. 99. Geeralized likelihood ratio tests ad uiformly most powerful tests. The America Statisticia, 44: Brostei, I.N. ad K.A. Semedjajew 99. Taschebuch der Mathematik. Teuber, Stuttgart. Casella, G. ad R.L. Berger. Statistical iferece. Secod editio. Duxbury, Pacific Grove, Califoria. Cox, D.R. ad D.V. Hikley 974. Theoretical statistics. Chapma ad Hall, Lodo. Dudewicz, E.J. ad S. Mishra 988. Moder mathematical statistics. Wiley, New York. Egle, R.F Wald, likelihood ratio, ad Lagrage multiplier tests i ecoometrics. Z. Griliches ad M.D. Itriligator eds.: Hadbook of Ecoometrics, Vol. : Godfrey, L.G Misspecificatio tests i ecoometrics. Cambridge Uiversity Press, Cambridge. Hamilto, J.D Time series aalysis. Priceto Uiversity Press, Priceto. Jaeger, R. et al. 5. Klassische ud robuste Ausgleichugsverfahre. Wichma, Heidelberg. Jayes, E.T. 3. Probability theory. The logic of sciece. Cambridge Uiversity Press, Cambridge. Johso, N.L. ad S. Kotz 97. Distributios i statistics: cotiuous uivariate distributios, Vol.. Wiley, New York. Johso, N.L. ad S. Kotz 97. Distributios i statistics: cotiuous uivariate distributios, Vol.. Wiley, New York. Koch, K.-R. 98. Variaz- ud Kovariazkompoeteschätzug für Streckemessuge auf Eichliie. Allgemeie Vermessugs-Nachrichte, 88:5-3. Koch, K.-R Parameter estimatio ad hypothesis testig i liear models. Spriger, Heidelberg. Koch, K.-R.. Eiführug i die Bayes-Statistik. Spriger, Heidelberg. Krämer, W. ad H. Soberger 986. The liear regressio model uder test. Physica, Heidelberg. Kreyszig, E Statistische Methode ud ihre Aweduge. Vadehoeck ad Ruprecht, Göttige. Lacker, B. 6. Dataispectio ad hypothesis tests of very log time series applied to GOCE satellite gravity gradiometry data. Dissertatio, Graz Uiversity of Techology. Lehma, E.L. 959a. Testig statistical hypotheses. First editio. Wiley, New York. Lehma, E.L. 959b. Optimum ivariat tests. Aals of Mathematical Statistics, 3: Lehma, E.L. ad J.P. Romao 5. Testig statistical hypotheses. Third editio. Spriger, New York. Meissl, P. 98. Least squares adjustmet - a moder approach. Mitteiluge der geodätische Istitute der Techische Uiversität Graz, Vol. 43.

96 9 Meyer, C.D.. Matrix aalysis ad applied liear algebra. SIAM. Neyma, J. ad E.S. Pearso 98. O the use ad iterpretatio of certai test criteria for purposes of statistical iferece. Biometrika, A:75-4, Neyma, J. ad E.S. Pearso 933. O the problem of the most efficiet tests of statistical hypotheses. Philosophical Trasactios of the Royal Society of Lodo, Series A, 3: Olive, D.J. 6. A course i statistical theory. Olie documet, Departmet of Mathematics, Souther Illiois Uiversity. Peracchi, F.. Ecoometrics. Wiley, New York. Pope, A.J The statistics of residuals ad the detectio of outliers. NOAA Techical Report NOS65 NGS, US Departmet of Commerce, Natioal Geodetic Survey, Rockville, Marylad. Rao, C.R Large sample tests of statistical hypotheses cocerig several parameters with applicatios to problems of estimatio. Proceedigs of Cambridge Philosophical Society, 44:5-57. Rao, C.R Liear statistical iferece ad its applicatios. Wiley, New York. Rousseeuw, P.J. ad A.M. Leroy 3. Robust regressio ad outlier detectio. Wiley, New York. Schuh, W.D Tailored umerical strategies for the global determiatio of the Earth s gravity field. Mitteiluge der geodätische Istitute der Techische Uiversität Graz, Vol. 8. Schuh, W.-D. 6. Ausgleichugsrechug ud Statistik III. Lecture otes, Istitute of Geodesy ad Geoiformatio, Uiversity of Bo. Schuh, W.-D. 6. Semiar Robuste Parameterschätzug. Lecture otes, Istitute of Geodesy ad Geoiformatio, Uiversity of Bo. Silvey, S.D The Lagragia multiplier test. Aals of Mathematical Statistics, 3: Stuart, A., Ord, J.K., ad S. Arold 999. The advaced theory of statistics, Vol. A: Classical iferece ad the liear model. Arold, Lodo. Stuart, A. ad J.K. Ord 3. The advaced theory of statistics, Vol. : Distributio theory. Arold, Lodo. Teuisse, P.J.G.. Testig theory. Delft Uiversity Press, Delft.

97 Ackowledgmet This thesis is the result of my studies i the field of adjustmet theory ad statistics at the Istitute of Geodesy ad Geoiformatio at the Uiversity of Bo. This opportuity was grated to me by Prof. Dr. tech. W.- D. Schuh, whose supervisio was the perfect mix of o-restrictive guidace ad ope-mided dialogue. I am deeply idebted to him for the time ad eergy he ivested ito supportig ad discussig my ideas. I also wat to thak Prof. Dr. rer. at. H.-P. Helfrich for servig as secod referee. Further, I ackowledge the support by the BMBF Geotechologie programmes Grats 3F39C ad 3F4B. My very special thaks goes to Jeramy Flora. Without her love, patiece, ad ideas throughout the years, this thesis ad my ow ervous system would ot exist.

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich [email protected] [email protected] Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Notes on exponential generating functions and structures.

Notes on exponential generating functions and structures. Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace [email protected] Jea-Luc Marichal Applied Mathematics

More information

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as: A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC [email protected] May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

Escola Federal de Engenharia de Itajubá

Escola Federal de Engenharia de Itajubá Escola Federal de Egeharia de Itajubá Departameto de Egeharia Mecâica Pós-Graduação em Egeharia Mecâica MPF04 ANÁLISE DE SINAIS E AQUISÇÃO DE DADOS SINAIS E SISTEMAS Trabalho 02 (MATLAB) Prof. Dr. José

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

THE ABRACADABRA PROBLEM

THE ABRACADABRA PROBLEM THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k. 18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information