Maura Department of Economics and Finance Università Tor Vergata
Outline 1 2 3
Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable with cumulative distribution function (cdf) F (x). Let Y be defined such that Y = F 1 (U). Y has c.d.f. equal to F.
Inverse distribution function
Inverse distribution function
Why nonparametric statistics? While in many situations parametric assumptions are reasonable (e.g. assumption of Normal distribution for the background noise), we often have no prior knowledge of the underlying distributions. In such situations, the use of parametric statistics can give misleading or even wrong results. We need statistical procedures which are insensitive to the model assumptions in the sense that the procedures retain their properties in the neighborhood of the model assumptions.
What is the nonparametric inference? The basic idea of nonparametric inference is to use data to infer an unknown quantity while making as few assumptions as possible. Usually, this means using statistical models that are infinite-dimensional. Indeed, a better name for nonparametric inference might be infinite-dimensional inference. But it is difficult to give a precise definition of nonparametric inference. For the purposes of this course, we will use the phrase nonparametric inference to refer to a set of modern statistical methods that aim to keep the number of underlying assumptions as weak as possible.
What is the advantage of nonparametric statistics? The rapid and continuous development of nonparametric statistical procedures over the past six decades is due to the following advantages enjoyed by nonparametric techniques Require few assumptions about the underlying populations from which the data are obtained It enables the user to obtain exact p values for tests, exact coverage probabilities for confidence regions, and exact experimentwise error rates for multiple comparison procedures. easy to understand (often) Usually they are only slightly less efficient than their normal competitors when the underlying populations are normal, and they can be mildly or wildly more efficient than these competitors when the underlying populations are not normal. insensitive to outliers
What is the advantage of nonparametric statistics? Because many nonparametric approaches require just the ranks of the observations, rather than the actual magnitude of the observations, they are applicable in many situations where normal theory procedures cannot be utilized.
The empirical distribution function We will begin with the problem of estimating a CDF (cumulative distribution function) Suppose X F, where F (x) = P(X x) is a distribution function The empirical distribution function, ˆF, is the CDF that puts mass 1/n at each data point x i ˆF (x) = 1 n n I (x i x) i=1 where I is the indicator function
Properties of ˆF At any fixed value of x, E(ˆF (x)) = F (x) Var(ˆF (x)) = 1 nf (x)(1 F (x)) Note that these two facts imply that ˆF (x) P F (x) An even stronger proof of convergence is given by the Glivenko-Cantelli Theorem: sup x ˆF (x) F (x) a.s. 0
Non parametric test In order to be able to employ the test proposed below, we have to make the supplementary (but mild) assumption that F is continuous. Thus the hypothesis to be tested here is H 0 : F (x) = F 0 (x) a given continuous d.f., against the alternative H 0 : F (x) F 0 (x) (in the sense that F (x) F 0 (x) for at least one one x. Define the random variable D n as D n = sup x ˆF (x) F (x)
Kolmogorov test Idea: If the difference between the sample and the theoretical distribution functions is severe, the null hypothesis H 0 is rejected. Statistic: The probability distribution of D n is not one of the well-known models. Its probabilities are given in a specific table for small n, while an asymptotic result is applied for big n. Rule: Critical region of the form D n (x) k
Kolmogorov One-sample test In order for this determination to be possible, we would have to know the distribution of D n, under H 0, or of some known multiple of it. It has been shown in the literature that P( nd n x H 0 ) n ( 1) j e 2j2 x 2, x > 0 j= Thus for large n, the right-hand side of previous equation may be used for the purpose of determining critical region. The test employed above is known as the Kolmogorov one-sample test.
Kolmogorov-Smirnov Two sample test The testing hypothesis problem just described is of limited practical importance. What arise naturally in practice are problems of the following type: Let X i, i = 1,..., m be i.i.d. r.v. with continuous but unknown d.f. F and let Y j, j = 1,..., n be i.i.d. r.v. with continuous but unknown d.f. G. The two random samples are assumed to be independent and the hypothesis of interest here is H 0 : F = G. One possible alternative is the following: H 1 : F G (in the sense that F (x) G(x) for at least one x R).
Kolmogorov-Smirnov Two sample test
Kolmogorov-Smirnov Two sample test
Robustness Any statistical procedure should possess the following desirable features: It has reasonably relative efficiency under the assumed model It is robust in the sense that small deviations from the assumed model assumptions should impair the perfomance only slighly Somewhat larger deviations from the model should not a cause a catastrophe
Robustness In addition to the classical concept of efficiency, new concepts are introduced to de- scribe the local stability of a statistical procedure (the influence function and derived quantities) its global reliability or safety (the breakdown point).
Sample median x (1), x (2),..., x (n) denotes a sample in ascending order. Definition. The (sample or empirical) median denoted by Me, is given by { x( n+1 Me = 2 ) if n is odd x ( n 2 ) + x ( n 2 +1) if n is even