Exact Nonparametric Tests for Comparing Means - A Personal Summary

Save this PDF as:

Size: px
Start display at page:

Download "Exact Nonparametric Tests for Comparing Means - A Personal Summary"

Transcription

1 Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, Economics Department, European University Institute. Via della Piazzuola 43, Florence, Italy,

2 Abstract We present an overview of exact nonparametric tests for comparing means of two random variables based on two independent i.i.d. samples when outcomes are known to be contained in some given bounded interval.

3 1 1 The Setting A statistical inference problem is parametric if the data generating process is known up to a nite dimensional parameter space. It is nonparametric if it is not parametric (Fraser, 1957). A statistical inference problem is distribution-free if there are no assumptions on the distribution underlying the data beyond what is known, e.g. often one has some information about the possible support. So the Fisher exact test for comparing means of two binomial distributions is uniformly most powerful for a parametric and distribution-free statistical inference problem. It is known that observations are only 0 or 1 so the restriction to binomial distributions is implied by what is known as is not an assumption on the distribution. The null and alternative hypothesis used to prove that a permutation test is most powerful (see below) make the associated statistical inference problem nonparametric but not distribution-free. A test is exact if its properties are derived for nite samples and not only for a su ciently large sample size. Often properties refers to the size of the test (or an upper bound on its size). This either means that there is a formula for the value of the size of the test or that there is a formula for an upper bound of the size of the test. Exact simply means that these formulas are correct for a given sample size. Assume that there are two underlying random variables Y 1 and Y 2 with unknown means E = EY 1 and EY 2 and two separate samples y1 k n and n k=1 2 yk of independent realizations of Y 1 and Y 2 respectively. There is an interval [;!] that is known k=1 to have the property that Y 1 ; Y 2 2 [;!] : So the joint distribution P is an element of [;!] 2 : Let P 1 and P 2 denote the respective marginals. The data generating process can be identi ed with P. As we allow for any P 2 [;!] 2 our setting is nonparametric. 2 One-Sided Test Consider the objective to choose an exact test of the null hypothesis H 0 : EY 1 EY 2 (or H 0 : EY 1 = EY 2 ) against the alternative hypothesis H 1 : EY 1 > EY 2 : 2.1 UMP Note that there is no uniformly most powerful test for our hypotheses. Nor is there a uniformly most powerful unbiased test. However there are most powerful tests and

4 2 other exact tests. 2.2 T Test The two sample t-test is not an exact test in our setting as it assumes that the underlying random variables are normally distributed. However neither Y 1 nor Y 2 can be normally distributed as the ranges of Y 1 and Y 2 are bounded. Without knowing more about the random variables one cannot argue that the t-test can be applied if the sample is su ciently large. It is only pointwise asymptotically level ; i.e. for given P and given " there is some N (P; ") such that if n > N (P; ") then the size of the test is within " of the size of the underlying t-test. However, the notion of su ciently large depends on P and it turns out that N (P; ") ; P 2 [;!] 2 is an unbounded set. The two sample t-test is not uniformly asymptotically level. In fact, the size of the test for any given sample size n is equal to 1. For the argument one can easily adapt the constructive proof of Lehmann and Loh (1990) who show this statement for testing the mean of a single random variable. Hence the existence of a bound N (P; ") is useless. In simple words, it is the di erence between pointwise and uniform convergence that makes the t-test non applicable even if the sample is very large. The example that can be used to prove the above statement is a simple variation of the one in Lehmann and Loh (1990). Assume that Y 1 = 1=2 almost surely and that Y 2 has two mass points, one at and the other at 0 where is strictly larger but very close to 1=2 such that EY 2 = 1=2: Notice that the closer is to 1=2; the more mass Y 2 puts on as EY 2 = 1=2: Provided is su ciently close to 1=2; with large probability each of the n realizations of Y 2 are equal to : If only Y 2 = is observed then since > 1=2 and since the sample variances are both 0 it follows that the null is rejected under the t-test even though EY 1 = EY 2 : Since the event that only is observed can be made arbitrarily large by choosing very close to 1=2; the probability of wrongly rejecting the null can me bade arbitrarily close to 1: Thus the size of the two sample t test is equal to 1: Notice that this example can be easily adjusted to apply only to continuous distributions without changing the conclusion. Notice also that variance of any random variable in our setting is bounded. The phenomenon of this example is that we cannot rule out that most weight is put on one observation very close to 1=2: The obvious reaction to the above example is to suggest to rst test for normality of

5 3 the errors. Such tests would be able to discover the above counter example. However this does not mean that they would make the t-test valid given the current standing of research. The problem of tests for normality (e.g. Shapiro-Wilk) is that the null hypothesis asserts that they are normally distributed. To claim that the errors are normally distributed is a claim about not being able to reject the null. It can be very likely that normality cannot be rejected even though errors are not normally distributed. This a ects the size of the combined test, rst of normality and then of EY 1 EY 2 : In fact, without results on the probability of rejecting EY 1 EY 2 with the t-test when errors are not normally distributed one has to simply work with the upper bound that the t-test always rejects EY 1 EY 2 whenever errors are not normal. But then this means that the size of the combined test is dominated by the probability of wrongly not being able to reject that errors are normal. Now note that the maximal probability of wrongly not rejecting that errors are normal will be approximately equal to or larger than the probability of correctly accepting the null. This is because there are distributions that are not normal that are almost normal. So if the size of the test for normality is equal to 10% then the probability of thinking that errors are normally distributed even if they are not can be 90% or higher. In this case the best upper bound given current research on the size of the combined test is 90%: Note that the bound above is not tight. When errors are almost normal then the size of the t-test will be close to its size when errors are normal. This will be true (for appropriate distance function) in the current framework as we assume that payo s belong to [0; 1] : In order to assess the size of the combined test one would have to derive the size of this combined test, in particular obtain nite sample solutions to the size of the t-test when errors are almost normal. As this has not been done, it is an open question whether tests for normality can be used to reinstate the t-test when payo s are bounded. 2.3 Permutation Tests Permutation tests are exact tests in terms of size. A permutation test is a most powerful unbiased test of testing H 0 : EY 1 = EY 2 against the simple alternative hypothesis H 1 : P = P where P is given with E P Y 1 > E P Y 2 : Accordingly all permutations of the data are considered in which the data is randomly assigned to one of the two actions and the null is rejected if the true sample is within the % of permutations under

6 4 which the alternative is least likely under P. This is the theory. In practice, typically only the test statistic itself is speci ed without mentioning whether it is most powerful relative to some alternative hypothesis. The test remains exact in terms of size. A popular test statistic is to take the di erence between the two sample means. I have not come across any distribution with bounded support that can be speci ed for the alternative hypothesis that makes even this test most powerful. Note that the Wilcoxon rank sum test (or Mann Whitney U test) is a also permutation test which can be used to compare means and to compare distributions. It can be useful when data is given in form of ranks. However I do not see the sense to rst transform the data into ranks and then to apply this test as the transformation means to throw away information. The reason is that I cannot imagine that it is most powerful against any speci c distribution. The literature on permutation tests typically ignores the possibility to use permutation tests to create most powerful tests as they are interested in distribution-free tests or at least in alternatives that are composite. 2.4 Aside on Methodology Before we proceed with out presentation of tests we present some material on methodology of selecting tests Multiple Testing on same Data It is not good practice to evaluate multiple tests on the same data without correcting for this. The reason is that this typically increases the type I error underlying the nal evaluation. To illustrate, assume that the null hypothesis could not be rejected with a rst test but that it can be rejected based on a second test where both tests are unbiased with the same size : Consider an underlying data generating process P with EY 0 = EY 1. As the rst test is unbiased, the probability of wrongly rejecting the null using this test is equal to : Consequently, the probability of wrongly rejecting the null hypothesis under P is equal to + ' (P ) where ' (P ) is the probability that the rst test does not reject the null and that the second does reject the null. If the two tests are not identical then ' (P ) > 0 and this two step procedure leads to a rejection of the null hypothesis with probability strictly larger than : More generally, if neither test is more powerful than the other then the sequential procedure of testing until

7 5 obtaining a rejection strictly increases the type I error. Of course one could mention how many other tests have been tried, however it is di cult to derive exactly how the type I error increases, the upper bound on the type I error when checking two tests is 2: As one can hardly verify how many others tests were tried it is useful to have an understanding of which tests should be applied to which setting. This is best done by comparing tests. When uniformly most powerful tests exist then there is no danger for worrying about multiple testing of same data. One can say that there is a best test. However in our setting a UMP test does not exist. Permutation tests are most powerful given the assumptions on how the alternative hypothesis has been limited. As di erent alternative hypotheses satisfying EY 1 > EY 2 yield di erent tests, each is most powerful for the speci c alternative. Again it can be di cult to justify which alternative is most plausible and it is not good practice to try several alternatives Choice of a Test: Ex-Ante vs Ex-Post When there is no uniformly most powerful test then it is important to rst select a test and then to gather data. One problem of looking for a test after data is gathered is illustrated above as it is hard to check how many tests were run but not reported. A further problem is more fundamental and is due to the fact that size is an ex-ante concept which means the test may not be conditioned on the observed data. Often if not typically one can design a test given the data gathered that will reject the null given the data. We illustrate. Consider the test to reject the null if the di erence between the two sample means is equal to 6= 0 where is given. So reject if Y 1 Y2 = : Then the size of this test will be small if the sample size n is su ciently large. Note that it is important that is rst chosen before the data is gathered. Otherwise could be set equal to the di erence of the observed averages and the null would be rejected. So setting after data is observed yields size equal to 1: An objection to the above example could be that the test suggested is not plausible. One would expect that if it is rejected for Y 1 Y2 = with > 0 then it is also rejected for Y 1 Y2 : However this is only a heuristic argument. The real problem is that the above test is not credible as it is not most powerful nor does it satisfy any other optimality principles.

8 The Value of the P Value It is standard practice to record p values when testing data. In particular one cites whether the null can be rejected at 1% level or at the 5% level. In either case the null is rejected. P levels above 5% typically lead to a statement of not being able to reject the null. Consider this practice. What is the real size of the test? In other words, how often will the null be wrongly rejected. Is the p value really the smallest level of signi cance leading to rejection of the null? Well if the null is rejected as long as the p value is below 5% then the size is 5%. So regardless of whether or not the data has a p value below 1%, if the statistician would have also rejected the null if the p value was above 1% but below 5% then the fact that the data has a p value below 1% is not of importance. The test has size 5%: The speci c p value is useless for the given data, it only matters if it is below or above 5%: In particular the p value is not the smallest level of signi cance leading to rejection of the null. The issue behind the above phenomenon is the same as we discussed in the previous section. The test is formulated, by identifying whether signi cance is at 1% or 5% level, after the data has been gathered. Size is an ex-ante concept that is a function of the test that is derived before the data is observed. Looking at the p value is an ex-post operation. If one can credibly commit to reject the null if and only if the p value is below 1% then a p value below 1% would have the information that the size of the test is equal to 1%: However, standard practice calls to reject if and only if the p value is below 5%; hence observing a p value of 1% or lower has the only information that the p value is below 5%: It does not matter how much lower it is. Of course the p value has a value for the general objective. When discovering that it is very low then one can be more ambitious when one gathers similar data next time, e.g. one can choose to gather less data The Role of the Null Hypothesis Often the null hypothesis cannot be rejected in which case we would like to argue that there is no treatment e ect. This is where type II error comes in handy. If a test is most powerful for some distribution satisfying the alternative then we know that the null could not be rejected more often under this alternative. However it would be even nicer to know explicitly what the type II error is. Moreover, how to argue that the distribution underlying the alternative is well chosen given that they type II error

9 7 depends on this distribution? One would wish to nd some way of making inferences without making speci c unveri able assumptions on the distributions to avoid these issues. We show how to do this now Comparing Tests Even if there is no uniformly most powerful test one can still compare tests. This comparison will not build on making speci c assumptions on the distributions underlying either the null or the alternative. The idea is to consider a worst case analysis centered around the parameters of interest which here are the means. Speci cally, for any given 1 and 2 with 1 > 2 one can consider all distributions P with EY 1 = 1 and EY 2 = 2 and calculate the maximal type II error of wrongly not rejecting the null within this class. In other words, one can calculate the minimal power within this set. This leads to a minimum power function that depends on 1 and 2 : Or for any given d > 0 one can derive the maximal type II error among all distributions P with EY 1 EY 2 + d. This leads to a minimal power function that depends on d: Note that fp : EY 2 < EY 1 < EY 2 + dg can be interpreted as an indi erence zone, a set of underlying parameters where the decision maker does not care whether or not the null hypothesis is rejected. It is necessary to consider d > 0 as the type II error for d = 0 will be equal to 1 : For d > 0 one typically can make the type II error small provided the sample is su ciently large. The smaller d the larger the sample must be. Using these minimal power functions one can evaluate the event of not rejecting the null as a function of either ( 1 ; 2 ) or as a function of d: One can then compare tests by comparing minimal power functions. We show in a di erent paper that there is a best test in terms of such minimal power functions. In other words, there is a uniformly most powerful test when inference is only based on the parameters of the distribution satisfying the alternative hypothesis. Such a test is called parameter most powerful. A general approach followed in the literature (see Lehmann and Romano, 2005) is to limit the set of distributions belonging to the alternative hypothesis. So one sets H 1 : P 2 K where K fp : EY 1 > EY 2 g : In this case, fp : EY 1 > EY 2 g n K is the indi erence zone. The alternative hypothesis only contains those distributions di ering so widely from those postulated by the null hypothesis that false acceptance of the null is a serious error (Lehmann and Romano, 2005, p. 320). In

10 8 this sense it is natural to formulate the indi erence zone in terms of the underlying means. For instance, one can x d > 0 and set H 1 : EY 1 EY 2 + d: In this case fp : EY 2 < EY 1 < EY 2 + dg is the indi erence zone. A maximin test is a test that maximizes over all tests of size the minimum power over all P 2 K : In particular, a parameter most powerful test is a maximin test when K can be described in terms of only Unbiasedness Unbiasedness is a property that is intuitive and which can add enough structure to nd a best test. For instance, there is no UMP test for comparing the means of two Bernoulli distributed random variables but there is a UMP unbiased test for that setting, called the Fisher exact test. However, lack of unbiasedness should not be a reason to reject a test. Unbiasedness means that the alternative should never be rejected more likely when the alternative is true than when the null is true. This sounds intuitive. However we are comparing very small probabilities in this statement and there is no decision theoretic basis for doing this. The reason why unbiasedness fails is typically due to distributions belonging to the alternative that are very close to the null. In our setting we are worried about rejecting the null with probability smaller than among the alternatives but this typically means that EY 1 > EY 2 but EY 1 EY 2 : Above we however argued that it is plausible not to worry too much about alternatives that are close to the null in the sense that EY 2 < EY 1 < EY 2 + d for some small d: We now return to our presentation of exact tests. 2.5 Test Based on Hoe ding Bounds In the following I construct a nonparametric test in which any distribution belongs either to the null or to the alternative hypothesis. Lower bounds on minimal power functions are easily derived. The test is not unbiased. I use the Hoe ding bounds, similar to what Bickel et al. (1989) did for testing the mean of a single sample. Following a corollary by Hoe ding (1963, eq. 2.7), Pr ((y 1 y 2 ) (EY 1 EY 2 ) t) e nt2 (! ) 2 holds for t > 0: So if t solves e nt2 (! )2 = then the rule to reject the null when y 1 y 2 t : The nice thing about this approach is that one can calculate an upper bound on the type II error. If EY 1 = EY 2 + d then Pr ((y 1 y 2 ) t ) = Pr ((y 1 y 2 ) d t d) e n(d t)2 (! ) 2 for

11 9 d > t : For example, if n = 30, [;!] = [0; 1] and = 0:05 then one easily veri es that t = p ln 20= p 30 0:316 and that the type II error is bounded above by 0:2 when d 0: Using Tests Derived for a Single Sample The above are the only deterministic exact nonparametric one-sided tests I know of that were explicitly constructed for the two sample problem. There are exact tests for the one sample problem (see Schlag, 2006) that of course can also be applied to our setting. One can rede ne Y = Y 1 Y 2 and test the null that EY 0 against the alternative that EY > 0 where it is known that Y 2 [ 1; 1]. 2.7 Parameter Most Powerful Tests In a separate paper (Schlag, 2006) I have constructed a randomized exact unbiased test that I call parameter most powerful as it minimizes the type II error whenever the set of alternatives can be described in terms of the underlying means only. It maximizes the minimal power function that only depends on 1 and 2 : In a rst step the data is randomly transformed into binary data. In the second step one then applies the Fisher exact test which is uniformly most powerful unbiased test for the binary case. My test is randomized as the transformation of the data is random and hence the recommendation of the test is typically randomized: reject the null hypothesis with x% where typically 0 < x < 100: The property of being parameter most powerful comes at the cost of being randomized. The advantage is that it can be used to evaluate the relative e ciency of other deterministic tests.. References [1] Bickel, P., J. Godfrey, J. Neter and H. Clayton (1989), Hoe ding Bounds for Monetary Unit Sampling in Auditing, Contr. Pap. I.S.I. Meeting, Paris. [2] Fraser, D.A.S. (1957), Nonparametric Methods in Statistics, New York: John Wiley and Sons.

12 10 [3] Hoe ding, W. (1963), Probability Inequalities for Sums of Bounded Random Variables, J. Amer. Stat. Assoc. 38(301), [4] Lehmann, E.L. and J.P. Romano (2005), Testing Statistical Hypotheses, 3rd edition, Springer. [5] Schlag, K.H. (2006), Nonparametric Minimax Risk Estimates and Most Powerful Tests for Means, European University Institute, Mimeo.

Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).

1 Another method of estimation: least squares

1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

1. The Classical Linear Regression Model: The Bivariate Case

Business School, Brunel University MSc. EC5501/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 018956584) Lecture Notes 3 1.

Introduction. Agents have preferences over the two goods which are determined by a utility function. Speci cally, type 1 agents utility is given by

Introduction General equilibrium analysis looks at how multiple markets come into equilibrium simultaneously. With many markets, equilibrium analysis must take explicit account of the fact that changes

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

1 Hypothesis Testing In Statistics, a hypothesis proposes a model for the world. Then we look at the data. If the data are consistent with that model, we have no reason to disbelieve the hypothesis. Data

1 if 1 x 0 1 if 0 x 1

Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

Non-parametric tests I

Non-parametric tests I Objectives Mann-Whitney Wilcoxon Signed Rank Relation of Parametric to Non-parametric tests 1 the problem Our testing procedures thus far have relied on assumptions of independence,

Nonparametric Statistics

1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the Means of Two Normal

Binomial lattice model for stock prices

Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

Oligopoly. Chapter 10. 10.1 Overview

Chapter 10 Oligopoly 10.1 Overview Oligopoly is the study of interactions between multiple rms. Because the actions of any one rm may depend on the actions of others, oligopoly is the rst topic which requires

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models

On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models Hisashi Tanizaki Graduate School of Economics Kobe University (tanizaki@kobe-u.ac.p) ABSTRACT In this paper we

Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

3. Nonparametric methods

3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

Prep Course in Statistics Prof. Massimo Guidolin

MSc. Finance & CLEFIN A.A. 3/ ep Course in Statistics of. Massimo Guidolin A Few Review Questions and oblems concerning, Hypothesis Testing and Confidence Intervals SUGGESTION: try to approach the questions

Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

Empirical Methods in Applied Economics

Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already

Hypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 One-Sided and Two-Sided Tests

Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis

CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

Representation of functions as power series

Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

[Chapter 10. Hypothesis Testing]

[Chapter 10. Hypothesis Testing] 10.1 Introduction 10.2 Elements of a Statistical Test 10.3 Common Large-Sample Tests 10.4 Calculating Type II Error Probabilities and Finding the Sample Size for Z Tests

14.451 Lecture Notes 10

14.451 Lecture Notes 1 Guido Lorenzoni Fall 29 1 Continuous time: nite horizon Time goes from to T. Instantaneous payo : f (t; x (t) ; y (t)) ; (the time dependence includes discounting), where x (t) 2

Rank-Based Non-Parametric Tests

Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs

Nonparametric Tests for Randomness

ECE 461 PROJECT REPORT, MAY 2003 1 Nonparametric Tests for Randomness Ying Wang ECE 461 PROJECT REPORT, MAY 2003 2 Abstract To decide whether a given sequence is truely random, or independent and identically

Machine Learning. Topic: Evaluating Hypotheses

Machine Learning Topic: Evaluating Hypotheses Bryan Pardo, Machine Learning: EECS 349 Fall 2011 How do you tell something is better? Assume we have an error measure. How do we tell if it measures something

496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT

496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT * Use a non-parametric technique. There are statistical methods, called non-parametric methods, that don t make any assumptions about the underlying distribution

Mathematics for Economics (Part I) Note 5: Convex Sets and Concave Functions

Natalia Lazzati Mathematics for Economics (Part I) Note 5: Convex Sets and Concave Functions Note 5 is based on Madden (1986, Ch. 1, 2, 4 and 7) and Simon and Blume (1994, Ch. 13 and 21). Concave functions

, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0

Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after Simeon-Denis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will

Chapter G08 Nonparametric Statistics

G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................

Multiple Testing. Joseph P. Romano, Azeem M. Shaikh, and Michael Wolf. Abstract

Multiple Testing Joseph P. Romano, Azeem M. Shaikh, and Michael Wolf Abstract Multiple testing refers to any instance that involves the simultaneous testing of more than one hypothesis. If decisions about

Sampling and Hypothesis Testing

Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

Median of the p-value Under the Alternative Hypothesis

Median of the p-value Under the Alternative Hypothesis Bhaskar Bhattacharya Department of Mathematics, Southern Illinois University, Carbondale, IL, USA Desale Habtzghi Department of Statistics, University

Permutation tests are similar to rank tests, except that we use the observations directly without replacing them by ranks.

Chapter 2 Permutation Tests Permutation tests are similar to rank tests, except that we use the observations directly without replacing them by ranks. 2.1 The two-sample location problem Assumptions: x

14.773 Problem Set 2 Due Date: March 7, 2013

14.773 Problem Set 2 Due Date: March 7, 2013 Question 1 Consider a group of countries that di er in their preferences for public goods. The utility function for the representative country i is! NX U i

MTH6120 Further Topics in Mathematical Finance Lesson 2

MTH6120 Further Topics in Mathematical Finance Lesson 2 Contents 1.2.3 Non-constant interest rates....................... 15 1.3 Arbitrage and Black-Scholes Theory....................... 16 1.3.1 Informal

The Binomial Distribution

The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing

5. Convergence of sequences of random variables

5. Convergence of sequences of random variables Throughout this chapter we assume that {X, X 2,...} is a sequence of r.v. and X is a r.v., and all of them are defined on the same probability space (Ω,

Homework Zero. Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis. Complete before the first class, do not turn in

Homework Zero Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis Complete before the first class, do not turn in This homework is intended as a self test of your knowledge of the basic statistical

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

Advanced Microeconomics Ordinal preference theory Harald Wiese University of Leipzig Harald Wiese (University of Leipzig) Advanced Microeconomics 1 / 68 Part A. Basic decision and preference theory 1 Decisions

Non-Parametric Tests (I)

Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

STRUTS: Statistical Rules of Thumb. Seattle, WA

STRUTS: Statistical Rules of Thumb Gerald van Belle Departments of Environmental Health and Biostatistics University ofwashington Seattle, WA 98195-4691 Steven P. Millard Probability, Statistics and Information

Basic concepts and introduction to statistical inference

Basic concepts and introduction to statistical inference Anna Helga Jonsdottir Gunnar Stefansson Sigrun Helga Lund University of Iceland (UI) Basic concepts 1 / 19 A review of concepts Basic concepts Confidence

Nonparametric Test Procedures

Nonparametric Test Procedures 1 Introduction to Nonparametrics Nonparametric tests do not require that samples come from populations with normal distributions or any other specific distribution. Hence

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

Chapter 2 Limits Functions and Sequences sequence sequence Example

Chapter Limits In the net few chapters we shall investigate several concepts from calculus, all of which are based on the notion of a limit. In the normal sequence of mathematics courses that students

A Simpli ed Axiomatic Approach to Ambiguity Aversion

A Simpli ed Axiomatic Approach to Ambiguity Aversion William S. Neilson Department of Economics University of Tennessee Knoxville, TN 37996-0550 wneilson@utk.edu March 2009 Abstract This paper takes the

Permutation & Non-Parametric Tests

Permutation & Non-Parametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

Chapter 1 Hypothesis Testing

Chapter 1 Hypothesis Testing Principles of Hypothesis Testing tests for one sample case 1 Statistical Hypotheses They are defined as assertion or conjecture about the parameter or parameters of a population,

Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

Confidence Intervals and Hypothesis Testing

Name: Class: Date: Confidence Intervals and Hypothesis Testing Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The librarian at the Library of Congress

Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

Math 317 HW #5 Solutions

Math 317 HW #5 Solutions 1. Exercise 2.4.2. (a) Prove that the sequence defined by x 1 = 3 and converges. x n+1 = 1 4 x n Proof. I intend to use the Monotone Convergence Theorem, so my goal is to show

Test Code MS (Short answer type) 2004 Syllabus for Mathematics. Syllabus for Statistics

Test Code MS (Short answer type) 24 Syllabus for Mathematics Permutations and Combinations. Binomials and multinomial theorem. Theory of equations. Inequalities. Determinants, matrices, solution of linear

Skewed Data and Non-parametric Methods

0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

Elements of probability theory

2 Elements of probability theory Probability theory provides mathematical models for random phenomena, that is, phenomena which under repeated observations yield di erent outcomes that cannot be predicted

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

Optimal insurance contracts with adverse selection and comonotonic background risk

Optimal insurance contracts with adverse selection and comonotonic background risk Alary D. Bien F. TSE (LERNA) University Paris Dauphine Abstract In this note, we consider an adverse selection problem

WHERE DOES THE 10% CONDITION COME FROM?

1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

T adult = 96 T child = 114.

Homework Solutions Do all tests at the 5% level and quote p-values when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

I. Pointwise convergence

MATH 40 - NOTES Sequences of functions Pointwise and Uniform Convergence Fall 2005 Previously, we have studied sequences of real numbers. Now we discuss the topic of sequences of real valued functions.

The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

Notes on Chapter 1, Section 2 Arithmetic and Divisibility

Notes on Chapter 1, Section 2 Arithmetic and Divisibility August 16, 2006 1 Arithmetic Properties of the Integers Recall that the set of integers is the set Z = f0; 1; 1; 2; 2; 3; 3; : : :g. The integers

HypoTesting. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Name: Class: Date: HypoTesting Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A Type II error is committed if we make: a. a correct decision when the

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

Difference tests (2): nonparametric

NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge

9Tests of Hypotheses. for a Single Sample CHAPTER OUTLINE

9Tests of Hypotheses for a Single Sample CHAPTER OUTLINE 9-1 HYPOTHESIS TESTING 9-1.1 Statistical Hypotheses 9-1.2 Tests of Statistical Hypotheses 9-1.3 One-Sided and Two-Sided Hypotheses 9-1.4 General

UCLA. Department of Economics Ph. D. Preliminary Exam Micro-Economic Theory

UCLA Department of Economics Ph. D. Preliminary Exam Micro-Economic Theory (SPRING 2011) Instructions: You have 4 hours for the exam Answer any 5 out of the 6 questions. All questions are weighted equally.

Induction. Margaret M. Fleck. 10 October These notes cover mathematical induction and recursive definition

Induction Margaret M. Fleck 10 October 011 These notes cover mathematical induction and recursive definition 1 Introduction to induction At the start of the term, we saw the following formula for computing

c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.

Lecture 2b: Utility c 2008 Je rey A. Miron Outline: 1. Introduction 2. Utility: A De nition 3. Monotonic Transformations 4. Cardinal Utility 5. Constructing a Utility Function 6. Examples of Utility Functions

On Importance of Normality Assumption in Using a T-Test: One Sample and Two Sample Cases

On Importance of Normality Assumption in Using a T-Test: One Sample and Two Sample Cases Srilakshminarayana Gali, SDM Institute for Management Development, Mysore, India. E-mail: lakshminarayana@sdmimd.ac.in

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as