Prep Course in Statistics Prof. Massimo Guidolin

Similar documents
Maximum Likelihood Estimation

Exact Nonparametric Tests for Comparing Means - A Personal Summary

3. Mathematical Induction

Principle of Data Reduction

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Likelihood: Frequentist vs Bayesian Reasoning

Hypothesis Testing for Beginners

1 Prior Probability and Posterior Probability

Chapter 4: Statistical Hypothesis Testing

2 Sample t-test (unequal sample sizes and unequal variances)

Sample Size and Power in Clinical Trials

Inference on the parameters of the Weibull distribution using records

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Stat 5102 Notes: Nonparametric Tests and. confidence interval

HYPOTHESIS TESTING: POWER OF THE TEST

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Study Guide for the Final Exam

5.1 Identifying the Target Parameter

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Hypothesis testing - Steps

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

3.4 Statistical inference for 2 populations based on two samples

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Name: Date: Use the following to answer questions 3-4:

Non-Inferiority Tests for Two Proportions

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Point Biserial Correlation Tests

Chapter 2 Probability Topics SPSS T tests

Tests for One Proportion

Two-sample inference: Continuous data

Non-Parametric Tests (I)

Multivariate Normal Distribution

Introduction to Detection Theory

Unit 26 Estimation with Confidence Intervals

Two Correlated Proportions (McNemar Test)

WHERE DOES THE 10% CONDITION COME FROM?

Statistics in Retail Finance. Chapter 2: Statistical models of default

. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.

SAS Certificate Applied Statistics and SAS Programming

Non Parametric Inference

E3: PROBABILITY AND STATISTICS lecture notes

Lesson 9 Hypothesis Testing

4. Continuous Random Variables, the Pareto and Normal Distributions

Chapter 2. Hypothesis testing in one population

Introduction to Hypothesis Testing

1.5 Oneway Analysis of Variance

Stat 704 Data Analysis I Probability Review

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Math 431 An Introduction to Probability. Final Exam Solutions

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

1 if 1 x 0 1 if 0 x 1

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Independent t- Test (Comparing Two Means)

Fairfield Public Schools

Confidence Intervals for One Standard Deviation Using Standard Deviation

Normal distribution. ) 2 /2σ. 2π σ

Chapter 3 RANDOM VARIATE GENERATION

1 Sufficient statistics

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Lecture 9: Bayesian hypothesis testing

Topic 8. Chi Square Tests

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

Lecture Notes Module 1

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

The Variability of P-Values. Summary

Practice with Proofs

Practice problems for Homework 11 - Point Estimation

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Theory versus Experiment. Prof. Jorgen D Hondt Vrije Universiteit Brussel jodhondt@vub.ac.be

1 Another method of estimation: least squares

The CUSUM algorithm a small review. Pierre Granjon

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma

Confidence Intervals for Cp

Introduction to General and Generalized Linear Models

Tests for Two Proportions

Basics of Statistical Machine Learning

One-Way Analysis of Variance

STRUTS: Statistical Rules of Thumb. Seattle, WA

22. HYPOTHESIS TESTING

Introduction to Hypothesis Testing OPRE 6301

Chapter 6: Point Estimation. Fall Probability & Statistics

Week 4: Standard Error and Confidence Intervals

Statistics in Retail Finance. Chapter 6: Behavioural models

Lecture 19: Conditional Logistic Regression

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Confidence Intervals for Spearman s Rank Correlation

How To Test For Significance On A Data Set

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

So let us begin our quest to find the holy grail of real analysis.

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Comparison of Estimation Methods for Complex Survey Data Analysis

Simple Linear Regression Inference

Transcription:

MSc. Finance & CLEFIN A.A. 3/ ep Course in Statistics of. Massimo Guidolin A Few Review Questions and oblems concerning, Hypothesis Testing and Confidence Intervals SUGGESTION: try to approach the questions first, without looking at the answers. Occasionally the questions/problems explicitly send you to check out concepts and definitions that have not explicitly discussed during the prep course. In this case besides Casella and Berger s book Wikipedia may be very useful, as always.. (Likelihood Ratio Test for a Normal Population Mean when the Variance is Unknown) Suppose,..., is a random sample from a ( ) but an experimenter is interested only in inferences about, such as performing the one-sided mean test : vs. :.Inthiscase that the parameter is a nuisance parameter. Derive the LRT and show that the test based on LRT(x) has the same structure as a test that defines the rejection region on the basis of a Student s t statistic. Solution. The LRT statistic is (in this case, taking the max or the sup delivers the same value function, ask yourself why): (x) max ( { : } ; x) max ( { : } ; x) max ( { : } ; x) (ˆ ˆ ; x) where ˆ and ˆ are the well-known ML estimators. At this point, if ˆ then the restricted maximum is the same as the unconstrained one and numerator and denominator will be identical; otherwise, the restricted maximum is ( ; x), where P ( ).Therefore: (x) ( ;x) ( (ˆ ˆ ) ˆ ;x) (ˆ ) because we know that ˆ. At this point, note that ˆ X X X ( + ) ( ) + ( ) + X + ( )( )

X ( ) +( ) + ( ) +( ) X ( ) As a result, because (x) +( ) + ( ) + ( ) the rejection region (x) implies that if rejection is impossible, as one would expect, while if ( ) ( ) which will be the case when ( ) is sufficiently large exceeding an increasing function of the fixed Now note that + {z } ( ) which has the structure as the rejection region determined above. Thelessonofthisexerciseis: Besides being an extension of an example analyzed during the lecture, this problem shows that the structure of LRTs is independent for a given and fixed from the specific statistical assumptions that are made.. (Computing and Optimizing a Normal Power Function) Let,..., be a random sample from a ( ) population, known. An LRT of : versus : is a test that rejects if LRT The constant [ ] can be any positive number. (a) Compute the power function of this LRT. (b) Suppose the experimenter wishes to have a maximum Type I Error probability of.. Suppose, in addition, the experimenter wishes to have a maximum Type II Error probability of. if +. Show how to choose and to achieve these goals. Solution. (a) From the definition of power function, () ( ) + + {z } ()

As increases from to +, it is easy to see that this normal probability increases from to. Therefore, it follows that () is an increasing function of, with lim () lim (+ ) lim () lim + ( ) (b) As noted above, the power function of a LRT test in which one rejects if ( ) is () Because () isincreasingin, so that solutions to the equations below are unique, the requirements will be met if ( ) and( + ) {z } 8 ( ) type II error By choosing 8, we achieve ( )(8), regardless of. Now we wish to choose so that ( + ) (8 )8. However, we know that ( 8) 8, so that setting 8 8 and solving for, yield 9. Of course n must be an integer. So choosing 8 and 5 yield a test with error probabilities controlled as specified by the experimenter. 3. (UMP Binomial Test Derived from Neyman-Pearson s Lemma) Notice that the conditions of Neyman-Pearson s lemma for simple hypotheses may be re-written as: x if (x; ) (x; ) x if (x; ) (x; ) for some and Θ (x ) assuming (x; ) 6. Let () Use Neyman-Pearson lemma to show that in the test of : vs. : 3, the test that rejects the null hypothesis if is the UMP test at size 5. Solution. Calculating the ratios of PMFs featured in the lemma, we have that for the three possibile realizations of : (; ) (; ) 3 6 (; ) (; ) 3 36 3 (; ) (; ) 3 96 9 3

If we choose (3/) (9/), the Neyman-Pearson Lemma says that the test that rejects the null hypothesis if and otherwise fails to reject the null of istheumptestatsize ( ; ) 5 Note that the proof really consists of showing the existence of such a and ( 3 9 )satisfies this condition.. (Deriving Normal One-Sided Confidence Bounds by Inversion) Let,..., be a random sample from a ( ) population. Construct a upper confidence bound for that is, we want a confidence interval of the form (x) ( (x)]. Solution. To obtain such an interval by inversion, consider the one-sided tests (plural because as we change we derive different tests) of : versus :. Note that we use the specification of t determine the form of the confidence interval: specifies large values of,sotheconfidence set will contain small values, values less than a bound; thus, we will get an upper confidence bound. The size LRT of versus rejects if Thus the non-rejection (abusing language, one may say acceptance) region for this test is: ½x : and this occurs with probability. Inverting this expression to find an inequality that expresses as a function of sample statistics, we have: ½ + which means that the desired one-sided confidence bound is ( + ]. 5. (Confidence Interval for the Sample Variance) Let,..., be a random sample from a ( ) population. Construct confidence intervals for and respectively. Solution. Because we know from lecture that ( ) Aconfidence interval is defined by a choice of and ( ) such that ½ ( ) ª n o where and are the critical values under the chi-square the leave probabilities in both tails. This choice splits the probability equally, putting in each tail of the distribution (but the distribution, however, is a skewed distribution and it is not immediately clear that an equal probability split is optimal for a skewed distribution). We can invert this set to find ( ) ( ) ( ) ( ) ( ) ( )

or equivalently (s s ) ( ) ( ) 6. (MLEs and LRT-Based Tests in the Case of a Pareto Distribution) Let,..., be a random sample from a Pareto population with PDF ( (; ) if + [+ )(), [+ ) () otherwise (a) Find the MLEs of and. (b) Show that the LRT of : and unknown, versus : 6 and unknown, has critical region of the form {x :, },where and Q ln (min ) (Hint: note that by construction Q (min ) ). Solution. (a) The log-likelihood is ln ( ; x) ln Q + [+ ) ( ) ½ ln Q + if min if min ( ln + ln +( +)ln Q if min if min ( Q ln + ln ( +)ln if min if min For any value of, this is an increasing function of for min. So both the restricted and unrestricted MLEs of are ˆ min. This is a typical example of an MLE that follows the definition of lecture, but that does not correspond to an interior (global) maximum of the log-likelihood function and that as such is NOT obtained by solving a first-order condition. To find the MLE of, forˆ min set to zero, to obtain: ln ( ˆ ; x) + ln min ln Q Moreover, + ln min ln Q ˆ ˆ ln (ˆ ˆ ; x) ln [( Q )(min ) ] (ˆ ) (as ) for all possible realizations of the sample, so that ˆ indeed represents an (interior) MLE. 5

(b) Under,theMLEofisˆ, and the MLE of is still ˆ min. So the likelihood ratio statistic is (x) h exp ln exp[ ln + ln min ( + ) ln Q ] + (min ) Q [(min ) ] Q ln min Q + + ln Q Q i Q Note now that ln (x) ½ ln + so that (x) is increasing for and decreasing if.thus, is equivalent to or, for appropriately chosen constants and. 7. (A Simple Application of the CLT) For a random sample,..., of Bernoulli() variables, it is desired to test : 9 versus : 5 Use the Central Limit Theorem to determine, approximately, the sample size needed so that the two probabilities of error are both about. Use a test function that rejects if P is large. Solution. The CLT tells us that P p ( is approximately ( ). For a test that rejects when P, weneedtofind and to satisfy (9) and (5) 99 9 5 9 5 We thus want (9) 9 5 ' 33 and (5) 9 5 ' 33 Solving these equations, gives 3567 and 67835. 8. (Abusing the Size in actical Applications) Oneverystrikingabuseof size levels is to choose them after seeing the data and to choose them in such a way as to force rejection (or acceptance) of a null hypothesis. To see what the true Type I and Type II error probabilities of such a procedure are, calculate size and power of the following two trivial tests: (a) Always reject, no matter what data are obtained (equivalent to the practice of choosing the size level to force rejection of ). (b) Always accept, no matter what data are obtained (equivalent to the practice of choosing the size level to force acceptance of ). 6

Solution. (a) (reject is true) ) Type I error (reject is true) ) Type II error (b) (reject is true) ) Type I error (reject is true) ) Type II error References Casella, G., andr.l., Berger, Statistical inference, Duxbury ess,. 7