# Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing Introduction Error Rates and Power of a Test...

Save this PDF as:

Size: px
Start display at page:

Download "Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test..."

## Transcription

1 Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction Error Rates and Power of a Test Testing for a population mean 3.1 Normal Distribution with Known Variance Duality with Confidence Intervals Normal Distribution with Unknown Variance Testing for differences in population means Two Sample Problems Normal Distributions with Known Variances Normal Distributions with Unknown Variances Goodness of Fit Count Data and Chi-Square Tests Proportions Model Checking Independence

2 1 Hypothesis Testing 1.1 Introduction Hypotheses Suppose we are going to obtain a random i.i.d. sample X = (X 1,..., X n ) of a random variable X with an unknown distribution P X. To proceed with modelling the underlying population, we might hypothesise probability models for P X and then test whether such hypotheses seem plausible in light of the realised data x = (x 1,..., x n ). Or, more specifically, we might fix upon a parametric family P X θ with unknown parameter θ and then hypothesise values for θ. Generically let θ 0 denote a hypothesised value for θ. Then after observing the data, we wish to test whether we can indeed reasonably assume θ = θ 0. For example, if X N(µ, σ ) we may wish to test whether µ = 0 is plausible in light of the data x. Formally, we define a null hypothesis H 0 as our hypothesised model of interest, and also specify an alternative hypothesis H 1 of rival models against which we wish to test H 0. Most often we simply test H 0 : θ = θ 0 against H 1 : θ = θ 0. This is known as a twosided test. In some situations it may be more appropriate to consider alternatives of the form H 1 : θ > θ 0 or H 1 : θ < θ 0, known as one-sided tests. Rejection Region for a Test Statistic To test the validity of H 0, we first choose a test statistic T(X) of the data for which we can find the distribution, P T, under H 0. Then, we identify a rejection region R R of low probability values of T under the assumption that H 0 is true, so P(T R H 0 ) = α for some small probability α (typically 5%). A well chosen rejection region will have relatively high probability under H 1, whilst retaining low probability under H 0. Finally, we calculate the observed test statistic t(x) for our observed data x. If t R we reject the null hypothesis at the 100α% level. p-values For each possible significance level α (0, 1), a hypothesis test at the 100α% level will result in either rejecting or not rejecting H 0. As α 0 it becomes less and less likely that the null hypothesis will be rejected, as the rejection region is becoming smaller and smaller. Similarly, as α 1 it becomes more and more likely that the null hypothesis will be rejected. For a given data set and resulting test statistic, we might, therefore, be interested in identifying the critical significance level which marks the threshold between us rejecting and not rejecting the null hypothesis. This is known as the p-value of the data. Smaller p-values suggest stronger evidence against H Error Rates and Power of a Test Test Errors There are two types of error in the outcome of a hypothesis test:

3 Type I: Rejecting H 0 when in fact H 0 is true. By construction, this happens with probability α. For this reason, the significance level of a hypothesis test is also referred to as the Type I error rate. Type II: Not rejecting H 0 when in fact H 1 is true. The probability with which this type of error will occur depends on the unknown true value of θ, so to calculate values we plug-in a plausible alternative value for θ = θ 0, θ 1 say, and let β = P(T / R θ = θ 1 ) be the probability of a Type II error. Power We define the power of a hypothesis test by 1 β = P(T R θ = θ 1 ). For a fixed significance level α, a well chosen test statistic T and rejection region R will have high power - that is, maximise the probability of rejecting the null hypothesis when the alternative is true. Testing for a population mean.1 Normal Distribution with Known Variance N(µ, σ ) - σ Known Suppose X 1,..., X n are i.i.d. N(µ, σ ) with σ known and µ unknown. We may wish to test if µ = µ 0 for some specific value µ 0 (e.g. µ 0 = 0, µ 0 = 9.8). Then we can state our null and alternative hypotheses as H 0 :µ = µ 0 ; H 1 :µ = µ 0. Under H 0 : µ = µ 0, we then know both µ and σ. So for the sample mean X we have a known distribution for the test statistic Z = X µ 0 σ/ n Φ. So if we define our rejection region R to be the 100α% tails of the standard normal distribution distribution, ) ( ) R = (, z 1 α z 1 α, { } z z > z 1 α, we have P(Z R) = α under H 0. We thus reject H 0 at the 100α% significance level our observed test statistic z = x µ 0 σ/ n R. The p-value is given by {1 Φ( z )}. 3

4 Ex. 5% Rejection Region for N(0, 1) Statistic φ(z) R z.1.1 Duality with Confidence Intervals There is a strong connection in this context between hypothesis testing and confidence intervals. Suppose we have constructed a 100(1 α)% confidence interval for µ. Then this is precisely the set of values {µ 0 } for which there would be insufficient evidence to reject a null hypothesis H 0 : µ = µ 0 at the 100α%-level. Example A company makes packets of snack foods. The bags are labelled as weighing 454g; of course they won t all be exactly 454g, and let s suppose the variance of bag weights is known to be 70g². The following data show the mass in grams of 50 randomly sampled packets. 464, 450, 450, 456, 45, 433, 446, 446, 450, 447, 44, 438, 45, 447, 460, 450, 453, 456, 446, 433, 448, 450, 439, 45, 459, 454, 456, 454, 45, 449, 463, 449, 447, 466, 446, 447, 450, 449, 457, 464, 468, 447, 433, 464, 469, 457, 454, 451, 453, 443 Are these data consistent with the claim that the mean weight of packets is 454g? 1. We wish to test H 0 : µ = 454 vs. H 1 : µ = 454. So set µ 0 = Although we have not been told that the packet weights are individually normally distributed, by the CLT we still have that the mean weight of the sample of packets is approximately normally distributed, and hence we still approximately have Z = X µ 0 σ/ n Φ. 3. x = 451. and n = 50 z = x µ 0 σ/ n = For a 5%-level significance test, we compare the statistic z =.350 with the rejection region R = (, z ) (z 0.975, ) = (, 1.96) (1.96, ). Clearly we have z R, and so at the 5%-level we reject the null hypothesis that the mean packet weight is 454g. 4

5 5. At which significance levels would we have not rejected the null hypothesis? For a 1%-level significance test, the rejection region would have been R = (, z ) (z 0.995, ) = (,.576) (.576, ). In which case z / R, and so at the 1%-level we would not have rejected the null hypothesis. The p-value is {1 Φ( z )} = {1 Φ(.350 )} ( ) = 0.019, and so we would only reject the null hypothesis for α > 1.9%.. Normal Distribution with Unknown Variance N(µ, σ ) - σ Unknown Similarly, if σ in the above example were unknown, we still have that T = X µ 0 s n 1 / n t n 1. So for a test of H 0 : µ = µ 0 vs. H 1 : µ = µ 0 at the α level, the rejection region of our observed test statistic t = x µ 0 s n 1 / n is ) ( ) R = (, t n 1,1 α t n 1,1 α, { } t t > t n 1,1 α. Ex. 5% Rejection Region for t 5 Statistic f(t) R t 5

6 Example 1 Consider again the snack food weights example. There, we assumed the variance of bag weights was known to be 70. Without this, we could have estimated the variance by s n 1 = 1 n 1 n i=1 Then the corresponding t-statistic becomes very similar to the z-statistic of before. (x i x) = t = x µ 0 s n 1 / n =.341, And since n = 50, we compare with the t 49 distribution which is approximately N(0, 1). So the hypothesis test results and p-value would be practically identical. Example A particular piece of code takes a random time to run on a computer, but the average time is known to be 6 seconds. The programmer tries an alternative optimisation in compilation and wishes to know whether the mean run time has changed. To explore this, he runs the reoptimised code 16 times, obtaining a sample mean run time of 5.8 seconds and bias-corrected sample standard deviation of 1. seconds. Is the code any faster? 1. We wish to test H 0 : µ = 6 vs. H 1 : µ = 6. So set µ 0 = 6.. Assuming the run times are approximately normal, T = X µ 0 s n 1 / n t n 1. That is, X 6 s n 1 / 16 t 15. So we reject H 0 at the 100α% level if T > t 15,1 α/. 3. x = 5.8, s n 1 = 1. and n = 16 t = x µ 0 s n 1 / n = We have t = /3 <<.13 = t 15,.975, so we have insufficient evidence to reject H 0 at the 5% level. 5. In fact, the p-value for these data is 51.51%, so there is very little evidence to suggest the code is now any faster. 6

7 3 Testing for differences in population means 3.1 Two Sample Problems Samples from Populations Suppose, as before, we have a random sample X = (X 1,..., X n1 ) from an unknown population distribution P X. But now, suppose we have a further random sample Y = (Y 1,..., Y n ) from a second, different population P Y. Then we may wish to test hypotheses concerning the similarity of the two distributions P X and P Y. In particular, we are often interested in testing whether P X and P Y have equal means. That is, to test H 0 : µ X = µ Y vs H 1 : µ X = µ Y. Paired Data A special case is when the two samples X and Y are paired. That is, if n 1 = n = n and the data are collected as pairs (X 1, Y 1 ),..., (X n, Y n ) so that, for each i, X i and Y i are possibly dependent. For example, we might have a random sample of n individuals and X i represents the heart rate of the i th person before light exercise and Y i the heart rate of the same person afterwards. In this special case, for a test of equal means we can consider the sample of differences Z 1 = X 1 Y 1,..., Z n = X n Y n and test H 0 : µ Z = 0 using the single sample methods we have seen. In the above example, this would test whether light exercise causes a change in heart rate. 3. Normal Distributions with Known Variances N(µ X, σ X ), N(µ Y, σ Y ) Suppose X = (X 1,..., X n1 ) are i.i.d. N(µ X, σ X ) with µ X unknown; Y = (Y 1,..., Y n ) are i.i.d. N(µ Y, σ Y ) with µ Y unknown; the two samples X and Y are independent. Then we still have that, independently, ( ) X N µ X, σ X n 1 ( ) Ȳ N µ Y, σ Y n From this it follows that the difference in sample means, ( ) X Ȳ N µ X µ Y, σ X + σ Y, n 1 n 7

8 and hence ( X Ȳ) (µ X µ Y ) σ X /n 1 + σ Y /n Φ. So under the null hypothesis H 0 : µ X = µ Y, we have Z = X Ȳ Φ. σx /n 1 + σy /n N(µ X, σx ), N(µ Y, σy ) - σ X, σ Y Known So if σx and σ Y are known, we immediately have a test statistic z = x ȳ σx /n 1 + σy /n which we can compare against the quantiles of a standard normal. That is, { } R = z z > z 1 α, gives a rejection region for a hypothesis test of H 0 : µ X = µ Y vs. H 1 : µ X = µ Y at the 100α% level. 3.3 Normal Distributions with Unknown Variances N(µ X, σ ), N(µ Y, σ ) - σ Unknown On the other hand, suppose σx and σ Y are unknown. Then if we know σ X = σ Y = σ but σ is unknown, we can still proceed. We have and so, under H 0 : µ X = µ Y, but with σ unknown. ( X Ȳ) (µ X µ Y ) σ 1/n 1 + 1/n Φ, X Ȳ σ 1/n 1 + 1/n Φ. Pooled Estimate of Population Variance We need an estimator for the variance using samples from two populations with different means. Just combining the samples together into one big sample would over-estimate the variance, since some of the variability in the samples would be due to the difference in µ X and µ Y. So we define the bias-corrected pooled sample variance Sn 1 +n = n 1 i=1 (X i X) + n i=1 (Y i Ȳ), n 1 + n 8

9 which is an unbiased estimator for σ. We can immediately see that s n 1 +n is indeed an unbiased estimate of σ by noting S n 1 +n = n 1 1 n 1 + n S n n 1 n 1 + n S n 1 ; That is, s n 1 +n is a weighted average of the bias-corrected sample variances for the individual samples x and y, which are both unbiased estimates for σ. Then substituting S n1 +n in for σ we get and so, under H 0 : µ X = µ Y, ( X Ȳ) (µ X µ Y ) t n1 +n 1/n1 + 1/n, S n1 +n T = S n1 +n X Ȳ t n1 +n 1/n1 + 1/n. So we have a rejection region for a hypothesis test of H 0 : µ X = µ Y vs. H 1 : µ X = µ Y at the 100α% level given by for the statistic { R = t t > t n1 +n,1 α }, t = s n1 +n x ȳ. 1/n1 + 1/n Example The same piece of C code was repeatedly run after compilation under two different C compilers, and the run times under each compiler were recorded. The sample mean and biascorrected sample standard deviation for Compiler 1 were 114s and 310s respectively, and the corresponding figures for Compiler were 94s and 90s. Both sets of data were each based on 15 runs. Suppose that Compiler is a refined version of Compiler 1, and so if µ 1, µ are the expected run times of the code under the two compilations, we might fairly assume µ µ 1. Conduct a hypothesis test of H 0 : µ 1 = µ vs H 1 : µ 1 > µ at the 5% level. Until now we have mostly considered two-sided tests. That is tests of the form H 0 : θ = θ 0 vs H 1 : θ = θ 0. Here we need to consider one-sided tests, which differ by the alternative hypothesis being of the form H 1 : θ < θ 0 or H 1 : θ > θ 0. This presents no extra methodological challenge and requires only a slight adjustment in the construction of the rejection region. We still use the t-statistic t = s n1 +n x ȳ, 1/n1 + 1/n 9

10 where x, ȳ are the sample mean run times under Compilers 1 and respectively. But now the one-sided rejection region becomes R = {t t > t n1 +n,1 α }. First calculating the bias-corrected pooled sample variance, we get s n 1 +n = = 300. (Note that since the sample sizes n 1 and n are equal, the pooled estimate of the variance is the average of the individual estimates.) So t = s n1 +n = 10 = x ȳ = 1/n1 + 1/n /15 + 1/15 For a 1-sided test we compare t = 3.16 with t 8,0.95 = and conclude that we reject the null hypothesis at the 5% level; the second compilation is significantly faster. 10

11 4 Goodness of Fit 4.1 Count Data and Chi-Square Tests Count Data The results in the previous sections relied upon the data being either normally distributed, or at least through the CLT having the sample mean being approximately normally distributed. Tests were then developed for making inference on population means under those assumptions. These tests were very much model-based. Another important but very different problem concerns model checking, which can be addressed through a more general consideration of count data for simple (discrete and finite) distributions. The following ideas can then be trivially extended to infinite range discrete and continuous r.v.s by binning observed samples into a finite collection of predefined intervals. Samples from a Simple Random Variable Let X be a simple random variable taking values in the range {x 1,..., x k }, with probability mass function p j = P(X = x j ), j = 1,..., k. A random sample of size n from the distribution of X can be summarised by the observed frequency counts O = (O 1,..., O k ) at the points x 1,..., x k (so k j=1 O j = n). Suppose it is hypothesised that the true pmf {p j } is from a particular parametric model p j = P(X = x j θ), j = 1,..., k for some unknown parameter p-vector θ. To test this hypothesis about the model, we first need to estimate the unknown parameters θ so that we are able to calculate the distribution of any statistic under the null hypothesis H 0 : p j = P(X = x j θ), j = 1,..., k. Let ˆθ be such an estimator, obtained using the sample O. Then under H 0 we have estimated probabilities for the pmf ˆp j = P(X = x j ˆθ), j = 1,..., k and so we are able to calculate estimated expected frequency counts E = (E 1,..., E k ) by E j = n ˆp j. (Note again we have k j=1 E j = n.) We then seek to compare the observed frequencies with the expected frequencies to test for goodness of fit. Chi-Square Test To test H 0 : p j = P(X = x j θ) vs. H 1 : 0 p j 1, p j = 1 we use the chi-square statistic X = k (O i E i ). E i=1 i If H 0 were true, then the statistic X would approximately follow a chi-square distribution with ν = k p 1 degrees of freedom. k is the number of values (categories) the simple r.v. X can take. p is the number of parameters being estimated (dim(θ)). For the approximation to be valid, we should have j, E j 5. This may require some merging of categories. 11

12 χ ν pdf for ν = 1,, 3, 5, 10 χ ν(x) χ 1 χ χ 3 χ 5 χ x Rejection Region Clearly larger values of X correspond to larger deviations from the null hypothesis model. That is, if X = 0 the observed counts exactly match those expected under H 0. For this reason, we always perform a one-sided goodness of fit test using the χ statistic, looking only at the upper tail of the distribution. Hence the rejection region for a goodness of fit hypothesis test at the at the 100α% level is given by R = { } x x > χ k p 1,1 α. 4. Proportions Example Each year, around 1.3 million people in the USA suffer adverse drug effects (ADEs). A study in the Journal of the American Medical Association (July 5, 1995) gave the causes of 95 ADEs below. Cause Number of ADEs Lack of knowledge of drug 9 Rule violation 17 Faulty dose checking 13 Slips 9 Other 7 Test whether the true percentages of ADEs differ across the 5 causes. Under the null hypothesis that the 5 causes are equally likely, we would have expected counts of 95 5 = 19 for each cause. 1

13 So our χ statistic becomes x = (9 19) 19 + (17 19) 19 + (13 19) 19 = = = (9 19) 19 + (7 19) 19 We have not estimated any parameters from the data, so we compare x with the quantiles of the χ 5 1 = χ 4 distribution. Well 16 > 9.49 = χ 4,0.95, so we reject the null hypothesis at the 5% level; we have reason to suppose that there is a difference in the true percentages across the different causes. 4.3 Model Checking Example - Fitting a Poisson Distribution to Data Recall the example from the Discrete Random Variables chapter, where the number of particles emitted by a radioactive substance which reached a Geiger counter was measured for 608 time intervals, each of length 7.5 seconds. We fitted a Poisson(λ) distribution to the data by plugging in the sample mean number of counts (3.870) for the rate parameter λ. (Which we now know to be the MLE!) x O(n x ) E(n x ) (O=Observed, E=Expected). Whilst the fitted Poisson(3.87) expected frequencies looked quite convincing to the eye, at that time we had no formal method of quantitatively assessing the fit. However, we now know how to proceed. x O E O E (O E) E The statistic x = (O E) E = should be compared with a χ = χ 9 distribution. Well χ 9,0.95 = 16.91, so at the 5% level we do not reject the null hypothesis of a Poisson(3.87) model for the data. 4.4 Independence Contingency Tables Suppose we have two simple random variables X and Y which are jointly distributed with unknown probability mass function p XY. We are often interested in trying to ascertain whether X and Y are independent. That is, determine whether p XY (x, y) = p X (x)p Y (y). 13

14 Let the ranges of the r.v.s X and Y be {x 1,..., x k } and {y 1,..., y l } respectively. Then an i.i.d. sample of size n from the joint distribution of (X, Y) can be represented by a list of counts n ij (1 i k; 1 j l) of the number of times we observe the pair (x i, y j ). Tabulating these data in the following way gives what is known as a k l contingency table. y 1 y... y l x 1 n 11 n 1 n 1l n 1 x n 1 n n l n. x k n k1 n k n kl n k n 1 n... n l n Note the row sums (n 1, n,..., n k ) represent the frequencies of x 1, x,..., x k in the sample (that is, ignoring the value of Y). Similarly for the column sums (n 1, n,..., n l ) and y 1,..., y l. Under the null hypothesis H 0 : X and Y are independent, the expected values of the entries of the contingency table, conditional on the row and column sums, can be estimated by ˆn ij = n i n j, 1 i k, 1 j l. n To see this, consider the marginal distribution of X; we could estimate p X (x i ) by ˆp i = n i n. Similarly for p Y (y j ) we get ˆp j = n j n. Then under the null hypothesis of independence p XY (x i, y j ) = p X (x i )p Y (y j ), and so we can estimate p XY (x i, y j ) by ˆp ij = ˆp i ˆp j = n i n j n. Now that we have a set of expected frequencies to compare against our k l observed frequencies, a χ test can be performed. We are using both the row and column sums to estimate our probabilities, and there are k and l of these respectively. So we compare our calculated x statistic against a χ distribution with kl {(k 1) + (l 1)} 1 = (k 1)(l 1) degrees of freedom. Hence the rejection region for a hypothesis test of independence in a k l contingency table at the at the 100α% level is given by R = { } x x > χ (k 1)(l 1),1 α. 14

15 Example An article in International Journal of Sports Psychology (July-Sept 1990) evaluated the relationship between physical fitness and stress. 549 people were classified as good, average, or poor fitness, and were also tested for signs of stress (yes or no). The data are shown in the table below. Poor Fitness Average Fitness Good Fitness Stress No stress Is there any relationship between stress and fitness? Under independence we would estimate the expected values to be Poor Fitness Average Fitness Good Fitness Stress No stress Hence the χ statistic is calculated to be X = i (O i E i ) ( ) ( ) = E i = This should be compared with a χ distribution with ( 1) (3 1) = degrees of freedom. χ,0.95 = 5.99, so we have no significant evidence to suggest there is any relationship between fitness and stress. 15

### 3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

### Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

### 7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### Chapter 8 Introduction to Hypothesis Testing

Chapter 8 Student Lecture Notes 8-1 Chapter 8 Introduction to Hypothesis Testing Fall 26 Fundamentals of Business Statistics 1 Chapter Goals After completing this chapter, you should be able to: Formulate

### Multiple random variables

Multiple random variables Multiple random variables We essentially always consider multiple random variables at once. The key concepts: Joint, conditional and marginal distributions, and independence of

### Lecture 13 More on hypothesis testing

Lecture 13 More on hypothesis testing Thais Paiva STA 111 - Summer 2013 Term II July 22, 2013 1 / 27 Thais Paiva STA 111 - Summer 2013 Term II Lecture 13, 07/22/2013 Lecture Plan 1 Type I and type II error

### 9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

### Basic concepts and introduction to statistical inference

Basic concepts and introduction to statistical inference Anna Helga Jonsdottir Gunnar Stefansson Sigrun Helga Lund University of Iceland (UI) Basic concepts 1 / 19 A review of concepts Basic concepts Confidence

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections - we are still here Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the

### Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### 12.5: CHI-SQUARE GOODNESS OF FIT TESTS

125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### [Chapter 10. Hypothesis Testing]

[Chapter 10. Hypothesis Testing] 10.1 Introduction 10.2 Elements of a Statistical Test 10.3 Common Large-Sample Tests 10.4 Calculating Type II Error Probabilities and Finding the Sample Size for Z Tests

### Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

### Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

Chapter 4 Statistical Inference in Quality Control and Improvement 許 湘 伶 Statistical Quality Control (D. C. Montgomery) Sampling distribution I a random sample of size n: if it is selected so that the

### Chapter Five: Paired Samples Methods 1/38

Chapter Five: Paired Samples Methods 1/38 5.1 Introduction 2/38 Introduction Paired data arise with some frequency in a variety of research contexts. Patients might have a particular type of laser surgery

### The Neyman-Pearson lemma. The Neyman-Pearson lemma

The Neyman-Pearson lemma In practical hypothesis testing situations, there are typically many tests possible with significance level α for a null hypothesis versus alternative hypothesis. This leads to

### 15.0 More Hypothesis Testing

15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis

### 6. Jointly Distributed Random Variables

6. Jointly Distributed Random Variables We are often interested in the relationship between two or more random variables. Example: A randomly chosen person may be a smoker and/or may get cancer. Definition.

### 5.1 Identifying the Target Parameter

University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

### Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

### Hypothesis Testing or How to Decide to Decide Edpsy 580

Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Hypothesis Testing or How to Decide to Decide

### 3.6: General Hypothesis Tests

3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.

### 3. Nonparametric methods

3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

### 4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Sampling and Hypothesis Testing

Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

### HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

### HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

### Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

### Hypothesis testing. Power of a test. Alternative is greater than Null. Probability

Probability February 14, 2013 Debdeep Pati Hypothesis testing Power of a test 1. Assuming standard deviation is known. Calculate power based on one-sample z test. A new drug is proposed for people with

### Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

### Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

### ( ) = P Z > = P( Z > 1) = 1 Φ(1) = 1 0.8413 = 0.1587 P X > 17

4.6 I company that manufactures and bottles of apple juice uses a machine that automatically fills 6 ounce bottles. There is some variation, however, in the amounts of liquid dispensed into the bottles

### Hypothesis testing for µ:

University of California, Los Angeles Department of Statistics Statistics 13 Elements of a hypothesis test: Hypothesis testing Instructor: Nicolas Christou 1. Null hypothesis, H 0 (always =). 2. Alternative

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the

### MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample

MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Null Hypothesis Significance Testing Signifcance Level, Power, t-tests. 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is

### Lecture 8 Hypothesis Testing

Lecture 8 Hypothesis Testing Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Midterm 1 Score 46 students Highest score: 98 Lowest

### Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

### Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012.

Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG8). December 15, 12. 1. (3p) The joint distribution of the discrete random variables X and

### HypoTesting. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Name: Class: Date: HypoTesting Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A Type II error is committed if we make: a. a correct decision when the

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the Means of Two Normal

### Hypothesis tests, confidence intervals, and bootstrapping

Hypothesis tests, confidence intervals, and bootstrapping Business Statistics 41000 Fall 2015 1 Topics 1. Hypothesis tests Testing a mean: H0 : µ = µ 0 Testing a proportion: H0 : p = p 0 Testing a difference

### Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

### Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

### C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Using pivots to construct confidence intervals. In Example 41 we used the fact that

Using pivots to construct confidence intervals In Example 41 we used the fact that Q( X, µ) = X µ σ/ n N(0, 1) for all µ. We then said Q( X, µ) z α/2 with probability 1 α, and converted this into a statement

### MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/6

MATH 214 (NOTES) Math 214 Al Nosedal Department of Mathematics Indiana University of Pennsylvania MATH 214 (NOTES) p. 1/6 "Pepsi" problem A market research consultant hired by the Pepsi-Cola Co. is interested

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### Maximum Likelihood Estimation

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

### Measuring the Power of a Test

Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Confidence Intervals for the Difference Between Two Means

Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

### Chapter 8. Hypothesis Testing

Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

### Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

### Hypothesis Testing - II

-3σ -2σ +σ +2σ +3σ Hypothesis Testing - II Lecture 9 0909.400.01 / 0909.400.02 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -2σ +σ +2σ +3σ Review: Hypothesis

### CHAPTER 15: Tests of Significance: The Basics

CHAPTER 15: Tests of Significance: The Basics The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 15 Concepts 2 The Reasoning of Tests of Significance

2. DATA AND EXERCISES (Geos2911 students please read page 8) 2.1 Data set The data set available to you is an Excel spreadsheet file called cyclones.xls. The file consists of 3 sheets. Only the third is

### Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

### 13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations.

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. Data is organized in a two way table Explanatory variable (Treatments)

### CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING

CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING MULTIPLE CHOICE 56. In testing the hypotheses H 0 : µ = 50 vs. H 1 : µ 50, the following information is known: n = 64, = 53.5, and σ = 10. The standardized

### Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Practice problems for Homework 1 - confidence intervals and hypothesis testing. Read sections 10..3 and 10.3 of the text. Solve the practice problems below. Open the Homework Assignment 1 and solve the

### CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### 11-2 Goodness of Fit Test

11-2 Goodness of Fit Test In This section we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table). We will use a hypothesis

### Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

### For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

### Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

### 1 Confidence intervals

Math 143 Inference for Means 1 Statistical inference is inferring information about the distribution of a population from information about a sample. We re generally talking about one of two things: 1.

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### Testing: is my coin fair?

Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

### Chapter 9, Part A Hypothesis Tests. Learning objectives

Chapter 9, Part A Hypothesis Tests Slide 1 Learning objectives 1. Understand how to develop Null and Alternative Hypotheses 2. Understand Type I and Type II Errors 3. Able to do hypothesis test about population

### Chi Square for Contingency Tables

2 x 2 Case Chi Square for Contingency Tables A test for p 1 = p 2 We have learned a confidence interval for p 1 p 2, the difference in the population proportions. We want a hypothesis testing procedure

### E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

### Section 12 Part 2. Chi-square test

Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### Lecture 1: t tests and CLT

Lecture 1: t tests and CLT http://www.stats.ox.ac.uk/ winkel/phs.html Dr Matthias Winkel 1 Outline I. z test for unknown population mean - review II. Limitations of the z test III. t test for unknown population

### NPTEL STRUCTURAL RELIABILITY

NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Standard Deviation Estimator

CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

### Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review