Hypothesis Testing. Learning Objectives. After completing this module, the student will be able to


 Donald Stewart
 1 years ago
 Views:
Transcription
1 Hypothesis Testing Learning Objectives After completing this module, the student will be able to carry out a statistical test of significance calculate the acceptance and rejection region calculate and interpret the p value of a statistical test calculate and interpret type 1 and type 2 errors calculate the power of a test Knowledge and Skills Concepts: null hypothesis, alternative, test statistic, rejection region, acceptance region, p value, significance level, type 1 error, type 2 error, false positive, false negative, power of a test Resampling method Fisher s exact test Prerequisites binomial distribution hypergeometric distribution Normal distribution Sample average Sample standard deviation macros in Excel Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 1
2 Prologue The problem of decision making is ubiquitous. Almost daily, you can read in the news about studies that lead to recommendations based on statistical evidence. The U.S. Department of Health and Human Services Agency for Healthcare Research and Quality (http://www.ahrq.gov/) provides health care recommendations, for instance, through its U.S. Preventive Services Task Force (http://www.ahrq.gov/clinic/uspstfix.htm), an independent panel of experts in primary care and prevention, which reviews research results and develops recommendations. These recommendations are based on analyses of tens or hundreds of clinical studies, and recommendations may change as new evidence accumulates over time. Frequently, clinical studies are phrased in terms of hypothesis testing. For instance, if a new treatment for a disease is developed, we might wish to know whether it performs better than the current treatment. We set up a clinical trial where patients are randomly assigned to one or the other treatment. We then compare the number of successful treatments in each group. Let s assume that the two groups have the same number of patients. In order to conclude that the new treatment is better than the current treatment, we would need to demonstrate that the number of successful treatments in the new treatment group is larger than the number of successful treatments in the current treatment group. The question is how much larger the number of successful treatments in the new treatment group would need to be to convince other investigators that the new treatment is indeed better. These kinds of questions can be answered within the framework of hypothesis testing. In class Activity 1 Assume the current treatment for a disease is successful in 30% of all cases. A new treatment is being developed and a preliminary clinical trial showed that 5 out of 10 patients were successfully treated. Can you conclude that the new treatment is more successful? If the new treatment was not better than the current treatment, we would hypothesize that the new treatment has probability 0.3 of being successful. Alternatively, if the new treatment is better than the current treatment, we would hypothesize that the new treatment has probability greater than 0.3 of being successful. If the new treatment has the same likelihood of success than the current treatment, namely probability 0.3, then the number of patients in the small clinical trial who are treated successfully under the new treatment is binomially distributed with 10 trials and success probability 0.3. The following table was created in EXCEL using the BINOMDIST function and shows this probability distribution: Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 2
3 x P(X=x) We see that the probability of five or more successes when the success probability is 0.3 is = Thus, it is not unlikely to see 5 (or more) out of 10 patients recover when the success probability of recovery is 0.3. We conclude that there is not enough evidence to conclude that the new treatment is better. Discuss in your group the following questions: 1. Why did we add up the probabilities in the above example? 2. Would you be able to conclude definitively from this study that the new treatment isn t any better? 3. What would be your next step in determining whether the new treatment is better? Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 3
4 In class Activity 2 Suppose you have a coin in your pocket. You want to decide whether the coin is fair or biased. You hypothesize that the coin is fair. To test this hypothesis, you toss the coin 30 times. The number of heads is binomially distributed with the number of trials being 30 and the probability of heads (success) being 0.5. Below is the histogram of the probability distribution. Suppose the experiment resulted in 18 heads and 12 tails. Discuss the following questions in your group: 1. What can you say about the coin? Is it a fair coin or a biased coin? 2. What would your conclusion be if the experiment resulted in 24 heads and 6 tails? 3. What criteria did you use to make the decision in each of the two cases? 4. Can you be sure that your decision is correct? Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 4
5 Some Theory In both In class Activities, you had to make a decision between two alternatives. In the first case, you needed to decide whether the new treatment was better than the current treatment. In the second case, you needed to decide whether the coin was fair or biased. In both cases, you relied on a probability model, and you based your decision on how likely the outcome of the experiment was compared to the expectation of the model. In both cases, there was also the possibility that you arrived at the wrong decision. In the following, we will discuss the basic elements of hypothesis testing. We will use the example of the fair coin versus the biased coin because of its simplicity. One hypothesis is that the coin is fair, that is, that the probability of heads is 0.5. The alternative hypothesis is that the coin is biased, that is, the probability of heads is different from 0.5. We base our decision of whether or not the coin is fair on comparing the result of our experiment to what we expect based on a probabilistic model. Namely, if the fraction of heads in the experiment is close to 1/2, the experiment provides evidence for the coin being fair; if the fraction of heads is either low or high, the experiment provides evidence for the coin being biased. The hypothesis the coin is fair is called the null hypothesis and is denoted by H 0. The alternative the coin is biased is denoted by H 1. (We will say more about which of the two hypotheses is the null hypothesis and which is the alternative later.) We summarize this as H0 : p= 0.5 H : p We designed an experiment in which we tossed the coin thirty times. The data collected in the experiment provided evidence for or against the null hypothesis. The data in our experiment were the sequence of heads and tails in the thirty trials. The data suggest that we can calculate a single number, namely the number of heads, which we can compare against what we would expect under the null hypothesis. This single number is called the test statistic. A probabilistic model for tossing a fair coin allows us to calculate the probability distribution of the test statistic. Namely, under the null hypothesis, the number of heads is binomially distributed with 30 trials and success probability p = 0.5. In the experiment, we observed 18 heads. How likely is it that we observe 18 or more heads? If X denotes the number of heads, we are asking for PX ( 18), which can be calculated by adding up the probabilities of the events { = 18 },{ = 19 },...{ = 30} X X X. Refer to the spreadsheet (tab Fair Coin ) to verify that PX ( 18) = Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 5
6 Since the alternative is two sided, that is p < 0.5 or p > 0.5, we will reject the null hypothesis if the number of heads is either too large or too small. We only calculated the probability of at least 18 heads. The probability distribution under the null hypothesis is symmetric and so we add to this the probability of the symmetric event at most 12 heads or { X 12}. We thus need to calculate the probability of the event of having either at least 18 heads or at most 12 heads PX ( 12 or X 18) = = This probability is called the p value and denoted by p. Commonly accepted by statisticians is the following: p<0.01: strong evidence against H <p<0.05: moderate evidence against H 0 p>0.10: little or no evidence against H 0 Since p = , there is little or no evidence against H 0, and thus not sufficient evidence to reject the null hypothesis. If, instead of 18 heads, we observed 24 heads in the experiment, we find for the p value PX ( 6 or X 24) = (2)( ) = We conclude that there is strong evidence against H 0. We reject the null hypothesis and say that the result is highly statistically significant. Rejection Region In our example, we are looking for outcomes that have either a large or a small number of heads. We can define a set of extreme outcomes a priori so that if the outcome of the experiment is in this set, we reject the null hypothesis. The set of extreme outcomes is called the rejection region. Because the probability distribution is symmetric about 15, and there are 15 possible outcomes below 15, namely 0,1,2,,14, and 15 possible outcomes above 15, namely 16,17,..,30, we will define the rejection region in a symmetric way, namely, we will identify a number a, so that the rejection region is of the form { 0,1,2,..., } { 30,30 + 1,...,30} a a a. If we are looking for moderate evidence, say, we want to test at the 0.05 significance level, we will choose a as large as possible so that each of the two sets has Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 6
7 probability close to Looking at the table of probabilities for the outcomes Number of heads is equal to k, we see that if a = 9, we have PX ( 9 or X 21) = = If we choose a larger value for a, the probability would exceed 0.05; a smaller value of a would result in a probability that is smaller than Thus a=9 is the best choice for defining the rejection region if we are interested in moderate evidence against the null hypothesis. We thus reject the null hypothesis if the number of heads in the experiment of tossing the coin thirty times is either less than or equal to 9 or greater than or equal to 21. The complement of the rejection region, called the acceptance region, is the set { 10,11,12,...,17,18,20 }. In the experiment, we observed 18 heads, which is in the acceptance region. We thus do not reject the null hypothesis. Statisticians are careful about phrasing their conclusion. If the outcome of the experiment is unlikely under the null hypothesis, they reject a null hypothesis. If not, they will say that the null hypothesis cannot be rejected. Statisticians do not accept a null hypothesis. There is a big difference between saying we do not reject a null hypothesis and we accept a null hypothesis. Just because the data is consistent with the null hypothesis, does not mean that the null hypothesis is true there could be many other reasons for getting a result that is consistent with the null hypothesis. That is, not rejecting a null hypothesis does not assert its truth. The null hypothesis merely withstood a challenge. As we will see shortly, rejecting a null hypothesis only means that the null hypothesis may not be true. The histogram in the figure below indicates the acceptance region and rejection region. In a two sided test, the rejection region is the union of the two extreme events that are in the two ends of the distribution, which are called the tails of the distribution Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 7
8 Type I Error The probability of the rejection region in our experiment is That is, there is a 4.3% probability that we will reject the null hypothesis even though it is true. Erroneously rejecting the null hypothesis is called committing a type I error. The type I error leads to false positives. Since there is a positive probability that the null hypothesis is erroneously rejected, we can only conclude that the null hypothesis may not be true when we reject the null hypothesis. Type II Error The other possible error is not rejecting the null hypothesis when the alternative is true. This is called a type II error. The type II error leads to false negatives. The type II error can only be calculated if the alternative is sufficiently specified. In our example, we only said that the coin is biased under the alternative. There are infinitely many probability models that satisfy the assumption of the alternative, namely any binomial distribution with p 0.5. For a fixed value of p 0.5, we can calculate the type II error. For instance, let s assume p = 0.7. Then P(10 X 20 p = 0.7) = For larger values of p, this probability will be smaller. For instance, if p = 0.8, then P(10 X 20 p = 0.8) = Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 8
9 This means that the larger (or, by symmetry, the smaller) the probability of heads is, the better we will be able to detect whether a coin is biased. This is quantified by the power of the test, which is defined as 1 type II error. The power of a test is therefore the probability of rejecting the null hypothesis when the alternative is true. The following table lists the power of the test for different values of the probability of heads P(Heads) Power We can plot the power of the test as a function of the probability of heads: Hypothesis Testing trough Resampling In our example of testing whether a coin is fair, we were able to calculate the probability distribution under the null hypothesis exactly. In many applications, the probability distribution under the null Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 9
10 hypothesis is not known and must be simulated. This method is called resampling method. We can illustrate this important method on our example. In the spreadsheet (tab Simulation ), we set up a simulation of 30 independent trials, each with probability 0.5 of success. If we denote a success by a 1 and a failure by a 0, then the syntax to accomplish this in EXCEL is =IF(RAND()<0.5,1,0) (See, for instance, Cell B4.) If you add up the 30 values, you obtain the total number of successes in the 30 trials, which you can find in Cell E4. Write a macro so that you can record the outcomes of 500 such experiments and record the number of heads in each of the 500 runs in column I. If you want the type I error to be 5%, then since the test is two sided, we need to determine the 2.5 th and 97.5 th percentiles of the simulation outcomes to find the acceptance region. Find the acceptance region and the corresponding rejection region. How does this compare to the exact calculations we did earlier? Summary Statistical hypothesis testing involves the following steps: 1. Formulation of a null hypothesis and an alternative. 2. Construction of a test statistic that can discriminate between the null hypothesis and the alternative. Calculate the probability distribution under the null hypothesis. 3. There are two ways to proceed from here. Either one allows us to control the type I error: a. Specification of the type I error and calculation of the rejection and acceptance region followed by data collection and decision of whether to reject or not to reject the null hypothesis based on the data. b. Collection of data and calculation of the corresponding p value followed by decision of whether or not to reject the null hypothesis. The p value is the type I error, that is, it is the probability of erroneously rejecting the null hypothesis based on the data. Worked out Example Problem: A jury pool includes 50% women and 50% men. A jury of 12 people was selected from this pool and included 3 women. A newspaper commented on the biased selection process. (a) Test the hypothesis that the jury selection was fair. (b) Repeat the test assuming now that the jury only included 2 women. Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 10
11 Solution: The first part of the solution applies to both (a) and (b). The null hypothesis is that the jury selection was fair, that is, the proportion of women is 0.5. The alternative is that the selection process was biased against women. We thus choose for the alternative that the proportion of women is less than 0.5: H0 : p= 0.5 H : p< The next step is to find a test statistic and to calculate the probability distribution of the test statistic under the null hypothesis. We can choose as the test statistic the number of women in the jury pool. We denote the test statistic by X. The test statistic is binomially distributed with n, the number of trial, equal to 12, and p, the probability of success being 0.5. The EXCEL function =BINOMDIST(k, n, p, FALSE) calculates the probability distribution of a binomial distribution of k successes in n trials with success probability p. The last entry FALSE indicates that it calculates the probability mass function. To calculate the cumulative probability distribution, replace FALSE by TRUE. With n = 12, and p = 0.5, we obtain the following table: k P(X=k) (a) In this part of the problem, three women were on the jury. To calculate the p value, we compute the probability of the event three or fewer women : PX ( 3) = = Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 11
12 Since the p value is about 0.073, we conclude that there is not enough evidence to reject the null hypothesis. In about 7.3% of jury selections from this pool, we would expect to see three or fewer women. The result is not statistically significant. (b) The situation is different if only two women had been selected. The probability of two or fewer women on the jury is PX ( 2) = = Now, the p value is only about 2%, which is statistically significant. We would now reject the null hypothesis. Homework 1. A student committee composed of 20 upper division and lower division students needs to be assembled. One third of the student population is upper division students. The committee ends up having an equal number of upper division and lower division students. The lower division students, expecting a higher number of them on the committee, made the accusation that the selection process was biased against them. Test the hypothesis that the selection process was fair against the alternative that the selection process was biased against the lower division students. 2. In a cross between heterozygous plants of genotype Cc, we expect that 50% of the offspring are heterozygous (i.e., genotype Cc) and 50% are homozygous (i.e., either of genotype CC or of genotype cc). Among 14 plants, we find that 3 plants are homozygous and 11 plants are heterozygous. Test the hypothesis that the ratio of homozygous to heterozygous plants is 1:1 against the alternative that the ratio is different from 1:1. 3. Assume that the population distribution is normal with mean μ and standard deviation σ. We take a sample of size n from this population and calculate the sample average X n 1 n n i = 1 = X i We know that the distribution of X n is then again normal with mean μ and standard deviation σ / n. We can define a new random variable X n μ Z = σ / n Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 12
13 which is normally distributed with mean 0 and standard deviation 1. Suppose now that the average lifetime of a sample of 50 medical devices is 5 years and 8 months with population standard deviation of 4 months. Assume that the lifetime of this medical device is listed as 6 years. Test the hypothesis μ = 6 years against the alternative μ < 6 years at the 0.05 significance level by first calculating the rejection and acceptance regions for the 0.05 significance level. Calculate the power of this test. 4. For large samples, the sampling distributions can often be approximated by normal distributions even if the population distribution is not normal. Here is a typical example: A group of 100 students takes the ACT math test and has an average score of The standard deviation σ = 3.2. The average score nationwide was Test whether the average score of this group of 100 students is lower than the national average. 5. Suppose you have a biased coin in your pocket. One side shows up with probability 0.3, the other with probability 0.7. Unfortunately, you don t remember which side is more likely. Here are the two scenarios: Hypothesis A P(Heads)=0.3 P(Tails)=0.7 Hypothesis B P(Heads)=0.7 P(Tails)=0.3 To determine whether the probability of heads is 0.3 (Hypothesis A) or 0.7 (Hypothesis B), you toss the coin 9 times and record the number of heads. Here is the outcome of the experiment: H,T,T,T,H,H,T,T,H,T (a) Based on this outcome, what do you think is the more likely scenario, Hypothesis A or Hypothesis B? (b) The number of heads in the experiment is binomially distributed with 9 trials and probability of heads equal to 0.3 in Hypothesis A and 0.7 in Hypothesis B. Calculate the probabilities for the event Number of Heads = k under the two hypotheses for the ten possible values of k. For which values of k would you reject Hypothesis A? Calculate the probability of erroneously rejecting Hypothesis A. Calculate the probability of not rejecting Hypothesis A when in fact Hypothesis B is true. 6. Focht et al reported in a research article on the efficacy of duct tape vs cryotherapy in the treatment of Verruca vulgaris (the common wart) (Arch. Pediatr. Adolesc. Med. 2002; 156: ). Their objective was [t]o determine if application of duct tape is as effective as cryotherapy in the treatment of common warts. They enrolled 61 patients into their study; 51 patients completed the Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 13
14 study. The main outcome measure was complete resolution of the wart being studied. Patients were randomized to receive either cryotherapy (liquid nitrogen applied to the wart every two three weeks) or application of duct tape for a maximum of two months. Of the 51 patients, 26 were treated with duct tape and 25 with cryotherapy. In the duct tape group, 22 had complete resolution of the wart; in the cryotherapy group, 15 patients had complete resolution of the wart. Here is the data in table form summarizing the outcome of the study: Duct Tape Cryotherapy SUM No Resolution Resolution SUM (a) What percentage of patients completing the study were treated with duct tape, and what percentage were treated with cryotherapy? (b) To test whether duct tape is at more effective than cryotherapy, we design a statistical test. The null hypothesis states that the two treatments are equally effective. The alternative is that duct tape therapy is more effective than cryotherapy. Under the null hypothesis, the two treatments are equally effective. Under this assumption, we can develop a probability model to calculate the probability of 22 patients in the duct tape group that saw complete resolution. This is how: We have a group of 51 patients, which is the population. 37 patients saw successful resolution, which is the group of successes in the population. 26 patients are randomly assigned to the duct tape group, which is the sample size. The number of successes in the sample is 22. This is to the following urn problem that we can solve using basic probability theory: An urn has a total of 51 balls, 37 of which are blue, the remainder is green. We take a sample of 26 balls at random from the urn, what is the likelihood that 22 balls are blue? If we denote the number of successes in the sample by X, we can calculate the probability of this event using the hypergeometric distribution: PX ( = 22) = Excel has a function that will calculate this probability: =HYPGEOMDIST(22,26,37,51). To calculate the p value, we need to determine the probability of at least 22 complete resolutions in a sample of size 26 when the population size is 51 and the number of successes in the population is 37. Find this probability. What can you conclude? (The statistical test in this problem is called Fisher s exact test.) Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 14
15 7. In 2006, another study on the efficacy of duct tape in treating warts was done to address some of the shortcomings of the first study, in particular the small sample size and the lack of a placebo group. The study by de Haen et al. (2006) on the efficacy of duct tale vs placebo in the treatment of Verruca vulgaris (warts) in primary school children (Arch. Pediatr. Adolesc. Med. 2006; 106: ) had 103 participants who completed treatment; 51 patients were treated with duct tape and the remaining 52 patients received a placebo treatment. After 6 weeks, the wart had disappeared in 8 of the children in the duct tape group and 3 of the children in the placebo group. Test whether the duct tape treatment is more effective. Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 15
How to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationChapter 8 Introduction to Hypothesis Testing
Chapter 8 Student Lecture Notes 81 Chapter 8 Introduction to Hypothesis Testing Fall 26 Fundamentals of Business Statistics 1 Chapter Goals After completing this chapter, you should be able to: Formulate
More informationModule 5 Hypotheses Tests: Comparing Two Groups
Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationChapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing 83 Testing a Claim About a Proportion 85 Testing a Claim About a Mean: s Not Known 86 Testing
More informationChapter 7 Part 2. Hypothesis testing Power
Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship
More informationYou flip a fair coin four times, what is the probability that you obtain three heads.
Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. JaeWan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationIntroduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses
Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the
More informationNonInferiority Tests for Two Proportions
Chapter 0 NonInferiority Tests for Two Proportions Introduction This module provides power analysis and sample size calculation for noninferiority and superiority tests in twosample designs in which
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters  they must be estimated. However, we do have hypotheses about what the true
More informationTesting: is my coin fair?
Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible
More informationUnit 29 ChiSquare GoodnessofFit Test
Unit 29 ChiSquare GoodnessofFit Test Objectives: To perform the chisquare hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationMAT X Hypothesis Testing  Part I
MAT 2379 3X Hypothesis Testing  Part I Definition : A hypothesis is a conjecture concerning a value of a population parameter (or the shape of the population). The hypothesis will be tested by evaluating
More informationWHERE DOES THE 10% CONDITION COME FROM?
1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay
More information9.1 Basic Principles of Hypothesis Testing
9. Basic Principles of Hypothesis Testing Basic Idea Through an Example: On the very first day of class I gave the example of tossing a coin times, and what you might conclude about the fairness of the
More informationSTAT 35A HW2 Solutions
STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },
More informationChi Square for Contingency Tables
2 x 2 Case Chi Square for Contingency Tables A test for p 1 = p 2 We have learned a confidence interval for p 1 p 2, the difference in the population proportions. We want a hypothesis testing procedure
More information, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0
Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after SimeonDenis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will
More informationMATH 140 Lab 4: Probability and the Standard Normal Distribution
MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes
More informationTesting Hypotheses About Proportions
Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine
More informationMATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample
MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of
More informationTRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing.
This is Dr. Chumney. The focus of this lecture is hypothesis testing both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 In this lecture, we will talk about both theoretical
More information2 GENETIC DATA ANALYSIS
2.1 Strategies for learning genetics 2 GENETIC DATA ANALYSIS We will begin this lecture by discussing some strategies for learning genetics. Genetics is different from most other biology courses you have
More informationHYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NONSTATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
More informationSampling Distribution of the Mean & Hypothesis Testing
Sampling Distribution of the Mean & Hypothesis Testing Let s first review what we know about sampling distributions of the mean (Central Limit Theorem): 1. The mean of the sampling distribution will be
More informationHomework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.
Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling
More informationChapter 2: Data quantifiers: sample mean, sample variance, sample standard deviation Quartiles, percentiles, median, interquartile range Dot diagrams
Review for Final Chapter 2: Data quantifiers: sample mean, sample variance, sample standard deviation Quartiles, percentiles, median, interquartile range Dot diagrams Histogram Boxplots Chapter 3: Set
More information1 SAMPLE SIGN TEST. NonParametric Univariate Tests: 1 Sample Sign Test 1. A nonparametric equivalent of the 1 SAMPLE TTEST.
NonParametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A nonparametric equivalent of the 1 SAMPLE TTEST. ASSUMPTIONS: Data is nonnormally distributed, even after log transforming.
More informationNovember 08, 2010. 155S8.6_3 Testing a Claim About a Standard Deviation or Variance
Chapter 8 Hypothesis Testing 8 1 Review and Preview 8 2 Basics of Hypothesis Testing 8 3 Testing a Claim about a Proportion 8 4 Testing a Claim About a Mean: σ Known 8 5 Testing a Claim About a Mean: σ
More informationBasic Statistics Self Assessment Test
Basic Statistics Self Assessment Test Professor Douglas H. Jones PAGE 1 A sodadispensing machine fills 12ounce cans of soda using a normal distribution with a mean of 12.1 ounces and a standard deviation
More informationProbability, Binomial Distributions and Hypothesis Testing Vartanian, SW 540
Probability, Binomial Distributions and Hypothesis Testing Vartanian, SW 540 1. Assume you are tossing a coin 11 times. The following distribution gives the likelihoods of getting a particular number of
More informationHomework 5 Solutions
Math 130 Assignment Chapter 18: 6, 10, 38 Chapter 19: 4, 6, 8, 10, 14, 16, 40 Chapter 20: 2, 4, 9 Chapter 18 Homework 5 Solutions 18.6] M&M s. The candy company claims that 10% of the M&M s it produces
More informationMath 3C Homework 3 Solutions
Math 3C Homework 3 s Ilhwan Jo and Akemi Kashiwada ilhwanjo@math.ucla.edu, akashiwada@ucla.edu Assignment: Section 2.3 Problems 2, 7, 8, 9,, 3, 5, 8, 2, 22, 29, 3, 32 2. You draw three cards from a standard
More informationTest of proportion = 0.5 N Sample prop 95% CI z value p value (0.400, 0.466)
STATISTICS FOR THE SOCIAL AND BEHAVIORAL SCIENCES Recitation #10 Answer Key PROBABILITY, HYPOTHESIS TESTING, CONFIDENCE INTERVALS Hypothesis tests 2 When a recent GSS asked, would you be willing to pay
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationQuantitative Biology Lecture 5 (Hypothesis Testing)
15 th Oct 2015 Quantitative Biology Lecture 5 (Hypothesis Testing) Gurinder Singh Mickey Atwal Center for Quantitative Biology Summary Classification Errors Statistical significance Ttests Qvalues (Traditional)
More informationIntroduction to Hypothesis Testing
Introduction to Hypothesis Testing A Hypothesis Test for Heuristic Hypothesis testing works a lot like our legal system. In the legal system, the accused is innocent until proven guilty. After examining
More informationHypothesis Testing. Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University
Hypothesis Testing Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 AMU / BonTech, LLC, JourniTech Corporation Copyright 2015 Learning Objectives Upon successful
More informationChapter 5. Random variables
Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like
More informationRandom variables, probability distributions, binomial random variable
Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that
More informationIntroduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.
Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.
More informationH + T = 1. p(h + T) = p(h) x p(t)
Probability and Statistics Random Chance A tossed penny can land either heads up or tails up. These are mutually exclusive events, i.e. if the coin lands heads up, it cannot also land tails up on the same
More informationreductio ad absurdum null hypothesis, alternate hypothesis
Chapter 10 s Using a Single Sample 10.1: Hypotheses & Test Procedures Basics: In statistics, a hypothesis is a statement about a population characteristic. s are based on an reductio ad absurdum form of
More informationChapter 9: Hypothesis Testing GBS221, Class April 15, 2013 Notes Compiled by Nicolas C. Rouse, Instructor, Phoenix College
Chapter Objectives 1. Learn how to formulate and test hypotheses about a population mean and a population proportion. 2. Be able to use an Excel worksheet to conduct hypothesis tests about population means
More informationThe Binomial Probability Distribution
The Binomial Probability Distribution MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2015 Objectives After this lesson we will be able to: determine whether a probability
More informationSection 5 Part 2. Probability Distributions for Discrete Random Variables
Section 5 Part 2 Probability Distributions for Discrete Random Variables Review and Overview So far we ve covered the following probability and probability distribution topics Probability rules Probability
More informationChiSquare Tests. In This Chapter BONUS CHAPTER
BONUS CHAPTER ChiSquare Tests In the previous chapters, we explored the wonderful world of hypothesis testing as we compared means and proportions of one, two, three, and more populations, making an educated
More informationSection 12.2, Lesson 3. What Can Go Wrong in Hypothesis Testing: The Two Types of Errors and Their Probabilities
Today: Section 2.2, Lesson 3: What can go wrong with hypothesis testing Section 2.4: Hypothesis tests for difference in two proportions ANNOUNCEMENTS: No discussion today. Check your grades on eee and
More informationConfidence Interval: pˆ = E = Indicated decision: < p <
Hypothesis (Significance) Tests About a Proportion Example 1 The standard treatment for a disease works in 0.675 of all patients. A new treatment is proposed. Is it better? (The scientists who created
More informationTests for One Proportion
Chapter 100 Tests for One Proportion Introduction The OneSample Proportion Test is used to assess whether a population proportion (P1) is significantly different from a hypothesized value (P0). This is
More informationMT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions...
MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo c Copyright 20042012 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MT426 Notebook 3 3 3.1 Definitions............................................
More informationUnit 21 Student s t Distribution in Hypotheses Testing
Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between
More informationTopic 6: Conditional Probability and Independence
Topic 6: September 1520, 2011 One of the most important concepts in the theory of probability is based on the question: How do we modify the probability of an event in light of the fact that something
More information81 82 83 84 85 86
81 Review and Preview 82 Basics of Hypothesis Testing 83 Testing a Claim About a Proportion 84 Testing a Claim About a Mean: s Known 85 Testing a Claim About a Mean: s Not Known 86 Testing a Claim
More informationLAB : THE CHISQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHISQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationHypothesis tests, confidence intervals, and bootstrapping
Hypothesis tests, confidence intervals, and bootstrapping Business Statistics 41000 Fall 2015 1 Topics 1. Hypothesis tests Testing a mean: H0 : µ = µ 0 Testing a proportion: H0 : p = p 0 Testing a difference
More informationBivariate Statistics Session 2: Measuring Associations ChiSquare Test
Bivariate Statistics Session 2: Measuring Associations ChiSquare Test Features Of The ChiSquare Statistic The chisquare test is nonparametric. That is, it makes no assumptions about the distribution
More informationChapter 1 Hypothesis Testing
Chapter 1 Hypothesis Testing Principles of Hypothesis Testing tests for one sample case 1 Statistical Hypotheses They are defined as assertion or conjecture about the parameter or parameters of a population,
More informationHypothesis Testing for Two Variances
Hypothesis Testing for Two Variances The standard version of the twosample t test is used when the variances of the underlying populations are either known or assumed to be equal In other situations,
More informationAP: LAB 8: THE CHISQUARE TEST. Probability, Random Chance, and Genetics
Ms. Foglia Date AP: LAB 8: THE CHISQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationSolutions to Homework 7 Statistics 302 Professor Larget
s to Homework 7 Statistics 30 Professor Larget Textbook Exercises.56 Housing Units in the US (Graded for Accurateness According to the 00 US Census, 65% of housing units in the US are owneroccupied while
More informationPeople have thought about, and defined, probability in different ways. important to note the consequences of the definition:
PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models
More information82 Basics of Hypothesis Testing. Definitions. Rare Event Rule for Inferential Statistics. Null Hypothesis
82 Basics of Hypothesis Testing Definitions This section presents individual components of a hypothesis test. We should know and understand the following: How to identify the null hypothesis and alternative
More informationIntroduction to Hypothesis Testing
Introduction to Hypothesis Testing A Hypothesis Test for μ Heuristic Hypothesis testing works a lot like our legal system. In the legal system, the accused is innocent until proven guilty. After examining
More informationModule 7: Hypothesis Testing I Statistics (OA3102)
Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.110.5 Revision: 212 1 Goals for this Module
More information93.4 Likelihood ratio test. NeymanPearson lemma
93.4 Likelihood ratio test NeymanPearson lemma 91 Hypothesis Testing 91.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental
More informationComparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples
Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The
More informationQuestion: What is the probability that a fivecard poker hand contains a flush, that is, five cards of the same suit?
ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the
More informationStatistical Inference: Hypothesis Testing
Statistical Inference: Hypothesis Testing Scott Evans, Ph.D. 1 The Big Picture Populations and Samples Sample / Statistics x, s, s 2 Population Parameters μ, σ, σ 2 Scott Evans, Ph.D. 2 Statistical Inference
More informationACTM State ExamStatistics
ACTM State ExamStatistics For the 25 multiplechoice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tiebreaker
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More information1. Comparing Two Means: Dependent Samples
1. Comparing Two Means: ependent Samples In the preceding lectures we've considered how to test a difference of two means for independent samples. Now we look at how to do the same thing with dependent
More informationSection 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.
Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than
More informationSection 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)
Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis
More informationBA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420
BA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420 1. Which of the following will increase the value of the power in a statistical test
More informationChapter 6: Probability
Chapter 6: Probability In a more mathematically oriented statistics course, you would spend a lot of time talking about colored balls in urns. We will skip over such detailed examinations of probability,
More informationPROBABILITIES AND PROBABILITY DISTRIBUTIONS
Published in "Random Walks in Biology", 1983, Princeton University Press PROBABILITIES AND PROBABILITY DISTRIBUTIONS Howard C. Berg Table of Contents PROBABILITIES PROBABILITY DISTRIBUTIONS THE BINOMIAL
More informationTwoSample TTests Assuming Equal Variance (Enter Means)
Chapter 4 TwoSample TTests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when the variances of
More informationSolutions: Problems for Chapter 3. Solutions: Problems for Chapter 3
Problem A: You are dealt five cards from a standard deck. Are you more likely to be dealt two pairs or three of a kind? experiment: choose 5 cards at random from a standard deck Ω = {5combinations of
More informationSHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
Math 1342 (Elementary Statistics) Test 2 Review SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Find the indicated probability. 1) If you flip a coin
More information9.1 Hypothesis Testing
9.1 Hypothesis Testing Define: 1. Null Hypothesis 2. Alternative Hypothesis Null Hypothesis: H 0, statement that the population proportion, or population mean is EQUAL TO a number population proportion
More informationAP Statistics 1998 Scoring Guidelines
AP Statistics 1998 Scoring Guidelines These materials are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationIntroduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 8.1 Homework Answers
Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 8.1 Homework Answers 8.1 In each of the following circumstances state whether you would use the large sample confidence interval,
More informationChapter 8. Hypothesis Testing
Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing
More informationChapter 15 Binomial Distribution Properties
Chapter 15 Binomial Distribution Properties Two possible outcomes (success and failure) A fixed number of experiments (trials) The probability of success, denoted by p, is the same on every trial The trials
More informationStatistical inference provides methods for drawing conclusions about a population from sample data.
Chapter 15 Tests of Significance: The Basics Statistical inference provides methods for drawing conclusions about a population from sample data. Two of the most common types of statistical inference: 1)
More informationNormal Probability Distribution
Normal Probability Distribution The Normal Distribution functions: #1: normalpdf pdf = Probability Density Function This function returns the probability of a single value of the random variable x. Use
More informationMATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS
MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution
More informationSTATISTICS 151 SECTION 1 FINAL EXAM MAY
STATISTICS 151 SECTION 1 FINAL EXAM MAY 2 2009 This is an open book exam. Course text, personal notes and calculator are permitted. You have 3 hours to complete the test. Personal computers and cellphones
More informationDensity Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:
Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve
More informationIntroduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationModels for Discrete Variables
Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More information7 Hypothesis testing  one sample tests
7 Hypothesis testing  one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More information