8-1 8-2 8-3 8-4 8-5 8-6



Similar documents
Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing

Mind on Statistics. Chapter 12

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Introduction to Hypothesis Testing OPRE 6301

MAT 155. Key Concept. February 03, S4.1 2_3 Review & Preview; Basic Concepts of Probability. Review. Chapter 4 Probability

Sample Size and Power in Clinical Trials

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Statistics 2014 Scoring Guidelines

WISE Power Tutorial All Exercises

Hypothesis Testing --- One Mean

Introduction to Quantitative Methods

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Review. March 21, S7.1 2_3 Estimating a Population Proportion. Chapter 7 Estimates and Sample Sizes. Test 2 (Chapters 4, 5, & 6) Results

Hypothesis Testing for Beginners

HYPOTHESIS TESTING WITH SPSS:

Fairfield Public Schools

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Descriptive Statistics

II. DISTRIBUTIONS distribution normal distribution. standard scores

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Correlational Research

Independent samples t-test. Dr. Tom Pierce Radford University

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Lecture Notes Module 1

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

In the past, the increase in the price of gasoline could be attributed to major national or global

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Testing Hypotheses About Proportions

p ˆ (sample mean and sample

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

1. Ethics in Psychological Research

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Chi Square Tests. Chapter Introduction

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

How To Test For Significance On A Data Set

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

CALCULATIONS & STATISTICS

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

22. HYPOTHESIS TESTING

Simple Regression Theory II 2010 Samuel L. Baker

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

HYPOTHESIS TESTING: POWER OF THE TEST

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

Week 3&4: Z tables and the Sampling Distribution of X

Review #2. Statistics

Basic Concepts in Research and Data Analysis

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

5.1 Identifying the Target Parameter

8 6 X 2 Test for a Variance or Standard Deviation

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

Unit 26 Estimation with Confidence Intervals

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

Mind on Statistics. Chapter 4

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Chapter 2. Hypothesis testing in one population

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Premaster Statistics Tutorial 4 Full solutions

Statistical tests for SPSS

Chapter 1 Introduction to Correlation

Binomial Sampling and the Binomial Distribution

Linear Models in STATA and ANOVA

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

MAT 155. Key Concept. September 22, S5.3_3 Binomial Probability Distributions. Chapter 5 Probability Distributions

Study Guide for the Final Exam

3. Mathematical Induction

Association Between Variables

Inclusion and Exclusion Criteria

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

UNDERSTANDING THE TWO-WAY ANOVA

HOW TO WRITE A LABORATORY REPORT

Projects Involving Statistics (& SPSS)

Using Excel for inferential statistics

ELEMENTARY STATISTICS

A) B) C) D)

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

6: Introduction to Hypothesis Testing

Transcription:

8-1 Review and Preview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-4 Testing a Claim About a Mean: s Known 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing a Claim About a Standard Deviation or Variance Hypothesis Testing 390

CHAPTER PROBLEM Does the MicroSort method of gender selection increase the likelihood that a baby will be a girl? Gender-selection methods are somewhat controversial. Some people believe that use of such methods should be prohibited, regardless of the reason. Others believe that limited use should be allowed for medical reasons, such as to prevent gender-specific hereditary disorders. For example, some couples carry X-linked recessive genes, so that a male child has a 50% chance of inheriting a serious disorder and a female child has no chance of inheriting the disorder. These couples may want to use a gender-selection method to increase the likelihood of having a baby girl so that none of their children inherit the disorder. Methods of gender selection have been around for many years. In the 1980s, ProCare Industries sold a product called Gender Choice. The product cost only $49.95, but the Food and Drug Administration told the company to stop distributing Gender Choice because there was no evidence to support the claim that it was 80% reliable. The Genetics & IVF Institute developed a newer gender-selection method called MicroSort. The Microsort XSORT method is designed to increase the likelihood of a baby girl, and the YSORT method is designed to increase the likelihood of a boy. Here is a statement from the MicroSort Web site: The Genetics & IVF Institute is offering couples the ability to increase the chance of having a child of the desired gender to reduce the probability of X-linked diseases or for family balancing. Stated simply, for a cost exceeding $3000, the Genetics & IVF Institute claims that it can increase the probability of having a baby of the gender that a couple prefers. As of this writing, the MicroSort method is undergoing clinical trials, but these results are available: Among 726 couples who used the XSORT method in trying to have a baby girl, 668 couples did have baby girls, for a success rate of 92.0%. Under normal circumstances with no special treatment, girls occur in 50% of births. (Actually, the current birth rate of girls is 48.79%, but we will use 50% to keep things simple.) These results provide us with an interesting question: Given that 668 out of 726 couples had girls, can we actually support the claim that the XSORT technique is effective in increasing the probability of a girl? Do we now have an effective method of gender selection?

392 Chapter 8 Hypothesis Testing 8-1 Review and Preview In Chapters 2 and 3 we used descriptive statistics when we summarized data using tools such as graphs, and statistics such as the mean and standard deviation. Methods of inferential statistics use sample data to make an inference or conclusion about a population. The two main activities of inferential statistics are using sample data to (1) estimate a population parameter (such as estimating a population parameter with a confidence interval), and (2) test a hypothesis or claim about a population parameter. In Chapter 7 we presented methods for estimating a population parameter with a confidence interval, and in this chapter we present the method of hypothesis testing. In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a procedure for testing a claim about a property of a population. The main objective of this chapter is to develop the ability to conduct hypothesis tests for claims made about a population proportion p, a population mean m, or a population standard deviation s. Here are examples of hypotheses that can be tested by the procedures we develop in this chapter: Genetics The Genetics & IVF Institute claims that its XSORT method allows couples to increase the probability of having a baby girl. Business A newspaper headline makes the claim that most workers get their jobs through networking. Medicine Medical researchers claim that when people with colds are treated with echinacea, the treatment has no effect. Aircraft Safety The Federal Aviation Administration claims that the mean weight of an airline passenger (including carry-on baggage) is greater than 185 lb, which it was 20 years ago. Quality Control When new equipment is used to manufacture aircraft altimeters, the new altimeters are better because the variation in the errors is reduced so that the readings are more consistent. (In many industries, the quality of goods and services can often be improved by reducing variation.) The formal method of hypothesis testing uses several standard terms and conditions in a systematic procedure. Study Hint: Start by clearly understanding Example 1 in Section 8-2, then read Sections 8-2 and 8-3 casually to obtain a general idea of their concepts, then study Section 8-2 more carefully to become familiar with the terminology. CAUTION When conducting hypothesis tests as described in this chapter and the following chapters, instead of jumping directly to procedures and calculations, be sure to consider the context of the data, the source of the data, and the sampling method used to obtain the sample data. (See Section 1-2.)

8-2 Basics of Hypothesis Testing 393 8-2 Basics of Hypothesis Testing Key Concept In this section we present individual components of a hypothesis test. In Part 1 we discuss the basic concepts of hypothesis testing. Because these concepts are used in the following sections and chapters, we should know and understand the following: How to identify the null hypothesis and alternative hypothesis from a given claim, and how to express both in symbolic form How to calculate the value of the test statistic, given a claim and sample data How to identify the critical value(s), given a significance level How to identify the P-value, given a value of the test statistic How to state the conclusion about a claim in simple and nontechnical terms In Part 2 we discuss the power of a hypothesis test. Part 1: Basic Concepts of Hypothesis Testing The methods presented in this chapter are based on the rare event rule (Section 4-1) for inferential statistics, so let s review that rule before proceeding. Rare Event Rule for Inferential Statistics If, under a given assumption, the probability of a particular observed event is extremely small, we conclude that the assumption is probably not correct. Following this rule, we test a claim by analyzing sample data in an attempt to distinguish between results that can easily occur by chance and results that are highly unlikely to occur by chance. We can explain the occurrence of highly unlikely results by saying that either a rare event has indeed occurred or that the underlying assumption is not correct. Let s apply this reasoning in the following example. 1 Gender Selection ProCare Industries, Ltd. provided a product called Gender Choice, which, according to advertising claims, allowed couples to increase your chances of having a girl up to 80%. Suppose we conduct an experiment with 100 couples who want to have baby girls, and they all follow the Gender Choice easy-to-use in-home system described in the pink package designed for girls. Assuming that Gender Choice has no effect and using only common sense and no formal statistical methods, what should we conclude about the assumption of no effect from Gender Choice if 100 couples using Gender Choice have 100 babies consisting of the following? a. 52 girls b. 97 girls Aspirin Not Helpful for Geminis and Libras Physician Richard Peto submitted an article to Lancet, a British medical journal. The article showed that patients had a better chance of surviving a heart attack if they were treated with aspirin within a few hours of their heart attacks. Lancet editors asked Peto to break down his results into subgroups to see if recovery worked better or worse for different groups, such as males or females. Peto believed that he was being asked to use too many subgroups, but the editors insisted. Peto then agreed, but he supported his objections by showing that when his patients were categorized by signs of the zodiac, aspirin was useless for Gemini and Libra heartattack patients, but aspirin is a lifesaver for those born under any other sign. This shows that when conducting multiple hypothesis tests with many different subgroups, there is a very large chance of getting some wrong results. a. We normally expect around 50 girls in 100 births. The result of 52 girls is close to 50, so we should not conclude that the Gender Choice product is effective. The result of 52 girls could easily occur by chance, so there isn t sufficient evidence to say that Gender Choice is effective, even though the sample proportion of girls is greater than 50%. continued

394 Chapter 8 Hypothesis Testing b. The result of 97 girls in 100 births is extremely unlikely to occur by chance. We could explain the occurrence of 97 girls in one of two ways: Either an extremely rare event has occurred by chance, or Gender Choice is effective. The extremely low probability of getting 97 girls suggests that Gender Choice is effective. In Example 1 we should conclude that the treatment is effective only if we get significantly more girls than we would normally expect. Although the outcomes of 52 girls and 97 girls are both greater than 50%, the result of 52 girls is not significant, whereas the result of 97 girls is significant. 2 Gender Selection The Chapter Problem includes the latest results from clinical trials of the XSORT method of gender selection. Instead of using the latest available results, we will use these results from preliminary trials of the XSORT method: Among 14 couples using the XSORT method, 13 couples had girls and one couple had a boy. We will proceed to formalize some of the analysis in testing the claim that the XSORT method increases the likelihood of having a girl, but there are two points that can be confusing: 1. Assume p 0.5: Under normal circumstances, with no treatment, girls occur in 50% of births. So p = 0.5 and a claim that the XSORT method is effective can be expressed as p 7 0.5. 2. Instead of P(exactly 13 girls), use P (13 or more girls): When determining whether 13 girls in 14 births is likely to occur by chance, use P (13 or more girls). (Stop for a minute and review the subsection of Using Probabilities to Determine When Results Are Unusual in Section 5-2.) Under normal circumstances the proportion of girls is p = 0.5, so a claim that the XSORT method is effective can be expressed as p 7 0.5. We support the claim of p 7 0.5 only if a result such as 13 girls is unlikely (with a small probability, such as less than or equal to 0.05). Using a normal distribution as an approximation to the binomial distribution (see Section 6-6), we find P (13 or more girls in 14 births) = 0.0016. Figure 8-1 shows that with a probability of 0.5, the outcome of 13 girls in 14 births is unusual, so we reject random chance as a reasonable explanation. We conclude that the proportion of girls born to couples using the XSORT method is significantly greater than the proportion that we expect with random chance. Here are the key components of this example: Claim: The XSORT method increases the likelihood of a girl. That is, p 7 0.5. Working assumption: The proportion of girls is p = 0.5 (with no effect from the XSORT method). The preliminary sample resulted in 13 girls among 14 births, so the sample proportion is pn = 13>14 = 0.929. Assuming that p = 0.5, we use a normal distribution as an approximation to the binomial distribution to find that P (at least 13 girls in 14 births) = 0.0016. (Using Table A-1 or calculations with the binomial probability distribution results in a probability of 0.001.) There are two possible explanations for the result of 13 girls in 14 births: Either a random chance event (with the very low probability of 0.0016) has occurred,

8-2 Basics of Hypothesis Testing 395 or the proportion of girls born to couples using the XSORT method is greater than 0.5. Because the probability of getting at least 13 girls by chance is so small (0.0016), we reject random chance as a reasonable explanation. The more reasonable explanation for 13 girls is that the XSORT method is effective in increasing the likelihood of girls. There is sufficient evidence to support a claim that the XSORT method is effective in producing more girls than expected by chance. 0.25 0.20 Figure 8-1 Probability Histogram for Numbers of Girls in 14 Births The probability of 13 or more girls is very small. Probability 0.15 0.10 The probability of 13 or more girls is very small. 0.05 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Number of Girls in 14 births We now proceed to describe the components of a formal hypothesis test, or test of significance. Many professional journals will include results from hypothesis tests, and they will use the same components described here. Working with the Stated Claim: Null and Alternative Hypotheses The null hypothesis (denoted by H 0 ) is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value. (The term null is used to indicate no change or no effect or no difference.) Here is a typical null hypothesis included in this chapter: H 0 : p = 0.5. We test the null hypothesis directly in the sense that we assume (or pretend) it is true and reach a conclusion to either reject it or fail to reject it. The alternative hypothesis (denoted by H 1 or H a or ) is the statement that the parameter has a value that somehow differs from the null hypothesis. For the methods of this chapter, the symbolic form of the alternative hypothesis must use one of these symbols: 6, 7, Z. Here are different examples of alternative hypotheses involving proportions: H 1 : p 7 0.5 H 1 : p 6 0.5 H 1 : p Z 0.5 Note About Always Using the Equal Symbol in H 0 : It is now rare, but the symbols and Ú are occasionally used in the null hypothesis. Professional statisticians H A H 0

396 Chapter 8 Hypothesis Testing and professional journals use only the equal symbol for equality. We conduct the hypothesis test by assuming that the proportion, mean, or standard deviation is equal to some specified value so that we can work with a single distribution having a specific value. Note About Forming Your Own Claims (Hypotheses): If you are conducting a study and want to use a hypothesis test to support your claim, the claim must be worded so that it becomes the alternative hypothesis (and can be expressed using only the symbols 6, 7, or Z). You can never support a claim that some parameter is equal to some specified value. For example, after completing the clinical trials of the XSORT method of gender selection, the Genetics & IVF Institute will want to demonstrate that the method is effective in increasing the likelihood of a girl, so the claim will be stated as p 7 0.5. In this context of trying to support the goal of the research, the alternative hypothesis is sometimes referred to as the research hypothesis. It will be assumed for the purpose of the hypothesis test that p = 0.5, but the Genetics & IVF Institute will hope that p = 0.5 gets rejected so that p 7 0.5 is supported. Supporting the alternative hypothesis of p 7 0.5 will support the claim that the XSORT method is effective. Note About Identifying H 0 and H 1 : Figure 8-2 summarizes the procedures for identifying the null and alternative hypotheses. Next to Figure 8-2 is an example using the claim that with the XSORT method, the likelihood of having a girl is greater than 0.5. Note that the original statement could become the null hypothesis, it could become the alternative hypothesis, or it might not be either the null hypothesis or the alternative hypothesis. Figure 8-2 Identifying H 0 and H 1 Start Example: The claim is that with the XSORT method, the likelihood of having a girl is greater than 0.5. This claim in symbolic form is p 7 0.5. If p 7 0.5 is false, the symbolic form that must be true is p 0.5. Identify the specific claim or hypothesis to be tested, and express it in symbolic form. Give the symbolic form that must be true when the original claim is false. H 1 : p 7 0.5 H 0 : p = 0.5 Using the two symbolic expressions obtained so far, identify the null hypothesis H 0 and the alternative hypothesis H 1 :? H 1 is the symbolic expression that does not contain equality.? H 0 is the symbolic expression that the parameter equals the fixed value being considered. (The original claim may or may not be one of the above two symbolic expressions.)

8-2 Basics of Hypothesis Testing 397 3 Identifying the Null and Alternative Hypotheses Consider the claim that the mean weight of airline passengers (including carryon baggage) is at most 195 lb (the current value used by the Federal Aviation Administration). Follow the three-step procedure outlined in Figure 8-2 to identify the null hypothesis and the alternative hypothesis. Refer to Figure 8-2, which shows the three-step procedure. Step 1: Express the given claim in symbolic form. The claim that the mean is at most 195 lb is expressed in symbolic form as m 195 lb. Step 2: If m 195 lb is false, then m 7 195 lb must be true. Step 3: Of the two symbolic expressions m 195 lb and m 7 195 lb, we see that m 7 195 lb does not contain equality, so we let the alternative hypothesis H 1 be m 7 195 lb. Also, the null hypothesis must be a statement that the mean equals 195 lb, so we let H 0 be m = 195 lb. Note that in this example, the original claim that the mean is at most 195 lb is neither the alternative hypothesis nor the null hypothesis. (However, we would be able to address the original claim upon completion of a hypothesis test.) Converting Sample Data to a Test Statistic The calculations required for a hypothesis test typically involve converting a sample statistic to a test statistic. The test statistic is a value used in making a decision about the null hypothesis. It is found by converting the sample statistic (such as the sample proportion pn, the sample mean x, or the sample standard deviation s) to a score (such as z, t, or x 2 ) with the assumption that the null hypothesis is true. In this chapter we use the following test statistics: Test statistic for proportion z = pn - p Test statistic for mean Test statistic for standard deviation A z = x - m s x 2 = 2n (n - 1)s2 The test statistic for a mean uses the normal or Student t distribution, depending on the conditions that are satisfied. For hypothesis tests of a claim about a population mean, this chapter will use the same criteria for using the normal or Student t distributions as described in Section 7-4. (See Figure 7-6 and Table 7-1.) pq n s 2 or t = x - m s 2n 4 Finding the Value of the Test Statistic Let s again consider the claim that the XSORT method of gender selection increases the likelihood of having a baby girl. Preliminary results from a test of the XSORT method of gender selection involved 14 couples who gave birth to 13 girls and 1 boy. Use the continued

398 Chapter 8 Hypothesis Testing Human Lie Detectors Researchers tested 13,000 people for their ability to determine when someone is lying. They found 31 people with exceptional skills at identifying lies. These human lie detectors had accuracy rates around 90%. They also found that federal officers and sheriffs were quite good at detecting lies, with accuracy rates around 80%. Psychology Professor Maureen O Sullivan questioned those who were adept at identifying lies, and she said that all of them pay attention to nonverbal cues and the nuances of word usages and apply them differently to different people. They could tell you eight things about someone after watching a two-second tape. It s scary, the things these people notice. Methods of statistics can be used to distinguish between people unable to detect lying and those with that ability. given claim and the preliminary results to calculate the value of the test statistic. Use the format of the test statistic given above, so that a normal distribution is used to approximate a binomial distribution. (There are other exact methods that do not use the normal approximation.) From Figure 8-2 and the example displayed next to it, the claim that the XSORT method of gender selection increases the likelihood of having a baby girl results in the following null and alternative hypotheses: H 0 : p = 0.5 and H 1 : p 7 0.5. We work under the assumption that the null hypothesis is true with p = 0.5. The sample proportion of 13 girls in 14 births results in pn = 13>14 = 0.929. Using p = 0.5, pn = 0.929, and n = 14, we find the value of the test statistic as follows: z = pn - p 0.929-0.5 = = 3.21 pq (0.5) (0.5) A n A 14 We know from previous chapters that a z score of 3.21 is unusual (because it is greater than 2). It appears that in addition to being greater than 0.5, the sample proportion of 13> 14 or 0.929 is significantly greater than 0.5. Figure 8-3 shows that the sample proportion of 0.929 does fall within the range of values considered to be significant because they are so far above 0.5 that they are not likely to occur by chance (assuming that the population proportion is p = 0.5). Figure 8-3 shows the test statistic of z = 3.21, and other components in Figure 8-3 are described as follows. Unusually high sample proportions Critical region: Area of 0.05 used as criterion for identifying unusually high sample proportions p 0.5 or z 0 Proportion of girls in 14 births z 1.645 Sample proportion of: p 0.929 Critical or value Test Statistic z 3.21 Figure 8-3 Critical Region, Critical Value, Test Statistic Tools for Assessing the Test Statistic: Critical Region, Significance Level, Critical Value, and P-Value The test statistic alone usually does not give us enough information to make a decision about the claim being tested. The following tools can be used to understand and interpret the test statistic.

8-2 Basics of Hypothesis Testing 399 The critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. For example, see the red-shaded critical region shown in Figure 8-3. The significance level (denoted by a) is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. If the test statistic falls in the critical region, we reject the null hypothesis, so a is the probability of making the mistake of rejecting the null hypothesis when it is true. This is the same a introduced in Section 7-2, where we defined the confidence level for a confidence interval to be the probability 1 - a. Common choices for a are 0.05, 0.01, and 0.10, with 0.05 being most common. A critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis. The critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the significance level of a. See Figure 8-3 where the critical value of z = 1.645 corresponds to a significance level of a = 0.05. (Critical values were formally defined in Section 7-2.) 5 Finding a Critical Value for Critical Region in the Right Tail Using a significance level of a = 0.05, find the critical z value for the alternative hypothesis H 1 : p 7 0.5 (assuming that the normal distribution can be used to approximate the binomial distribution). This alternative hypothesis is used to test the claim that the XSORT method of gender selection is effective, so that baby girls are more likely, with a proportion greater than 0.5. Refer to Figure 8-3. With H 1 : p 7 0.5, the critical region is in the right tail as shown. With a right-tailed area of 0.05, the critical value is found to be z = 1.645 (by using the methods of Section 6-2). If the right-tailed critical region is 0.05, the cumulative area to the left of the critical value is 0.95, and Table A-2 or technology show that the z score corresponding to a cumulative left area of 0.95 is z = 1.645. The critical value is z = 1.645 as shown in Figure 8-3. Two-Tailed Test: 0.025 0.025 6 Finding Critical Values for a Critical Region in Two Tails Using a significance level of a = 0.05, find the two critical z values for the alternative hypothesis H 1 : p Z 0.5 (assuming that the normal distribution can be used to approximate the binomial distribution). Refer to Figure 8-4(a). With H 1 : p Z 0.5, the critical region is in the two tails as shown. If the significance level is 0.05, each of the two tails has an area of 0.025 as shown in Figure 8-4(a). The left critical value of z = -1.96 corresponds to a cumulative left area of 0.025. (Table A-2 or technology result in z = -1.96 by using the methods of Section 6-2). The rightmost critical value of z = 1.96 is found from the cumulative left area of 0.975. (The rightmost critical value is z 0.975 = 1.96.) The two critical values are z = -1.96 and z = 1.96 as shown in Figure 8-4(a). z 1.96 z 0 z 1.96 (a) Left-Tailed Test: 0.05 z 1.645 z 0 (b) Right-Tailed Test: 0.05 z 0 z 1.645 (c) Figure 8-4 Finding Critical Values

400 Chapter 8 Hypothesis Testing The P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. P-values can be found after finding the area beyond the test statistic. The procedure for finding P-values is given in Figure 8-5. The procedure can be summarized as follows: Critical region in the left tail: P-value = area to the left of the test statistic Critical region in the right tail: P-value = area to the right of the test statistic Critical region in two tails: P-value = twice the area in the tail beyond the test statistic The null hypothesis is rejected if the P-value is very small, such as 0.05 or less. Here is a memory tool useful for interpreting the P-value: If the P is low, the null must go. If the P is high, the null will fly. Start What type of test? Two-tailed Left-tailed Left Is the test statistic to the right or left of center? Right Right-tailed P-value area to the left of the test statistic P-value twice the area to the left of the test statistic P-value twice the area to the right of the test statistic P-value area to the right of the test statistic P-value P-value is twice this area. P-value is twice this area. P-value Test statistic Test statistic Test statistic Test statistic Figure 8-5 Procedure for Finding P-Values

8-2 Basics of Hypothesis Testing 401 CAUTION Don t confuse a P-value with a proportion p. Know this distinction: P-value = probability of getting a test statistic at least as extreme as the one representing sample data p = population proportion 7 Finding a P-Value for a Critical Region in the Right Tail Consider the claim that the XSORT method of gender selection increases the likelihood of having a baby girl, so that p 7 0.5. Use the test statistic z = 3.21 (found from 13 girls in 14 births, as in Example 4). First determine whether the given conditions result in a critical region in the right tail, left tail, or two tails, then use Figure 8-5 to find the P-value. Interpret the P-value. With a claim of p 7 0.5, the critical region is in the right tail, as shown in Figure 8-3. Using Figure 8-5 to find the P-value for a right-tailed test, we see that the P-value is the area to the right of the test statistic z = 3.21. Table A-2 (or technology) shows that the area to the right of z = 3.21 is 0.0007, so the P-value is 0.0007. The P-value of 0.0007 is very small, and it shows that there is a very small chance of getting the sample results that led to a test statistic of z = 3.21. It is very unlikely that we would get 13 (or more) girls in 14 births by chance. This suggests that the XSORT method of gender selection increases the likelihood that a baby will be a girl. Lie Detectors and the Law Why not require all criminal suspects to take lie detector tests and dispense with trials by jury? The Council of Scientific Affairs of the American Medical Association states, It is established that classification of guilty can be made with 75% to 97% accuracy, but the rate of false positives is often sufficiently high to preclude use of this (polygraph) test as the sole arbiter of guilt or innocence. A false positive is an indication of guilt when the subject is actually innocent. Even with accuracy as high as 97%, the percentage of false positive results can be 50%, so half of the innocent subjects incorrectly appear to be guilty. 8 Finding a P-Value for a Critical Region in Two Tails Consider the claim that with the XSORT method of gender selection, the likelihood of having a baby girl is different from p = 0.5, and use the test statistic z = 3.21 found from 13 girls in 14 births. First determine whether the given conditions result in a critical region in the right tail, left tail, or two tails, then use Figure 8-5 to find the P-value. Interpret the P-value. The claim that the likelihood of having a baby girl is different from p = 0.5 can be expressed as p Z 0.5, so the critical region is in two tails (as in Figure 8-4(a)). Using Figure 8-5 to find the P-value for a two-tailed test, we see that the P-value is twice the area to the right of the test statistic z = 3.21. We refer to Table A-2 (or use technology) to find that the area to the right of z = 3.21 is 0.0007. In this case, the P-value is twice the area to the right of the test statistic, so we have: P-value = 2 * 0.0007 = 0.0014 The P-value is 0.0014 (or 0.0013 if greater precision is used for the calculations). The small P-value of 0.0014 shows that there is a very small chance of getting the sample results that led to a test statistic of z = 3.21. This suggests that with the XSORT method of gender selection, the likelihood of having a baby girl is different from 0.5.

402 Chapter 8 Hypothesis Testing Sign used in H 1 : Two -tailed test Figure 8-6 Two-Tailed, Left-Tailed, Right-Tailed Tests Types of Hypothesis Tests: Two-Tailed, Left-Tailed, Right-Tailed The tails in a distribution are the extreme critical regions bounded by critical values. Determinations of P-values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail. It therefore becomes important to correctly characterize a hypothesis test as two-tailed, left-tailed, or right-tailed. Two-tailed test: The critical region is in the two extreme regions (tails) under the curve (as in Figure 8-4(a)). Left-tailed test: The critical region is in the extreme left region (tail) under the curve (as in Figure 8-4(b)). Right-tailed test: The critical region is in the extreme right region (tail) under the curve (as in Figure 8-4(c)). Hint: By examining the alternative hypothesis, we can determine whether a test is two-tailed, left-tailed, or right-tailed. The tail will correspond to the critical region containing the values that would conflict significantly with the null hypothesis. A useful check is summarized in Figure 8-6. Note that the inequality sign in H 1 points in the direction of the critical region. The symbol Z is often expressed in programming languages as 6 7, and this reminds us that an alternative hypothesis such as p Z 0.5 corresponds to a two-tailed test. Decisions and Conclusions The standard procedure of hypothesis testing requires that we directly test the null hypothesis, so our initial conclusion will always be one of the following: 1. Reject the null hypothesis. 2. Fail to reject the null hypothesis. Decision Criterion The decision to reject or fail to reject the null hypothesis is usually made using either the P-value method of testing hypotheses or the traditional method (or classical method). Sometimes, however, the decision is based on confidence intervals. In recent years, use of the P-value method has been increasing along with the inclusion of P-values in results from software packages. P-value method: Using the significance level a: If P-value a, reject H 0. If P-value 7 a, fail to reject H 0. Traditional method: If the test statistic falls within the critical region, reject H 0. If the test statistic does not fall within the critical region, fail to reject H 0. Another option: Instead of using a significance level such as a = 0.05, simply identify the P-value and leave the decision to the reader. Confidence intervals: A confidence interval estimate of a population parameter contains the likely values of that parameter. If a confidence interval does not include a claimed value of a population parameter, reject that claim.

8-2 Basics of Hypothesis Testing 403 Wording the Final Conclusion Figure 8-7 summarizes a procedure for wording the final conclusion in simple, nontechnical terms. Note that only one case leads to wording indicating that the sample data actually support the conclusion. If you want to support some claim, state it in such a way that it becomes the alternative hypothesis, and then hope that the null hypothesis gets rejected. CAUTION Never conclude a hypothesis test with a statement of reject the null hypothesis or fail to reject the null hypothesis. Always make sense of the conclusion with a statement that uses simple nontechnical wording that addresses the original claim. Accept/Fail to Reject A few textbooks continue to say accept the null hypothesis instead of fail to reject the null hypothesis. The term accept is somewhat misleading, because it seems to imply incorrectly that the null hypothesis has been proved, but we can never prove a null hypothesis. The phrase fail to reject says more correctly that the available evidence isn t strong enough to warrant rejection of the null hypothesis. In this text we use the terminology fail to reject the null hypothesis, instead of accept the null hypothesis. Start Wording of final conclusion Does the original claim contain the condition of equality? No (Original claim does not contain equality and becomes H 1 ) Yes (Original claim contains equality) Do you reject H 0? Do you reject H 0? No (Fail to reject H 0 ) No (Fail to reject H 0 ) Yes (Reject H 0 ) Yes (Reject H 0 ) There is sufficient evidence to warrant rejection of the claim that... (original claim). There is not sufficient evidence to warrant rejection of the claim that... (original claim). The sample data support the claim that... (original claim). There is not sufficient sample evidence to support the claim that... (original claim). (This is the only case in which the original claim is rejected.) (This is the only case in which the original claim is supported.) Figure 8-7 Wording of Final Conclusion

404 Chapter 8 Hypothesis Testing Large Sample Size Isn t Good Enough Biased sample data should not be used for inferences, no matter how large the sample is. For example, in Women and Love: A Cultural Revolution in Progress, Shere Hite bases her conclusions on 4500 replies that she received after mailing 100,000 questionnaires to various women s groups. A random sample of 4500 subjects would usually provide good results, but Hite s sample is biased. It is criticized for over-representing women who join groups and women who feel strongly about the issues addressed. Because Hite s sample is biased, her inferences are not valid, even though the sample size of 4500 might seem to be sufficiently large. Multiple Negatives When stating the final conclusion in nontechnical terms, it is possible to get correct statements with up to three negative terms. (Example: There is not sufficient evidence to warrant rejection of the claim of no difference between 0.5 and the population proportion. ) Such conclusions are confusing, so it is good to restate them in a way that makes them understandable, but care must be taken to not change the meaning. For example, instead of saying that there is not sufficient evidence to warrant rejection of the claim of no difference between 0.5 and the population proportion, better statements would be these: Fail to reject the claim that the population proportion is equal to 0.5. Unless stronger evidence is obtained, continue to assume that the population proportion is equal to 0.5. 9 Stating the Final Conclusion Suppose a geneticist claims that the XSORT method of gender selection increases the likelihood of a baby girl. This claim of p 7 0.5 becomes the alternative hypothesis, while the null hypothesis becomes p = 0.5. Further suppose that the sample evidence causes us to reject the null hypothesis of p = 0.5. State the conclusion in simple, nontechnical terms. Refer to Figure 8-7. Because the original claim does not contain equality, it becomes the alternative hypothesis. Because we reject the null hypothesis, the wording of the final conclusion should be as follows: There is sufficient evidence to support the claim that the XSORT method of gender selection increases the likelihood of a baby girl. Errors in Hypothesis Tests When testing a null hypothesis, we arrive at a conclusion of rejecting it or failing to reject it. Such conclusions are sometimes correct and sometimes wrong (even if we do everything correctly). Table 8-1 summarizes the two different types of errors that can be made, along with the two different types of correct decisions. We distinguish between the two types of errors by calling them type I and type II errors. Type I error: The mistake of rejecting the null hypothesis when it is actually true. The symbol a (alpha) is used to represent the probability of a type I error. Type II error: The mistake of failing to reject the null hypothesis when it is actually false. The symbol b (beta) is used to represent the probability of a type II error. Because it can be difficult to remember which error is type I and which is type II, we recommend a mnemonic device, such as routine for fun. Using only the consonants from those words (RouTiNe FoRFuN), we can easily remember that a type I error is RTN: Reject True Null (hypothesis), whereas a type II error is FRFN: Fail to Reject a False Null (hypothesis). Notation a (alpha) = probability of a type I error (the probability of rejecting the null hypothesis when it is true) b (beta) = probability of a type II error (the probability of failing to reject a null hypothesis when it is false)

8-2 Basics of Hypothesis Testing 405 Table 8-1 Type I and Type II Errors The null hypothesis is true True State of Nature The null hypothesis is false Decision We decide to reject the null hypothesis We fail to reject the null hypothesis Type I error (rejecting a true null hypothesis) P(type I error) = a Correct decision Correct decision Type II error (failing to reject a false null hypothesis) P(type II error) = b 10 Identifying Type I and Type II Errors Assume that we are conducting a hypothesis test of the claim that a method of gender selection increases the likelihood of a baby girl, so that the probability of a baby girl is p 7 0.5. Here are the null and alternative hypotheses: Give statements identifying the following. a. Type I error b. Type II error H 0 : p = 0.5 H 1 : p 7 0.5 a. A type I error is the mistake of rejecting a true null hypothesis, so this is a type I error: Conclude that there is sufficient evidence to support p 7 0.5, when in reality p = 0.5. That is, a type I error is made when we conclude that the gender selection method is effective when in reality it has no effect. b. A type II error is the mistake of failing to reject the null hypothesis when it is false, so this is a type II error: Fail to reject p = 0.5 (and therefore fail to support p 7 0.5) when in reality p 7 0.5. That is, a type II error is made if we conclude that the gender selection method has no effect, when it really is effective in increasing the likelihood of a baby girl. Controlling Type I and Type II Errors: One step in our standard procedure for testing hypotheses involves the selection of the significance level a (such as 0.05), which is the probability of a type I error. The values of a, b, and the sample size n are all related, so when you choose or determine any two of them, the third is automatically determined. One common practice is to select the significance level a, then select a sample size that is practical, so the value of b is determined. Generally try to use the largest a that you can tolerate, but for type I errors with more serious consequences, select smaller values of a. Then choose a sample size n as large as is reasonable, based on considerations of time, cost, and other relevant factors. Another common practice is to select a and b, so the required sample size n is automatically determined. (See Example 12 in Part 2 of this section.) Comprehensive Hypothesis Test In this section we describe the individual components used in a hypothesis test, but the following sections will combine those components in comprehensive procedures. We can test claims about population parameters by using the P-value method summarized in Figure 8-8, the traditional method summarized in Figure 8-9, or we can use a confidence interval, as described on page 407.

406 Chapter 8 Hypothesis Testing Figure 8-8 P-Value Method Figure 8-9 Traditional Method P-Value Method Traditional Method Start Start 1 Identify the specific claim or hypothesis to be tested, and put it in symbolic form. 1 Identify the specific claim or hypothesis to be tested, and put it in symbolic form. 2 Give the symbolic form that must be true when the original claim is false. 2 Give the symbolic form that must be true when the original claim is false. 3 Of the two symbolic expressions obtained so far, let the alternative hypothesis H 1 be the one not containing equality, so that H 1 uses the symbol or or. Let the null hypothesis H 0 be the symbolic expression that the parameter equals the fixed value being considered. 3 Of the two symbolic expressions obtained so far, let the alternative hypothesis H 1 be the one not containing equality, so that H 1 uses the symbol or or. Let the null hypothesis H 0 be the symbolic expression that the parameter equals the fixed value being considered. 4 Select the significance level based on the seriousness of a type 1 error. Make small if the consequences of rejecting a true H 0 are severe. The values of 0.05 and 0.01 are very common. 4 Select the significance level based on the seriousness of a type 1 error. Make small if the consequences of rejecting a true H 0 are severe. The values of 0.05 and 0.01 are very common. 5 Identify the statistic that is relevant to this test and determine its sampling distribution (such as normal, t, chi-square). 5 Identify the statistic that is relevant to this test and determine its sampling distribution (such as normal, t, chi-square). 6 Find the test statistic and find the P-value (see Figure 8-5). Draw a graph and show the test statistic and P-value. 6 Find the test statistic, the critical values, and the critical region. Draw a graph and include the test statistic, critical value(s), and critical region. 7 Reject H 0 if the P-value is less than or equal to the significance level. Fail to reject H 0 if the P-value is greater than. 7 Reject H 0 if the test statistic is in the critical region. Fail to reject H 0 if the test statistic is not in the critical region. 8 Restate this previous decision in simple, nontechnical terms, and address the original claim. 8 Restate this previous decision in simple, nontechnical terms, and address the original claim. Stop Stop Confidence Interval Method Construct a confidence interval with a confidence level selected as in Table 8-2. Because a confidence interval estimate of a population parameter contains the likely values of that parameter, reject a claim that the population parameter has a value that is not included in the confidence interval. Table 8-2 Significance 0.01 Level for 0.05 Hypothesis 0.10 Test Confidence Level for Confidence Interval Two-Tailed Test 99% 95% 90% One-Tailed Test 98% 90% 80%

8-2 Basics of Hypothesis Testing 407 Confidence Interval Method For two-tailed hypothesis tests construct a confidence interval with a confidence level of 1 - a; but for a one-tailed hypothesis test with significance level a, construct a confidence interval with a confidence level of 1-2a. (See Table 8-2 for common cases.) After constructing the confidence interval, use this criterion: A confidence interval estimate of a population parameter contains the likely values of that parameter. We should therefore reject a claim that the population parameter has a value that is not included in the confidence interval. CAUTION In some cases, a conclusion based on a confidence interval may be different from a conclusion based on a hypothesis test. See the comments in the individual sections that follow. The exercises for this section involve isolated components of hypothesis tests, but the following sections will involve complete and comprehensive hypothesis tests. Part 2: Beyond the Basics of Hypothesis Testing: The Power of a Test We use b to denote the probability of failing to reject a false null hypothesis, so P (type II error) = b. It follows that 1 - b is the probability of rejecting a false null hypothesis, and statisticians refer to this probability as the power of a test, and they often use it to gauge the effectiveness of a hypothesis test in allowing us to recognize that a null hypothesis is false. The power of a hypothesis test is the probability (1 - b) of rejecting a false null hypothesis. The value of the power is computed by using a particular significance level a and a particular value of the population parameter that is an alternative to the value assumed true in the null hypothesis. Note that in the above definition, determination of power requires a particular value that is an alternative to the value assumed in the null hypothesis. Consequently, a hypothesis test can have many different values of power, depending on the particular values of the population parameter chosen as alternatives to the null hypothesis. 11 Power of a Hypothesis Test Let s again consider these preliminary results from the XSORT method of gender selection: There were 13 girls among the 14 babies born to couples using the XSORT method. If we want to test the claim that girls are more likely (p 7 0.5) with the XSORT method, we have the following null and alternative hypotheses: H 0 : p = 0.5 H 1 : p 7 0.5 Let s use a = 0.05. In addition to all of the given test components, we need a particular value of p that is an alternative to the value assumed in the null hypothesis H 0 : p = 0.5. Using the given test components along with different alternative values of p, we get the following examples of power values. These values of power were found by using Minitab, and exact calculations are used instead of a normal approximation to the binomial distribution. continued

408 Chapter 8 Hypothesis Testing Specific Alternative Value of p b Power of Test (1 - b) 0.6 0.820 0.180 0.7 0.564 0.436 0.8 0.227 0.773 0.9 0.012 0.988 Based on the above list of power values, we see that this hypothesis test has power of 0.180 (or 18.0%) of rejecting H 0 : p = 0.5 when the population proportion p is actually 0.6. That is, if the true population proportion is actually equal to 0.6, there is an 18.0% chance of making the correct conclusion of rejecting the false null hypothesis that p = 0.5. That low power of 18.0% is not good. There is a 0.564 probability of rejecting p = 0.5 when the true value of p is actually 0.7. It makes sense that this test is more effective in rejecting the claim of p = 0.5 when the population proportion is actually 0.7 than when the population proportion is actually 0.6. (When identifying animals assumed to be horses, there s a better chance of rejecting an elephant as a horse (because of the greater difference) than rejecting a mule as a horse.) In general, increasing the difference between the assumed parameter value and the actual parameter value results in an increase in power, as shown in the above table. Because the calculations of power are quite complicated, the use of technology is strongly recommended. (In this section, only Exercises 46 48 involve power.) Power and the Design of Experiments Just as 0.05 is a common choice for a significance level, a power of at least 0.80 is a common requirement for determining that a hypothesis test is effective. (Some statisticians argue that the power should be higher, such as 0.85 or 0.90.) When designing an experiment, we might consider how much of a difference between the claimed value of a parameter and its true value is an important amount of difference. If testing the effectiveness of the XSORT genderselection method, a change in the proportion of girls from 0.5 to 0.501 is not very important. A change in the proportion of girls from 0.5 to 0.6 might be important. Such magnitudes of differences affect power. When designing an experiment, a goal of having a power value of at least 0.80 can often be used to determine the minimum required sample size, as in the following example. 12 Finding Sample Size Required to Achieve 80% Power Here is a statement similar to one in an article from the Journal of the American Medical Association: The trial design assumed that with a 0.05 significance level, 153 randomly selected subjects would be needed to achieve 80% power to detect a reduction in the coronary heart disease rate from 0.5 to 0.4. Before conducting the experiment, the researchers selected a significance level of 0.05 and a power of at least 0.80. They also decided that a reduction in the proportion of coronary heart disease from 0.5 to 0.4 is an important difference that they wanted to detect (by correctly rejecting the false null hypothesis). Using a significance level of 0.05, power of 0.80, and the alternative proportion of 0.4, technology such as Minitab is used to find that the required minimum sample size is 153. The researchers can then proceed by obtaining a sample of at least 153 randomly selected subjects. Due to factors such as dropout rates, the researchers are likely to need somewhat more than 153 subjects. (See Exercise 48.)

8-2 Basics of Hypothesis Testing 409 8-2 Basic Skills and Concepts Statistical Literacy and Critical Thinking 1. Hypothesis Test In reporting on an Elle> MSNBC.COM survey of 61,647 people, Elle magazine stated that just 20% of bosses are good communicators. Without performing formal calculations, do the sample results appear to support the claim that less than 50% of people believe that bosses are good communicators? What can you conclude after learning that the survey results were obtained over the Internet from people who chose to respond? 2. Interpreting P-Value When the clinical trial of the XSORT method of gender selection is completed, a formal hypothesis test will be conducted with the alternative hypothesis of p 7 0.5, which corresponds to the claim that the XSORT method increases the likelihood of having a girl, so that the proportion of girls is greater than 0.5. If you are responsible for developing the XSORT method and you want to show its effectiveness, which of the following P-values would you prefer: 0.999, 0.5, 0.95, 0.05, 0.01, 0.001? Why? 3. Proving that the Mean Equals 325 mg Bottles of Bayer aspirin are labeled with a statement that the tablets each contain 325 mg of aspirin. A quality control manager claims that a large sample of data can be used to support the claim that the mean amount of aspirin in the tablets is equal to 325 mg, as the label indicates. Can a hypothesis test be used to support that claim? Why or why not? 4. Supporting a Claim In preliminary results from couples using the Gender Choice method of gender selection to increase the likelihood of having a baby girl, 20 couples used the Gender Choice method with the result that 8 of them had baby girls and 12 had baby boys. Given that the sample proportion of girls is 8> 20 or 0.4, can the sample data support the claim that the proportion of girls is greater than 0.5? Can any sample proportion less than 0.5 be used to support a claim that the population proportion is greater than 0.5? Stating Conclusions About Claims. In Exercises 5 8, make a decision about the given claim. Use only the rare event rule stated in Section 8-2, and make subjective estimates to determine whether events are likely. For example, if the claim is that a coin favors heads and sample results consist of 11 heads in 20 flips, conclude that there is not sufficient evidence to support the claim that the coin favors heads (because it is easy to get 11 heads in 20 flips by chance with a fair coin). 5. Claim: A coin favors heads when tossed, and there are 90 heads in 100 tosses. 6. Claim: The proportion of households with telephones is greater than the proportion of 0.35 found in the year 1920. A recent simple random sample of 2480 households results in a proportion of 0.955 households with telephones (based on data from the U.S. Census Bureau). 7. Claim: The mean pulse rate (in beats per minute) of students of the author is less than 75. A simple random sample of students has a mean pulse rate of 74.4. 8. Claim: Movie patrons have IQ scores with a standard deviation that is less than the standard deviation of 15 for the general population. A simple random sample of 40 movie patrons results in IQ scores with a standard deviation of 14.8. Identifying H 0 and H 1 In Exercises 9 16, examine the given statement, then express the null hypothesis H 0 and alternative hypothesis H 1 in symbolic form. Be sure to use the correct symbol ( μ, p, S) for the indicated parameter. 9. The mean annual income of employees who took a statistics course is greater than $60,000. 10. The proportion of people aged 18 to 25 who currently use illicit drugs is equal to 0.20 (or 20%). 11. The standard deviation of human body temperatures is equal to 0.62 F. 12. The majority of college students have credit cards. 13. The standard deviation of duration times (in seconds) of the Old Faithful geyser is less than 40 sec.