Chapter 08. Introduction

Size: px
Start display at page:

Download "Chapter 08. Introduction"

Transcription

1 Chapter 08 Introduction Hypothesis testing may best be summarized as a decision making process in which one attempts to arrive at a particular conclusion based upon "statistical" evidence. A typical hypothesis test contains two contradicting statements about the value of a population parameter of interest. These statements are called the null hypothesis, denoted by, and the alternative hypothesis (aka the research hypothesis) denoted by. Since these two hypotheses are contradicting, at least one of them must be false. Hypothesis testing is the statistical process used to decide which statement (or hypothesis) appears to be true and which appears to be false. The evidence we use to determine which hypothesis is correct arrives in the form of randomly sampled data from the population (or populations) of interest. The first step of any hypothesis test is to establish the null and alternative hypotheses, which in turn will help us to determine exactly what we are testing. In practice, the researcher is responsible for setting up the null and alternative hypotheses based on the type of research conducted. For us, setting up the null and alternative hypotheses will stem from the careful interpretation of a "claim" found in the description of a given research statement or question. Typically, a claim is associated with the alternative hypothesis, but is occasionally associated with the null hypothesis. If the wording of the claim suggests equality of any kind then it is associated with the null hypothesis. For example, if the claim states a parameter is "equal to", "greater than or equal to" (at least), or "less than or equal to" (at most) a given value, then it is associated with the null hypothesis. Alternately, if the claim specifically lacks equality, that is, states a parameter is less than, greater than, or unequal to a given value, then the claim is associated with the alternative hypothesis (i.e., will only contain <, >, or ). For instance, if we claim "the average surface temperature of the water in the North Atlantic in September is greater than 38 F" (i.e. µ > 38), then this claim addresses the alternative hypothesis because "more than" does not imply equality. For this claim, the alternative hypothesis will look like When the claim is associated with the alternative hypothesis, a null hypothesis needs to be devised so that it contradicts the alternative hypothesis. An easy way to accomplish this is to simply set the parameter equal to the value already specified in the alternative hypothesis. For example, a null hypothesis for the current alternative hypothesis could simply be stated as Another option for the null hypothesis would be to use : µ < 38, which would also contradict the alternative hypothesis and it too infers equality. However, in an attempt to simplify this process, we will 3:15:08 PM]

2 Chapter 08 use the equal sign (=) instead of the less than or equal to sign (< ), or the greater than or equal to sign (>). Let's look at another example. Suppose a researcher makes the following claim: "The mean age of a person diagnosed with type II diabetes is less than 29 years of age.". In this instance, the claim made by the researcher is again the alternative hypothesis because "less than" does not imply equality. In this instance, the alternative hypothesis will look like... Additionally, if a researcher makes the claim: "The mean age of a person diagnosed with type II diabetes is greater than 29 years of age.", then the alternative hypothesis becomes If a claim indicates a parameter (or parameters) is not equal to some value, as in the statement "The mean age of a person diagnosed with type II diabetes is not 29 years of age.", the alternative hypothesis would be listed as We now need to fit a null hypothesis to each of these three alternative hypotheses. Fortunately, by utilizing straight equality in the null hypothesis, we can create a single null hypothesis that can be used with any of the three previous alternative hypotheses, that being... Regardless of which of the previous three alternative hypotheses we want to test, this one null hypothesis contradicts all of them. The reason we can get away with a single null hypothesis for any of the three previously mentioned alternative hypotheses is because the remaining steps involved in a hypothesis test are determined by the type of inequality used in the alternative hypothesis, and not those used in the null hypothesis. This is precisely the reason why we are able to use the equals sign (=) exclusively in the null hypothesis, and let the alternative hypothesis contain either <, >, or. Occasionally, the null hypothesis is specified in the "claim" and the alternative hypothesis has to be created to fit the research scenario. For example, consider the claim "the mean age of men getting married for the first time is at least 25 years old". The phrase "at least 25" implies "greater than or equal to 25". Thus, the alternative hypothesis needs to contradict this statement and therefore should be written such that the parameter is "less than 25". Again, it is the alternative hypothesis that plays a roll in the completion of the hypothesis test, so whether the null hypothesis utilizes the "greater than or equal to" sign, or just the "equal to" sign, the test will remain the same. As a result, either of the following sets of hypotheses would be 3:15:08 PM]

3 Chapter 08 correct for this claim. However, as mentioned before, for the sake of simplicity, we will utilize the null hypothesis than contains only the equal sign. Additional examples of setting up the null and alternative hypotheses through interpretation of different claims about various parameters are given here. Claim: The mean body temperature of a healthy adult is not 98.6 degrees F. The claim "is not 98.6" implies this statement is associated with the alternative hypothesis, which will contain the not equal to symbol. Thus, as a consequence, the null hypothesis will contain the "equals to" symbol. The only correct set of hypotheses is: Claim: The mean monthly student loan payment of graduates from the University of Oklahoma is thought to be more than $340. Since the claim states "more than," with no mention of equality, it must be the alternative hypothesis. Thus, an appropriate null and alternative hypotheses are: Claim: The population proportion of Democrats who will vote against their own party in the upcoming election is less than Since the statement contains the phrase "less than", with no mention of equality, it is referencing the alternative hypothesis. Also notice the claim is about a proportion, not a mean. Appropriate null and alternative hypotheses for this scenario are... Claim: The mean temperature in Nashville during the month of July is at least 84 degrees. In this case the claim that the temperature is "at least" 84 degrees contains equality because "at least" means "greater than or equal to". Therefore, this claim addresses the null hypothesis, and thus, the 3:15:08 PM]

4 Chapter 08 alternative hypothesis must consist of "less than 84". Incidentally, the null and alternative hypotheses can be written as... Claim: The standard deviation associated with the number of text messages sent by teenagers per day in the US equals 16. This claim states that the standard deviation associated with US teenagers and their texting habits equals 16, suggesting the statement is associated with the null hypothesis. Note that there was no mention of "greater than", "less than", "at most", or "at least" anywhere in the statement. This means the only way the alternative hypothesis can contradict the claim is, if it contains a "not equals to" sign. Consequently, the null and alternative hypotheses are: We can also make claims about two (or more) population parameters. An example might be "in September, the mean surface temperatures for the North Atlantic will not equal the mean surface temperature of the North Pacific", which represents the alternative hypothesis and is written as. Noting, that the null hypothesis that contradicts our alternative hypothesis is. Hypothesis test like these, involving two or more parameters, will be the focus of future chapters. To recap, for each of the hypothesis tests conducted in this chapter (and in the following chapters), three simple rules can be referenced in helping us with our construction of the null and alternative hypotheses. 1. The null hypothesis is always associated with an equals sign. 2. The alternative hypothesis never contains an equal sign, but contains either a < or > or. 3. The null and alternative hypotheses always contradict each other. 3:15:08 PM]

5 Chapter 8.2 The Origin of Hypothesis Testing Regardless of whether the claim coincides with the null or alternative hypothesis, when conducting a hypothesis test, we always assume the null hypothesis is true and test the reliability of the null hypothesis with sample data. The reason we test the null hypothesis is because, by assuming the null hypothesis is true, we are able to utilize the pre-established properties of sampling distributions. The basic idea of testing the null hypothesis involves using sample data to calculate a statistic that estimates the parameter of interest. Then, based on the proximity of the statistic in relation to the parameter, we decide whether or not there is sufficient evidence to conclude the null hypothesis is false. These underlying ideas regarding hypothesis testing might best be encapsulated by considering an example. The University President Example: Suppose the president of a university hypothesizes that the average age of students attending her university is 20.5 years. The president's claim (or hypothesis) implies equality, so the null hypothesis and alternative hypotheses are and respectively. Note that because no specific indication of testing for less than, greater than, at most, or at least was given, the alternative hypothesis must be. One way to investigate our null hypothesis is to take a random sample of students from the university and calculate their average age. Recall, due to sampling error, values of the sample mean,, vary from sample to sample, and serve only as a "good guess" to the value of the population mean. We do not expect the sample mean,, to equal the population mean, µ, but, if the null hypothesis is true, we do expect a vast majority of 's to be reasonably close to µ = Therefore, if the value of the sample mean age is close to 20.5, we will have little evidence to suggest that µ = 20.5 is not a viable statement. However, if the value of begins to deviate substantially from 20.5, then we start to question the legitimacy of the null hypothesis. This brings about the question of just how far does need to deviate from the value of µ stated in the null hypothesis before we begin to suspect the null hypothesis is incorrect? The answer to this question depends greatly upon the spread of the distribution of 's. Fortunately, by making use of results provided by the central limit theorem, the spread of the distribution of 's can be estimated. Recall, if n is greater than 30, the sampling distribution of the 's will be approximately normally distributed with a mean of µ and a standard deviation of. For instance, suppose a random sample of 30 university students was selected and their average age was found to be Additionally, assume it was known that the value of is 2.4 years. Thus, the standard deviation of the sampling distribution is, or years. More importantly, if the president's claim that the average age of the students at her university is 20.5 years is true, we would expect about 95% of the sample means to be within 1.96(0.438) of 20.5, or between and years. This is shown in Figure 8.1with 95% of the sample means falling between the red vertical lines. 3:15:28 PM]

6 Chapter 8.2 If the value of our sample mean,, falls within this central 95% of the sampling distribution, we fail to reject the null hypothesis because we assume the difference between and µ is the result of sampling error (recall, sampling error is the reason different samples provide different sample means). That is, when the value of our sample mean is between the values which define the central 95% of the sampling distribution, we lack sufficient evidence to suggest the true mean is not This is equivalent to saying, based on the evidence provided by our sample mean, we do not have enough evidence to reject the null hypothesis. In this instance, our sample mean of 20.8 clearly falls among the commonly expected values of, as indicated by the green line in Figure 8.2. Therefore, a sample mean of 20.8 provides us with an unsubstantial amount of evidence against the null hypothesis, meaning there is insufficient evidence to suggest the average age of students at the university is not 20.5 years. 3:15:28 PM]

7 Chapter 8.2 Notice the last statement said nothing about evidence in support of the null hypothesis, we just don't have enough evidence to say that it is false. That is, the difference between the sample mean and the hypothesized population mean was not large enough to "really" convince us otherwise. On the other hand, if the value of our sample mean,, was much further away from 20.5, like 21.5, we should be less inclined to believe that the null hypothesis is true. Namely, the veracity of the null hypothesis would be in question because the value of the sample mean (21.5) is so extreme compared to the stated value of 20.5 that there appears to be more than just sampling error present. This would indicate that the true average age is not 20.5, but probably some value larger than In fact, we can see from Figure 8.3that the sample mean of 21.5 is not located within the middle 95% of the distribution. Instead, it deviates greatly from the hypothesized center of Although it is possible that sampling error is the only cause of this sample mean being so extreme, the probability is very small. Therefore, instead of assuming the large distance between the sample mean and hypothesized population mean is the rare case of extreme sampling error, we instead adopt the more believable idea that the population mean is really larger than 20.5 and there is just a little sampling error. When the value of a statistic (our sample mean in this case), is beyond what would be expected due to sampling error alone, we say that our result is statistically significant. 3:15:28 PM]

8 Chapter :15:28 PM]

9 Chapter 8.3 Setting up a Hypothesis Test Once the null and alternative hypotheses are established, the next step is to determine whether we are conducting a one-tailed test or a two-tailed test, that is whether or not there are one or two rejection regions. The number of tails in our hypothesis test is determined by the alternative hypothesis becasue it reveals where the test statistic needs to fall in order to reject the null hypothesis. For instance, say we hypothesize that the average age of persons diagnosed with type II diabetes is not 29 years old, giving us a null and alternative hypotheses of: In this case, there are two different scenarios which allow us to reject the null hypothesis. We could reject the idea that µ = 29 if the sample mean is very small compared to the hypothesized population mean or if the sample mean is very large compared to the hypothesized population mean. Either way, we expect more than just sampling error to be causing the large difference between the hypothesized and sample means (i.e. a statistically significant difference). As a result we need to keep both tails of the sampling distribution labeled as potential "rejection regions," where a rejection region is defined as any area of the distribution typically not attributed to sampling error alone (see Figure 8.4). This is fittingly called a two-tailed test because both tails are potential "rejection regions." However, what if the claim was along the lines of "the mean age of a person who is diagnosed with type II diabetes is less than 29 years"? In keeping with the claim, the appropriate set of null and alternative 3:15:48 PM]

10 Chapter 8.3 hypotheses are: From inspection of the alternative hypothesis, in order to really convince anyone that the alternative hypothesis is true, we will need a sample mean that is much smaller than 29 such that its value is beyond the range of sample means accounted for by sampling error. This type of situation would enable us to suspect that the cause of the difference between the sample and hypothesized mean is due to more than just sampling error. Recognize that sample means greater than 29 will surely not convince anyone that the alternative hypothesis is true. We call this a one-tailed hypothesis test (or more formally, a left-tailed hypothesis test), as the only rejection region falls in the left tail. Therefore, we only reject the null hypothesis if the value of our sample mean finds itself in the left tail of the distribution as shown in Figure 8.5. In a similar fashion, if the null and alternative hypotheses were stated as: then the rejection region would fall to the right of the distribution because the only way to reject the null hypothesis is to obtain a sample mean large than 29 such that the value of the sample mean is beyond what would be considered sampling error (see Figure 8.6). 3:15:48 PM]

11 Chapter 8.3 Since the rejection region is found in the right tail, this too is a one-tailed test, but more specifically, we call it a right-tailed hypothesis test. 3:15:48 PM]

12 Chapter 8.4 Making an Error When conducting a hypothesis test, we must remember that we can never be absolutely certain which hypothesis is the correct one. When we complete a hypothesis test, we select what we think is the correct hypothesis, but we may be wrong. The reason we can never be absolutely certain which hypothesis is the correct one is because we are using a sample to make an inference about an entire population. For instance, in the example regarding the mean age of students at a university, we rejected the null hypothesis when the sample mean ( = 21.5) fell outside of the middle 95% of the distribution. However, even when the null hypothesis is true, there is still a chance, although small (2.5% for each tail for a total of 5%), of an estimate (our sample mean in this case) falling outside the central portion of the distribution (the area due to sampling error). When one of these rare yet possible estimates occurs, we reject the null hypothesis when in fact it should be retained. If we reject the null hypothesis but the null hypothesis is correct, we have made an error called a type I error. The probability of a type I error is denoted by (the Greek letter alpha), where is also called the "level of significance" or, the "type I error rate". The value of is easily obtained as it is the researcher (you) who gets to decide on what this value is. The value selected by the researcher always corresponds to the area in the rejection region(s). Thus, if we decide to use the middle 95% of the sampling distribution to account for sampling error, then that leaves 5% in the tails or = Just as confidence intervals should never have a level of confidence lower than 90% or rarely be greater than 99%, the value of should never rise above 0.10 or rarely fall below Regardless of the value we select for, we need to determine this value of before collecting our data and conducting our hypothesis test. If we let the results of the hypothesis test influence which alpha level we choose, what will keep us from selecting an alpha level that supports the result "we" desire, instead of the results given by our test? Thus, the level of alpha is always chosen a priori "before the fact" and never ex post facto or "after the fact." A second type of error arises whenever we fail to reject the null hypothesis, but the null hypothesis is actually false. This type of error is called type II error. The probability of a type II error is denoted (the Greek letter beta). For instance, in reference to the mean age of a college student example, a type II error can occur when the sample mean falls within the middle 95% of the sampling distribution such as = 20.8, but the population mean is not 20.5, as specified in the null hypothesis. Because the deviation of this sample mean from the hypothesized population mean could be attributed entirely to sampling error, we would have no substantial reason to reject the null hypothesis. Therefore, even if the true population mean is a value other than 20.5, the null hypothesis will not be rejected and a type II error will be committed. The below table summarizes the errors (and the non-errors) one can make when conducting a hypothesis test. 3:16:05 PM]

13 Chapter 8.4 Unfortunately, we can never be sure of the exact value of because calculating it requires us to know the true value of when the null hypothesis is wrong. If we knew the real value of, we would not bother conducting a hypothesis test. Although we will not be able to directly calculate the probability of a type II error, we will discuss ways in which the chance of comitting a type II error can be reduced. 3:16:05 PM]

14 Chapter 8.5 Power The complement of the probability of a type II error is called power. Power is the probability of rejecting the null hypothesis when indeed the null hypothesis is false. Thus, power represents the probability of making a good decision. Although power will not be discussed in detail in this text, the concept of power is important. If our hypothesis test has high power, then we will be more likely to make the correct decision of rejecting a false null hypothesis. Also, when making comparisons between different statistical tests designed to accomplish the same goal, the one with the highest power is generally preferred. Mathematically, power is denoted as 1-, where is the probability of a type II error. 3:16:15 PM]

15 Chapter 8.6 Relationships Between Type I Error, Type II Error, and Power In this section we will discuss how the probability of type I error, the probability of type II error, and power are related to one another. To do so, turn you attention to Figure 8.7, where the blue curve (on the left) represents the distribution with respect to the null hypothesis, which in this case is centered at zero (i.e., µ = 0). Likewise, the red curve (on the right) represents the distribution specified in the alternative hypothesis, (or more specifically, ). In reality we would never know the specific value of the parameter under the alternative hypothesis (µ = 2 in this case), but it makes it easier to discuss the relationship between power, the probability of a type I error, and the probability of a type II error when this value is known. To better understand the interconnections between type I error, type II error, and power, we need to adhere to the following rules. When referencing type I error, we are assuming the null hypothesis is the correct hypothesis. Therefore, when discussing type I error, we will consider only the blue curve in Figure 8.7. When we referencing type II error or power, we are assuming the alternative hypothesis is the correct hypothesis, thus, we will consider only the red curve in Figure 8.7. When the value of a sample mean,, falls between the vertical green lines, we will not reject the null hypothesis. In this case, if the null hypothesis is true, then no error has been made. However, if the sample mean falls to the left of the lower green line or the right of the upper green line and the null hypothesis is true, we will incorrectly reject the null hypothesis and a type I error will be committed. The probability of a type I error (denoted ) is represented by the area under the blue curve outside the green lines. In Figure 8.7 this area is labeled "Type I Error" and is also the rejection regions. 3:16:26 PM]

16 Chapter 8.6 On the other hand, if the true value of the population mean is two (i.e., the alternative hypothesis is correct) and the sample mean falls between the green lines, then we fail to reject the null hypothesis and a type II error is committed. The type II error rate,, (i.e. the probability of committing a type II error) is represented by the area under the red curve that falls between the green lines. In Figure 8.7 this area is labeled "Type II Error". Finally, if the position of the sample mean falls to the left of the lower green line, or the right of the upper green line, then no error has been committed. In fact, if this happens, it is desirable as we are rejecting a false null hypothesis in support of a true alternative hypothesis. As stated earlier, the probability of correctly rejecting the null hypothesis is power, and is represented by the area under the red curve to the left of the lower green line and the right of the upper green line. We can see from Figure 8.7 that almost all of the power falls to the right of the upper green line, which makes sense, as the value of the mean under the alternative hypothesis is greater than the mean under null hypothesis (two is greater than zero). Consequently, if the value of the mean under the alternative hypothesis was smaller than the value of the mean given by the null hypothesis, then most of the power would be found under the red curve and to the left of the lower green line. In our continued quest to see how type I error, type II error, and power are all related, consider Figure 8.8, which is similar to Figure 8.7, except the type I error rate has been reduced (i.e. the vertical green lines have been moved further apart). It is a common misconception to think that reducing the probability of a type I error is a beneficial. How could it be bad thing to lower your chance of committing an error? The problem with reducing,, the type I error rate, is that you simultaneously increase the probability of a type II error. As displayed in Figure 8.8, when the type I error rate decreases, the area under the blue curve and between the green lines increases. When this happens, the area between the green lines and under the red curve also increases, consequently increasing. In addition, when increases, power decreases. This diminishes our 3:16:26 PM]

17 Chapter 8.6 ability to correctly reject the null hypothesis when it is false. It is also worthy to note how much the probability of a type II error increased from a very small decrease in the probability of a type I error. By comparing Figure 8.7 and Figure 8.8, notice, it was not an equal exchange. For this example, when was decreased only a little, increased substantially. Additionally, as displayed in Figure 8.9, if we reduce the type II error rate to increase power, we in turn increase the type I error rate. This is yet another example of "you can't get something for nothing." If you reduce type I error rate, your type II error rate increases and power decreases. Similarly, if your type II error rate is decreased and power increased, you do so at the cost of increasing the type I error rate. This is why it is common to use a type I error rate that strikes a "happy middle ground", like say A type I error rate of 0.05 is small enough to minimize the chance of rejecting the null hypothesis incorrectly, but large enough to insure a relatively manageable type II error rate along with hopefully providing a decent amount of power. In general, it is recommended that we select a type I error rate (a.k.a. level of significance) between 0.1 and When a hypothesis test is a one-tailed test instead of a two-tailed test, the areas representing the type I error rate, the type II error rate, and power are all on one side of the distribution. Recall, a one-tailed test places the entire type I error rate (or rejection region) into either the left or right tail of the distribution. For example, the location of the type I error rate, type II error rate, and power for a right-tailed test are displayed in Figure :16:26 PM]

18 Chapter 8.6 To see firsthand the relationship between type I error, type II error, and power, activate the Alpha-Beta interactive tool below. Click here to use the Alpha-Beta Tool. To utilize this interactive tool, you can control not only the type I error rate but also the sample size, the standard deviation of the distributions, and the distance between the means hypothesized in the null and alternative hypotheses. Once all of these values are determined, the interactive tool displays the probability of a type II error and the power. This tool can be used to investigate how altering the values of these variables influences the probability of a type II error and power. However, when using this tool, keep in mind that the only variables the researcher would have control over in the "real world" are the sample size, the level of, and possibly whether or not a one or two-tailed test is conducted. The standard deviation would only be estimated after the sample is taken while the distance between the means, the probability of a type II error, and power are never known. In the interactive tool, the variables that a researcher would have control over are coded in blue while the variables that would generally be unknown to the researcher are coded in red. Upon activating the interactive tool, it may be helpful to increase your understanding of the relationship between type I error, type II error, and power by answering the following questions. For a set standard deviation and distance between the means: 1. Which has higher power, a one-tailed or a two-tailed test? 2. What happens to the probability of type II error as the probability of a type I error is decreased? 3. What happens to the power as the probability of a type I error is decreased? 3:16:26 PM]

19 Chapter What happens to the probability of a type II error as the sample size is increased? 5. What happens to the power as the sample size increases? For a set sample size and level of alpha: 1. What happens to the probability of a type II error as the standard deviation increases? 2. What happens to the power as the standard deviation increases? 3. What happens to the probability of a type II error as the distance between the means increases? 4. What happens to the power as the distance between the means increases? Answers: 1. The one tailed test has higher power. 2. The probability of a type II error increases. 3. The power decreases 4. The probability of a type II error decreases. 5. The power increases. 1. The probability of a type II error increases. 2. The power decreases. 3. The probability of a type II error decreases. 4. The power increases. 3:16:26 PM]

20 Chapter 8.7 Hypothesis Tests about a Population Mean, the Right Way! In section 8.2, we discussed one method of determining which hypothesis appears to be true. Recall, in section 8.2 the mean age of students at a university was thought to be 20.5 years. To test this theory, a random sample of 30 students was selected and their mean age was determined. If the sample mean age fell in the middle 95% of the sampling distribution, we failed to reject the null hypothesis. Whereas, if the sample mean fell in either tail (the rejection regions), the null hypothesis was rejected and the alternative hypothesis was thought to be correct. While this method accomplishes the goal, it does not reflect the process used by researchers. First, in section 8.2, we assumed we knew the population standard deviation,. In more authentic situations, will almost never be known and must be estimated with the sample standard deviation, s. Similar to constructing confidence intervals (see section 7.14), when s is used to estimate, we base our calculations on the t-distribution instead of a z-distribution. In addition, we do not usually utilize the sampling distribution alone to determine if our sample mean is "extreme enough" to reject the null hypothesis, as was done in section 8.2. Although this method suffices, it is missing a key aspect of hypothesis testing: a value that measures the "strength of the evidence against the null hypothesis". For instance, in the example from section 8.2 the null hypothesis,, was rejected when the sample mean was 21.5 because it fell into one of the rejection regions. But, this process never really indicates just how "far out" the sample mean was compared to the hypothesized population mean, or more importantly, how uncommon it would be to get a sample mean as extreme or even more extreme than 21.5 if the null hypothesis was true. Obviously the evidence was strong enough to reject the null hypothesis, but think about how much stronger the evidence would have been if the sample mean turned out to be say 25.8 (if 21.5 is extreme, then 25.8 is really extreme). Introducing the P-Value In order to mathematically state just how strong, or weak, the evidence is against the null hypothesis, we calculate the probability of getting a sample mean (or any other estimate for that matter) that is at least as extreme as the one obtained assuming the null hypothesis is true. This probability is called the p-value, and is used to determine the degree in which the null hypothesis is either rejected or retained. Graphically, the p-value is the area in the tail(s) beyond the sample mean. Regardless of where the sample mean falls in the sampling distribution, if the test is a left-tailed test, then the p-value is associated with the area to the left of the sample mean as shown in Figure :16:47 PM]

21 Chapter 8.7 Similarly, if the test is a right-tailed test, the p-value can be graphically represented by the area to the right of the sample mean, as shown in Figure Two-tailed tests are approached differently because the definition of the p-value states " at least as extreme as the one obtained, in the direction of the alternative hypothesis...". Thus, when the alternative hypothesis indicates the rejection region is in both tails ( ), the p-value is related to the area that extends outward towards both tails, beyond not just the sample mean, but beyond the location of its complement (mirror image for a symmetrical distribution) called the "pseudo" sample mean. An example of a two- 3:16:47 PM]

22 Chapter 8.7 tailed test is illustrated in Figure 8.13, where the solid green line represents the actual sample mean and the dashed green line represents the corresponding pseudo sample mean. The p-value is then found by considering the area the the left of the pseudo mean and to the right of the actual mean. Of course if the actual mean is positioned in the left tail instead of the right, then the p-value is found by combining the area to the left of the sample mean with the area to the right of the pseudo sample mean. Note, that if the sample mean is not in the rejection region, then the area beyond the sample mean and the pseudo sample mean will be larger than the area defined by the rejection region (which is equal to the level of significance, or ). Therefore, if the p-value is larger than the level of significance, we will fail to reject the null hypothesis. However, if the sample mean falls in a rejection region, then the area beyond the sample mean and pseudo sample mean will be less than the level of significance, causing us to reject the null hypothesis. Note that due to the inclusion of the pseudo sample mean, the p-value for a two-tailed test will always be twice as large as the p-value for an equivalent one-tailed test. In general, smaller p-values (smaller probabilities) indicate stronger evidence against the null hypothesis while larger p-values indicate weaker evidence against the null hypothesis. Thus, if your p-value is very small (smaller than say 0.001) then you have very strong evidence in support of rejecting the null hypothesis. Keep in mind however, that regardless of the determined p-value, we always reject the null hypothesis if the p-value is smaller than the stated level of significance. The p-value simply gives us an "idea" of how strong our evidence is against the null hypothesis. Determining the P-Value To determine the p-value, (i.e., the strength of the evidence against the null hypothesis), we must first convert the sample mean into a "test statistic" by standardizing it. This standardized value is called a test statistic because it is the value that is used to test the null hypothesis. A sample mean can be standardized 3:16:47 PM]

23 Chapter 8.7 (turned into a test statistic) by using Equation 8.1. Once the test statistic is found, we can then find the p- value using an appropriate interactive tool (or table). This process is similar to the method used in chapter 6, where we standardized x-scores by transforming them into z-scores, and then, found the probabilities associated with our z-scores. An example problem (or two) will hopefully make this process clearer, but first, let's consider the situations in which equation 8.1 provides us with reliable p-values. The assumptions for conducting a hypothesis test on a population mean (i.e. what we need to assume of we are to use Equation 8.1 appropriately) are: 1. The data was collected via a simple random sample. 2. The sample size must be large enough to ensure an approximately normal sampling distribution. According to the central limit theorem, we need a sample size adequate enough to make the sampling distribution approximately normal. Generally a sample size of 30 will suffice. Oxygen Intake Example: A research scientist claims that the mean oxygen intake per breath for smokers is less than 40.6 ml/kg. Based on a sample of 35 smokers the mean oxygen intake was found to be 39.2 ml/kg with a sample standard deviation of 3 ml/kg. Obviously the sample mean, 39.2 ml/kg is less than the stated 40.6 ml/kg. But, is this difference statistically significant, i.e. large enough to statistically convince us that the mean oxygen intake is less than 40.6 ml/kg? Stated differently, is there enough evidence to support the claim based on a level of significance of = 0.05? Step 1: Determine the Null and Alternative Hypotheses Because the research believes the oxygen intake is less than 40.6 ml/kg, but not equal to it, the claim is the alternative hypothesis. Therefore the null and alternate hypotheses can be written as = 40.6 and < 40.6 respectively. The alternative hypothesis indicates a left-tailed test is to be conducted, meaning the entire rejection region falls on the left side of the distribution. Since the level of significance is set at 0.05, the area in the rejection region will be If the p-value turns out to be less than or equal to 0.05, the null hypothesis should be rejected, otherwise the null hypothesis should be retained. Step 2: Calculate the Test Statistic Using Equation 8.1, determine the test statistic (the t-score) by standardizing the sample mean of 39.2 ml/kg with a sample standard deviation of 3 ml/kg. 3:16:47 PM]

24 Chapter 8.7 Step 3: Find the p-value Next, determine the correct degrees of freedom for which the t-distribution is applicable, and then find the area to the left of the test statistic. This is equivalent to finding the probability of obtaining a sample mean as extreme or more extreme than 39.2 when the true mean is assumed to be 40.6 ml/kg. The correct degrees of freedom are n - 1 = 34. The probability of being to the left of 39.2 (or being to the left of a t-score of based on 34 degrees of freedom) can be found using the interactive p-value calculator for a t-distribution. This application (along with others) is available in the floating menu at the right of the screen. To utilize the interactive p-value calculator, we first set the degrees of freedom to the correct value (34 for the current example). Once the degrees of freedom are set, we determine whether we are conducting a left-tailed test, a right-tailed test, or a two-tailed test. Since we are currently conducting a left-tailed test, we will utilize the box marked "Left Tailed". From here, to find the p-value associated with our test statistic, we adjust the slider (corresponding to the p-value) until the value of the test statistic (-2.76) matches the value in the box labeled "Left Tailed" (if we cannot get exactly -2.76, then get as close as possible). Once the value in the box labeled "Left Tailed" contains the value of the test statistic, the corresponding p- value can be found from inspection of the "Corresponding p-value" box. The p-value for this example turns out to be about :16:47 PM]

25 Chapter 8.7 Step 4: Decide which Hypothesis Appears to be True Since the p-value is 0.005, the probability of getting a sample mean as extreme as 39.2 assuming the population mean is 40.6 is about (not very likely). Since this probability is so small, we are led to believe that the null hypothesis is false. Therefore we reject the null hypothesis and favor the alternative hypothesis instead. In summary, since the p-value is smaller than the stated alpha level of 0.05, we reject the null hypothesis in favor of the alternative hypothesis. Step 5: Write a Statement(s) Explaining Your Conclusion An example of a good concluding statement would be: "Based on a sample of 35 smokers, there is sufficient evidence (p-value = 0.005) to reject the notion that the mean oxygen intake for a smoker is at least 40.6 ml/kg (the null hypothesis), and therefore, we conclude that the mean oxygen intake for smokers is lower than 40.6 ml/kg (the alternative hypothesis)." Professor Salary Example: A study conducted by the American Chemical Society claims that the average annual salary of tenured chemistry professors is $70,000. To test this claim, the Association of American Chemistry Professors randomly selected 52 tenured chemistry professors and found their mean salary to be $70,150 along with a standard deviation of $900. Investigate the claim using an alpha level of 0.05 (the level of significance). Step 1: Determine the Null and Alternative Hypotheses 3:16:47 PM]

26 Chapter 8.7 Since there was no indication of less than or greater than, the alternative hypothesis must contain the not-equal sign. Therefore, the null and alternative hypotheses are and respectively. As a result, this is a two-tailed test. Step 2: Calculate the Test Statistic The sample mean was $70,150. Using the stated standard deviation of $900 and the sample size of 52, we calculate the test statistic using Equation 8.1. Step 3: Find the P-Value After determining the degrees of freedom, we can find the p-value using the interactive p-value calculator for a t-distribution. Since we are conducting a two-tailed test, and the test statistic is positive, we find the p-value based on the "Upper two-tailed" critical value (if the test statistic was negative we would utilize the "Lower two-tailed" critical value). Thus, based on 51 degrees of freedom, the p-value that corresponds to a test statistic of t = 1.2 is about Step 4: Decide which Hypothesis Appears to be True Because the p-value, 0.24, is larger than the stated alpha level of 0.05 we fail to reject the null hypothesis. The data does not provide us with sufficient evidence to suggest the null hypothesis is false as the difference between $70,150 and the hypothesized $70,000 was not statistically significant. The probability of getting a sample mean as extreme as $70,150 when the population mean was thought to be $70,000 is 0.24, meaning such a result will happen in about one out of every four samples, which is quite common. Step 5: Write a Statement(s) Explaining Your Conclusion An example of a good concluding statement would be: "Based on a sample of 52 tenured chemistry professors and a 0.05 level of significance, we do not have enough evidence (p-value = 0.24) to reject the claim that the mean salary of tenured chemistry professors is $70,000". The same results can be obtained by using a statistical software package such as Minitab. For instance, below is the output that corresponds to our Professor Salary Example. 3:16:47 PM]

27 Chapter 8.7 Notice that, among other things, Minitab provides us with the value of the test statistic and the corresponding p-value. Regardless of whether we use Minitab or conduct the test "by hand" our decision remains the same, i.e., since the p-value is greater than 0.05 we fail to reject the null hypothesis. 3:16:47 PM]

28 Chapter 8.8 Hypothesis Testing about a Population Proportion Not only can we test values of hypothesized population means, but we can also test the values of hypothesized population proportions as well. Recall, proportions specify the part or percent of a population that has a specific characteristic or trait. For instance, we might hypothesize about the population proportion of registered voters who will participate in an upcoming election, or, hypothesize about the proportion of grizzly bears in Denali National Park that are male. There are very few differences between conducting a hypothesis test for a population proportion in comparison to conducting a hypothesis test for a population mean (see section 8.7). This is because the basic steps in conducting a hypothesis test on a population proportion are essentially the same as for a population mean. Additionally, all previously discussed terms retain their exact meaning, such as type I error, p-value, power, etc. The only difference is, we are concentrating on proportions instead of means, so consequently, the null and alternative hypotheses will be statements about a population proportion instead of a population mean. It is also the case that our test statistic will be calculated based on proportions, and assumes the shape of a z-distribution (similar to when confidence intervals about a population proportion were created). Thus, our test statistic will take the form of a z-score, where the equation for the test statistic is given in Equation 8.2. In Equation 8.2, P is the value of the population proportion (our hypothesized value), and is the sample proportion, which is defined as the number of observations with the trait/characteristic of interest, x, out of the number in our sample, n, i.e.,. However, before working on some examples, we need to first consider the requirements necessary for Equation 8.2 to provide reliable results. The assumptions for conducting a hypothesis test on a population proportion are: 1. The data was collected via a simple random sample. 2. The sample size must be large enough to ensure an approximately normal distribution. An easy rule of thumb is if both np > 15 and n(1-p)>15 are true, then the sample size is adequate. Since we are testing the null hypothesis, we must use the proportion stated in the null hypothesis and not the sample proportion when checking this assumption (other textbooks claim that 15 can be replaced by 10 or as little as 5, although research hints otherwise). Voter Example: A political analyst states that fewer than 30% of the voting population will vote in an upcoming city election. To support his claim, he randomly samples 400 registered voters in the city and 3:17:02 PM]

29 Chapter 8.8 determines that 98 plan to vote in the upcoming election. Conduct an appropriate hypothesis test for the political analyst using a level of significance of Step 1: Determine the null and alternative hypotheses Because the political analyst stated that less than 30% will vote (no equality implied), the claim must be the alternative hypothesis. Therefore the null and alternative hypotheses can be written as and respectively. Note, it is the political analyst's goal to disprove the null hypothesis, thus verifying his statement. Step 2: Check the assumptions Since P = 0.30, then (400)(0.3) = 120 and (400)(1-0.3) = 280. Since both are greater than 15, the sample size is sufficient to use the z-distribution. Step 3: Calculate the test statistic and p-value The value of the sample proportion is and the test statistic is: Because the alternative hypothesis states "less than," this is a left-tailed test and the rejection region is contained in the left tail only. Thus, to calculate the p-value that corresponds to the test statistic, we must first activate the p-value calculator for the z-distribution interactive tool (found in the floating menu to the right of the screen). Once activated, the interactive tool should look similar to Figure :17:02 PM]

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935) Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

WISE Power Tutorial All Exercises

WISE Power Tutorial All Exercises ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Mind on Statistics. Chapter 12

Mind on Statistics. Chapter 12 Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

In the past, the increase in the price of gasoline could be attributed to major national or global

In the past, the increase in the price of gasoline could be attributed to major national or global Chapter 7 Testing Hypotheses Chapter Learning Objectives Understanding the assumptions of statistical hypothesis testing Defining and applying the components in hypothesis testing: the research and null

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1 Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

More information

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals. 1 BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Hypothesis Testing: Two Means, Paired Data, Two Proportions Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Introduction to Hypothesis Testing." My name is Dr. Jennifer Ann Morrow. In today's demonstration,

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.

More information

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Typical Inference Problem Definition of Sampling Distribution 3 Approaches to Understanding Sampling Dist. Applying 68-95-99.7 Rule

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Sample Practice problems - chapter 12-1 and 2 proportions for inference - Z Distributions Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Lesson 9 Hypothesis Testing

Lesson 9 Hypothesis Testing Lesson 9 Hypothesis Testing Outline Logic for Hypothesis Testing Critical Value Alpha (α) -level.05 -level.01 One-Tail versus Two-Tail Tests -critical values for both alpha levels Logic for Hypothesis

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Online 12 - Sections 9.1 and 9.2-Doug Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

More information

8 6 X 2 Test for a Variance or Standard Deviation

8 6 X 2 Test for a Variance or Standard Deviation Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis

More information

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 5.1 Homework Answers 5.7 In the proofreading setting if Exercise 5.3, what is the smallest number of misses m with P(X m)

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

Statistical estimation using confidence intervals

Statistical estimation using confidence intervals 0894PP_ch06 15/3/02 11:02 am Page 135 6 Statistical estimation using confidence intervals In Chapter 2, the concept of the central nature and variability of data and the methods by which these two phenomena

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

p ˆ (sample mean and sample

p ˆ (sample mean and sample Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

ELEMENTARY STATISTICS

ELEMENTARY STATISTICS ELEMENTARY STATISTICS Study Guide Dr. Shinemin Lin Table of Contents 1. Introduction to Statistics. Descriptive Statistics 3. Probabilities and Standard Normal Distribution 4. Estimates and Sample Sizes

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATH 140 Lab 4: Probability and the Standard Normal Distribution MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes

More information

Stats Review Chapters 9-10

Stats Review Chapters 9-10 Stats Review Chapters 9-10 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Tests for One Proportion

Tests for One Proportion Chapter 100 Tests for One Proportion Introduction The One-Sample Proportion Test is used to assess whether a population proportion (P1) is significantly different from a hypothesized value (P0). This is

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Chapter 7 TEST OF HYPOTHESIS

Chapter 7 TEST OF HYPOTHESIS Chapter 7 TEST OF HYPOTHESIS In a certain perspective, we can view hypothesis testing just like a jury in a court trial. In a jury trial, the null hypothesis is similar to the jury making a decision of

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

6: Introduction to Hypothesis Testing

6: Introduction to Hypothesis Testing 6: Introduction to Hypothesis Testing Significance testing is used to help make a judgment about a claim by addressing the question, Can the observed difference be attributed to chance? We break up significance

More information

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice! Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!) Part A - Multiple Choice Indicate the best choice

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion Learning Objectives Upon successful completion of Chapter 8, you will be able to: Understand terms. State the null and alternative

More information

Testing Hypotheses About Proportions

Testing Hypotheses About Proportions Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information