Chapter Five. Hypothesis Testing: Concepts

Size: px
Start display at page:

Download "Chapter Five. Hypothesis Testing: Concepts"

Transcription

1 Chapter Five The Purpose of Hypothesis Testing An Initial Look at Hypothesis Testing Formal Hypothesis Testing Introduction Null and Alternate Hypotheses Procedure for Formal Hypothesis Tests Examples Errors in Hypothesis Testing Introduction False Positive Errors False Negative Errors Summary: Choosing the Confidence Level Chapter Checkpoint

2 The Purpose of Hypothesis Testing The purpose of obtaining measurements of a chemical system is usually to draw some conclusions about the properties of the system. One of the simplest use of statistics, one that has largely concerned us to this point, is to obtain an estimate of the system properties through the use of confidence intervals. This is an aspect of statistical estimation theory. Now, however, we turn our attention to decision theory, where we learn how we can use measurement statistics to draw general conclusions about chemical systems. The following are examples of situations where we want to draw some kind of conclusion based on measurements: two reactants are mixed, and the concentrations of the products are monitored as a function of time in order to determine the rate constant, k, of the reaction. You want to compare the results of your measurement with a value calculated from theory. you have just come up with a new synthetic procedure for a certain commercial product that you believe increases the yield over the currently accepted method. You measure the yield by both methods, and you find that your method gives a 65% yield while the older method gave a 60% yield. You must prove that your method is actually superior to the older method, and that the increase in yield is not due to the uncertainty in the measured values. For a more detailed example, consider the following situation. Let s say we obtain the following measurements of the ph of a particular solution ph measurements: 9.5, 9.9, 9.8 Now we wish to know whether it is possible to state, with confidence, that the ph of the solution is less than 10. If we can assume that the measurements are unbiased, we can restate this question in a form that can be evaluated with statistics, namely: is it true that ph < 10? Now, assuming no measurement bias, the fact that none of the measurements of ph are greater than 10 seems to support the notion that the true ph of the solution is less than ten. However, since the measurement of ph is a random variables, there is always a chance that the actual ph is indeed greater than ten, and that the three measurements, by random chance, all happen to be less than 10 just like there is a chance that three coin flips in a row will come up tails, even though there is a fifty-fifty chance of getting heads on any single coin toss. Our problem is this: at what point can we say that random variability is an unlikely explanation for the difference between the measured ph values and a fixed value (e.g., a ph of 10)? In other words, when do the measured values differ significantly from the fixed value? The meaning of the word significantly must be very clear: a statistically significant difference in the values would be a greater difference than could be reasonably explained by random error. This is exactly the type of question that hypothesis testing answers. Hypothesis tests are sometimes called significance tests, since they detect significant differences in numbers, differences that are unlikely to be due to random chance. Page 110

3 An Initial Look at Hypothesis Testing Let s use an example to help us to see how we might derive conclusions using random variables (i.e., measurements). Example 5.1 A cigarette manufacturer states that the nicotine level of its cigarettes is 14 mg per cigarette. You wish to test this claim. You collect a random sample of 5 cigarettes and test for nicotine content. The measured nicotine level (in mg) of the cigarettes in the sample are 14.05, 14.33, 16.36, 18.55, Do these measurements indicate a nicotine level different than that claimed by the manufacturer? Basically, what we would like to do is test the following statement: Hypothesis: The true nicotine level of the cigarettes is different from that claimed (14 mg) by the manufacturer. Let s calculate the mean of the measured nicotine level. x ( ). mg measurements T x bar mean( x ) x bar = mg So the mean measured level of nicotine in the five cigarettes was mg/cigarette. Obviously, this value is somewhat larger than the nicotine level stated by the manufacturer. The question is, however, is the difference between the nicotine levels significant? Do we have any justification for challenging the nicotine levels claimed by the manufacturer? In order to answer this question, we need more information than simply the measurement average: we must also make use of the observed variability of the five measurements to construct a confidence interval. s x stdev ( x) se s x 5 se = mg standard error of mean value t critical t-value for 4 df's at the 5% level width t. se width = 2.32 mg x lower x. bar t se x lower = mg x upper x. bar t se x upper = lower boundary of CI upper boundary of CI mg In this instance, the 95% confidence interval is ± 2.32 mg/cigarette. Recall exactly what this interval represents: assuming no bias, this range of values ( mg) contains the true amount of nicotine in the cigarettes analyzed, with 95% probability. Page 111

4 Since the confidence interval calculated from the measurements on five cigarettes includes 14 mg, we cannot support the original hypothesis that the manufacturer s claimed nicotine level is incorrect. In other words, the difference between the measurement mean of mg and the manufacturer s stated level of 14 mg is not significant. Note that we must be very careful in how we phrase our conclusion. Even though the confidence interval includes the value 14 mg, we have not proven that the manufacturer s claim is true. In other words we do not prove that [nicotine] = 14 mg/cigarette. We can only state that there is a 95% probability that the true nicotine content is somewhere between and 17.93; out best estimate of the nicotine content is mg. we cannot prove (with 95% probability) that [nicotine] 14 mg/cigarette, since the 95% confidence interval contains this value. We have just had our first brush with hypothesis testing, where we use data (containing random error) from an experiment to test an assertion. This is obviously an important area of statistics, and one that we will discuss in detail. Page 112

5 Formal Hypothesis Testing Introduction In the last section, a confidence interval was constructed in order to test a specific hypothesis. In scientific endeavors, there are a wide variety of different types of hypotheses that may need to be tested using the results of one or more experiments. In this section, we will formalize the procedure to be used in hypothesis testing. Although the procedure may seem a little rigid, it can be adopted for almost any situation. The price for the general applicability of the procedure is the use of somewhat abstract language and concepts. Null and Alternate Hypotheses All hypothesis tests actually involve at least two statements, called the null hypothesis (H 0 ) and the alternate (or working) hypothesis (H 1 ). A statistical hypothesis is an assertion or conjecture concerning one or more population parameters. Basically, this step is a translation from words to population parameters. The null hypothesis, H 0, will generally involve an equality and one or more population parameters. In our nicotine example, the null hypothesis would be: null hypothesis H 0 : µ x = 14 mg/cigarette In other words, we accept as the null hypothesis the manufacturer s claim that each cigarette contains 14 mg of nicotine. If the null hypothesis is true, and if there is no bias in the measurements, then the population mean µ x of all measurements will be 14 mg. As you can see, the null hypothesis involves a population parameter (µ x, the population mean of the measurements) and a statement of equality. As we will stress time and again, the null hypothesis cannot be proven. It is assumed as fact unless the data proves otherwise. The alternate hypothesis, H 1, will be a statement involving the same population parameters, in such a way that H 1 and H 0 cannot both be true. Usually the alternate hypothesis involves one of the following relational operators:, <, or >. For our example, alternate hypothesis H 1 : µ x 14 mg/cigarette (two-tailed test) Alternate hypotheses such as this one, with a not equals ( ) relationship, result in two-tailed tests. This statement claims that the measurement population mean is not 14 mg; if we assume no measurement bias, this hypothesis disputes the manufacturer s claim of nicotine level. The form of both hypotheses is very important, particularly that of the alternate hypothesis. This is because we are testing the alternate hypothesis in the hypothesis test procedure. Suppose we actually suspect that the manufacturer is underestimating the nicotine level in the cigarettes; in this case, we would use the following alternate hypothesis: a different alternate hypothesis H 1 : µ x > 14 mg/cigarette (one-tailed test) or, H 1 : the true nicotine content is greater than 14 mg/cigarette Page 113

6 This form of H 1 would result in a slightly different hypothesis test. Alternate hypotheses such as this one, with a greater than (>) or less than (<) relationship, result in one-tailed tests. In the hypothesis testing procedure, we assume that the null hypothesis is true, and it is not tested. The goal of the procedure is to test the assertion embodied by the alternate hypothesis, H 1. If H 1 is proven to be true, then obviously H 0 will be false. This format is exactly the same as that of the US criminal legal system, as represented in the famous statement innocent until proven guilty. In statistical hypothesis testing, H 0 is assumed to be true unless H 1 can be proven to be true with reasonable certainty. Procedure for Formal Hypothesis Tests For easy reference, here is a list of the steps in hypothesis testing; each step will be discussed in detail. 1. Form the null hypothesis, H 0, and the alternative hypothesis, H 1, in terms of statistical population parameters. 2. Choose the desired confidence level. The confidence level this is also sometimes called the significance level. 3. Choose a test statistic and calculate it. 4. Calculate the critical values; alternately, determine the P-value of the test statistic. 5. State the conclusion clearly, avoiding statistical jargon. Step 1: State the null hypothesis (H 0 ) and the alternate hypothesis (H 1 ) We have described the null and alternate hypotheses. Formulating these is the most difficult but crucial part of the test procedure. Remember that we begin with an assumption that H 0 is true, and that we are trying to test H 1. We may be interested in either proving or disproving H 1. The following table gives the null hypotheses for three common statistical tests. Note that the null hypothesis always involves population parameters, and (in these cases) is expressed as an equality. Page 114

7 Situation comparison of a random variable, x, and a fixed value, k Form of the null hypothesis H 0 : x = k Answers the question: Is there a significant difference between the mean of some measurements, and some fixed value? comparison of the mean of two variables, x and y H 0 : x = y Is there a significant difference between the mean of two sets of measurements? comparison of the variances of two variables, x and y H 0 : x 2 = y 2 Is there a significant difference between the variances of two sets of measurements? The alternate hypotheses, H 1, in these cases may involve an inequality ( ) or a relational operator (< or >). As discussed previously, the form of H 1 determines whether we use a one-tailed or a two-tailed test. Step 2: Choose the desired level of confidence/significance Remember that any confidence interval has an associated confidence level. The purpose of a confidence interval is to bracket the possible values for a population parameter such as µ x. Random variables always add a little spice (i.e., uncertainty) to any conclusion; there is always a chance that we are wrong, since random variables are, well, random. So the confidence level is needed to state the probability that the population parameter is truly contained with our confidence interval. It is a measure of how much we trust the interval, how confident we are in our result. Since confidence intervals play a crucial role in hypothesis testing, it is not surprising that we generally choose a confidence level when testing assertions using the results of experiments, which are almost always random variables. The meaning of the confidence level in hypothesis testing is slightly different than in confidence intervals, however. Consider our example. We have two competing hypothesis: H 0 : µ x = 14 mg and H 1 : µ x 14 mg. We are testing the alternate hypothesis, H 1, and there are two possible outcomes: 1. We succeed in proving that H 1 is true, in which case H 0 is know to be false. 2. We fail to prove that H 1 is true. [Remember! We cannot prove that H 0 is true.] The confidence level in hypothesis testing measures our certainty when we succeed in proving H 1. It is the probability that the conclusion that H 1 is true and H 0 is false is correct. Let s assume that we want to test at the 95% level for our example. That means that, if our test proves that the nicotine level is not 14 mg, there is a 95% probability that our data has lead us to the proper conclusion. You might wonder: why wouldn t I want to be very certain in my conclusion? In other words, shouldn t I always choose a high confidence level in hypothesis testing (at least 95%, and maybe Page 115

8 99% or even 99.9%!). We will defer a discussion of the appropriate confidence level in testing to later in the chapter. But for now, ask yourself this question: why don t you similarly always choose a high confidence level in constructing confidence intervals? A 95% confidence interval is commonly given; why not always use 99%, or 99.9%? What affect would that have on the confidence interval? There are both advantages and disadvantages in choosing high confidence levels, as we will discover. In statistics, the term significance level is probably more common than confidence level in hypothesis testing. The significance level (SL) is directly related to the confidence level (CL): SL = 100% CL. Thus, instead of testing at the 95% confidence level, we may instead test at the 5% significance level and arrive at the same conclusions. Although we will tend to use the term confidence level in this text, you should be familiar with both terms. Step 3: Choose a test statistic and calculate its value The next step in hypothesis testing is to choose a statistic (the test statistic) appropriate for testing the hypotheses. The test statistic (like any statistic) is a value calculated in some manner from the data. Since the data presumably contain random error, the test statistic will likewise be a random variable. There are two requirements for a test statistic: 1. Its probability distribution must be known; preferably, tables of critical values exist for the statistic. 2. The test statistic should result in a reasonably good (or efficient ) hypothesis test. What factors might make one test better than another? Let s come back to that point in a little bit. In example 5.1, the null and alternate hypotheses both deal with the population mean µ x of the measurements, so it would seem that we could certainly use the sample mean of the measurements as the basis for the test statistic. In constructing a confidence interval for µ x, the t-distribution is used (when σ x is not known). This suggests that the following test statistic, T, could be used in this hypothesis test: possible test statistic T = x n 14 s(x n ) The test statistic is the studentized sample mean. It has a t-distribution; if H 0 is true, then µ T = 0. The sample mean is not the only possible basis of the test statistic. Instead, we could use the sample median, or some other form of weighted average. It turns out that for normally distributed data, the studentized sample mean is the best test statistic to use for hypothesis tests such as for example 5.1. Let s calculate the observed value for the test statistic for the five measurements in example 5.1: x bar 14. mg T obs se T obs = This is the "studentized" mean: the number of std devs of the mean from 14 mg In this equation, se is the standard error of the sample mean, x bar. According to the observed test statistic, the mean of the measurements, mg/cigarette, is 1.92 standard deviations from the manufacturer s claimed value of 14 mg/cigarette. Page 116

9 Step 4: Calculate the critical value(s) or the P-value It is important to keep in mind that the null hypothesis, H 0, is innocent until proven guilty. The probability distribution of the test statistic, T, assuming that the null hypothesis is true, is called the null distribution. The next step in hypothesis testing is to calculate the critical value(s) of the null distribution. For two-tailed tests, such as the one we must use for example 5.1, there are two critical values. (One-tailed tests only have a single critical value). The null distribution of T is a t-distribution with four degrees of freedom and a mean of zero. Recalling that we choose 95% as our confidence level, the critical values are the values such that T crit = ± t 4,0.025 = ± lower critical value upper critical value 95% Test Statistic accept H 1 reject H 0 accept H 1 reject H 0 accept H 0 T = T = (lower critical value) upper critical value Figure 5.1: Decision criteria for the hypothesis test for example 5.1. If the observed test statistic is above the upper critical value, or below the lower critical value, then we accept the alternate hypothesis, H 1, and reject the null hypothesis, H 0. The critical values are the boundaries between two decision-making regions: the acceptance region, between the two critical values. If the test statistic assumes a value in this region, then the null hypothesis, H 0, is accepted. We cannot prove the alternate hypothesis, H 1, with the desired confidence level. Page 117

10 the rejection region, where T obs > T upper or T obs < T lower. If the test statistic is in this region, then H 0 is rejected and H 1 is accepted. We have proven that H 1 is true at the desired confidence level. By inspecting the null distribution, we can see how the critical values are chosen, and we can understand the role of the confidence level in hypothesis testing. Figure 5.1 shows the situation for a two-tailed test at the 95% confidence level. We choose the critical values so that the 95% of the area under the null distribution is between the critical values. What this means is that, if the null hypothesis is true, there is a 95% probability that the observed test statistic will be within the acceptance region. It is not strictly necessary to calculate the critical values. An alternative approach makes use of the concept of the P-value, which has been mentioned before. The P-value can be interpreted in terms of the null distribution; in particular, for a two-tailed test, the P-value is two-tailed P-value P obs = P(T > T obs )+P(T < T obs )=2 $ P(T > T obs ) Consider example 5.1: the mean of five measurements of nicotine content was mg/cigarette, which is 1.92 standard deviations from the manufacturer s claimed value. Most statistical programs and spreadsheets will also calculate the P-value; for example 5.1, the two-tailed P-value is P obs = In other words, if the null hypothesis were true, there is a 12.66% probability that we would obtain a sample mean that is farther than 1.92 standard deviations from 14 mg/cigarette (in either direction). The P-value is used instead of (or in addition to) critical values. It indicates the weight of the evidence in favor of the alternate hypothesis: the smaller the P-value, the less likely it is that random variability can account for the observed data. To tie the P-value approach with the critical region approach, consider this: the P-value tells us the maximum value of the confidence level that we can adopt and still prove the alternate hypothesis. We calculate this value by maximum confidence level: CL = 100% $ (1 P obs ) where CL is the confidence level as a percentage. For example 5.1, if we choose a confidence level of 87.44% or less, then we can prove that the alternate hypothesis is true. Of course, a smaller confidence level means that we are less confident of our conclusion, so we want a P-value as small as possible. We may more directly interpret the P-value in terms of the significance level. The P-value is the largest significance level at which we may accept the alternate hypothesis. Thus, in this example, we can prove H 1 at the 12.66% significance level, at best. Remember: a smaller significance level means we are more certain of this conclusion. Aside: calculating P-values in Excel Page 118

11 When the null distribution is a t-distribution, then the P-value is calculated in Excel by using the TDIST() function: calculation P-values in Excel P obs = tdist(t obs, df, tail) where T obs is the observed value of the test statistic, df are the degrees of freedom of the t-distribution, and tails is either one or two (for 1- or 2-tailed P-values). For example 5.1, you would enter = tdist(1.9243, 4, 2) into any cell to obtain the 2-tailed P-value. Other Excel functions would be needed when the null distribution does not follow a t-distribution. Step 5: State the conclusion After we decide whether to accept H 0 or H 1, we must state our conclusion in a manner that is accurate and yet can be understood by anyone who does not have a background in statistics. Essentially, we must translate our conclusions from statistic-ese (e.g., reject H 0, accept H 1 ) to normal language. We should give both our conclusion and the confidence level, even though the confidence level is most properly understood in a statistics framework. For example 5.1, we accepted H 0 ; we couldn t prove H 1. In other words, our conclusion would be: We cannot prove with 95% confidence that the nicotine level in the cigarettes is different than 14 mg/cigarette. This statement sounds like poor English (basically a double negative), but the wording was very carefully chosen. We begin with the assumption that the cigarettes have 14 mg of nicotine, and we fail prove any differently. This is similar to a jury returning a verdict of Not Guilty in a criminal trial. Notice that the verdict is not that the defendant was innocent, simply that guilt was not proven beyond a reasonable doubt. In hypothesis testing, the level of reasonable doubt is determined when the confidence level is set. Examples Let s try another two-tailed test. This test is similar in nature to example 5.1. Example 5.2 A certain analytical procedure is being tested for the presence of measurement bias. Twenty measurements are made on a solution whose concentration has been certified at µm. The sample mean is µm, with an RSD of 5.0% for the individual measurements. Is there any evidence of measurement bias? Page 119

12 First let's set up the null and alternate hypotheses H 0 : µ x µm There is no bias in the measurements. H 1 : µ x µm Bias exists; two-tailed test. ξ x µm x bar µm RSD. 5.0 % s. x RSD x bar s x = µm std_err s x 19 std_err = µm Let's use the studentized mean as the test statistic, and calculate the observed test statistic T obs x bar ξ x std_err T obs = sample mean is this many std devs from the true value P obs This is the two-tailed P-value of the observed value of the test statistic Now we look up the critical values from the t-tables. For 19 degrees of freedom, a 95% confidence level and a two-tailed test, the critical values are and Since the observed value of the test statistic is within the acceptance region, we must accept the null hypothesis. Thus, we cannot prove bias in these measurements at the 95% confidence level. Note: from the observed P-value for this example, we see that we can only prove H 1 with 60.22% confidence, at best. Now let s try a one-tailed test. Example 5.3 It is suspected that a series of tests of blood alcohol level proves that the alcohol level is above the legal limit of 0.10%. The measurements are: Do these measurements prove legal intoxication with 95% confidence? As always, the first step is to set up the null and alternate hypotheses. In this case, we should use the following: null H 0 : µ x = 0.10 % blood alcohol level at the legal limit (assuming no bias) alternate H 1 : µ x > 0.10 % blood alcohol level above the legal limit It may be a little difficult to see why the null hypothesis should be that the blood alcohol level is exactly 0.10 %. In setting up the hypotheses, it is best to always ask yourself, what is it that I want to test? What are the possible conclusions? The answers to these questions determine the form of the alternate hypothesis; the null hypothesis will follow. For this example, we want to test whether or not the alcohol level is above the legal limit. Remember that the purpose of the statistical test procedure is actually to test the alternate hypothesis, so that we would propose as the alternate hypothesis that the alcohol level is too high. The nature of the testing procedure is such that we either prove or fail to prove this Page 120

13 hypothesis; i.e., or conclusion will be either that we can prove that the alcohol level is too high (a guilty verdict) or that we cannot prove an excessive alcohol level ( not guilty ).These conclusions are proper for our intentions in this example. Since we propose that µ x > 0.10 % is our alternate hypothesis statement, the corresponding null hypothesis is µ x = 0.10 %. The other thing to notice about the form of H 1 in this example is that it results in a one-tailed test. This will affect the critical values (and the P-value, if we calculate it). Let s continue with our testing procedure. We can proceed by calculating the observed test statistic. x ( ).% x bar mean( x ) x bar = % std_err T stdev ( x) 6 std_err = % Let's calculate the observed test statistic x bar 0.10.% T obs T obs = studentized measurement mean std_err P obs Probability of seeing a larger value that T obs is 0.379%. The P-value is standard output for many statistical programs. In this case, the one-tailed P-value is %, which means that we can prove H 1 at the % confidence level, if we desired; certainly at the 95% level we may reject H 0 and accept H 1. However, it is difficult to use t-tables to calculate P obs, so we will confirm this decision using the critical value approach. For a one-tailed test, there is only a single critical value, as shown in the next figure. Page 121

14 critical v alue 95% Test Statistic H 0 : µ = k H 1 : µ > k accept H 1 accept H 0 critical value Figure 5.2 : An example of a one-tailed test. There is only a single critical value. The top figure shows the null distribution. The critical value is chosen such that the area to under the curve to the left of the critical value is at the appropriate confidence level (95% for this example). The lower figure shows the decision process: if the observed test statistic is larger than the critical value, T obs > T crit, then the null hypothesis is rejected and the alternate hypothesis is proven. Recall that the null distribution is the probability distribution of the test statistic, T, assuming that H 0 is true. As the upper figure shows, we must choose the critical value such that, for the null distribution, P(T obs < T crit ) = CL where CL is the chosen confidence level. For our example, we have chosen a confidence level of 95%. We can determine the critical value from the t-tables: one-tailed critical value T crit = t, = t 5,.05 where ν is the appropriate degrees of freedom, and α is the area in the right tail of the t distribution. We determine the value of α from the confidence level:. CL =(1 ) $ 100% For our example, the t-tables tell us that the critical value is T crit = If you recall, the observed value of the test statistic was ; since this is larger than the critical value, we reject the null hypothesis and accept the alternate hypothesis. Our conclusion is: Assuming no measurement bias, the data show that the blood alcohol level is above the legal limit (at the 95% confidence level). Page 122

15 Errors in Hypothesis Testing Introduction Since they involve random variables, there is always an element of uncertainty in hypothesis tests. Specifically, there is always a chance that the conclusion of a test is in error. This uncertainty is the reason that you must specify a confidence level when you perform statistical tests. Choosing the confidence level allows you to determine the degree of the uncertainty in your test: basically, you can control the likelihood that your conclusion is correct. As we will see, the confidence level also indirectly determines the ability of the statistical test to detect and label small differences as significant. How can the conclusion from a hypothesis test be in error? For tests with a single null hypothesis, H 0, and a single alternate hypothesis, H 1, then the following table shows all the possibilities: decision reality H 1 is not true: H 1 is true: accept H 0 ( negative result) correct false negative accept H 1 ( positive result) false positive correct Let s illustrate with an example. Let s say someone undergoes a pregnancy test. Now the reality of the matter is that the person is either pregnant or she isn t.the test will either decide in favor of pregnancy (called a positive test result) or will decide that the subject is not pregnancy (a negative result). We can draw an analogy to statistical hypothesis tests. We begin with the assumption (the null hypothesis) that the subject is not pregnant. The alternate hypothesis, the one we want to test, is that the subject is pregnant. A conclusion in favor of pregnancy (H 1 is accepted) is considered a positive test result; however, if the subject actually is not pregnant (H 0 is actually true), then our conclusion is in error. This situation an incorrect acceptance of H 1 is called a false positive. On the other hand, if the conclusion of the test is that the subject is not pregnant (H 0 is accepted), and this conclusion is in error (H 1 is actually true), then the test gives a false negative. In the remainder of this section, we will describe how to calculate the probability that the result of a hypothesis test is in error (either a false positive or false negative). False Positive Errors All of the hypothesis tests presented so far in this chapter have been of the following type: the null hypothesis is H 0 : µ x = k the true measurement mean is some fixed value, k While the alternate hypothesis is one of the following Page 123

16 H 1 : µ x k H 1 : µ x > k H 1 : µ x < k the true measurement mean is not some fixed value, k (a two-tailed test) the true measurement mean is larger than some fixed value, k (a one-tailed test) the true measurement mean is smaller than some fixed value, k (a one-tailed test) The decision criterion of the test is the following: if the observed test statistic, T obs, is outside of the interval defined by the critical value(s), then we reject H 0 and accept H 1. A false positive occurs when T obs is outside the H 0 acceptance region when, in fact, H 0 is true. The probability of a false positive is controlled by choosing the appropriate confidence levels in a statistical test. To be exact, CL = 1 where CL is the chosen confidence level, and α is the probability of a false positive. In other words, when testing at the 90% confidence level, there is a 10% chance of falsely accepting H 1. Let s imagine that we are comparing a mean value, µ x, to a fixed value k. Unknown to us, the null hypothesis is actually true. The following figure shows the null distribution of the test statistic, i.e., the probability distribution of the test statistic when the null hypothesis is actually true. Null Distribution critical v alue critical v alue area: α/2 area: α/ Test Statistic Figure 5.3: choosing the critical values for a two-tailed test. If T obs occurs between the critical values, then the null hypothesis is accepted; if not, then H 1 is accepted. The shaded area in both tails is probability of a false positive: it is the probability that T obs does not fall between the critical values, even though it should, since H 0 is true. Now we can see how the critical values are chosen for two-tailed tests: each tail must contain an area of α/2, so that the total probability of a false positive is α, the desired value. Page 124

17 Now let s consider the probability of false positive error for a one-tailed test. In such a test, there is only a single critical value. Let s imagine that we are testing for values that are greater than a fixed value, k; in other words, our alternate hypothesis is H 1 : µ x > k. The next figure shows the null distribution, together with the critical value and the probability of false positive. Null Distribution critical v alue area: α Test Statistic Figure 5.4: choosing the critical value for a one-tailed test. If T obs is less than the critical value, then the null hypothesis is accepted; if not, then H 1 is accepted. The shaded area in both tails is probability of a false positive. Note that the critical value was chosen such that the probability of false positive, α, is the same as in figure 5.3 To summarize, we set the probability of false positive error when we choose the confidence level. We must then choose the critical values according to our desired value of α. This means that, for a two-tailed test, the area in each tail of the null distribution must be α/2; for a one-tailed test, the area in the single tail (since there is only one critical value) will be α. False Negative Errors A false negative occurs when we incorrectly accept H 0 when we should actually reject H 0 and accept H 1. In other words, the alternate hypothesis is actually true, but the test statistic still falls within the critical region (so that the null hypothesis is accepted). The next figure shows the probability distribution of the test statistic when the alternate hypothesis is true. Page 125

18 True Distribution of Test Statistic (H 1 is true) accept H 0 accept H 1 critical value β: probability of f alse negativ e Test Statistic Figure 5.5: This figure shows the probability distribution (not the null distribution) of the test statistic in a situation when the alternate hypothesis is actually true (in this case, µ x > k). However, if the test statistic is less than the critical value, shown in the figure, then the null hypothesis will be accepted: this would be a false negative error. The shaded area shows the probability, β, of this occurring. As we see in the figure, even when the alternate hypothesis is true, there is some chance (β) that the test statistic will be less than the critical value. This chance is the probability of false negative error, β. In order to calculate β, we must know the value of the population parameter, µ x. We can always calculate the value of β for some hypothetical situation in which we postulate a value for the population parameter. This type of exercise would give us some idea of how sensitive our testing procedure is to situations in which the alternate hypothesis is false. The next example illustrates this point. Page 126

19 Example 5.4 You wish to develop a procedure to test for bias in the analysis of fluoride in water. During the analytical procedure, three independent measurements are obtained on a sample, and averaged to determine the fluoride concentration. The standard solution to be used in the test is known to contain 0.45 w / w % F, and the RSD of the entire analytical procedure is known to be 0.10 (i.e., 10% RSD for the average of the three measurements) (a) What are the critical values that can be used to determine if there is bias in a measurement? (b) What values of population measurement mean, µ x, would result in a 90% probability that bias will be detected? In other words, what bias would result in acceptance (with 90% probability) of the alternate hypothesis in part (a)? The true fluoride concentration, ξ x, of the standard solution is 0.45 w / w %. The analytical procedure in this situation consists of obtaining three measurements and averaging them to obtain a point estimate of the fluoride concentration. We can calculate the standard error of the mean of three measurements: ξ. x 0.45 % RSD 0.1 σ. overall RSD ξ x σ overall = % The null and alternate hypotheses will be the true standard error (a population parameter) is known H 0 : µ x = ξ x there is no measurement bias H 1 : µ x ξξ measurement bias exists (two-tailed test) One thing that is different about this hypothesis test, compared to all the others we have done: the true (i.e., population) standard deviation of the mean, (x n ), is known. Thus, the test statistic will be the standardized different between the mean of three measurements and the true concentration of the solution: test statistic T = x 3 x (x 3 ) where x 3 is the mean of 3 measurements. Assuming that x 3 is distributed normally, T will follow a normal distribution with a standard deviation of one.. The null distribution, which assumes that µ x = ξ x, will follow a z-distribution (i.e., a standard normal distribution). Let s set our confidence level at 99%; in other words, we are limiting the probability of false positives to 1%: α = Now we can find the critical values. From the z-tables, we see that z (you should verify this; the actual value is , as reported by Excel). Our decision rules for this hypothesis test are: if < T obs < , then accept H 0. We cannot prove measurement bias with 99% confidence. Page 127

20 if T obs < or T obs > , then reject H 0 and accept H 1. We can prove bias with 99% confidence. In this instance, it is useful to note that there is an equivalent way of stating these decision rules: if the observed measurement mean, x 3, is more than standard errors from the true concentration, ξ x, then we have evidence of bias. crit lower ξ. x z crit σ overall crit lower = % crit upper ξ. x z crit σ overall crit upper = % Alternate decision rules if w / w % < x < w 3 / w %, then we must accept H 0 if x < w / w % or > w 3 x 3 / w %, then we reject H 0 and accept H 1 at the 99% confidence level You should realize that these rules are not different then the first ones; they would result in exactly the same conclusion for a given set of data. These rules just give another way of looking at the hypothesis test process. Now let s look at part (b). We want to find the measurement population mean, µ x, that would result in a 90% chance that measurement bias would be detected. Let s imagine that there is actually a certain amount of positive bias in the measurements. The probability that the bias will actually be detected is the area under the probability distribution curve that is greater than the upper critical value. In other words, if we want to find the minimum amount of positive bias that will be detected with a 90% probability, we need to find the measurement mean, µ x, that satisfies: This situation is shown in the following figure. P( x > w 3 / w %) = 0.90 Page 128

21 probability distribution of measurement mean accept H 1 accept H 1 accept H 0 β = measurement mean, w/w % Figure 5.6: The critical values associated with the decision rules for two-tailed bias detection at the 99% confidence level are represented by the dashed vertical lines. The probability distribution describes the mean of three positively biased measurements, and results in β = 0.10; in other words, for measurements described by this distribution, there is a 10% chance of a false negative result to bias testing at the 99% confidence level. From the z-tables, we know that z 0.90 = gives a right-tailed area of We must solve for µ x in the following expression: x crit x = (x 3 ) where x crit is the upper critical value for the testing procedure, and (x 3 ) is the standard error of the mean of three measurements. Solving for µ x gives = x crit + z 0.90 (x 3 ) This is the mean of the probability distribution shown in the figure. Substituting w / w % for the critical value, and a standard error of w / w %, gives µ x = w / w %. This corresponds to a bias, γ x, of γ x µ x ξ x γ x = % If you repeat this procedure to find the negative bias that gives β = 0.10, you will find that a bias of γ x = w / w % will give the desired false negative probability value. In other words, our calculations tell us that when testing for bias at the 99% confidence level under these conditions, we have a 90% chance of detecting bias of w / w %. This is useful information. If, for example, the sensitivity of our hypothesis test for bias detection is unacceptable, then we have two options: lower our confidence level from 99% (which would Page 129

22 decrease our critical values) or average more measurements to decrease our standard error. We could also try to improve the precision of our method, so that the standard deviation of the individual measurements is smaller. Summary: Choosing the Confidence Level Choosing the confidence level directly determines the critical values and the value of α, the probability of a false positive error. Let s consider a two-tailed test: H 0 : H 1 : µ x = k µ x k for which there are two critical values, represented on the following number line: Choosing a larger confidence level will cause the critical values to move further apart. True, this means that there is a less chance of a false positive error; however, the power of the test to detect small differences between µ x and k has been decreased. In other words, there is a greater chance of a false negative error (i.e., β has increased). Thus there is always a compromise to consider in choosing the confidence level; values of 95% and 99% are very common. The value chosen may depend on the potential consequences of errors. Consider the following situations: in employee drug testing, no employer want to deal with false accusations. In such a situation, a high confidence level (99% or even higher) might be appropriate, because the consequences of a false positive (wrongly accusing an employee of taking drugs) are perceived to be more severe than missing the borderline cases. in screening patients for HIV, the consequences of a false negative (incorrectly concluding that the patient is not infected) are very severe. In this case, the confidence level might be set relaltively low. To be sure, there will be an increase in false positives, but a separate, independent test can be performed on these patients. Page 130

23 Chapter Checkpoint The following terms/concepts were introduced in this chapter: acceptance region alternate hypothesis critical value false positive false negative hypothesis test null hypothesis null distribution one-tailed test P-value rejection region significance level significance test statistical hypothesis statistical significance test statistic two-tailed test In addition to being able to understand and use these terms, after mastering this chapter, you should use formal hypothesis testing procedures to determine if there is a significant difference between a normally-distributed random variable and a fixed value, using either a one- or two-tailed test interpret P-values from a hypothesis test explain trade-offs in choosing a confidence level Page 131

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1 Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935) Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis

More information

22. HYPOTHESIS TESTING

22. HYPOTHESIS TESTING 22. HYPOTHESIS TESTING Often, we need to make decisions based on incomplete information. Do the data support some belief ( hypothesis ) about the value of a population parameter? Is OJ Simpson guilty?

More information

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Mind on Statistics. Chapter 12

Mind on Statistics. Chapter 12 Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals. 1 BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Testing a claim about a population mean

Testing a claim about a population mean Introductory Statistics Lectures Testing a claim about a population mean One sample hypothesis test of the mean Department of Mathematics Pima Community College Redistribution of this material is prohibited

More information

Lesson 9 Hypothesis Testing

Lesson 9 Hypothesis Testing Lesson 9 Hypothesis Testing Outline Logic for Hypothesis Testing Critical Value Alpha (α) -level.05 -level.01 One-Tail versus Two-Tail Tests -critical values for both alpha levels Logic for Hypothesis

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Leslie Chandrakantha lchandra@jjay.cuny.edu Department of Mathematics & Computer Science John Jay College of

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means OPRE504 Chapter Study Guide Chapter 11 Confidence Intervals and Hypothesis Testing for Means I. Calculate Probability for A Sample Mean When Population σ Is Known 1. First of all, we need to find out the

More information

Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Chapter 7 TEST OF HYPOTHESIS

Chapter 7 TEST OF HYPOTHESIS Chapter 7 TEST OF HYPOTHESIS In a certain perspective, we can view hypothesis testing just like a jury in a court trial. In a jury trial, the null hypothesis is similar to the jury making a decision of

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

6: Introduction to Hypothesis Testing

6: Introduction to Hypothesis Testing 6: Introduction to Hypothesis Testing Significance testing is used to help make a judgment about a claim by addressing the question, Can the observed difference be attributed to chance? We break up significance

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

1 Hypothesis Testing. H 0 : population parameter = hypothesized value: 1 Hypothesis Testing In Statistics, a hypothesis proposes a model for the world. Then we look at the data. If the data are consistent with that model, we have no reason to disbelieve the hypothesis. Data

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Introduction to Hypothesis Testing." My name is Dr. Jennifer Ann Morrow. In today's demonstration,

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Testing Hypotheses About Proportions

Testing Hypotheses About Proportions Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

The Wilcoxon Rank-Sum Test

The Wilcoxon Rank-Sum Test 1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live

More information

Paired 2 Sample t-test

Paired 2 Sample t-test Variations of the t-test: Paired 2 Sample 1 Paired 2 Sample t-test Suppose we are interested in the effect of different sampling strategies on the quality of data we recover from archaeological field surveys.

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

8 6 X 2 Test for a Variance or Standard Deviation

8 6 X 2 Test for a Variance or Standard Deviation Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Introduction. Statistics Toolbox

Introduction. Statistics Toolbox Introduction A hypothesis test is a procedure for determining if an assertion about a characteristic of a population is reasonable. For example, suppose that someone says that the average price of a gallon

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

More information

p ˆ (sample mean and sample

p ˆ (sample mean and sample Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

More information

Two Related Samples t Test

Two Related Samples t Test Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago 8 June 1998, Corrections 14 February 2010 Abstract Results favoring one treatment over another

More information

1. How different is the t distribution from the normal?

1. How different is the t distribution from the normal? Statistics 101 106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M 7.1 and 7.2, ignoring starred parts. Reread M&M 3.2. The effects of estimated variances on normal approximations. t-distributions.

More information

1.7 Graphs of Functions

1.7 Graphs of Functions 64 Relations and Functions 1.7 Graphs of Functions In Section 1.4 we defined a function as a special type of relation; one in which each x-coordinate was matched with only one y-coordinate. We spent most

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

Topic #6: Hypothesis. Usage

Topic #6: Hypothesis. Usage Topic #6: Hypothesis A hypothesis is a suggested explanation of a phenomenon or reasoned proposal suggesting a possible correlation between multiple phenomena. The term derives from the ancient Greek,

More information

Experimental Analysis

Experimental Analysis Experimental Analysis Instructors: If your institution does not have the Fish Farm computer simulation, contact the project directors for information on obtaining it free of charge. The ESA21 project team

More information

8 Divisibility and prime numbers

8 Divisibility and prime numbers 8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Correlational Research

Correlational Research Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

More information

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Chapter 6: Probability

Chapter 6: Probability Chapter 6: Probability In a more mathematically oriented statistics course, you would spend a lot of time talking about colored balls in urns. We will skip over such detailed examinations of probability,

More information

Simple Inventory Management

Simple Inventory Management Jon Bennett Consulting http://www.jondbennett.com Simple Inventory Management Free Up Cash While Satisfying Your Customers Part of the Business Philosophy White Papers Series Author: Jon Bennett September

More information