Testing the Statistical Significance of a Correlation. R. Michael Furr. Wake Forest University

Size: px
Start display at page:

Download "Testing the Statistical Significance of a Correlation. R. Michael Furr. Wake Forest University"

Transcription

1 Statistical Significance of a Correlation 1 Running Head: SIGNIFICANCE TESTING AND CORRELATIONS Testing the Statistical Significance of a Correlation R. Michael Furr Wake Forest University Address correspondence to: Mike Furr Department of Psychology Wake Forest University Winston-Salem, NC 2706 furrrm@wfu.edu

2 Statistical Significance of a Correlation 2 Testing the Statistical Significance of a Correlation Researchers from Psychology, Education, and other social and behavioral sciences are very concerned with statistical significance. If a researcher conducts a study and finds that the results are statistically significant, then the he or she has greater confidence in the effects revealed by the study. When results are statistically significant, researchers are more likely to believe that the effects are real and not likely to have occurred by chance. The goal of science is to understand our physical or social world, and this occurs in part by being able to judge which research findings are real and which are flukes and red herrings. This paper describes the procedures through which researchers determine whether the results of a study are statistically significant. It presents the logic, technical steps, and interpretation of a test of statistical significance, specifically for researchers examining a correlation between two variables. Many textbooks provide in-depth introductions to statistical significance, but there appear to be no sources that provide such an introduction in the context of correlations. Most introductory statistics textbooks in Psychology provide concepts and procedures for significance testing in the context of means, but the extension of significance testing to correlations is usually very slight. Typically, the coverage of significance testing for correlations, if it is discussed at all, focuses on computational procedures, bypassing the conceptual foundations and interpretations. In fact, the organization of some introductory statistics textbooks implies that correlations and significance testing are completely separate issues. For example, a chapter on correlation might be in a section labeled Descriptive Statistics and the chapters related to significance testing might be included in a section labeled Inferential Statistics. Although general statistics textbooks omit deep coverage of the conceptual and practical foundations of significance testing for correlations, one might suspect that such coverage could be found in sources that focus on correlational procedures specifically (e.g., Archdeacon, 1994; Bobko, 2001; Chen & Popovich, 2002; Cohen & Cohen, 1983; Edwards; 1984; Ezekiel, 1941; Miles & Shevlin, 2001; Pedhazur, 1997). Unfortunately, these sources also omit in-depth discussions of basic concepts in significance testing. The more advanced sources naturally assume that readers already have a solid grasp

3 Statistical Significance of a Correlation 3 of basic concepts in significance testing. Unfortunately, even the more introductory sources provide little background in basic concepts in statistically significance as related to correlations. A number of potential problems arise from the fact that no sources provide in-depth discussions of significance testing as related to correlations. First, some budding researchers might be left with the mistaken and potentially confusing belief that correlations and significance tests are unrelated issues. Although the computation and interpretation of a correlation can proceed without reference to a significance test, correlations are rarely reported without an accompanying significance test. Second, even if researchers are aware that correlations can be tested for statistical significance, they might have difficulty connecting fundamental concepts in significance testing (e.g., parameters, confidence intervals, distributions of inferential statistics) to correlations. The existing sources make little effort t to generalize concepts articulated in the context of means or frequencies to correlational analyses. Third, the existing sources create difficulty for course instructors who cover correlational analyses before other kinds of analyses. For example, some Psychology Departments divide their Research Methods and Statistics courses into a correlational semester and an experimental semester. If the correlational course is taken before the experimental course, then instructors who teach the correlational course face a dilemma. They can ignore significance testing of correlations, they can provide a cursory coverage of significance testing of correlations, or they can assign readings that present significance testing in the context of means or frequencies. A solid understanding of significance testing as related to correlations may be particularly important as the field evolves in two ways. First, researchers are increasingly aware of the importance of effect sizes, such as correlations (American Psychological Association, 2001; Capraro & Capraro, 2003; Furr, 2004; Heldref Foundation, 1997; Kendall, 1997; Murphy, 1997; Rosenthal, Rosnow, & Rubin, 2000; Thompson, 1994, 1999; Wilkinson & APA Task Force on Statistical Inference, 1999). Second, many in the field recognize that regression, based on a correlational foundation, is a general approach to data analysis that can incorporate much that is typically conceptualized as Analysis of Variance. As the awareness and use of effect sizes and correlational analytic procedures continue to grow, and as advanced

4 Statistical Significance of a Correlation 4 correlational procedures continue to emerge, researchers should have a solid understanding of the connections between correlations and significance testing. The current paper is intended to partially fill this hole in the basic statistical literature. It describes what statistical significance is about, presents fundamental concepts in evaluating statistical significance, and details the procedures for testing the statistical significance of a correlation. Samples and Populations: Inferential Statistics Imagine that Dr. Cartman wants to know whether the Scholastic Aptitude Test (SAT) is a valid predictor of college freshman performance at the local university. To address this issue, he recruits a sample of 200 freshmen from the university. The students give their consent for Dr. Cartman to have access to their academic records, from which he records their SAT scores and their first-year college Grade Point Average. Based on these data, Dr. Cartman finds that the correlation between SAT scores and GPA is.40, which is a positive correlation of moderate size. This correlation tells him that, within the sample, the students with relatively high SAT scores tend to have relatively high GPAs (and that students with relatively low SAT scores tend to have relatively low GPAs). Based on this finding in his sample, Dr. Cartman is tempted to conclude that the SAT is indeed a useful predictor of freshman GPA at the University. But how much confidence should Dr. Cartman have in this conclusion? He might be hesitant to use the results found in a sample of 200 students to make an inference about whether SAT scores are correlated with GPAs in the entire freshman student body. The question of statistical significance arises from the fact that scientists would like to make conclusions about psychological phenomena, effects, differences, or relationships between variables as they exist in a large population (or populations) of people (or rats, monkeys, etc, depending on the scientist s area of expertise). For example, Dr. Cartman would like to make conclusions about whether SAT scores are correlated with GPAs in the entire freshman student body. Similarly, a clinical psychologist might be interested in whether a new drug generally helps to alleviate depression, within the population of all people who might take the drug. Or a social psychologist hypothesizes that romantic

5 Statistical Significance of a Correlation 5 couples in which the partners have similar profiles of personality traits tend to be happier than couples in which the partners have dissimilar personalities. This researcher would be interested in concluding whether similarity and romantic happiness are generally correlated with each other within the population of all couples in romantic relationships. Despite their desire to make conclusions about large populations, researchers generally study only samples of people recruited from the larger population of interest. In our example, Dr. Cartman would like to make conclusions about the entire freshman class at the university, but he is able to recruit a sample of only 200 students from the student body. Similarly, a clinical psychologist cannot study all people who might ever take a drug, and a Social Psychologist cannot study all romantic couples. Researchers such as Dr. Cartman gather data from relatively small samples of people, and they use the sample data to make inferences about the existence and size of psychological phenomena in the larger population from which the sample was drawn. Researchers must be concerned about the accuracy with which they can use data from samples to make inferences about the population from which the sample was recruited. Dr. Cartman recognizes that the 200 students who happened to be in his study might not be perfectly representative of the entire freshman class. It is possible that, in the student body as a whole, there is no association between SAT scores and GPA. That is, in the population from which the sample of 200 students was drawn, the correlation between SAT and GPA could be zero. Even if the correlation in the population is exactly zero, Dr. Cartman could potentially obtain a random sample of students in which the correlation between SAT and GPA is not zero. His particular sample of 200 students might be unusual in some subtle way. Just by chance, Dr. Cartman might have recruited a sample in which people who scored relatively high on the SAT also tend to have relatively high GPAs. Thus researchers must be concerned about sampling error the fact that a particular sample might not be perfectly representative of the population from which they were randomly drawn. The potential presence of sampling error means that researchers can never be totally confident that results found in a sample are perfectly representative of what is really going on in the population.

6 Statistical Significance of a Correlation 6 The procedures for evaluating statistical significance help researchers determine how well their sample s results represent the population from which the sample was drawn. Roughly speaking, when we find that results are statistically significant, we have confidence in inferring that the effects observed in the sample represent the effects in the population as a whole. In our example, Dr. Cartman would want to know whether the correlation he found in the sample s data is statistically significant. If it is, then he will feel fairly confident in concluding that SAT scores are correlated with GPA in the student body as a whole. If his correlation is not found to be statistically significant, then he would not feel confident concluding that SAT scores are correlated with GPA in the student body as a whole. Because we use this process to help us make inferences about populations, the statistical terms and procedures involved in this process are called inferential statistics. There are standard terminologies describing inferential statistics and the connections between samples and populations. We use the sample data to calculate a correlation between two variables, such as SAT and GPA. This correlation is labeled with an r, and it is called a descriptive statistic because it describes some aspect of the sample that is actually observed in our study. We then might use the correlation observed in the sample to estimate the correlation in the population from which the sample is drawn. Because we cannot study the entire population, we can only make an informed guess about the correlation as it exists in the population. The correlation in the population is labeled with the Greek letter rho (ρ), and it is called a parameter. Statistical Hypotheses: Null and Alternative Consider again Dr. Cartman, who wishes to know if the SAT is correlated with GPA in the entire freshman class. If Dr. Cartman is conducting a traditional significance test of a correlation, then he will consider two possibilities. One possibility, called the null hypothesis, is that the SAT is not correlated with GPA within the Freshman student body. More technically, the null hypothesis states that the population correlation parameter (ρ) is zero. The null hypothesis is often labeled as H 0, written as:

7 Statistical Significance of a Correlation 7 H 0 : ρ = 0 This expresses exactly what Dr. Cartman is doing he will be testing the null hypothesis that the correlation in the population is equal to zero. The second possibility, called the alternative hypothesis, is that the SAT is correlated with GPA within the freshman student body. More technically, the alternative hypothesis states that the population correlation parameter (ρ) is not zero. The alternative hypothesis is often labeled as H 1 or H A, written as: H 1 : ρ 0 The Decision About the Statistical Hypotheses For traditional significance testing of a correlation, Dr. Cartman faces a decision between two competing hypotheses. In making this decision, Dr. Cartman has two options. First, he might reject the null hypothesis, thereby concluding that, in the population, the two variables are correlated with each other. In other words, the results of the analysis of his sample s data make him feel confident enough to conclude that the population correlation is some value other than zero. Second, he might fail to reject the null hypothesis, thereby concluding that, in the population, the two variables are not correlated with each other. In other words, the results of the analysis of his sample s data do not make him feel confident enough to conclude that the population correlation is a value other than zero. Note that, strictly speaking, both options are phrased in terms of rejecting the null hypothesis Dr. Cartman can either reject the null or he can fail to reject the null. Researchers generally do not phrase the decisions in terms of the accepting the null or in terms of the alternative hypothesis. For these reasons, the traditional procedures are call null hypothesis significant testing. The decision regarding the null hypothesis is tied to the notion of statistical significance. If a researcher rejects the null hypothesis, then the result is said to be statistically significant. If a researcher fails to reject the null hypothesis, then the result is said to be not statistically significant. Practically speaking, the default decision is to fail to reject the null hypothesis to conclude that the variables are uncorrelated in the population. Researchers reject the null hypothesis only when their

8 Statistical Significance of a Correlation 8 sample data make them confident enough to override the default decision and guess that the null hypothesis is incorrect. In his sample of 200 freshmen, Dr. Cartman found a correlation of.40 between SAT and GPA. The question that Dr. Cartman faces is, do his sample findings make him confident enough to reject the null hypothesis that the correlation in the entire freshman student body is zero? Two issue arise when determining whether the sample findings make Dr. Cartman confident enough to reject the null hypothesis. First, how confident is Dr. Cartman that the null hypothesis is false? In other words, how confident should he be that, in the entire freshman class, SAT truly is correlated with GPA? The second issue is how confident does he need to be in order to actually reject the null hypothesis? Psychology and related sciences have reached a consensus regarding the degree of confidence that a researcher should have before rejecting a null hypothesis. These two issues are considered in turn, as part of a process called a t-test. Testing the Null Hypothesis: What Affects Our Confidence? Two main factors make Dr. Cartman more or less willing to conclude that there is a non-zero correlation between SAT and GPA in the entire Freshman student body. One factor affecting his confidence is the size of the correlation in his sample. In his sample s data, Dr. Cartman found a correlation of r =.40, which represents a positive correlation of moderate size. But what if he had found that the correlation in his sample was much weaker, say only r =.12? Dr. Cartman recognizes that a correlation of only r =.12 is not very different from a correlation of zero. Therefore, he probably would not be very confident in concluding that the population correlation was anything but zero. In other words, if the correlation in the population is indeed zero (i.e., if ρ = 0), then it would not be very surprising to randomly draw a sample in which the observed correlation is small only slightly different from zero. But what if Dr. Cartman had found that the correlation in sample was very strong, say r =.80? A correlation of r =.80 is very far from zero it expresses a very strong association between two variables. Therefore, he probably would be much more confident in concluding that the population correlation was not zero. In other words, if the correlation in the population is indeed zero (i.e., if ρ = 0), then it would be

9 Statistical Significance of a Correlation 9 very surprising to randomly draw a sample in which the correlation is so far away from zero. In sum, the size of the correlation in the sample will affect Dr. Cartman s confidence in concluding that the population correlation is anything but zero larger sample correlations will increase his confidence that the population correlation is not zero. The second factor affecting his confidence in rejecting the null is the size of the sample itself. In his study, Dr. Cartman was able to recruit 200 participants. But what if he had been able to recruit a small sample of only 15 participants? Dr. Cartman probably would not be very confident in making inferences about the entire freshman student body based on a study of only 15 participants. On the other hand, if he had been able to recruit a sample of 500 students (a much larger proportion of the population), then Dr. Cartman would be more comfortable in making inferences about the entire student body. Therefore, larger samples increase his confidence in making inferences about the population, and smaller samples decrease his confidence. We can quantify the amount of confidence that a researcher should have in rejecting the null hypothesis that the correlation in the population is zero (i.e., H 0 : ρ = 0). We compute a t value, which is an inferential statistic that can be conceptualized roughly as an index of degree of confidence in rejecting the null hypothesis. The formula for computing the t value reflects the two factors discussed above the size of the correlation and the size of the sample: t OBSERVED = r 1 r 2 x N 2 Equation 1 The t OBSERVED is the t value derived for the data that was observed in the actual sample of participants in the study, r is the correlation in the sample, and N is the number of participants in the sample. In Dr. Cartman data: t OBSERVED = x t OBSERVED =. 436 x t OBSERVED = 6.135

10 Statistical Significance of a Correlation 10 Large t values reflect more confidence in rejecting the null hypothesis. Consider the t value that would be obtained for a sample in which the correlation is only.12 and the sample size is only 15: t OBSERVED = x 15 2 t OBSERVED =.121 x t OBSERVED =.436 This t value is noticeably lower than the t value found in the larger sample with the larger correlation, and the lower t value reflects the lower confidence that we would have in rejecting the null hypothesis. In sum, effect size (i.e., the size of the correlation) and sample size are the two key factors affecting a researcher s confidence in rejecting the null hypothesis. These two factors are part of what researchers call the power of a significance test (Cohen, 1988). Larger effect sizes (correlations farther away from zero) and larger sample sizes increase our confidence in rejecting a null hypothesis reflecting a powerful significance test. Testing the Null Hypothesis: How Confident Do We Need to Be? Once we know the factors influencing Dr. Cartman s confidence in rejecting the null hypothesis, we can consider the question of how confident he needs to be in order to reject the null. We have seen that larger correlations and larger samples produce greater confidence, as reflected in larger t values. But how large does a t value need to be in order for Dr. Cartman to decide to reject the null hypothesis that the population correlation between SAT and GPA is zero? Confidence can be framed in terms of the probability that we would be making an error if we rejected the null hypothesis. Recall that a researcher never really knows if the null hypothesis is true or if it is false (because researchers typically cannot include entire populations in their studies). Researchers collect data on a sample that is drawn from the population of interest, and then they use the sample s data to make educated guesses about the population. But even the most well-educated guess could be incorrect. Significance testing is most directly concerned with what is called a Type I Error. A Type I

11 Statistical Significance of a Correlation 11 error is made when a researcher rejects the null hypothesis when in fact the null hypothesis is true. That is, a researcher makes a Type I error when he or she concludes that two variables are correlated with each other in the population, when in reality the two variables are not correlated with each other in the population. If Dr. Cartman rejects the null hypothesis in his study, then he is saying that there is a very low probability that he will be making an incorrect rejection. The probability of an event occurring (i.e., the probability that a mistake will be made) ranges from 0 to 1.0, with probabilities near zero meaning that the event is very unlikely to occur. Thus, a probability of 0 means that there is absolutely no chance that a mistake will be made, and a probability of 1.0 means that a mistake will definitely be made. Values between these two extremes reflect differing likelihoods of the event. A probability of.50 means that there is a 50% chance that the a mistake will be made, and a probability of.05 means that there is only a 5% chance (a pretty remote chance) that a mistake will be made. By convention, psychologists have adopted the probability of.05 as the criterion for determining how confident a researcher needs to be before rejecting the null hypothesis. Put another way, if Dr. Cartman finds that his study gives him a level of confidence associated with less than a 5% chance of an incorrect rejection of the null hypothesis, then he is allowed to reject the null hypothesis. Traditionally, psychologists have assumed that, if researchers are so confident in their results that they have such a small chance of making a Type I Error, then they are allowed to reject the null hypothesis. Researchers often use the term alpha level when referring to the degree of confidence required to reject the null hypothesis. By convention, most significance tests in psychology are conducted with an alpha level of.05. Statisticians have made connections between the observed t values computed earlier and the p value (alpha level) associated with incorrectly rejecting the null hypothesis. How can Dr. Cartman determine if his observed t value allows him to be confident enough to assert that he has less than a 5% chance of making a Type I Error? To do this, Dr. Cartman must identify the appropriate critical t value, which will tell him how large his observed t value must be in order for him to reject the null hypothesis in

12 Statistical Significance of a Correlation 12 his study. The critical t value that Dr. Cartman will use reflects a.05 probability of incorrectly rejecting the null hypothesis. It is the t value that is exactly associated with a 5% chance of making a Type I Error. To identify the appropriate critical t value, Dr. Cartman can refer to a Table of critical t values. Many basic statistics textbooks and research method textbooks include tables of critical t values, such as that presented in Table 1 (see the end of this paper). Dr. Cartman must consider only two issues when identifying the critical t value for his study. These two issues are reflected in the columns and rows of the Table. Table 1 presents several columns of t values. These columns represent different degrees of confidence required, in terms of the probability of making a Type 1 Error. Because psychology and similar sciences have traditionally adopted a probability level of.05 as the criterion for rejecting a null hypothesis, Dr. Cartman will typically only be concerned about the values in the column labeled.05. Table 1 also presents several rows, and each row represents a different sized study. The rows are labeled df, which stands for degrees of freedom. Degrees of freedom is linked to the number of participants in the sample. Specifically, df = N 2. Dr. Cartman determines that the degrees of freedom for his study is df = 198 (200 2 = 198). Referring to a Table of critical t values, Dr. Cartman pinpoints the intersection of the.05 column and the appropriate row. The entry at this point in the Table is he will use this critical t value to help decide whether to reject the null hypothesis. Testing the Null Hypothesis: Making the Decision The decision about the null hypothesis is made by comparing an observed t value to the appropriate critical t value. If Dr. Cartman finds that the absolute value of his observed t value is larger than the critical t value, then he will decide to reject the null hypothesis. If Dr. Cartman finds that the absolute value of his observed t value is not larger than the critical t value, then he fails reject the null hypothesis. In shorthand terms, If t OBSERVED > t CRITICAL then reject H 0

13 Statistical Significance of a Correlation 13 If t OBSERVED < t CRITICAL then fail to reject H 0 In his case, Dr. Cartman rejects the null hypothesis, because the absolute value of his observed t value is larger than the critical t value ( > 1.972). These statistically significant results tell Dr. Cartman that it is highly unlikely to find a correlation of.40 (a moderate effect size) in a sample of 200 participants (a fairly large sample), if the correlation in the population is zero. Therefore, he rejects the null hypothesis and concludes with confidence that the correlation in the population is probably not zero. That is, he concludes that in the entire freshman student body at his University, SAT scores are indeed correlated with GPA. For a more general perspective, it might be worth considering other patterns of results. As a second example, imagine that a second researcher, Dr. Marsh, had obtained a correlation of.12 from a sample of 15 participants. In this case, the observed t value would be t OBSERVED =.437. Looking at Table 1, in the.05 column and the row for df = 13 (df = 15-2), he finds that the critical t value is t CRITICAL = Here, the absolute value of t OBSERVED is less than t CRITICAL, so Dr. Marsh would fail to reject the null hypothesis. The effect size is small (i.e., it is not very different from zero), and the sample is small (only 15 people). With such weak results and such a small sample, Dr. Marsh is not confident enough to reject the idea that the correlation in the population is zero. Thus, his correlation is not statistically significant. As a third example, imagine that Dr. Broflovski had obtained a negative correlation (say, r = -.40) from a sample of 200 participants. In this case, the observed t value would be t OBSERVED = (note that this is a negative observed t value). Looking at Table 1, in the.05 column and the row for df = 198, he finds that the critical t value is t CRITICAL = Here, the absolute value of t OBSERVED is greater than t CRITICAL ( > 1.972), so Dr. Broflovski would reject the null hypothesis in this case. Dr. Broflovski has found a moderately-sized correlation (i.e., it is fairly different from zero) in a fairly large sample of participants. Dr. Broflovski feels that he is highly unlikely to find a moderate effect size in a fairly large sample, if the correlation in the population is zero. Note that the direction of the correlation (positive or negative) does not make a difference in this example. Dr. Broflovski has conducted what is known as a

14 Statistical Significance of a Correlation 14 two-tailed test or a non-directional test. This means that he is testing the null hypothesis that the correlation in the population is zero. This hypothesis can be rejected if the correlation in the sample is positive or if it is negative either way could convince him that the correlation in the population is not likely to be zero. Table 2 presents a summary of the steps in conducting a typical null hypothesis significance test of a correlation. Interpreting the Decision A significance test comes down to a decision between two choices. Usually this decision concerns whether or not the correlation is zero in the population from which the sample has been drawn. Inferential statistics help us determine the likelihood that a given sample s results might have occurred either: a) because the sample is drawn from a population in which the correlation is not zero, or b) purely by chance, with the sample being drawn from a population in which the correlation is zero. We reject the null hypothesis when the probability level associated with our results suggests that our results are unlikely to have occurred if the null hypothesis were true. Again, the primary example of Dr. Cartman shows this situation he obtained a moderate correlation in a large sample. His significance test tells him that this result is unlikely to have occurred in this sample, if indeed the correlation in the population is zero. He therefore concludes that the null hypothesis is false (i.e., he concludes that the population correlation is not zero), and decides to reject it. We fail to reject the null hypothesis when the probability level associated with our results suggests that our results are not unlikely to have occurred if the null hypothesis were true. In the second example, Dr. Marsh found a weak correlation in a small sample. The significance test indicates that the results might very well occur even if the correlation in the population is zero. Dr. Marsh therefore concludes that the null hypothesis might not be false (i.e., the population correlation might well be zero) and so he decided not to reject it.

15 Statistical Significance of a Correlation 15 You are likely to hear a variety of different interpretations of a correlation that is statistically significant. For example, Dr. Cartman s results (r =.40, p <.05) from his sample of N = 200 might lead him to make statements such as: The correlation is significantly different from zero. It s unlikely that the sample came from a population in which the correlation is zero. In the population from which the sample was drawn, the two variables are probably associated with each other. The observed data are unlikely to have occurred by random chance. There is a less than a.05 probability (i.e., a very small chance) that the results could have been obtained if the null hypothesis is true. He is 95% confident that the population correlation is not zero If this study was done 100 times (each with a random sample of N = 200, drawn from a population in which the correlation is zero), we would get a correlation of magnitude.40 or stronger (ie, r.40 ) fewer than 5 times. Given that the results are unlikely to have occurred if the null were true, then the null is probably not true. You are also likely to hear a variety of different interpretations of a correlation that is not statistically significant. For example, the second example (r =.12, p >.05, N = 15) might lead to statements such as: The sample s correlation is not significantly different from zero. It s not unlikely that the sample came from a population in which the correlation is zero. In the population from which the sample was drawn, the variables are likely to be uncorrelated with each other. The observed data might very well have occurred by random chance.

16 Statistical Significance of a Correlation 16 There is a more than a.05 probability (i.e., not a small chance) that the results could have been obtained even if the null hypothesis is true. She cannot be 95% confident that the population correlation is not zero. If this study is done 100 times (each with a random sample of N = 15, drawn from a population in which the correlation is zero), we would get a correlation of magnitude.12 or stronger (ie, r.12 ) more than 5 times. Given that the results are not unlikely to have occurred if the null were true, the null might very well be true. Experts in probability might take issue with some of the above interpretations, depending on their perspective on probability and logic. Nevertheless, many of the interpretations above or close variations are often used. While considering the appropriate interpretations of significance tests, we should also consider at least two potential confusions. One point of confusion might concern a rejection of the null hypothesis. Dr. Cartman s sample correlation was r =.40, which was statistically significant. By rejecting the null hypothesis that ρ = 0, Dr. Cartman can conclude that the sample is probably not drawn from a population with a correlation of 0. But he should not conclude that the sample was drawn from a population with a correlation of ρ =.40. The sample might come from a population with a correlation of ρ =.40, but it also might come from a population with a correlation of ρ =.35 or ρ =.50 and so on. So, rejecting the null hypothesis means that the correlation in the population is probably not zero, but it does not indicate what the correlation in the population is likely to be. A second potential point of confusion concerns the failure to reject the null hypothesis. In the second example, Dr. Marsh s sample correlation was r =.12, which was not statistically significant. Recall that the failure to reject the null hypothesis tells Dr. Marsh that the sample s results might very well have occurred if ρ = 0. So, Dr. Marsh can assume that the population correlation might be zero. In this case, Dr. Marsh should not conclude that the correlation in the population is zero. The sample s

17 Statistical Significance of a Correlation 17 results (r =.12) might also have occurred if ρ =.02, ρ = -.07, or ρ =.20. So, just because the population correlation might be zero, that does not mean that it is zero or that all other possibilities are less likely. Confidence Intervals As outlined above, a null hypothesis test is a very specific test. The results of the typical test allow us to make one inference about the population, specifically that the population correlation is either unlikely to be zero or it might well be zero. That is, are two variables likely to be associated with each other in the population or not? Although it is useful to evaluate the likelihood that the population correlation is zero, we can ask many other questions about the correlation in the population from which a sample was drawn. For example, what is our best guess about the actual correlation in the population? If the correlation in Dr. Cartman s sample is r =.40, then what is Dr. Cartman s best guess about the size of the correlation among the entire freshman student body? All that Dr. Cartman knows is that the sample correlation is.40, therefore his most reasonable guess about the student body correlation is that it is ρ =.40. This guess about the specific value of the population correlation is called a point estimate because he is estimating a single, specific point at which the population correlation lies. Although the point estimate of the population correlation is an educated guess, Dr. Cartman is not sure that the population correlation is.40. He recognizes that his particular random sample of students might be different from the entire freshman student body in some ways, and these differences might mean that the correlation he finds in his sample is different from the correlation in the entire student body. Dr. Cartman might say that, although he is not sure that the population correlation is.40, he is fairly confident that the population correlation lies somewhere between.28 and.51. A confidence interval (CI) for a correlation is the range in which the population correlation is likely to lie, and it is estimated with a particular degree of confidence. For example, Dr. Cartman s range (.28 ρ.51) is a 95% CI. That is, he is 95% confident that the population correlation (ρ) is between.28 and.51.

18 Statistical Significance of a Correlation 18 Although a discussion of the calculation of a CI is beyond the scope of this paper, three important issues must be considered in interpreting a CI. First, the width of the CI reflects the precision of the estimate. Dr. Cartman s 95% CI ranges from.28 to.51, which is a span of 23 points on the correlational metric. But consider the second example, in which Dr. Marsh found a correlation of r =.12 in a sample of 15 participants. Dr. Marsh s 95% CI ranges from -.42 to.60 (ie, -.42 ρ.60), which is a span of 102 points on the correlational metric. Note the difference between the two examples, illustrated in Figure 1. Dr. Cartman s estimate of the population correlation is a narrower range than is Dr. Marsh s estimate. A narrower range reflects a more precise and informative estimate. For a more familiar example, consider two weather predictions. One meteorologist predicts that the high temperature tomorrow will be somewhere between 60 and 70 degrees a range of 10 degrees. Another meteorologist predicts that the high temperature tomorrow will be between 30 and 100 degrees a much wider range of 70 degrees. Obviously, the first meteorologist s narrower range is a much more precise and useful prediction. Narrow CI s are more precise and informative than are wide CI s. A second important point regarding CI s is the effect of sample size on a CI. In computing a CI based on a sample s data, the size of the sample is directly related to the precision (i.e., width) of the CI. Large samples allow researchers to make relatively precise CI estimates. Consider again the fact that Dr. Cartman s CI is much more precise than Dr. Marsh s CI. The difference in precision is primarily due to the difference in the sample sizes from the two studies. Dr. Cartman s CI is based on 200 participants, but Dr. Marsh s CI is based on only 15 participants. The link between sample size and the width of a CI is conceptually related to the link between sample size and our confidence in rejecting a null hypothesis, discussed earlier. A relatively large sample includes a relatively large proportion of the population. Therefore, an estimate about the population is more precise when based on large samples than when based on smaller samples. A third point regarding CI s is the link between a CI and the typical null hypothesis test. Recall that the typical hypothesis test of a correlation is the test of the null hypothesis that the correlation in the population is zero (H 0 : ρ = 0). We reject the null hypothesis when we are confident that there is less than

19 Statistical Significance of a Correlation 19 a 5% chance that the correlation in the population is indeed zero. A CI might be seen as the flip side of the significance test. Dr. Cartman s CI tells us to be 95% confident that the population correlation is within the range of.28 to.51. Put another way, Dr. Cartman s CI tells us that there is only a 5% chance that the population correlation is outside of the range of.28 to.51. Note that Dr. Cartman s CI does not include zero. The interval includes only positive values (i.e., it is entirely above zero), as illustrated in Figure 1. Therefore, the CI tells us to be 95% confident that the correlation in the population is not zero. In other words, it tells us that there is less than a 5% chance that the population correlation is zero. This parallel s the outcome of Dr. Cartman s null hypothesis test, in which he rejected the hypothesis that the correlation in the population is zero. In contrast, consider Dr. Marsh s CI, also illustrated in Figure 1. Dr. Marsh s CI does include zero the CI ranges from a negative value at one end to a positive value at the other end. The fact that zero is within Dr. Marsh s 95% CI indicates that the correlation in the population from which his sample was drawn might very well be zero. Although Dr. Marsh is 95% confident that the population correlation is not -.50, -.85,.62 and so on (because these values are outside of his CI), he cannot be confident that the population correlation is not zero. In sum, the traditional null hypothesis significance test is directly related to CI s, and this relationship hinges on whether a CI includes zero. Advanced Issues The concepts, procedures, and examples in this paper reflect the most typical kind of significance test of a correlation, in which a researcher tests the null hypothesis that the population correlation is zero, at an alpha level of.05. Although this is the most typical kind of significance test, options exist for conducting other kinds of statistical tests. Details of such options and advanced issues are beyond the scope of this paper, but an overview might be useful. Using Statistical Software Statistical packages such as SPSS usually provide exact probability values associated with each significance test. Figure 2 presents SPSS output for Example 2 (Dr. Marsh s results), with labels

20 Statistical Significance of a Correlation 20 provided to aid interpretation. Note that SPSS labels the p value as Sig. (2-tailed). As shown in Figure 2, the exact p value is.67. The p values reported by the statistical software are used to make decisions about the null hypothesis. If the p value is larger than.05 (as in Figure 2), then we would fail to reject the null hypothesis. If the p value is smaller than.05, then we would reject the null hypothesis. Therefore, if you use statistical software for correlational analysis, then you will not need to refer to a table of t values. Instead, you simply examine the exact p value and gauge whether it is greater than or less than.05. Additional Significance Tests for a Correlation By far, the most typical significance test of a correlation is a test of the null hypothesis that the population correlation is zero (H 0 : ρ = 0). This is most commonly reported in the psychological literature, and it is the default test, as reflected in the p values reported by statistical software packages such as SPSS or SAS. Despite this, we could test other null hypotheses involving correlations. We could test a null hypothesis that the population correlation is a specific value other than zero. For example, previous research might indicate the correlation between Conscientiousness and Work Performance is.30 in the population, but we might hypothesize that some professions, might have an even stronger correlation between Conscientiousness and Work Performance. We could recruit a sample of accountants, measure their Conscientiousness and measure their Work Performance, and we might test the null hypothesis that the correlation in the population of accountants (ie, the population from which our sample is drawn) is.30 (i.e., H 0 : ρ =.30). In this case, we believe that the correlation among accountants is not.30, which is reflected in the alternative hypothesis (H 1 ρ.30). The significance test for this example would be conducted somewhat differently than the much more typical test outlined earlier, and many statistics textbooks describe the procedures. Other p Values Besides.05

21 Statistical Significance of a Correlation 21 As described above, researchers have traditionally allowed themselves to reject null hypotheses when their analyses suggest that there is less than a 5% chance of making a Type I Error (i.e., incorrectly rejecting a null hypothesis). Although the p value (alpha level) of.05 is the conventional point at which researchers reject a null hypothesis, researchers could consider using different p values. Researchers sometimes use an even more strict criterion, such as an alpha level.01. Researchers who decide to use a p value of.01 would reject the null hypothesis only when their analyses suggest that there is less than a 1% chance of making a Type I Error. Using a different p value changes only Step 3 in the process of statistical significance tests, as illustrated in Table 2. In Step 3, the researcher would select a critical t value associated with a.01 alpha level. To identify the appropriate critical value, the researcher would refer to a table such as Table 1, and examine the column labeled.01 instead of the column labeled.05. The researcher would then proceed to Step 4 and Step 5, comparing their observed t value to the critical t value associated with the.01 alpha level. As shown in Table 1, the critical t value for a study conducted with an alpha of.01 is larger than the critical t value for a study conducted with an alpha of.05. In terms of the significance test, this difference means that researchers must be even more confident that the null hypothesis is incorrect. That is, a larger observed t value is required in order to reject the null hypothesis when using an alpha of.01. Two-tailed vs One-tailed Tests The examples described in this paper are based on two-tailed significance tests. The tests are designed to evaluate the null hypothesis that the population correlation is zero (H 0 : ρ = 0), in comparison to the alternative hypothesis that the population correlation is not zero (H 1 : ρ 0). For such two-tailed tests, the null hypothesis could be rejected if the sample correlation is positive or negative if the correlation is on either of the two sides of zero. These hypotheses are non-directional they do not reflect any kind of expectation that the population correlation is positive, for example. But researchers might have strong reasons to suspect that the correlation is in a specific direction. For example Dr. Cartman might suspect that the correlation between SAT and GPA is positive. In such

22 Statistical Significance of a Correlation 22 cases, researchers could consider using a one-tailed significance test. For one-tailed tests, the hypotheses are framed differently. If Dr. Cartman hypothesized that the population correlation is positive, then he might conduct a one-tailed test in which he tests that null hypothesis that the population correlation is less than or equal to zero (H 0 : ρ 0), in comparison to the alternative hypothesis that the population correlation is greater than zero (H 1 : ρ > 0). These are known as directional hypotheses. Conducting a one-tailed test changes Step 3 in the process of statistical significance tests, as illustrated in Table 2. In Step 3, the researcher would select a critical t value associated with a one-tailed (at the alpha level that he or she has chosen, usually.05). To identify the appropriate critical value, the researcher would refer to a table of critical t values. For the sake of simplifying the earlier discussion of critical values, Table 1 does not include information for one-tailed tests; however, many textbooks include tables with columns that guide researchers to the appropriate critical t values for one-tailed tests. The researcher would then proceed to Step 4 and Step 5, comparing their observed t value to the critical t value associated with the one-tailed test. Although researchers might use one-tailed tests, two-tailed tests are probably more common. One-tailed tests are often perceived as more liberal that two-tailed tests (e.g., Gravetter & Wallnau, 2004), allowing researchers to reject the null hypothesis more easily (although, in fact the two approaches have equal probabilities of producing a Type I error). This perception arises from the fact that the critical t values used in one-tailed tests are smaller than are the critical t values used in two-tailed tests. Consequently, researchers must meet a lower degree of confidence before rejecting the null hypothesis in a one-tailed test. Researchers tend to shy away from procedures that make is easier to reject a null hypothesis, preferring to take a more conservative approach. Put another way, researchers are reluctant to adopt procedures that might increase the probability of making a Type I error, and the use of one-tailed tests is often perceived as potentially increasing such errors. Therefore, despite the logic of one-tailed tests and directional hypotheses, two-tailed tests and non-directional hypotheses are used frequently. An Alternative Conceptualization of an Inferential Statistic

23 Statistical Significance of a Correlation 23 The conceptual approach to significance testing that is adopted in the current paper emphasizes the importance of effect size and sample size in determining statistical significance (see Equation 1, above). Textbooks usually present an alternative approach to significance testing. The alternative approach is very similar to the one outlined in the current paper, in that it proceeds through the same Steps listed in Table 2 and produces the same result. However, the alternative approach uses a slightly different conceptual framework. Again, a full description of the alternative approach is beyond the scope of this paper, but a general familiarity could be useful. The difference between the two approaches lies in the conceptualization of Step 2 (computing the observed t value). The alternative approach includes two components. First, we have greater confidence that the null hypothesis is incorrect when our sample s statistic (i.e., our observed correlation) is far away from what is predicted by the null hypothesis. Second, we have greater confidence in making inferences about the population from which the sample was drawn when our sample s statistic is a precise estimate of the population parameter. As outlined in many textbooks, the alternative approach conceptualizes an inferential statistic in the following way: Observed t value = Observed value of - Expected value of the correlation the correlation (in the sample) under the null hypothesis Standard Error of the correlation Equation 2 or t OBESRVED = r - ρ s r In this approach, the observed t value again reflects our confidence that the null hypothesis is incorrect larger observed t values make us more likely to reject the null hypothesis. We will assume that we are conducting a test of the typical null hypothesis (H 0 : ρ 0) that the correlation is zero in the population from which the sample was drawn. As in the approach described earlier, we are more likely to reject the null hypothesis when the observed correlation is far away from zero than when the observed correlation is close to zero. This is reflected in the numerator of the equation above the difference between the observed correlation and the correlation that is proposed by the null hypothesis.

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

This chapter discusses some of the basic concepts in inferential statistics.

This chapter discusses some of the basic concepts in inferential statistics. Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details

More information

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1 Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or 1 Chapter 7 Comparing Means in SPSS (t-tests) This section covers procedures for testing the differences between two means using the SPSS Compare Means analyses. Specifically, we demonstrate procedures

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails. Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal

More information

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Correlational Research

Correlational Research Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

More information

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but Test Bias As we have seen, psychological tests can be well-conceived and well-constructed, but none are perfect. The reliability of test scores can be compromised by random measurement error (unsystematic

More information

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Eight things you need to know about interpreting correlations:

Eight things you need to know about interpreting correlations: Research Skills One, Correlation interpretation, Graham Hole v.1.0. Page 1 Eight things you need to know about interpreting correlations: A correlation coefficient is a single number that represents the

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

WISE Power Tutorial All Exercises

WISE Power Tutorial All Exercises ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Introduction to Hypothesis Testing." My name is Dr. Jennifer Ann Morrow. In today's demonstration,

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Two Related Samples t Test

Two Related Samples t Test Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

Writing Thesis Defense Papers

Writing Thesis Defense Papers Writing Thesis Defense Papers The point of these papers is for you to explain and defend a thesis of your own critically analyzing the reasoning offered in support of a claim made by one of the philosophers

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

The Kruskal-Wallis test:

The Kruskal-Wallis test: Graham Hole Research Skills Kruskal-Wallis handout, version 1.0, page 1 The Kruskal-Wallis test: This test is appropriate for use under the following circumstances: (a) you have three or more conditions

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Confidence Intervals for Cp

Confidence Intervals for Cp Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process

More information

Fixed-Effect Versus Random-Effects Models

Fixed-Effect Versus Random-Effects Models CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval

More information

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) 1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

X = T + E. Reliability. Reliability. Classical Test Theory 7/18/2012. Refers to the consistency or stability of scores

X = T + E. Reliability. Reliability. Classical Test Theory 7/18/2012. Refers to the consistency or stability of scores Reliability It is the user who must take responsibility for determining whether or not scores are sufficiently trustworthy to justify anticipated uses and interpretations. (AERA et al., 1999) Reliability

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw

Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw Introduction To determine if a theory has the ability to explain, predict, or describe, you conduct experimentation and

More information

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests Final Report Sarah Maughan Ben Styles Yin Lin Catherine Kirkup September 29 Partial Estimates of Reliability:

More information

Test-Retest Reliability and The Birkman Method Frank R. Larkey & Jennifer L. Knight, 2002

Test-Retest Reliability and The Birkman Method Frank R. Larkey & Jennifer L. Knight, 2002 Test-Retest Reliability and The Birkman Method Frank R. Larkey & Jennifer L. Knight, 2002 Consultants, HR professionals, and decision makers often are asked an important question by the client concerning

More information

Types of Error in Surveys

Types of Error in Surveys 2 Types of Error in Surveys Surveys are designed to produce statistics about a target population. The process by which this is done rests on inferring the characteristics of the target population from

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Variables Control Charts

Variables Control Charts MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables

More information

A full analysis example Multiple correlations Partial correlations

A full analysis example Multiple correlations Partial correlations A full analysis example Multiple correlations Partial correlations New Dataset: Confidence This is a dataset taken of the confidence scales of 41 employees some years ago using 4 facets of confidence (Physical,

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

There is a simple equation for calculating dilutions. It is also easy to present the logic of the equation.

There is a simple equation for calculating dilutions. It is also easy to present the logic of the equation. Solutions: Dilutions. A. Dilutions: Introduction... 1 B. The dilution equation... 2 C. The logic of the dilution equation... 3 D. Should you memorize the dilution equation? -- Attention X11 students...

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Writing the Empirical Social Science Research Paper: A Guide for the Perplexed. Josh Pasek. University of Michigan.

Writing the Empirical Social Science Research Paper: A Guide for the Perplexed. Josh Pasek. University of Michigan. Writing the Empirical Social Science Research Paper: A Guide for the Perplexed Josh Pasek University of Michigan January 24, 2012 Correspondence about this manuscript should be addressed to Josh Pasek,

More information

Measurement in ediscovery

Measurement in ediscovery Measurement in ediscovery A Technical White Paper Herbert Roitblat, Ph.D. CTO, Chief Scientist Measurement in ediscovery From an information-science perspective, ediscovery is about separating the responsive

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine 2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels

More information

Lesson 9 Hypothesis Testing

Lesson 9 Hypothesis Testing Lesson 9 Hypothesis Testing Outline Logic for Hypothesis Testing Critical Value Alpha (α) -level.05 -level.01 One-Tail versus Two-Tail Tests -critical values for both alpha levels Logic for Hypothesis

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

STATISTICS FOR PSYCHOLOGISTS

STATISTICS FOR PSYCHOLOGISTS STATISTICS FOR PSYCHOLOGISTS SECTION: STATISTICAL METHODS CHAPTER: REPORTING STATISTICS Abstract: This chapter describes basic rules for presenting statistical results in APA style. All rules come from

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

American Journal Of Business Education July/August 2012 Volume 5, Number 4

American Journal Of Business Education July/August 2012 Volume 5, Number 4 The Impact Of The Principles Of Accounting Experience On Student Preparation For Intermediate Accounting Linda G. Carrington, Ph.D., Sam Houston State University, USA ABSTRACT Both students and instructors

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information