CHAPTER 8 TESTING HYPOTHESES. Most of the information presented in the first seven chapters of this text has focused on


1 CHAPTER 8 TESTING HYPOTHESES Most of the information presented in the first seven chapters of this text has focused on the theoretical and mathematical foundations for a range of statistics beginning with the mean, median, and mode and building up to standard error of the mean and the development of confidence intervals. Each of these statistics represents an extremely important tool in the process of conducting research, but as the introductory chapter explained, they are not the primary objective of statistical research. Instead, these tools merely represent a means to an end. The end which is sought by statistical researchers is answering specific research questions that relate to an area of observation or study. The formulation of a research question begins with observation of the world around us. Political and social scientists observe the world of politics, noticing interesting events, actions, and characteristics. Research questions are then formed to provide a framework for investigating those events, actions, and characteristics in an effort to explain them and predict what outcomes will be produced under similar circumstances in the future. A good research question should focus on a theoretical issue, not a statistical one, and identify the phenomenon that the researcher would like understand more completely. For example, a researcher interested in voter turnout may notice that she routinely sees many more older voters than young voters when she goes to cast her ballot. She wants to explore the issue scientifically to determine if her observation reflects a larger trend in voting patterns. The following research question can then be formulated: Is voter turnout influenced by the age of the voter? The next step in the process involves the formulation of hypotheses. Hypotheses represent a proposed explanation of the phenomenon noted in the research question. Essentially, 100
2 a hypothesis is represents a story created by the researcher to explain what she has observed. Hypotheses suggest the existence of a cause and effect relationship between two or more factors. Unlike research questions, the formulation of hypotheses must involve statistical relationships between events, actors, and/or groups that can be subjected to testing using the statistical tools available. Statistical research involves two distinct types of hypotheses. The first type of hypothesis is called the research hypothesis. This represents the proposed explanation or story that has been developed by the researcher. In our example, observation suggests that older voters may be more likely to vote than younger individuals. Therefore, on potential hypothesis could be that: Those over 65 are significantly more likely to vote than those under 25. This hypothesis suggests that an analysis of voting patterns should reveal levels of turnout among older voters that are significantly higher than those of young voters. Most statistical research involves the use of samples rather than entire populations. Even though sampling methods have become quite advanced and scientifically valid, they do reduce the overall precision that can be obtained in the data. The constant presence of sampling error in the process threatens the reliability of observed differences in the data. For this reason, statistical researchers have chosen to establish a relatively high standard of evidence when testing hypothesis. Rather than beginning with the presumption that the research hypothesis is correct, statistical researchers are expected to begin testing with the assumption that the research hypothesis is incorrect, only confirming it when the supporting evidence is extremely strong. This approach is somewhat similar to the presumption of innocence employed within the criminal justice system. Under that system, those accused are to be judged innocent unless prosecutors can prove beyond a reasonable doubt that they are guilty of a crime. Likewise, the 101
3 scientific researcher is expected to judge the research hypothesis false unless she is able to produce very strong evidence to the contrary. Application of this principle requires the formulation of the second type of hypothesis; the null hypothesis. It represents a negative statement of the research hypothesis, suggesting that there is no difference in groups or that there is no relationship between factors. The null hypothesis is the one that is subjected to statistical evaluation and either accepted or rejected as a result. When writing a null hypothesis, it is essential that the researcher clearly identify what will be tested and the nature of the test to be performed. To return to our previous example, the null hypothesis must demonstrate that our researcher is interested in voter turnout and that no statically significant differences are expected among older and younger voters. A good example of a null hypothesis for this scenario would be: There is no statistically significant difference in the mean level of turnout among voters over 65 and voters under 25. A variety of statistical tools can be used to test hypotheses. When conducting these tests, researcher work with observed characteristics of actors, events, or other phenomena called variables. Variables are characteristics which can have two or more values. Variables can be found in a variety of different forms in the field of statistics. Just as data can be nominal, ordinal, or interval in nature, variables can be defined on the basis of the type of data used to create them. Variables constructed using nominal data are referred to as categorical or nominal variables. One special type of categorical variable is called the dummy variable. Dummy variables are variables which have exactly 2 possible values. A good example of a dummy variable is gender. Ordinal data produces ordinal variables. Interval data produces interval or scale variables. Another means of differentiating between variables relates to the theoretical basis of the research being conducted. Hypotheses represent an attempt to explain differences in one 102
4 variable. The variable that researchers are attempting to explain is called the dependent variable. It is the variable that the researcher believes is influenced by one or more other variables and represents the primary subject of the research process. Independent variables are the variables which are expected to produce change within the dependent variable. They are sometimes referred to as explanatory or test variables because the researcher believes manipulation of these variables will produce change in the dependent variable. A third variable which very frequently affects the results of statistical research is the intervening variable. The intervening variable is a variable which may affect a change in a dependent variable but is not known or controlled during a research project. Too many uncontrolled intervening variables may affect the research findings to the extent that the research project becomes valueless because most of the variance in the dependent variables cannot be attributed to the manipulation of an independent variable. The causation between independent and dependent variables can be illustrated by observing that an increase or decrease in the independent variable may result in an increase or decrease in the dependent variable. An increase in age (independent) may result in an increased voter turnout (dependent). This is an example of direct causation between independent and dependent variables. The manipulation of the independent variable is the hypothesized cause of a change in a dependent variable. Changes in the dependent variable are expected outcomes of changes in the independent variable. In determining cause and effect, a researcher must practice caution because some 1 causation between variables may seem apparent, but the apparent causation may be spurious. For example, an increase in inflation may seem to be the apparent cause of a rise in 1 A spurious relationship means that a phenomena is explained by a third unknown factor. In short, it is a false relationship. 103
5 unemployment because these two economic variables frequently move up or down together. However, it would be extremely difficult to statistically demonstrate that increases in inflation are actually major causes of increases in unemployment. This condition may appear to exist in the functioning of the overall economy, but the two variables do not affect each other. An increase in hand size may seem to suggest increased ability to solve math problems, but this is not the case because there are many babies in the population who have small hands and cannot solve math problems. Larger hand size does not cause greater math ability. The role of intervening variables is demonstrated in the example that an increase in age may result in an increased incidence of heart attack. Age may impact heart disease, yet variables other than age may have an even greater effect on heart problems. Intervening variables may or may not contribute to a change in the dependent variable because their real effect is not known. Once the research question, hypotheses, and relevant variables are identified and properly defined, statistical tools can be used to test the null hypothesis. The null hypothesis is generally created by researchers who fully expect to reject it and conclude that real differences exist between groups of cases. However, it is extremely important to remember that accepting the null hypothesis can still be an extremely important research finding. In many areas, the idea of differences between groups are widely accepted and research findings suggesting that such differences do not exist can have just as much impact on public policy and the advancement of science as findings that identify areas of significant differences. This chapter has introduced the concept of hypothesis testing, explaining the importance of independent and dependent variables and demonstrating the role of the null hypothesis in statistical research. Once the statistical analysis is performed, the researcher returns to the null hypothesis to draw a statistical conclusion. That conclusion then provides the basis for a 104
6 research conclusion that addresses the research or experimental hypothesis. The specific operations of the statistical tests that can be used to test hypotheses are outlined in the remaining chapters of this text. Every time a hypothesis is tested for a research situation the following MUST ALWAYS be applied: (1) State the RESEARCH QUESTION. (2) State the NULL HYPOTHESIS. (3) Conduct the STATISTICAL ANALYSIS. (4) Draw STATISTICAL CONCLUSIONS. (5) Draw RESEARCH CONCLUSIONS. 105
7 EXERCISES CHAPTER 8 1. For each of the following situations, produce a research question, a research hypothesis, and a null hypothesis: (A) (B) (C) Kelly notices that some Arkansas counties have laws prohibiting the consumption of alcohol while others do not. He decides to conduct research to determine whether such regulations have the potential to improve public safety. Jarod notices that public debate on the issue of education often focuses on the amount of money spent in this area of policy. He decides to conduct research to determine whether states that spend a lot of money provide better a better education than states that spend relatively little. Jeff observes the tendency for supporters of state lotteries to argue that they are necessary in order to provide better funding for education. He wishes to determine if the presence of a lottery has an impact on education spending. 2. For each of the situations addressed above, assume that the null hypothesis was accepted. Draw the appropriate statistical and research conclusions. 3. For each of the situations addressed in question 1, assume that the null hypothesis was rejected. Draw the appropriate statistical and research conclusions. 4. Define the following terms: (A) (B) (C) (D) (E) (F) (G) (H) research question independent variable dependent variable spurious independent sample dependent sample null hypothesis Research hypothesis 106
