Statistics: Module 2

Size: px
Start display at page:

Download "Statistics: Module 2"

Transcription

1 Statistics: Module 2 Geert Verbeke I-BioStat: Interuniversity Institute for Biostatistics and statistical Bioinformatics K.U.Leuven & Hasselt University, Belgium geert.verbeke@med.kuleuven.be verbeke PhD Biomedical Sciences

2 Contents 1 The comparison of two means: Unpaired data The comparison of two proportions: Unpaired data The comparison of two means: Paired data The comparison of two proportions: Paired data Errors in statistics: Basic concepts Errors in statistics: Practical implications One-sided versus two-sided tests Describing associations PhD Biomedical Sciences: Module 2 i

3 9 Non-parametric statistics Bibliography 234 PhD Biomedical Sciences: Module 2 ii

4 Chapter 1 The comparison of two means: Unpaired data Example Confidence interval for the difference of two means The unpaired t-test Assumptions Example: Survival times of cancer patients Example from the biomedical literature PhD Biomedical Sciences: Module 2 1

5 1.1 Example Consider an experiment in which weight gain in rats with high protein level diet is compared with weight gain in rats with low protein level diet. Group-specific histograms: PhD Biomedical Sciences: Module 2 2

6 Group-specific summary statistics: On average, there is an observed difference of 19g between the rats on a high protein diet and those on a low protein diet. Is this observed difference sufficient evidence to conclude that there indeed is an effect of diet on the weight gain? It would be of interest to know how likely such a difference of 19g is to occur if weight gain would be completely unrelated to the protein level of the diet. PhD Biomedical Sciences: Module 2 3

7 Note that, strictly speaking, we have two populations, with a sample randomly drawn from each: High protein rats: The hypothetical population of all rats that are given a high protein diet Low protein rats: The hypothetical population of all rats that are given a low protein diet From the first population, a random sample of n 1 = 12 rats was taken. From the second one, a random sample of n 2 = 7 rats was drawn. The corresponding observed means are x 1 = 120 and x 2 = 101 respectively. Because there is no relation between the observations taken from the first population and those taken from the second, we have unpaired data. PhD Biomedical Sciences: Module 2 4

8 1.2 Confidence interval for the difference of two means Let µ 1 and µ 2 be the (unknown) mean weight gain in the high and low protein population, respectively: Low protein High protein µ 2 µ 1 Of interest is to draw inferences about µ 1 µ 2 PhD Biomedical Sciences: Module 2 5

9 As always, our estimate of µ 1 µ 2 is µ 1 µ 2 = x 1 x 2 = 19 Based on the observed data, C.I. s can be constructed for µ 1 µ 2 For example, a 95% C.I. for µ 1 µ 2 is given by [ 2.19; 40.19] The true difference µ 1 µ 2 may or may not be in the interval [ 2.19; 40.19]. However, if 100 similar experiments would be conducted, then 95 out of the 100 corresponding C.I. s are expected to contain µ 1 µ 2. Hence, with 95% certainty, we can conclude that we believe µ 1 µ 2 to be within the interval [ 2.19; 40.19]. PhD Biomedical Sciences: Module 2 6

10 This C.I. shows that: the estimate (19g) of µ 1 µ 2 is a very imprecise estimate: the C.I. is very wide the estimate is up to units precise with 95% chance based on our data, it cannot be ruled out that µ 1 µ 2 would be zero, i.e., that there would be no difference between both populations. PhD Biomedical Sciences: Module 2 7

11 1.3 The unpaired t-test Often, it is of interest to test whether two populations have the same mean. This is translated in a set of hypotheses of the form: H 0 : µ 1 = µ 2 versus H A : µ 1 µ 2 We will reject the null hypothesis if the observed data show too much deviation from what is expected to see if the null hypothesis were correct Hence, we will reject H 0 if x 1 is much larger than x 2, or vice versa This is equivalent with rejecting H 0 if x 1 x 2 is too large PhD Biomedical Sciences: Module 2 8

12 Question: How large is too large? Answer: If the observed difference x 1 x 2 is very unlikely to happen by pure chance We therefore calculate the propability p of observing a similar experiment with mean difference between the groups of at least 19g, if µ 1 = µ 2. PhD Biomedical Sciences: Module 2 9

13 In our example, this probability equals p = : So, even if there is no relation at all between the protein content of the diet and weight gain, then one can still expect to observe a difference of at least 19g in 7.6% of the future similar experiments. Since p = > 0.05 = α, we consider this unsufficient evidence to conclude that the protein level would indeed affect the weight gain PhD Biomedical Sciences: Module 2 10

14 Conclusion: There is no significant difference (p = ) in weight gain between rats on a high protein level diet, and rats on a low protein level diet The above testing procedure is called the unpaired t-test since unpaired data are analysed, and since the calculation of the p-value is based on the t-distribution. PhD Biomedical Sciences: Module 2 11

15 1.4 Assumptions The calculation of the C.I., as well as the computation of the p-value are based on the sampling distribution of X 1 X 2, which describes what values for x 1 x 2 can be expected in case the experiment would be repeated many times. The sampling distribution of X 1 X 2 is completely determined from the sampling distribution of X 1 and X 2 In case of large samples, those distributions are known to be normal (CLT) In small samples, this normality of X 1 and X 2 is only valid in cases where the original data are (approximately) normally distributed. PhD Biomedical Sciences: Module 2 12

16 Therefore, in case of small samples, one assumes the outcome to be normally distributed in each group separately: Low protein High protein µ 2 µ 1 PhD Biomedical Sciences: Module 2 13

17 Conclusion: Low protein High protein Large samples: no assumptions µ 2 µ 1 Low protein High protein Small samples: Normality in both groups µ 2 µ 1 Note that the samples in our group were small (n 1 = 12 and n 2 = 7). Hence the histograms should be explored for any evidence against symmetry PhD Biomedical Sciences: Module 2 14

18 The group-specific histograms are: Note that, given the small sample sizes, assessment of symmetry is difficult This illustrates another drawback of small samples: Assumptions are often needed, which are very hard to check based on the observed data. PhD Biomedical Sciences: Module 2 15

19 Subject-matter knowledge can often help in deciding whether the underlying assumptions are realistic The unpaired t-test also implicitly asssumes that, both populations have the same variance This can be checked with a test for equality of variances, in which the following hypotheses are tested: H 0 : σ 2 1 = σ2 2 versus H A : σ 2 1 σ2 2 Most software packages automatically report the results from such a test, and even provide a corrected unpaired t-test, which corrects for the unequal variances: PhD Biomedical Sciences: Module 2 16

20 The variances are not significantly different from each other (p = ), such that our original result remains valid. Note that, since the variances are so similar, the corrected and uncorrected t-tests yield very similar results (p-values). Often, non-equality of the variances is associated with non-normality of the data PhD Biomedical Sciences: Module 2 17

21 1.5 Example: Survival times of cancer patients Based on data on survival times of cancer patients, we want to compare the surival times of stomach cancer patients with the survival times of colon cancer patients Summary statistics: We observe a large difference of = days in average survival time between both groups. PhD Biomedical Sciences: Module 2 18

22 On the other hand, there is a lot of variability between the subjects in both groups. Hence, it is not clear whether the observed difference of 171 days is sufficient evidence to conclude that survival times are indeed different for colon cancer patients and stomach cancer patients Results of the unpaired t-test: We do not find a significant difference between both groups, with respect to the survival time (p = ). PhD Biomedical Sciences: Module 2 19

23 However, the histograms suggest skewness in the data, such that the underlying assumption of normality becomes questionable: The skewness in the direction of the large values suggests that a logarithmic (or similar) transformation might be useful: X = survival time Y = ln(x) = ln(survival time) PhD Biomedical Sciences: Module 2 20

24 Histogram Possible transformations PhD Biomedical Sciences: Module 2 21

25 Stomach Colon X Y = ln(x) X Y = ln(x) PhD Biomedical Sciences: Module 2 22

26 As before, assessing symmetry is difficult due to the small number of observations in both groups. However, the evidence against symmetry is much weaker now. Results of unpaired t-test based on transformed data: The observed difference between both groups is still not significant (p = ), but the p-value is very different from what we obtained before the transformation (p = ). This illustrates that: assumptions need to be checked violation of assumptions can lead to serious errors PhD Biomedical Sciences: Module 2 23

27 Note that this is another example where geometric means and standard deviations would be useful to describe the location and spread in survival times in the two cancer groups separately: Stomach cancer Colon cancer Outcome mean (stand.dev.) mean (stand.dev.) Survival time (days) (3.49) (2.72) = exp(4.97) = exp(1.25) = exp(5.75) = exp(1.00) geometric means and standard deviations which is very different from the arithmetic means and standard deviations that were reported before: PhD Biomedical Sciences: Module 2 24

28 The fact that the formal test has been performed on the log-transformed survival times does not change the interpretation of the result If the log-transformed survival times are different for the two groups, then also the untransformed survival times Hence, although the conclusion, strictly speaking, should be that there is no significant difference in log survival times, it will often be formulated as there is no significant difference in survival times. PhD Biomedical Sciences: Module 2 25

29 1.6 Example from the biomedical literature Nissen et al. [1], Table 1: Large samples Similar variability in both groups p < rather than p = PhD Biomedical Sciences: Module 2 26

30 Kellett, Kellett, and Nordholm [2], Table 2: Relatively small samples Normality assumption NOT satisfied Variances NOT equal No reporting of the p-values PhD Biomedical Sciences: Module 2 27

31 Chapter 2 The comparison of two proportions: Unpaired data Example The chi-squared test Assumptions The Fisher Exact test Rows versus columns Example: Case-control data Example from the biomedical literature PhD Biomedical Sciences: Module 2 28

32 2.1 Example Consider data on sickness absence, collected on 585 employees with a similar job: Gender Sickness absence No Yes female male PhD Biomedical Sciences: Module 2 29

33 Research question: Is there a relation between absence and gender? 184/429 = 42.9% of the females, and 58/156 = 37.2% of the males have been absent This suggests that females are more absent than males However, even if absence due to sickness is equally frequent amongst males and females, the above results could have occurred by pure chance. It therefore would be of interest to calculate how likely it would be to observe such differences, by pure chance PhD Biomedical Sciences: Module 2 30

34 Note that we have again two populations, with a sample randomly drawn from each: Males: The hypothetical population of all male employees with similar job conditions Females: The hypothetical population of all female employees with similar job conditions From the first population, a random sample of n 1 = 156 males was taken. From the second one, a random sample of n 2 = 429 females was drawn. Let π 1 and π 2 denote the proportion of males and females in the total populations Then π 1 and π 2 can be estimated based on their sample versions π 1 = and π 2 = Because there is no relation between the observations taken from the first population and those taken from the second, we have unpaired data. PhD Biomedical Sciences: Module 2 31

35 2.2 The chi-squared test Often, it is of interest to test whether two populations have the same percentage of people with absence due to sickness. This is translated in a set of hypotheses of the form: H 0 : π 1 = π 2 versus H A : π 1 π 2 We will reject the null hypothesis if the observed data show too much deviation from what is expected to see if the null hypothesis were correct Hence, we will reject H 0 if π 1 is much larger than π 2, or vice versa This is equivalent with rejecting H 0 if π 1 π 2 is too large PhD Biomedical Sciences: Module 2 32

36 Question: How large is too large? Answer: If the observed difference π 1 π 2 is very unlikely to happen by pure chance We therefore calculate the propability p of observing a similar experiment with difference between the groups at least equal to π 1 π 2 = = 0.057, if π 1 = π 2 PhD Biomedical Sciences: Module 2 33

37 In our example, this probability equals p = 0.215: So, even if there is no relation at all between gender and absence, then one can still expect to observe a difference of 5.7% in 21.5% of the future similar experiments. Since p = > 0.05 = α, we consider this unsufficient evidence to conclude that the occurrence of sickness absence is related to gender PhD Biomedical Sciences: Module 2 34

38 Conclusion: There is no significant difference (p = 0.215) in prevalence of sickness absence between males and females The testing procedure needed for the comparison of proportions in unpaired data is called the chi-squared test since the calculation of the p-value is based on the chi-squared (χ 2 ) distribution. PhD Biomedical Sciences: Module 2 35

39 2.3 Assumptions The Fisher Exact test The calculation of the p-value is based on the sampling distribution of Π 1 Π 2, which describes what values for π 1 π 2 can be expected in case the experiment would be repeated many times. Note that Π 1 and Π 2 are the sample averages X 1 and X 2 of the binary variable sickness absence. Hence, for large samples, the sampling distribution of π 1 π 2 directly follows from the CLT In small samples, the normality of Π 1 and Π 2 can be problematic, and an alternative calculation of the p-value is needed. PhD Biomedical Sciences: Module 2 36

40 The Fisher Exact test provides an alternative way to calculate the p-value, without relying on the CLT, nor on the assumption of large samples. As an example, we consider again data on sickness absence, but from a second, much smaller, company: Gender Sickness absence No Yes female male The results based on the chi-squared as well as on the Fisher Exact test are: PhD Biomedical Sciences: Module 2 37

41 We observe considerable differences due to the (extremely) small sample sizes in both groups In larger samples, chi-squared and Fisher Exact produce much more similar p-values: Sickness absence p-value Company Males Females χ 2 Fisher Exact 1 58/ / /12 1/ / / /97 40/ /10 48/ /156 1/ /12 0/ /170 0/ / / PhD Biomedical Sciences: Module 2 38

42 The Fisher Exact test is very time-consuming, and cannot be calculated for large samples, except with special software. However, note that, for large samples, the chi-squared test remains possible, and yields results very similar to the ones that would have been obtained with the Fisher Exact test In practice, it is often standard to use Fisher Exact, unless computational restrictions require the use of chi-squared. Conclusion: Large samples: Chi-squared test Small samples: Fisher Exact test PhD Biomedical Sciences: Module 2 39

43 2.4 Rows versus columns When comparing two unpaired proportions, the data can always be summarized by a 2 2 table: Gender Sickness absence No Yes female A B A + B male C D C + D A + C B + D A + B + C + D in which A, B, C, and D represent the number of observations in each cell. The hypothesis of interest was to compare the prevalence of sickness absence between males and females. PhD Biomedical Sciences: Module 2 40

44 One can show that this is equivalent with comparing the percentage of males (females) between the employees with and without sickness absence: B A + B = D C C + D A + C = D B + D Proof: B A + B = D C + D B(C + D) = D(A + B) BC = AD C(B + D) = D(A + C) C A + C = D B + D This implies that, for the analysis of a 2 2 table, rows and columns can be interchanged. This is of interest for the analysis of case-control data PhD Biomedical Sciences: Module 2 41

45 2.5 Case-control data We consider the data on cervical cancer, where the relationship between the occurrence of cervical cancer and the age at first pregnancy is studied. Data were collected on 49 cancer cases and 317 non-cancer cases (controls). All women were asked about their age at first pregnancy, and the data are summarized as: Age Disease status Cervical cancer Control > PhD Biomedical Sciences: Module 2 42

46 Research question: Is there a relation between cancer and age? Of interest is to compare the prevalence of cancer between women with first pregnancy before the age of 25, and those with first pregnancy later. However, correct estimation of these percentages would have required a sample of women with first pregnancy before the age of 25, and a sample of women with first pregnancy later This was not the setup of the present experiment, where a number of cases and a number of controls are randomly selected, and where all women are then questioned about their age at first pregnancy. PhD Biomedical Sciences: Module 2 43

47 Such a design only allows correct estimation of the percentage of women with first pregnancy before the age of 25, for cases and controls separately. However, since rows and columns can be interchanged, this is sufficient to answer our research question of interest: PhD Biomedical Sciences: Module 2 44

48 For testing purposes, rows and columns can be interchanged, implying that the analysis of case-control data still answers the research question of interest For descriptive purposes, however, the choice between row and column percentages entirely depends on the design of the study. In the above example on cervical cancer, the row-percentages (i.e., percentage of women with first pregnancy before the age of 25), for cancer cases and controls separately, are the only ones that reflect the case-control nature of the experiment. PhD Biomedical Sciences: Module 2 45

49 2.6 Example from the biomedical literature Zuskin et al. [3], p.173 and Table 1: PhD Biomedical Sciences: Module 2 46

50 It is not clear when chi-squared is used, and when Fisher Exact is used PhD Biomedical Sciences: Module 2 47

51 Chapter 3 The comparison of two means: Paired data Example Confidence interval for the difference of two means The paired t-test The paired versus unpaired t-test Example Assumptions Example from the biomedical literature PhD Biomedical Sciences: Module 2 48

52 3.1 Example We consider the Captopril example, where blood pressure was taken in 15 hypertensive patients, before and after administration of the drug Captopril: PhD Biomedical Sciences: Module 2 49

53 Dataset Captopril Before After Patiënt SBP DBP SBP DBP Average (mm Hg) Diastolic before: Diastolic after: Systolic before: Systolic after: PhD Biomedical Sciences: Module 2 50

54 Research question: How does treatment affect BP? As in the unpaired t-test, we might consider this a two-sample case, where a sample is taken from each of two populations: Population 1: Patients without treatment Population 2: Patients after treatment with Captopril Let µ 1 be the population average BP if no treatment is given, and let µ 2 denote the population average BP after treatment. PhD Biomedical Sciences: Module 2 51

55 After treatment Without treatment µ 2 µ 1 Interest is in inference for the difference µ = µ 1 µ 2. The main difference when compared to the unpaired t-test is that each observation from the first sample now uniquely corresponds to one observation from the second sample, and vice versa. Hence, we have paired data PhD Biomedical Sciences: Module 2 52

56 In the case of unpaired data, µ would be estimated by the difference between the two sample averages: µ = µ 1 µ 2 = x 1 x 2 In the case of paired data, µ is estimated by the average of all subject-specific differences between BP s before and after treatment. More specifically, the variable of interest becomes the difference X in BP before and after treatment: X = BP before BP after PhD Biomedical Sciences: Module 2 53

57 The observed values x i for X can be calculated from the observed values of the BP in our sample: Before After Change Patiënt DBP DBP x i µ is the population mean of the variable X, and inference for µ can be based on the within-subject differences x i, rather than on the original BP measurements. PhD Biomedical Sciences: Module 2 54

58 3.2 Confidence interval for the difference of two means For example, a 95% confidence interval for µ is given by [4.91; 13.63]. Other confidence levels (99%, 90%,...) are possible as well The true average effect µ may or may not be in the interval [4.91; 13.63]. However, if 100 similar experiments would be conducted, then 95 out of the 100 corresponding C.I. s are expected to contain µ. Hence, with 95% certainty, we can conclude that we believe µ to be within the interval [4.91; 13.63]. PhD Biomedical Sciences: Module 2 55

59 This C.I. shows that: the estimate (9.27mmHg) of µ is a very imprecise estimate: the C.I. is very wide the estimate is up to 4.36 units precise with 95% chance based on our data and with 95% certainty, it can be ruled out that µ would be zero, i.e., that there would be no treatment effect at all. PhD Biomedical Sciences: Module 2 56

60 3.3 The paired t-test The hypothesis of interest is H 0 : µ 1 = µ 2 versus H A : µ 1 µ 2 This is equivalent with the following test about the mean of the difference X in bloodpressure: H 0 : µ = 0 versus H A : µ 0 We will reject the null hypothesis if the observed data show too much deviation from what is expected to see if the null hypothesis were correct Hence, we will reject H 0 if x is much larger or smaller than 0. PhD Biomedical Sciences: Module 2 57

61 This is equivalent with rejecting H 0 if x 0 is too large Question: How large is too large? Answer: If the observed difference x 0 is very unlikely to happen by pure chance We therefore calculate the propability p of observing a similar experiment with average observed effect of at least 9.27mmHg, if µ = 0. In our example, this probability equals p = PhD Biomedical Sciences: Module 2 58

62 So, if there would be no treatment effect at all, then one can expect to observe a difference of at least 9.27mmHg in only 0.1% of the future similar experiments. Since p = < 0.05 = α, we consider this sufficient evidence to conclude that Captopril affects the diastolic BP Conclusion: There is a significant difference (p = 0.001) in diastolic BP before and after treatment with Captopril The testing procedure is called the paired t-test since paired data are analysed, and since the calculation of the p-value is based on the t-distribution. PhD Biomedical Sciences: Module 2 59

63 3.4 The paired versus unpaired t-test What if the Captopril data were analysed using an unpaired t-test? PhD Biomedical Sciences: Module 2 60

64 Results from unpaired and paired t-tests, respectively: Unpaired: Paired: Although both tests lead to a significant result, there is a serious difference in p-values, showing that ignoring the paired nature of the data can lead to wrong conclusions. PhD Biomedical Sciences: Module 2 61

65 Conclusion: 15 2 measurements 30 1 measurement In general, the analysis of an outcome, measured multiple times per subject (repeated measures), requires different statistical procedures than when the outcome is measured only once for each subject. PhD Biomedical Sciences: Module 2 62

66 3.5 Example Obviously, it is important to correctly account for the paired nature of the data In practice, this requires knowledge about the design of the study and the way data have been collected As an example, suppose interest is in testing for differences in BMI between males and females Suppose that BMI measurements are available for 100 males and 100 females. The unpaired t-test is the obvious choice for the analysis, provided all assumptions are satisfied. Suppose now that the 100 males and females are taken from 100 married couples, would this change the preferred method for analysis? YES! PhD Biomedical Sciences: Module 2 63

67 3.6 Assumptions The calculation of the C.I. as well as the computation of the p-value is based on the sampling distribution of X, which describes what values for x can be expected in case the experiment would be repeated many times. In large samples, this sampling distribution is normal (CLT) In small samples, this normality is only valid in cases where the difference in BP is (approximately) normally distributed. Therefore, in case of small samples, one assumes the difference X to be normally distributed. Note that, in this context, the sample size refers to the number of pairs, not the number of observations in the data set PhD Biomedical Sciences: Module 2 64

68 Conclusion: Difference X Large samples: no assumptions µ = 0? Difference X Small samples: Normality for difference X µ = 0? In our Captopril example, the sample size was small (n = 15). Hence the histogram of the observed differences should be explored for any evidence against symmetry PhD Biomedical Sciences: Module 2 65

69 Histogram of observed differences: Assessment of symmetry is again difficult due to the small sample size, but there is no strong evidence for severe skewness. Note that the normality assumption is with respect to the difference X, not the original measurements. PhD Biomedical Sciences: Module 2 66

70 In our example, the original BP measurements (before and after treatment) are allowed to be skewed, as long as their differences are symmetrically distributed: After treatment Before treatment Difference X µ 2 µ 1 µ = 0? Hence, it is useless to check symmetry of the original observations. PhD Biomedical Sciences: Module 2 67

71 Note that, in case of skewness, it is often difficult and/or not helpful to transform the observed differences x i : Since often negative differences are observed, several standard transformations such as ln( ) or are not possible Even if a transformation such as, e.g., y i = ln(x i + 10) would yield symmetric observations y i, it is not clear what null hypothesis should be tested. Obviously, one can no longer test whether the mean of Y is equal to zero. In case of skewness, one therefore usually transforms the original data in such way that the differences become symmetric. This has the advantage that: Simple, standard, transformations can often be used One can still test for mean zero. PhD Biomedical Sciences: Module 2 68

72 For example, a potential transformation for the Captopril data would be: BP before BP after ln(bp before ) X ln(bp after ) = ln(bp before ) ln(bp after ) instead of: BP before BP after X = BP before BP after Y = ln(x + 5) PhD Biomedical Sciences: Module 2 69

73 3.7 Example from the biomedical literature Chen et al. [4], p. 76 and Tables 1 and 2: PhD Biomedical Sciences: Module 2 70

74 Paired t-test to test for time trends (IAC versus AOD) PhD Biomedical Sciences: Module 2 71

75 Unpaired t-test to test for group differences (SARS verus Control) PhD Biomedical Sciences: Module 2 72

76 Chapter 4 The comparison of two proportions: Paired data Example Mc Nemar test Assumptions Remark Mc Nemar versus chi-squared Example from biomedical literature PhD Biomedical Sciences: Module 2 73

77 4.1 Example Consider the data on the prevalence of severe colds in 1319 children, measured at the ages of 12 and 14. The response of interest is whether the child had severe colds during the last 12 months Severe colds at 12 yrs. Severe colds at 14 yrs. Yes No Yes No PhD Biomedical Sciences: Module 2 74

78 Research question: Is the prevalence of severe colds different at the two ages? At age 12, 356/1319 = 27% of the children reported severe colds. At age 14, this percentage equals 468/1319 = 35% These data suggest that the prevalence of severe colds increases with age. It would be of interest to know how likely the observed change in prevalence is to occur by pure chance. If this is very unlikely, the above data provide evidence that the prevalence indeed changes with age. Otherwise, the above data do not provide evidence for such a change. PhD Biomedical Sciences: Module 2 75

79 Note that the data structure is similar to the one in the Captopril data, in the sense that subjects are measured twice at different time points: Hence, we have again paired data. PhD Biomedical Sciences: Module 2 76

80 4.2 Mc Nemar test Let π 1 and π 2 be the percentage of children in the total population with a severe cold at the ages 12 and 14 respectively. Interest is in testing whether π 1 and π 2 are equal, which would reflect no change over time in the percentage of children with a severe cold. The hypothesis of interest is H 0 : π 1 = π 2 versus H A : π 1 π 2 Note that a change over time in the percentage of severe colds can only occur if children change their status: No severe cold at 12yrs severe cold at 14yrs Severe cold at 12yrs no severe cold at 14yrs PhD Biomedical Sciences: Module 2 77

81 Moreover, in order to have a change over time, more children should change in one direction than in the other Our test will therefore reject H 0 if the number of changers in one direction is much larger than the number of changers in the other direction. In our example, we will reject H 0 if is too large Question: How large is too large? Answer: If the observed difference is very unlikely to happen by pure chance PhD Biomedical Sciences: Module 2 78

82 We therefore calculate the probability p of observing a similar experiment with difference between the numbers of changers at least equal to = 112, if there would be no change over time in the total population. In our example, this probability equals p < : This p-value! So, if severe colds would occur equally frequently at both ages, it would be very unlikely to observe what has been observed in this particular experiment We therefore conclude that our data provide evidence that the probability of having a severe cold at the age of 12 is not the same as the probability of having a severe cold at the age of 14. PhD Biomedical Sciences: Module 2 79

83 Conclusion: There is a significant difference (p < ) in the occurrence of severe colds between the ages 12 and 14 The testing procedure needed for the comparison of proportions in paired data is called the Mc Nemar test. PhD Biomedical Sciences: Module 2 80

84 4.3 Assumptions Similarly to the chi-squared test, the calculation of the p-value is based on the assumption of a large sample In case of small samples, the p-value can be calculated without approximations based on CLT The exact calculation is similar to the Fisher Exact test for unpaired data. Many statistical packages only support the large-sample calculations. PhD Biomedical Sciences: Module 2 81

85 4.4 Remark As discussed before, the Mc Nemar test rejects H 0 if the off-diagonal elements are too different from each other, i.e., if there are many more changes in one direction than in the other direction. This implies that the testing procedure is independent of the observed diagonal elements Examples: Table: McNemar: comparison: vs vs result: p = p = PhD Biomedical Sciences: Module 2 82

86 4.5 Mc Nemar versus chi-squared There seems to be a lot of confusion about when Mc Nemar test and when chi-squared test should be used. As an example, consider the results from a survey in which 75 people were questioned about their intended vote in the US presidential elections, before and after a debate on the national television: Before TV debate After TV debate Reagan Carter Reagan Carter PhD Biomedical Sciences: Module 2 83

87 Depending on the research question, this table can be analysed in two different ways: Chi-squared: test for relation between vote before and after debate Mc Nemar: test for equal proportion Reagan voters before and after debate Hence, even when data are paired, the chi-squared test can be used Note that, in case of continuous data, there is no such choice: Unpaired data = Unpaired t-test Paired data = Paired t-test PhD Biomedical Sciences: Module 2 84

88 4.5.1 Mc Nemar test Research question: Before TV debate After TV debate Reagan Carter Reagan Carter Is the proportion Reagan voters the same before and after the debate? The observed proportions are 34/75 = 45.3% and 40/75 = 53.3% PhD Biomedical Sciences: Module 2 85

89 The p-value obtained from the Mc Nemar test equals p = : Hence the observed difference of 45.3% versus 53.3% would happen in 26.36% of the cases, even if the percentage of voters for Reagan is the same before and after the debate. Conclusion: The debate has not significantly changed the voting behaviour (p = ). PhD Biomedical Sciences: Module 2 86

90 4.5.2 Chi-squared test Research question: Before TV debate After TV debate Reagan Carter Reagan Carter Or equivalently: Is there a relation between voting behaviour before and after the debate? Is the proportion of Reagan voters after the debate the same amongst those who were in favour of Reagan before the debate as amongst those who were in favour of Carter before the debate? PhD Biomedical Sciences: Module 2 87

91 The observed proportions are 27/34 = 79.4% and 13/41 = 31.7% Note that this comes down to comparing the proportion of Reagan voters after the debate, between two separate groups: Those who were in favour of Reagan before the debate, and those who were not in favour of Reagan before the debate. Hence, we now compare unpaired proportions. The p-value obtained from the Chi-squared test equals p < : The observed difference of 79.4% versus 31.7% is very unlikely to happen if there would be no relation between the voting behaviour before and after the debate. PhD Biomedical Sciences: Module 2 88

92 Conclusion: There is a significant relation between the voting behaviour before and after the debate (p < ). PhD Biomedical Sciences: Module 2 89

93 4.5.3 General conclusion The survey results can be analysed in two different ways, leading to two different conclusions: Mc Nemar: There is no evidence that a TV debate would change the results of an election (p = ) Chi-squared: There is a strong relation between voting behaviour before and after the debate (p < ). Note that the proportion of Reagan voters before and after a TV debate could also be compared based on unpaired data. One then would question 75 people before the debate, and one would question 75 other people after the debate. PhD Biomedical Sciences: Module 2 90

94 The resulting 2 2 table would then contain 150 subjects: TV debate Preference Reagan Carter Before After The chi-squared test would compare the observed proportions 34/75 = 45.3% and 40/75 = 53.3%, which are the same ones as those compared before with the Mc Nemar test for the experiment with paired observations PhD Biomedical Sciences: Module 2 91

95 4.5.4 Some further examples There is no relation between (non-)significance of the chi-squared test and (non-)significance of the Mc Nemar test Examples: Table: χ 2 : comparison: vs vs vs vs result: p = p = p < p = McNemar: comparison: vs vs vs vs result: p = p < p = p = PhD Biomedical Sciences: Module 2 92

96 4.6 Example from biomedical literature De Clercq et al. [5], Abstract: Mc Nemar test to compare the presence of sumptoms before and after surgery. PhD Biomedical Sciences: Module 2 93

97 Chapter 5 Errors in statistics: Basic concepts Introduction Two types of errors Power Sample size calculation Examples Remarks Example from the biomedical literature PhD Biomedical Sciences: Module 2 94

98 5.1 Introduction Re-consider the example on the weight gain in rats, where interest is in the comparison between rats fed on a high or low protein diet Group-specific histograms: PhD Biomedical Sciences: Module 2 95

99 Group-specific summary statistics: On average, there is an observed difference of 19g between the rats on a high protein diet and those on a low protein diet. Based on the unpaired t-test, we obtained before that this observed difference is not sufficient evidence to believe that the weight gain is really different for the two diets (p = ) PhD Biomedical Sciences: Module 2 96

100 Conclusion: There is no significant difference (p = ) in weight gain between rats on a high protein level diet, and rats on a low protein level diet As indicated before, the result of a statistical test should be interpreted as evidence in favour or against the null hypothesis, and should not be interpreted as formal proof. In our example, the difference in weight gain between a population treated with one diet and a population treated with the other diet is too small to be detected based on 12 and 7 animals, respectively. PhD Biomedical Sciences: Module 2 97

101 Alternatively, if the t-test would have lead to p = 0.001, this would still not formally proof that there is a difference between both populations. After all, p = would only indicate that the observed difference of 19g occurs once every 1000 times, even if there is no difference at all between both populations. Maybe, our sample was indeed the extreme one that happens once every thousand experiments. Hence, whenever statistical tests are used, one has to be aware that errors in the conclusions can occur. It is therefore important to quantify the errors, and to keep them under control PhD Biomedical Sciences: Module 2 98

102 5.2 Two types of errors Reality H 0 correct H 0 not correct Test result Accept H 0 No error Type II error Reject H 0 Type I error No error Type I error: H 0 is incorrectly rejected Type II error: H 0 is incorrectly accepted PhD Biomedical Sciences: Module 2 99

103 5.3 Type I error A type I error occurs if H 0 is correct but the test leads to a significant result. Question: How likely is such an error to occur? Suppose the test is performed at the α = 5% level of significance If H 0 is correct, then one will observe a significant result in 5% of the cases Hence, in 5% of the cases, H 0 would be incorrectly rejected PhD Biomedical Sciences: Module 2 100

104 The probability of making a type I error is therefore equal to the chosen level α of significance. In practice, the probability of making a type I error is kept under control by choosing α sufficiently small In biomedical sciences α = 5% is often used, hereby allowing to make a type I error in 5% of the cases. Reality H 0 correct H 0 not correct Test result Accept H 0 Reject H 0 1 α If H 0 is correct, then the probability of making a type I error is α, while the probability of correctly accepting H 0 is 1 α. α 1 PhD Biomedical Sciences: Module 2 101

105 5.4 Type II error A type II error occurs if H 0 is incorrect but the test has not detected this, i.e., a non-significant result is obtained Question: How likely is such an error to occur? In contrast to the type I error, the probability of making a type II error is not easily controlled, and depends on various aspects of the sample(s) and population(s) PhD Biomedical Sciences: Module 2 102

106 In analogy to the type I error, the type II error rate is denoted by β Reality H 0 correct H 0 not correct Test result Accept H 0 1 α β Reject H 0 α 1 β 1 1 The power of a statistical test is 1 β, the probability of correctly rejecting H 0 PhD Biomedical Sciences: Module 2 103

107 5.5 Power In general, a specific testing procedure is acceptable, only if: the chance of making a type I error rate is sufficiently small the power to detect deviations from H 0 is sufficiently large The first condition can be met by specifying α sufficiently small. The second condition is more difficult to meet, as the power depends on various aspects of the sample(s) and population(s) This will be illustrated in the context of the comparison of two groups (such as the weight gain experiment) PhD Biomedical Sciences: Module 2 104

108 As before, let µ 1 and µ 2 represent the average weight gain in the total population, under high and low protein diets, respectively. The null and alternative hypotheses are given by H 0 : µ 1 = µ 2 versus H A : µ 1 µ 2 The power is the probability of correctly rejecting H 0. In that case, µ 1 µ 2, and we denote the true difference between both populations by = µ 1 µ 2 The unpaired t-test assumes the data to be normally distributed in both populations, with equal variability σ 2 PhD Biomedical Sciences: Module 2 105

109 Graphically: Low protein High protein.... σ2.... σ2 µ 2 µ PhD Biomedical Sciences: Module 2 106

110 5.5.1 Power as a function of α The smaller α, the smaller the power Intuitively: Type I errors are less likely if the null hypothesis is rejected less often. However, in cases where H 0 is truly wrong, it will still be rejected less often. An extreme case is obtained for α = 0: α = 0 implies that the null hypothesis is always accepted So, in case the null hypothesis is wrong, it is still accepted, leading to power 0 PhD Biomedical Sciences: Module 2 107

111 5.5.2 Power as a function of true difference The smaller, the smaller the power Intuitively: Large deviations from the null hypothesis are easier to detect Low protein High protein Low protein High protein µ µ 1 µ 2 µ 1... PhD Biomedical Sciences: Module 2 108

112 5.5.3 Power as a function of variability σ 2 The smaller σ 2, the larger the power Intuitively: Homogeneous groups are easier discriminated than heterogeneous groups Low protein High protein Low protein High protein µ µ 1 µ µ 1 PhD Biomedical Sciences: Module 2 109

113 5.5.4 Power as a function of sample size(s) The more observations, the larger the power Intuitively: More observations yields more information about the population(s), therefore implying more precision in the conclusions PhD Biomedical Sciences: Module 2 110

114 5.5.5 Conclusion The power depends on various aspects: Level of significance α True difference between the populations Within-group variance σ 2 Sample size(s) Note that the sample size is the only aspect under control of the investigator. In practice, one can calculate the sample size needed to reach a sufficiently high power. PhD Biomedical Sciences: Module 2 111

115 5.6 Sample size calculation As indicated before, a testing procedure is only acceptable if it has sufficient power, i.e., if the probability of making a type II error is sufficiently small. Since the sample size is the only aspect influencing the power, which is under control of the investigator, it is important that experiments are sufficiently large in order for the power to be sufficiently large as well The level α of significance is chosen such that the probability of making a type I error is sufficiently small The within-group variance σ 2 is pre-specified based on earlier, similar experiments, relevant literature, or a pilot study PhD Biomedical Sciences: Module 2 112

116 To be on the safe side, usually an upperbound for σ 2 is used: In case the variability would be smaller, the power would be higher, hence still sufficiently high In practice, is not known. Instead, the smallest which would still be clinically relevant to detect, is specified. If sufficient power is attained for the smallest meaningful, we have that: Any larger difference will be detected with even larger power We are not concerned about small powers for detecting smaller differences, as such differences are not relevant anyway. One can then calculate the number(s) of observations needed to reach a desired level of power. PhD Biomedical Sciences: Module 2 113

117 5.7 Example: Weight gain data In the weight gain data, the observed difference of 19g was found not to be significant (p = ) We can calculate the power that a real difference of 19g would be found significant if a new experiment were to be conducted, again with 12 and 7 observations in the high and low protein diet groups, respectively. Group-specific summary statistics, from the current experiment: PhD Biomedical Sciences: Module 2 114

118 Power calculations will be based on σ = 21, and α = 0.05 The power to detect a difference of 19g equals 43.45% Hence, with 12 and 7 observations respectively, there is only 43.45% chance that a true difference of 19g would be detected. If a difference of 19g is considered clinically relevant, then the weight gain experiment was clearly too small, since it is very likely that such a difference would remain undetected. We can also calculate the power for other values of PhD Biomedical Sciences: Module 2 115

119 Summary: Power to detect a difference 0g 5.00% 10g 15.70% 19g 43.45% 30g 80.80% 40g 96.49% : equal to α For example, 12 and 7 observations would be sufficient to show a true difference of 40g with more than 96% chance. Alternatively, one can also calculate how large the samples should be to detect a difference of, e.g., 20g with sufficiently high power. PhD Biomedical Sciences: Module 2 116

120 PhD Biomedical Sciences: Module 2 117

121 If a power of 90% is required to detect true effects as small as = 20g, at least 25 observations are needed in each group. With 30 observations in each group, the probability of making a type II error, when the true effect is not smaller than 20g, is approximately 5%. PhD Biomedical Sciences: Module 2 118

122 5.8 Example: Sickness absence We re-consider the data on sickness absence, collected on 585 employees with a similar job: Gender Sickness absence No Yes female male The observed difference between the absence rate 42.9% in females and 37.2% in males was found not significant (chi-squared test, p = 0.215). PhD Biomedical Sciences: Module 2 119

123 In case the percentages of sickness absence would be 42% in the total female population, and 37% in the total male population, and in case a random sample of 429 females and 156 males would be taken, there would be 19.01% chance to reach a significant effect. So, if the population proportions are indeed 42% and 37%, an experiment with 429 en 156 would detect this difference only 19 times out of 100 experiments. If a difference of 5% is considered clinically relevant, then the current experiment was clearly too small, since it is very likely that such a difference would remain undetected. We can calculate how large the samples should be in order to detect a difference between 42% and 37%, with sufficiently high power PhD Biomedical Sciences: Module 2 120

124 PhD Biomedical Sciences: Module 2 121

125 For example, two samples of approximately 2500 observations are needed in order to show a difference between 37% and 42%, with 95% probability PhD Biomedical Sciences: Module 2 122

126 5.9 Remarks The earlier examples of power and/or sample size calculations were in the context of the unpaired t-test and chi-squared test. Similar calculations can be done in any other statistical testing situation, e.g., Fisher Exact test, paired t-test, McNemar test,... Strictly speaking, all experiments should be preceded by a realistic sample size calculation to avoid experiments with unacceptable high type II error rates, i.e., with almost no chance at all to show clinically meaningful effects. PhD Biomedical Sciences: Module 2 123

127 5.10 Example from the biomedical literature Wong et al. [6] Methodology section, p.658: PhD Biomedical Sciences: Module 2 124

128 Table 2 with results: Discussion, p.664: PhD Biomedical Sciences: Module 2 125

129 The difference on which the sample size calculation was based was much larger than what actually was observed in the experiment Therefore, the power to reject equality of the groups was (much) lower than the expected 80% The current study cannot tell the difference between a 9% increase and a 3% decrease. If such differences are considered clinically important, then the current study was under-powered, due to the fact that the difference was overestimated at the time of the sample size calculation. PhD Biomedical Sciences: Module 2 126

130 Chapter 6 Errors in statistics: Practical implications Multiple testing Bonferroni correction Tests for baseline differences Equivalence tests Significance versus relevance Examples from biomedical literature PhD Biomedical Sciences: Module 2 127

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals. 1 BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

PRACTICE PROBLEMS FOR BIOSTATISTICS

PRACTICE PROBLEMS FOR BIOSTATISTICS PRACTICE PROBLEMS FOR BIOSTATISTICS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION 1. The duration of time from first exposure to HIV infection to AIDS diagnosis is called the incubation period.

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

2 Precision-based sample size calculations

2 Precision-based sample size calculations Statistics: An introduction to sample size calculations Rosie Cornish. 2006. 1 Introduction One crucial aspect of study design is deciding how big your sample should be. If you increase your sample size

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Section 12 Part 2. Chi-square test

Section 12 Part 2. Chi-square test Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

More information

Parametric and non-parametric statistical methods for the life sciences - Session I

Parametric and non-parametric statistical methods for the life sciences - Session I Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Come scegliere un test statistico

Come scegliere un test statistico Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

Unit 27: Comparing Two Means

Unit 27: Comparing Two Means Unit 27: Comparing Two Means Prerequisites Students should have experience with one-sample t-procedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Sample Size Planning, Calculation, and Justification

Sample Size Planning, Calculation, and Justification Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Tests for Two Proportions

Tests for Two Proportions Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

Erik Parner 14 September 2016. Basic Biostatistics - Day 2-21 September, 2016 1

Erik Parner 14 September 2016. Basic Biostatistics - Day 2-21 September, 2016 1 PhD course in Basic Biostatistics Day Erik Parner, Department of Biostatistics, Aarhus University Log-transformation of continuous data Exercise.+.4+Standard- (Triglyceride) Logarithms and exponentials

More information

Fixed-Effect Versus Random-Effects Models

Fixed-Effect Versus Random-Effects Models CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Mind on Statistics. Chapter 4

Mind on Statistics. Chapter 4 Mind on Statistics Chapter 4 Sections 4.1 Questions 1 to 4: The table below shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. Highest Degree Gender

More information

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION SOLUTIONS 1. a. To calculate the mean, we just add up all 7 values, and divide by 7. In Xi i= 1 fancy

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Statistics for Sports Medicine

Statistics for Sports Medicine Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours

More information

22. HYPOTHESIS TESTING

22. HYPOTHESIS TESTING 22. HYPOTHESIS TESTING Often, we need to make decisions based on incomplete information. Do the data support some belief ( hypothesis ) about the value of a population parameter? Is OJ Simpson guilty?

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013 STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

MEASURES OF LOCATION AND SPREAD

MEASURES OF LOCATION AND SPREAD Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the

More information