Unit 14: Nonparametric Statistical Methods

Size: px
Start display at page:

Download "Unit 14: Nonparametric Statistical Methods"

Transcription

1 Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 7/26/2004 Unit 14 - Stat Ramón V. León 1

2 Introductory Remarks Most methods studied so far have been based on the assumption of normally distributed data Frequently this assumption is not valid Sample size may be too small to verify it Sometimes the data is measured in an ordinal scale Nonparametric or distribution-free statistical methods Make very few assumptions about the form of the population distribution from which the data are sampled Based on ranks so they can be used on ordinal data Will concentrate on hypothesis tests but will also mention confidence interval procedures. 7/26/2004 Unit 14 - Stat Ramón V. León 2

3 Inference for a Single Sample Consider a random sample x1, x2,..., x n from a population with unknown median µ. (Recall that for nonnormal (especially skewed) distributions the median is a better measure of the center than the mean.) H : µ = µ vs. H : µ > µ Example: Test whether the median household income of a population exceeds $50,000 based on a random sample of household incomes from that population For simplicity we sometimes present methods for one-sided tests. Modifications for two-sided tests are straightforward and are given in the textbook Some examples in these notes are two-sided tests. 7/26/2004 Unit 14 - Stat Ramón V. León 3

4 Sign test: Sign Test for a Single Sample H 1. Count the number of x i 's that exceed µ 0. Denote this number by s+, called the number of plus signs. Let s = n s+, which is the number of minus signs. 2. Reject H if s is large or equivalently if s is small. 0 : µ = µ vs. H : µ > µ Test idea: Under the null hypothesis s + has a binomial distribution, Bin (n, ½). So this test is simply the test for binomial proportions 7/26/2004 Unit 14 - Stat Ramón V. León 4

5 Sign Test Example A thermostat used in an electric device is to be checked for the accuracy of its design setting of 200ºF. Ten thermostats were tested to determine their actual settings, resulting in the following data: 202.2, 203.4, 200.5, 202.5, 206.3, 198.0, 203.7, 200.8, 201.3, s + H : µ = 200 vs H : µ = 8 = number of data values > 200, so P-value = i= 8 i = = 2 i= 0 i 2 (The t test based on the mean has P-value = However recall that the t test assumes a normal population) 7/26/2004 Unit 14 - Stat Ramón V. León 5

6 Normal Approximation to Test Statistic If the sample size is large ( 20) the common of S and S is approximated by a normal distribution with 1 n ES ( + ) = ES ( ) = np= n =, n Var( S+ ) = Var( S ) = np(1 p) = n = Therefore can perform a one-sided z- test with s+ n z = n 4 + 7/26/2004 Unit 14 - Stat Ramón V. León 6

7 P-values for Sign Test Using JMP Based on normal approximation to the binomial ( = z 2 ) 7/26/2004 Unit 14 - Stat Ramón V. León 7

8 Treatment of Ties Theory of the test assumes that the distribution of the data is continuous so in theory ties are impossible In practice they do occur because of rounding A simple solution is to ignore the ties and work only with the untied observation. This does reduce the effective sample size of the test and hence its power, but the loss is not significant if there are only a few ties 7/26/2004 Unit 14 - Stat Ramón V. León 8

9 Let x x x be the ordered data values. (1) (2) ( n) Then a (1- α )-level CI for µ is given by x µ Comfidence Interval for µ x ( b+ 1) ( n b) where b= b α is the lower α 2 critical point n,1 2 of the Bin n,1 2 distribution. ( ) Note: Not all confidence levels are possible because of the discreteness of the Binomial distribution 7/26/2004 Unit 14 - Stat Ramón V. León 9

10 Thermostat Setting: Sign Confidence Interval for the Median From Table A.1 we see that for n = 10 and p=0.5, the lower critical point of the binomial distribution is 1 and by symmetry the upper critical point is 9. Setting α 2 = which gives 1-α = = 0.978, we find that = x µ x = (2) (9) is a 97.8% CI for µ. 7/26/2004 Unit 14 - Stat Ramón V. León 10

11 Sign Test for Matched Pairs Drop 3 tied pairs. Then s + = 20; s - = 3 7/26/2004 Unit 14 - Stat Ramón V. León 11

12 Sign Test for Matched Pairs 7/26/2004 Unit 14 - Stat Ramón V. León 12

13 Sign Test for Matched Pairs in JMP Pearson s p-value is not the same as the book s two-sided P-value because the book uses the continuity correction in the normal approximation to the binomial distribution, i.e, book uses z = (Page 567) rather than z = used by JMP. Note that ( ) 2 = book 7/26/2004 Unit 14 - Stat Ramón V. León 13

14 Wilcoxon Signed Rank Test H : µ = µ vs. H : µ µ More powerful than the sign test, however, it requires the assumption that the population distribution is symmetric 1. Rank and order the differences in terms of their absolute value Example 14.1 and 14.4: Thermostat Setting is 200 F 2. Calculate w + = sum of the ranks of the positive differences w + = Reject H 0 if w + is large or small 7/26/2004 Unit 14 - Stat Ramón V. León 14

15 Wilcoxon Signed Rank Test in JMP This test finds a significant difference at α=0.05 while the sign test did not at even α=0.1 7/26/2004 Unit 14 - Stat Ramón V. León 15

16 Normal Approximation in the Wilcoxon Signed Rank Test For large n, the null distribution of W W W + - can be well-approximated by a normal distribution with mean and variance given by nn ( + 1) nn ( + 1)(2n+ 1) EW ( ) = andvarw ( ) = For large samples a one-sided ( greater than median) z-test uses the statistic w+ n( n 1)/4 1/2 z = + nn ( + 1)(2n+ 1) 24 7/26/2004 Unit 14 - Stat Ramón V. León 16

17 Importance of Symmetric Population Assumption Here even though H 0 is true the long right hand tail makes the positive differences tend to be larger in magnitude than the negative differences, resulting in higher ranks. This inflates w + and hence the test s type I error probability. 7/26/2004 Unit 14 - Stat Ramón V. León 17

18 Null Distribution of the Wilcoxon Signed Rank Statistics 7/26/2004 Unit 14 - Stat Ramón V. León 18

19 Null Distribution of the Wilcoxon Signed Rank Statistics 7/26/2004 Unit 14 - Stat Ramón V. León 19

20 Wilcoxon Signed Rank Statistic: Treatment of Ties There are two types of ties Some of the data is equal to the median Drop these observations Some of the differences from the median may be tied Use midrank, that is, the average rank For example, suppose d1 = 1, d2 =+ 3, d3 = 3, d4 =+ 5 Then (2 + 3) r1 = 1, r2 = r3 = = 2.5, r4 = 4 2 With ties Table A.10 is only approximate 7/26/2004 Unit 14 - Stat Ramón V. León 20

21 Wilcoxon Sign Rank Test: Matched Pair Design Example 14.5: Comparing Two Methods of Cardiac Output Notice that we drop the three zero differences Notice that we average the tied ranks Two-Side P-values Signed test: Signed Rank test: t-test: (Page 284) (Notice that these tests require progressively more stringent assumptions about the population of differences) 7/26/2004 Unit 14 - Stat Ramón V. León 21

22 JMP Calculation 7/26/2004 Unit 14 - Stat Ramón V. León 22

23 Signed Rank Confidence Interval for the Median 7/26/2004 Unit 14 - Stat Ramón V. León 23

24 Thermostat Setting: Wilcoxon Signed Rank Confidence Interval for Median From Table A.10 we see that for n = 10, the upper 2.4% critical point is 47 and by symmetry the lower 2.4% 10(10 + 1) critical point is - 47 = = 8. 2 Setting α 2 = and hence 1-α= =0.952 we find that = x = x µ x = is a 95.2% CI for µ 7/26/2004 Unit 14 - Stat Ramón V. León 24

25 Inferences for Two Independent Samples One wants to show that the observations from one population tend to be larger than those from another population based on independent random samples x, x,..., x and y, y,..., y 1 2 n n Examples: Treated patients tend to live longer than untreated patients An equity fund tends to have a higher yield than a bond fund 7/26/2004 Unit 14 - Stat Ramón V. León 25

26 Wilcoxon-Mann-Whitney Test Example: Time to Failure of Two Capacitor Groups Reject for extreme values of w 1. 7/26/2004 Unit 14 - Stat Ramón V. León 26

27 Stochastic Ordering of Populations X is stochastically larger than Y ( X Y) if for all real numbers u, PX ( > u) PY ( > u) equivalently, P( X u) = F( u) F ( u) = P( Y u) 1 2 with strict inequality for at least some u. Denoted by X Y or equivalently by F < F ) 1 2 7/26/2004 Unit 14 - Stat Ramón V. León 27

28 Stochastic Ordering Especial Case: Location Difference θ is called a location parameter Notice that X X iff θ < θ /26/2004 Unit 14 - Stat Ramón V. León 28

29 0 1 2 Wilcoxon-Mann-Whitney Test H : F = F ( X Y) Alternatives : One sided: H : F < F ( X Y) Two sided: H : F < F or F < F ( X Y or Y X) Notice that the alternative is not H : F F (Kolmogorov-Smirnov Test can handle this alternative) 7/26/2004 Unit 14 - Stat Ramón V. León 29

30 Wilcoxon Version of the Test H : F = F ( X Y)vs. H : F < F ( X Y) Rank all N = n + n observations, 1 2 x, x,..., x and y, y,..., y 1 2 n in ascending order 2. Sum the ranks of the x's and y's separately. Denote these sums by w and Reject H if w is large or equivalently w is small n w 2 7/26/2004 Unit 14 - Stat Ramón V. León 30

31 Mann-Whitney Test Version The advantage of using the Mann-Whitney form of the test is that the same distribution applies whether we use u 1 or u 2 P value = P( U u ) = P( U u ) 1 2 7/26/2004 Unit 14 - Stat Ramón V. León 31

32 Null Distribution of the Wilcoxon- Mann-Whitney Test Statistic Under the null hypothesis each of these 10 ordering has an equal chance of occurring, namely, 1/10 5 = /26/2004 Unit 14 - Stat Ramón V. León 32

33 Null Distribution of the Wilcoxon- Mann-Whitney Test Statistic Pw ( 8) = = 0.2 (one-sided p-value for w= 8) 1 1 ( H : X Y) 1 7/26/2004 Unit 14 - Stat Ramón V. León 33

34 Normal Approximation of Mann- Whitney Statistic For large n and n, the null distribution of U can be 1 2 well approximated by a normal distribution with mean and variance given by nn 1 2 nn 1 2( N+ 1) EU ( ) = and VarU ( ) = 2 12 A large sample one-sided z- test can be based on the statistic z = u nn nn ( N 1) 12 ( H : X Y) 7/26/2004 Unit 14 - Stat Ramón V. León 34

35 Treatment of Ties A tie occurs when some x equal a y. A contribution of ½ is counted towards both u 1 and u 2 for each tied pair Equivalent to using the midrank method in computing the Wilcoxon rank sum statistic 7/26/2004 Unit 14 - Stat Ramón V. León 35

36 Wilcoxon-Mann-Whitney Confidence Interval Example14.8 shows that [d (18), d (63) ] = [-1.1, 14.7] is a 95.6% CI for the difference of the two medians of the failure times of capacitors. This example is in the book errata since Table A.11 is not detailed enough. 7/26/2004 Unit 14 - Stat Ramón V. León 36

37 Wilcoxon-Mann-Whitney Test in JMP z 2 = With continuity correction. Used in the book which gets a onesided p- value of Without continuity correction 7/26/2004 Unit 14 - Stat Ramón V. León 37

38 Inference for Several Independent Samples: Kruskal-Wallis Test Note that this is a completely randomized design 7/26/2004 Unit 14 - Stat Ramón V. León 38

39 Kruskal-Wallis Test H : F = F = = F vs. H : F < F for some i j a 1 i j Reject if a 2 H0 kw> χ 1, α Distance from the average rank 7/26/2004 Unit 14 - Stat Ramón V. León 39

40 Chi-Square Approximation For large samples the distribution of KW under the null hypothesis can be approximated by the chisquare distribution with a-1 degrees of freedom So reject H 0 if kw > χa 1, α 7/26/2004 Unit 14 - Stat Ramón V. León 40

41 Kruskal-Wallis Test Example Reject if kw is large. 2 χ 3,.005 = /26/2004 Unit 14 - Stat Ramón V. León 41

42 Kruskal-Wallis Test in JMP 7/26/2004 Unit 14 - Stat Ramón V. León 42

43 7/26/2004 Unit 14 - Stat Ramón V. León 43

44 Case method is different from Unitary method Formula method is different from Unitary method 7/26/2004 Unit 14 - Stat Ramón V. León 44

45 Pairwise Comparisons: Is Any Pair of Treatments Different? One can use the Tukey Method on the average ranks to make approximate pairwise comparisons. This is one of many approximate techniques where ranks are substituted for the observations in the normal theory methods. 7/26/2004 Unit 14 - Stat Ramón V. León 45

46 7/26/2004 Unit 14 - Stat Ramón V. León 46

47 7/26/2004 Unit 14 - Stat Ramón V. León 47

48 Tukey s Test Applied to the Ranks Averaged Lack of agreement with the more precise method of Example Here Equation method also seems to be different from Formula and Case method 7/26/2004 Unit 14 - Stat Ramón V. León 48

49 Example of Friedman s Test Ranking is done within blocks 2 χ 7,.025 = P-value =.0040 vs for ANOVA table 7/26/2004 Unit 14 - Stat Ramón V. León 49

50 i i i Inference for Several Matched Samples Randomized Block Design: a b y ij = observation on the i-th treatment in the j-th block if = c.d.f of r.v. Y corresponding to the observed value y ij ij ij For simplicity assume F ( y) = F( y θ β ) iθ i iβ j 2 treatment groups 2 blocks is the "treatment effect" is the "block effect" i.e., we assume that there is no treatment by block interaction ij i j 7/26/2004 Unit 14 - Stat Ramón V. León 50

51 Friedman Test H : θ = θ = = θ vs. H : θ > θ for some i j a 1 i j Reject if fr 2 > χa 1, α Distance from the total of the ranks from their expected value when there is no agreement between the blocks 7/26/2004 Unit 14 - Stat Ramón V. León 51

52 Pairwise Comparisons 7/26/2004 Unit 14 - Stat Ramón V. León 52

53 Rank Correlation Methods The Pearson correlation coefficient measures only the degree of linear association between two variables Inferences use the assumption of bivariate normality of the two variables We present two correlation coefficients that Take into account only the ranks of the observations Measure the degree of monotonic (increasing or decreasing) association between two variables 7/26/2004 Unit 14 - Stat Ramón V. León 53

54 Motivating Example ( xy, ) = (1, e), (2, e), (3, e), (4, e), (5, e) Note that there is a perfect positive association between between x and y with y = e x. The Pearson correlation correlation coefficient is only because the relationship is not linear The rank correlation coefficients we present yield a value of 1 for these data 7/26/2004 Unit 14 - Stat Ramón V. León 54

55 Spearman s Rank Correlation Coefficient Ranges between 1 and +1 with r s = -1 when there is a perfect negative association and r s = +1 when there is a perfect positive association 7/26/2004 Unit 14 - Stat Ramón V. León 55

56 Example (Wine Consumption and Heart Disease Deaths per 100,000 7/26/2004 Unit 14 - Stat Ramón V. León 56

57 7/26/2004 Unit 14 - Stat Ramón V. León 57

58 Calculation of Spearman s Rho 7/26/2004 Unit 14 - Stat Ramón V. León 58

59 Test for Association Based on Spearman s Rank Correlation Coefficient 7/26/2004 Unit 14 - Stat Ramón V. León 59

60 H 0 1 Hypothesis Testing Example : X= Wine Consumption and Y = Heart Disease Deaths are independent. vs. H : X and Y are (negatively or positively) associated z = r n 1 = = S Two-Sided P value = Evidence of negative association 7/26/2004 Unit 14 - Stat Ramón V. León 60

61 JMP Calculations: Pearson Correlation Heart Disease Deaths Alcohol from Wine Plot is fairly linear Pearson correlation 7/26/2004 Unit 14 - Stat Ramón V. León 61

62 JMP Calculations: Spearman Rank Correlation 7/26/2004 Unit 14 - Stat Ramón V. León 62

63 Kendall s Rank Correlation Coefficient: Key Concept Examples Concordant pairs: (1,2), (4,9) (1-4)(2-9)>0 (4,2), (3,1) (4-3)(2-1)>0 Discordant pairs: (1,2), (9,1) (1-9)(2-1)<0 (2,4), (3,1) (2-3)(4-1)<0 Tied pairs: (1,3), (1,5) (1 1)(3 5)=0 (1,4), (2,4) (1 2)(4 4)=0 (1,2), (1,2) (1 1)(2 2)=0 Kendall s idea is to compare the number of concordant pairs to the number of discordant pairs in bivariate data 7/26/2004 Unit 14 - Stat Ramón V. León 63

64 (X, Y) (1, 2) Kendall s Tau (3, 4) Example (2, 1) n 3 Number of pairwise comparisons = = = 3 = 2 2 N Concordant pairs: (1,2) (3,4) (3,4) (2,1) N c = 2 Discordant pairs: (1,2) (2,1) N d = 1 ˆ τ = = = N c N N d 7/26/2004 Unit 14 - Stat Ramón V. León 64

65 Kendall s Rank Correlation Coefficient: Population Version 7/26/2004 Unit 14 - Stat Ramón V. León 65

66 Kendall s Rank Correlation Coefficient: Sample Estimate Let Nc = Number of concordant pairs in the data Let Nd = Number of disconcordant pairs in the data n Let N = be the number of pairwise comparisons among 2 the observations ( xi, yi), i = 1, 2,..., n. Then Nc Nd ˆ τ = and Nc + Nd N = N if no ties ˆ τ = Nc Nd if ties ( N T )( N T ) x y where T and T are corrections for the number of tied pairs. x y 7/26/2004 Unit 14 - Stat Ramón V. León 66

67 Hypothesis of Independence Versus Positive Association Wine data: /26/2004 Unit 14 - Stat Ramón V. León 67

68 JMP Calculations: Kendall s Rank Correlation Coefficient 7/26/2004 Unit 14 - Stat Ramón V. León 68

69 Kendall s Coefficient of Concordance Measure of association between several matched samples Closely related to Friedman s test statistic Consider a candidates (treatments) and b judges (blocks) with each judge ranking the a candidates If there is perfect agreement between the judges, then each candidate gets the same rank. Assuming the candidates are labeled in the order of their ranking, the rank sum for the ith candidate would be r i = ib If the judges rank the candidates completely at random ( perfect disagreement ) then the expected rank of each candidate would be [1+2+ +a]/a =[a(a+1)/2]/a=(a+1)/2, and the expected value of all the rank sums would equal to b(a+1)/2 7/26/2004 Unit 14 - Stat Ramón V. León 69

70 Kendall s Coefficient of Concordance 7/26/2004 Unit 14 - Stat Ramón V. León 70

71 Kendall s Coefficient of Concordance and Friedman s Test 7/26/2004 Unit 14 - Stat Ramón V. León 71

72 w = = (8 1) 7/26/2004 Unit 14 - Stat Ramón V. León 72

73 Do You Need to Know More Nonparametric Statistical Methods, Second Edition by Myles Hollander and Douglas A. Wolfe. (1999) Wiley-Interscience 7/26/2004 Unit 14 - Stat Ramón V. León 73

74 Resampling Methods Conventional methods are based on the sampling distribution of a statistic computed for the observed sample. The sampling distribution is derived by considering all possible samples of size n from the underlying population. Resampling methods generate the sampling distribution of the statistic by drawing repeated samples from the observed sample itself. This eliminates the need to assume a specific functional form for the population distribution (e.g. normal). 7/26/2004 Unit 14 - Stat Ramón V. León 74

75 Challenger Shuttle O-Ring Data Do we have statistical evidence that cold temperature leads to more O-ring incidents? Notice that assumptions of two sample t test do not hold. Original analysis omitted the zeros? Was this justified? What do we do? 7/26/2004 Unit 14 - Stat Ramón V. León 75

76 Wrong t-test Analysis Difference of Low mean to High mean Notice that the assumptions of the independent sample t-test do not hold, i.e., data is not normal for each group. 7/26/2004 Unit 14 - Stat Ramón V. León 76

77 Permutation Distribution of t Statistic Also equal to the two-sided p-value Equivalent to selecting all simple random samples without replacement of size 20 from the 24 data points, labeling these High and the rest Low 7/26/2004 Unit 14 - Stat Ramón V. León 77

78 Comments A randomization test is a permutation test applied to data from a randomized experiment. Randomization tests are the gold standard for establishing causality. A permutation test considers all possible simple random samples without replacement from the set of observed data values The bootstrap method considers a large number of simple random samples with replacement from the set of observed data values. 7/26/2004 Unit 14 - Stat Ramón V. León 78

79 Calculation of t Statistics from 10, Bootstrap Samples Think that we are placing the 24 Challenger data values in a hat. And that we are randomly selecting 24 values with replacement from the hat, labeling the first 20 values High and the remaining 4 values Low. We repeat these process 10,000 times. For each of these 10,000 bootstrap samples we calculate the t-statistic. 35 t- statistics values were greater than or equal to out of (if s p = 0, t is defined to be 0). This gives a bootstrap P-value of 35/10000 = /26/2004 Unit 14 - Stat Ramón V. León 79

80 Bootstrap Distribution of Difference Between the Means 67 of the 10,000 differences of the Low mean and the High mean were greater than or equal to 1.3. This gives a bootstrap P-value of 67/10000 = Conclusion: Cold weather increases the chance of O-ring problems 7/26/2004 Unit 14 - Stat Ramón V. León 80

81 Bootstrap Final Remarks The JMP files - that we used to generate the bootstrap samples and to calculate the statistics - are available at the course web site. There are bootstrap procedures for most types of statistical problems. All are based on resampling from the data. These methods do not assume specific functional forms for the distribution of the data, e.g. normal The accuracy of bootstrap procedures depend on the sample size and the number of bootstrap samples generated 7/26/2004 Unit 14 - Stat Ramón V. León 81

82 How Were the Bootstrap Samples Generated? (see next page) 7/26/2004 Unit 14 - Stat Ramón V. León 82

83 7/26/2004 Unit 14 - Stat Ramón V. León 83

84 7/26/2004 Unit 14 - Stat Ramón V. León 84

85 7/26/2004 Unit 14 - Stat Ramón V. León 85

86 7/26/2004 Unit 14 - Stat Ramón V. León 86

87 Calculated Columns in JMP Samples File 7/26/2004 Unit 14 - Stat Ramón V. León 87

88 7/26/2004 Unit 14 - Stat Ramón V. León 88

89 7/26/2004 Unit 14 - Stat Ramón V. León 89

90 7/26/2004 Unit 14 - Stat Ramón V. León 90

91 7/26/2004 Unit 14 - Stat Ramón V. León 91

92 7/26/2004 Unit 14 - Stat Ramón V. León 92

93 7/26/2004 Unit 14 - Stat Ramón V. León 93

94 7/26/2004 Unit 14 - Stat Ramón V. León 94

95 Bootstrap Estimate of the Standard Error of the Mean Summary: We calculate the standard deviation of the N bootstrap estimates of the mean 7/26/2004 Unit 14 - Stat Ramón V. León 95

96 BSE for Arbitrary Statistic Example: The bootstrap standard error of the median is calculated by drawing a large number N, e.g , of bootstrap samples from the data. For each bootstrap sample we calculated the sample median. Then we calculate the standard deviation of the N bootstrap medians. 7/26/2004 Unit 14 - Stat Ramón V. León 96

97 Estimated Bootstrap Standard Error for t- statistics Using JMP Note N =10,000 7/26/2004 Unit 14 - Stat Ramón V. León 97

98 Bootstrap Standard Error Interpretation Many bootstrap statistics have an approximate normal distribution Confidence interval interpretation 68% of the time the bootstrap estimate (the average of the bootstrap estimates) will be within one standard error of true parameter value 95% of the time the bootstrap estimate (the average of the bootstrap estimates) will be within two standard error of true parameter value 7/26/2004 Unit 14 - Stat Ramón V. León 98

99 Bootstrap Confidence Intervals Percentile Method: Median Example 1. Draw N (= 10000) bootstrap samples from the data and for each calculate the (bootstrap) sample median. 2. The 2.5 percentile of the N bootstrap sample medians will be the LCL for a 95% confidence interval 3. The 97.5 percentile of the N bootstrap sample medians will be the UCL for a 95% confidence interval LCL UCL 7/26/2004 Unit 14 - Stat Ramón V. León 99

100 Do You Need to Know More? A Introduction to the Bootstrap by Bradley Efrom and Robert J. Tibshirani. (1993) Chapman & Hall/CRC 7/26/2004 Unit 14 - Stat Ramón V. León 100

Nonparametric Methods Testing with Ordinal Data or Nonnormal Distributions Chapter 16

Nonparametric Methods Testing with Ordinal Data or Nonnormal Distributions Chapter 16 Nonparametric Methods Testing with Ordinal Data or Nonnormal Distributions Chapter 16 Rational The classical inference methods require a distribution to describe the population of a variable Only the parameters

More information

Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size

Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N. J. Castellan (1988) Nonparametric statistics for the behavioral sciences,

More information

12 Measures of Association and Item Analysis

12 Measures of Association and Item Analysis CHAPTER 12 Measures of Association and Item Analysis OBJECTIVE The main objective of this chapter is to review some special agreement coefficients that are used in item analysis, or used to quantify the

More information

Chapter 21 Statistical Tests for Ordinal Data. Table 21.1

Chapter 21 Statistical Tests for Ordinal Data. Table 21.1 Chapter 21 Statistical Tests for Ordinal Data The Mann-Whitney (Rank-Sum) Test In this example, we deal with a dependent variable that is measured on a ratio scale (time in seconds), but its distribution

More information

Chapter 16: Nonparametric Tests

Chapter 16: Nonparametric Tests Chapter 16: Nonparametric Tests In Chapter 1-13 we discussed tests of hypotheses in a parametric statistics framework: which assumes that the functional form of the (population) probability distribution

More information

Review of the Topics for Midterm II

Review of the Topics for Midterm II Review of the Topics for Midterm II STA 100 Lecture 18 I. Confidence Interval 1. Point Estimation a. A point estimator of a parameter is a statistic used to estimate that parameter. b. Properties of a

More information

Parametric Statistics 1 Nonparametric Statistics

Parametric Statistics 1 Nonparametric Statistics Parametric Statistics 1 Nonparametric Statistics Timothy C. Bates tim.bates@ed.ac.uk Assume data are drawn from samples with a certain distribution (usually normal) Compute the likelihood that groups are

More information

Nonparametric test One sample tests Two sample tests Testing for three or more samples

Nonparametric test One sample tests Two sample tests Testing for three or more samples Nonparametric test One sample tests Two sample tests Testing for three or more samples 1 Background So far we have stressed that in order to carry out hypothesis tests we need to make certain assumptions

More information

CHAPTER 5 NON-PARAMETRIC TESTS

CHAPTER 5 NON-PARAMETRIC TESTS CHAPTER 5 NON-PARAMETRIC TESTS The methods described in chapter 4 are all based on the following assumptions: 1. Simple random sample. 2. The data is numeric. 3. The data is normally distributed. 4. For

More information

Guide to the Summary Statistics Output in Excel

Guide to the Summary Statistics Output in Excel How to read the Descriptive Statistics results in Excel PIZZA BAKERY SHOES GIFTS PETS Mean 83.00 92.09 72.30 87.00 51.63 Standard Error 9.47 11.73 9.92 11.35 6.77 Median 80.00 87.00 70.00 97.50 49.00 Mode

More information

Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

More information

Inference Procedures for One Sample and Paired-Data Location Problems

Inference Procedures for One Sample and Paired-Data Location Problems Inference Procedures for One Sample and Paired-Data Location Problems Robert J. Serfling October, 006 In 1 4 we examine and compare three types of procedures for inference about location parameters, in

More information

Chapter 13 Introduction to Nonparametric Analysis

Chapter 13 Introduction to Nonparametric Analysis Chapter 13 Introduction to Nonparametric Analysis Chapter Table of Contents OVERVIEW...179 Testing for Normality...... 179 Comparing Distributions....180 ONE-SAMPLE TESTS...180 TWO-SAMPLE TESTS...180 ComparingTwoIndependentSamples...181

More information

Lecture 9: Ordinal Associations of I J Contingency Tables

Lecture 9: Ordinal Associations of I J Contingency Tables Lecture 9: Ordinal Associations of I J Contingency Tables Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University

More information

Applications of Non-Parametric Statistics

Applications of Non-Parametric Statistics Applications of Non-Parametric Statistics Chan Yiu Man Department of Mathematics National University of Singapore 1. Introduction In the broadest sense a nonparametric statistical method is one that does

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : continuous variables Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

3.6: General Hypothesis Tests

3.6: General Hypothesis Tests 3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.

More information

Kendall s Tau-b Correlation Tests (Simulation)

Kendall s Tau-b Correlation Tests (Simulation) Chapter 804 Kendall s Tau-b Correlation Tests (Simulation) Introduction This procedure analyzes the power and significance level of the Kendall s Tau Correlation significance test using Monte Carlo simulation.

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

STATISTIKA INDUSTRI 2 TIN 4004

STATISTIKA INDUSTRI 2 TIN 4004 STATISTIKA INDUSTRI 2 TIN 4004 Pertemuan 11 & 12 Outline: Nonparametric Statistics Referensi: Walpole, R.E., Myers, R.H., Myers, S.L., Ye, K., Probability & Statistics for Engineers & Scientists, 9 th

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 April 28 May 3, 2011 Prof. Tesler z and t tests for

More information

Stat Statistical Methods

Stat Statistical Methods Stat 2153 - Statistical Methods Final - 2009.05.13 Name: Standard Deviation for a Set of Data: (x x) 2 s = n 1 (x µ) 2 σ = N = n (x 2 ) ( x) 2 n(n 1) Z-score for a Set of Data: z = x x s or z = x µ σ Combination

More information

Simple example. Simple example. Simple example. Example

Simple example. Simple example. Simple example. Example Simple example Simple example A box contains 1000 balls some white some black It is believed (hypothesized) that there are 990 W and 10 B 5 balls are drawn at random ( with replacement). They were all

More information

Intrinsically Ties Adjusted Non-Parametric Method for the Analysis of Two Sampled Data

Intrinsically Ties Adjusted Non-Parametric Method for the Analysis of Two Sampled Data Journal of Modern Applied Statistical Methods Volume Issue Article 8 --03 Intrinsically Ties Adjusted Non-Parametric Method for the Analysis of Two Sampled Data G. U. Ebuh Nnamdi Azikiwe University, Awka,

More information

1-sample Wilcoxon Signed Rank Test test for the median of a single population. 5-Step Procedure 1. Set H 0 : M = M 0

1-sample Wilcoxon Signed Rank Test test for the median of a single population. 5-Step Procedure 1. Set H 0 : M = M 0 1-sample Wilcoxon Signed Rank Test test for the median of a single population. 5-Step Procedure 1. Set H 0 : M = M 0 H a : M M 0 M > M 0 M < M 0 2. Select α 3. Test statistic 4. Find the p-value or the

More information

ALTERNATIVES TO t AND F

ALTERNATIVES TO t AND F CHAPTER 15 ALTERNATIVES TO t AND F OBJECTIVES After completing this chapter, you should be able to compute and use nonparametric alternatives to parametric tests, such as the t test for independent samples

More information

Ismor Fischer, 5/29/ POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y

Ismor Fischer, 5/29/ POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y Ismor Fischer, 5/29/2012 7.2-1 7.2 Linear Correlation and Regression POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y ρ = σ XY σ X σ Y FACT: 1 ρ

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric

More information

Analysis of numerical data S4

Analysis of numerical data S4 Basic medical statistics for clinical and experimental research Analysis of numerical data S4 Katarzyna Jóźwiak k.jozwiak@nki.nl 22nd November 2016 1/44 Hypothesis tests: numerical and ordinal data 1 group:

More information

How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo

How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There

More information

Formulas and Tables by Mario F. Triola

Formulas and Tables by Mario F. Triola Formulas and Tables by Mario F. Triola Copyright 010 Pearson Education, Inc. Ch. 3: Descriptive Statistics x Mean f # x x f Mean (frequency table) 1x - x s B n - 1 Standard deviation n 1 x - 1 x Standard

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation " of a normally distributed measurement and to test the goodness

More information

BIO-STATISTICAL ANALYSIS OF RESEARCH DATA

BIO-STATISTICAL ANALYSIS OF RESEARCH DATA BIO-STATISTICAL ANALYSIS OF RESEARCH DATA March 27 th and April 3 rd, 2015 Kris Attwood, PhD Department of Biostatistics & Bioinformatics Roswell Park Cancer Institute Outline Biostatistics in Research

More information

Course Notes - Statistics

Course Notes - Statistics EPI-546: Fundamentals of Epidemiology and Biostatistics Course Notes - Statistics MSc (Credit to Roger J. Lewis, MD, PhD) Outline: I. Classical Hypothesis (significance) testing A. Type I (alpha) error

More information

Model assumptions. Stat Fall

Model assumptions. Stat Fall Model assumptions In fitting a regression model we make four standard assumptions about the random erros ɛ: 1. Mean zero: The mean of the distribution of the errors ɛ is zero. If we were to observe an

More information

Hui Bian Office for Faculty Excellence Fall 2011

Hui Bian Office for Faculty Excellence Fall 2011 Hui Bian Office for Faculty Excellence Fall 2011 Purpose of data screening To find wrong entries To find extreme responses or outliers To see if data meet the statistical assumptions of analysis you are

More information

Summer School on Multidimensional Poverty Analysis

Summer School on Multidimensional Poverty Analysis Summer School on Multidimensional Poverty Analysis 1 13 August 2016 Beijing Normal University Beijing, China Robustness Analysis and Statistical Inference Bouba Housseini 9 August 2016 Focus of This Lecture

More information

Non Parametric Statistics

Non Parametric Statistics Non Parametric Statistics Διατμηματικό ΠΜΣ Επαγγελματική και Περιβαλλοντική Υγεία-Διαχείριση και Οικονομική Αποτίμηση Δημήτρης Φουσκάκης Introduction So far in the course we ve assumed that the data come

More information

Parametric & Nonparametric Models for Tests of Association

Parametric & Nonparametric Models for Tests of Association Statistics We Will Consider Parametric & Nonparametric Models for Tests of Association Parametric Nonparametric DV Categorical Interval/ND Ordinal/~ND univariate stats mode, #cats mean, std median, IQR

More information

Chapter 10. Chi-square Test of Independence

Chapter 10. Chi-square Test of Independence Lecture notes, Lang Wu, UBC 1 Chapter 10. Chi-square Test of Independence 10.1. Association between Two Discrete Variables To study the relationship or association between two continuous variables, we

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 3 / 4 May 2005 Department of Experimental Psychology University of Cambridge Slides at pobox.com/~rudolf/psychology

More information

Introduction to Statistical Quality Control, 6 th Edition by Douglas C. Montgomery. Copyright (c) 2009 John Wiley & Sons, Inc.

Introduction to Statistical Quality Control, 6 th Edition by Douglas C. Montgomery. Copyright (c) 2009 John Wiley & Sons, Inc. 1 2 Learning Objectives Chapter 4 3 4.1 Statistics and Sampling Distributions Statistical inference is concerned with drawing conclusions about populations (or processes) based on sample data from that

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

By Hui Bian. Office for Faculty Excellence

By Hui Bian. Office for Faculty Excellence By Hui Bian Office for Faculty Excellence A parametric statistical test is a test whose model specifies certain conditions about the parameters of the population from which the research sample was drawn.

More information

DATA ANALYSIS AND TESTING. Harry G. Kwatny. Department of Mechanical Engineering & Mechanics Drexel University ENGINEERING RELIABILITY INTRODUCTION

DATA ANALYSIS AND TESTING. Harry G. Kwatny. Department of Mechanical Engineering & Mechanics Drexel University ENGINEERING RELIABILITY INTRODUCTION OF AND TESTING Harry G. Kwatny Department of Mechanical Engineering & Mechanics Drexel University OUTLINE OF OF ENHANCE QUALITY CON TERMINOLOGY OF Population: The set of all possible outcomes, possibly

More information

Nonparametric Statistics

Nonparametric Statistics 1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables

More information

Statistical Formulas and Tables for use with STAT1010: Riippuvuusanalyysi

Statistical Formulas and Tables for use with STAT1010: Riippuvuusanalyysi Statistical Formulas and Tables for use with STAT1010: Riippuvuusanalyysi Bernd Pape http://www.uwasa.fi/ bepa/ Measures of Contingency Pearson s χ 2 : r s χ 2 (f ij e ij ) 2 =, (1) e ij i=1 j=1 where

More information

dominant: other:

dominant: other: Data Set 6: Comparison of arm size Background In this handout I use the data provided by Biometry students in 26 on the circumferences of their dominant and other arms. I had hypothesized that most people

More information

Outcome 1 Outcome 2 Treatment 1 A B Treatment 2 C D

Outcome 1 Outcome 2 Treatment 1 A B Treatment 2 C D Chapter 6 Categorical data So far, we ve focused on analyzing numerical data This section focuses on data that s categorical (eg, with values like red or blue that don t have any ordering) We ll start

More information

Statistics Advanced Placement G/T Essential Curriculum

Statistics Advanced Placement G/T Essential Curriculum Statistics Advanced Placement G/T Essential Curriculum UNIT I: Exploring Data employing graphical and numerical techniques to study patterns and departures from patterns. The student will interpret and

More information

One tailed vs two tailed tests: a normal distribution filled with a rainbow of colours. = height (14) = height (10) = D

One tailed vs two tailed tests: a normal distribution filled with a rainbow of colours. = height (14) = height (10) = D One tailed t-test using SPSS: Divide the probability that a two-tailed SPSS test produces. SPSS always produces two-tailed significance level. One tailed vs two tailed tests: a normal distribution filled

More information

Non-Parametric Tests

Non-Parametric Tests Non-Parametric Tests Non Parametric Tests Do not make as many assumptions about the distribution of the data as the t test. Do not require data to be Normal Good for data with outliers Non-parametric tests

More information

To test an average or pair of averages when σ is known, we use z-tests

To test an average or pair of averages when σ is known, we use z-tests t-tests To test an average or pair of averages when σ is known, we use z-tests But often σ is unknown, e.g., in specially constructed psycholinguistics tests, in tests of reactions of readers or software

More information

Spearman s Rank Correlation Tests (Simulation)

Spearman s Rank Correlation Tests (Simulation) Chapter 803 Spearman s Rank Correlation Tests (Simulation) Introduction This procedure analyzes the power and significance level of Spearman s Rank Correlation significance test using Monte Carlo simulation.

More information

basic biostatistics ME Mass spectrometry in an omics world December 10, 2012 Stefani Thomas, Ph.D.

basic biostatistics ME Mass spectrometry in an omics world December 10, 2012 Stefani Thomas, Ph.D. Lecture 13. Clinical studies and basic biostatistics ME330.884 Mass spectrometry in an omics world December 10, 2012 Stefani Thomas, Ph.D. 1 Statistics and biostatistics Statistics collection, organization,

More information

The Gaussian distribution (normal distribution)

The Gaussian distribution (normal distribution) The Gaussian distribution (normal distribution) When the distribution of the observations is normal, then 95% of all observation are located in the interval: mean-1.96 SD to mean+1.96 SD represents a descriptive

More information

Planejamento e Otimização de Experimentos

Planejamento e Otimização de Experimentos Planejamento e Otimização de Experimentos The Analysis of Variance Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com An Example Integrated circuits Wafers Plasma etching

More information

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives STA-3123: Statistics for Behavioral and Social Sciences II Text Book: McClave and Sincich, 12 th edition Contents and Objectives Initial Review and Chapters 8 14 (Revised: Aug. 2014) Initial Review on

More information

Chapter 21 Section D

Chapter 21 Section D Chapter 21 Section D Statistical Tests for Ordinal Data The rank-sum test. You can perform the rank-sum test in SPSS by selecting 2 Independent Samples from the Analyze/ Nonparametric Tests menu. The first

More information

Parametric test: (b) the data show inhomogeneity of variance; or. (c) the data are not measurements on an interval or ratio scale.

Parametric test: (b) the data show inhomogeneity of variance; or. (c) the data are not measurements on an interval or ratio scale. Non-parametric tests: Non-parametric tests make no assumptions about the characteristics or "parameters" of your data. Use them if (a) the data are not normally distributed; (b) the data show inhomogeneity

More information

Correlation. Scatterplots of Paired Data:

Correlation. Scatterplots of Paired Data: 10.2 - Correlation Objectives: 1. Determine if there is a linear correlation 2. Conduct a hypothesis test to determine correlation 3. Identify correlation errors Overview: In Chapter 9 we presented methods

More information

Data Management & Analysis: Intermediate PASW Topics II Workshop

Data Management & Analysis: Intermediate PASW Topics II Workshop Data Management & Analysis: Intermediate PASW Topics II Workshop 1 P A S W S T A T I S T I C S V 1 7. 0 ( S P S S F O R W I N D O W S ) Beginning, Intermediate & Advanced Applied Statistics Zayed University

More information

Normal and t Distributions

Normal and t Distributions Normal and t Distributions Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 11 13, 2011 Normal 1 / 33 Case Study Case Study Body temperature varies within individuals

More information

HITCHHIKER'S GUIDE TO ELEMENTARY STATISTICS

HITCHHIKER'S GUIDE TO ELEMENTARY STATISTICS \user\jobb\stat.98 980504 Torbjörn Ledin, Dept ENT, University Hospital, Linköping email Torbjorn.Ledin@inr.liu.se Adopted for the Basic Statistics Course for the PhD students. The examples shall be seen

More information

Nonparametric tests. Nonparametric tests and ANOVAs: What you need to know. Quick Reference Summary: Sign Test. ( ) n(x P = 2 * Pr[x!

Nonparametric tests. Nonparametric tests and ANOVAs: What you need to know. Quick Reference Summary: Sign Test. ( ) n(x P = 2 * Pr[x! Nonparametric tests and ANOVAs: What you need to know Nonparametric tests Nonparametric tests are usually based on ranks There are nonparametric versions of most parametric tests arametric One-sample and

More information

STATISTICS ELEMENTARY MARIO F. TRIOLA. Estimates and Sample Sizes EIGHTH EDITION

STATISTICS ELEMENTARY MARIO F. TRIOLA. Estimates and Sample Sizes EIGHTH EDITION ELEMENTARY STATISTICS Chapter 6 Estimates and Sample Sizes MARIO F. TRIOLA EIGHTH EDITION 1 Chapter 6 Estimates and Sample Sizes 6-1 Overview 6-2 Estimating a Population Mean: Large Samples 6-3 Estimating

More information

1. The Direction of the Relationship. The sign of the correlation, positive or negative, describes the direction of the relationship.

1. The Direction of the Relationship. The sign of the correlation, positive or negative, describes the direction of the relationship. Correlation Correlation is a statistical technique that is used to measure and describe the relationship between two variables. Usually the two variables are simply observed as they exist naturally in

More information

Linear Correlation Analysis

Linear Correlation Analysis Linear Correlation Analysis Spring 2005 Superstitions Walking under a ladder Opening an umbrella indoors Empirical Evidence Consumption of ice cream and drownings are generally positively correlated. Can

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

MODEL AND ANALYSIS FOR RANDOMIZED COMPLETE BLOCK DESIGNS

MODEL AND ANALYSIS FOR RANDOMIZED COMPLETE BLOCK DESIGNS MODEL AND ANALYSIS FOR RANDOMIZED COMPLETE BLOCK DESIGNS The randomized complete block design (RCBD) v treatments (They could be treatment combinations.) b blocks of v units, chosen so that units within

More information

CORRELATION ANALYSIS: EXACT PERMUTATION PARADIGM. Justice Ighodaro Odiase and Sunday Martins Ogbonmwan

CORRELATION ANALYSIS: EXACT PERMUTATION PARADIGM. Justice Ighodaro Odiase and Sunday Martins Ogbonmwan MATEMATIQKI VESNIK 59 (2007), 161 170 UDK 519.24 originalni nauqni rad research paper CORRELATION ANALYSIS: EXACT PERMUTATION PARADIGM Justice Ighodaro Odiase and Sunday Martins Ogbonmwan Abstract. For

More information

Sample Multiple Choice Problems for the Final Exam. Chapter 6, Section 2

Sample Multiple Choice Problems for the Final Exam. Chapter 6, Section 2 Sample Multiple Choice Problems for the Final Exam Chapter 6, Section 2 1. If we have a sample size of 100 and the estimate of the population proportion is.10, the standard deviation of the sampling distribution

More information

Chapter 3: Nonparametric Tests

Chapter 3: Nonparametric Tests B. Weaver (15-Feb-00) Nonparametric Tests... 1 Chapter 3: Nonparametric Tests 3.1 Introduction Nonparametric, or distribution free tests are so-called because the assumptions underlying their use are fewer

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

In this homework you will use Wilcoxon sum rank test, Wilcoxon sign rank test, paired t-test and ANOVA as well as getting some JMP practice.

In this homework you will use Wilcoxon sum rank test, Wilcoxon sign rank test, paired t-test and ANOVA as well as getting some JMP practice. Homework 8 Solutions In this homework you will use Wilcoxon sum rank test, Wilcoxon sign rank test, paired t-test and ANOVA as well as getting some JMP practice. (1) Nutritionists are studying the effect

More information

The purpose of Statistics is to ANSWER QUESTIONS USING DATA Know more about your data and you can choose what statistical method...

The purpose of Statistics is to ANSWER QUESTIONS USING DATA Know more about your data and you can choose what statistical method... The purpose of Statistics is to ANSWER QUESTIONS USING DATA Know the type of question and you can choose what type of statistics... Aim: DESCRIBE Type of question: What's going on? Examples: How many chapters

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I NTRODUCTION TO NON- PARAMETRI C ANALYSES HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests One-Way Between Groups ANOVA

More information

Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association

Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association This information is adapted from and draws heavily on: Sheskin, David J. 2000. Handbook of Parametric and

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Topic 13. Nonparametric Methods (Ch. 13)

Topic 13. Nonparametric Methods (Ch. 13) opic 3. Nonparametric Methods (Ch. 3) ) Introduction Most of our previous inferences have been based on the assumption of an underlying normal distribution for the test statistic. For example, it was pointed

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) MGMT 662: Integrative Research Project August 7, 2008. 1 1.1 Goals Goals of this class meeting Learn how to test for significant differences in means from two or more groups.

More information

DATA ANALYSIS IN SPSS POINT AND CLICK

DATA ANALYSIS IN SPSS POINT AND CLICK DATA ANALYSIS IN SPSS POINT AND CLICK By Andy Lin (IDRE Stat Consulting) www.ats.ucla.edu/stat/seminars/special/spss_analysis.pdf TODAY S SEMINAR Continuous, Categorical, and Ordinal variables Descriptive

More information

CRC Handbook of Tables for Probability and Statistics

CRC Handbook of Tables for Probability and Statistics CRC Handbook of Tables for Probability and Statistics Second Edition Editor William H. Beyer, Ph.D. Professor Department of Mathematics University of Akron Akron, Ohio CRC Press, Inc. Boca Raton, Florida

More information

X 2 has mean E [ S 2 ]= 2. X i. n 1 S 2 2 / 2. 2 n 1 S 2

X 2 has mean E [ S 2 ]= 2. X i. n 1 S 2 2 / 2. 2 n 1 S 2 Week 11 notes Inferences concerning variances (Chapter 8), WEEK 11 page 1 inferences concerning proportions (Chapter 9) We recall that the sample variance S = 1 n 1 X i X has mean E [ S ]= i =1 n and is

More information

CHAPTER 11: Multiple Regression

CHAPTER 11: Multiple Regression CHAPTER : Multiple Regression With multiple linear regression, more than one explanatory variable is used to explain or predict a single response variable. Introducing several explanatory variables leads

More information

Lecture 6 Statistical Tests. Confidence intervals Student test ANOVA test Fisher test for variances Nonparametric tests

Lecture 6 Statistical Tests. Confidence intervals Student test ANOVA test Fisher test for variances Nonparametric tests Lecture 6 Statistical Tests Confidence intervals Student test ANOVA test Fisher test for variances Nonparametric tests Gauss Curve Average=100, Standard deviation=15 Sample from a normal population In

More information

Simple Linear Regression Models

Simple Linear Regression Models Simple Linear Regression Models 14-1 Overview 1. Definition of a Good Model 2. Estimation of Model parameters 3. Allocation of Variation 4. Standard deviation of Errors 5. Confidence Intervals for Regression

More information

R egression is perhaps the most widely used data

R egression is perhaps the most widely used data Using Statistical Data to Make Decisions Module 4: Introduction to Regression Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and Resource Economics R egression

More information

Better Confidence Intervals for the Variance in a Random Sample

Better Confidence Intervals for the Variance in a Random Sample Better Confidence Intervals for the Variance in a Random Sample Ruth Hummel, Senin Banga, Thomas P. Hettmansperger Abstract Classical inferential procedures about the variance in the one-sample and two-sample

More information

Learning objectives. Non-parametric tests. Inhomogeneity of variance. Why non-parametric test? Non-parametric tests. NOT normally distributed

Learning objectives. Non-parametric tests. Inhomogeneity of variance. Why non-parametric test? Non-parametric tests. NOT normally distributed Learning objectives Criteria for choosing a non-parametric test Part Why non-parametric test? Inhomogeneity of variance Violation of the assumption of normally distributed data Group Group Group Inhomogeneity

More information

Biostatistics for Health Care Researchers: A Short Course. Susan M. Perkins, Ph.D. Division of Biostatistics Indiana University School of Medicine

Biostatistics for Health Care Researchers: A Short Course. Susan M. Perkins, Ph.D. Division of Biostatistics Indiana University School of Medicine Biostatistics for Health Care Researchers: A Short Course Comparison of Means Presented ed by: Susan M. Perkins, Ph.D. Division of Biostatistics Indiana University School of Medicine 1 Objectives Understand

More information

BIOSTATISTICS FOR THE CLINICIAN

BIOSTATISTICS FOR THE CLINICIAN BIOSTATISTICS FOR THE CLINICIAN Mary Lea Harper, Pharm.D. Learning Objectives 1. Understand when to use and how to calculate and interpret different measures of central tendency (mean, median, and mode)

More information

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

More information

Lecture 13: Kolmogorov Smirnov Test & Power of Tests

Lecture 13: Kolmogorov Smirnov Test & Power of Tests Lecture 13: Kolmogorov Smirnov Test & Power of Tests S. Massa, Department of Statistics, University of Oxford 2 February 2016 An example Suppose you are given the following 100 observations. -0.16-0.68-0.32-0.85

More information