Unit 14: Nonparametric Statistical Methods
|
|
- Elijah Greene
- 7 years ago
- Views:
Transcription
1 Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 7/26/2004 Unit 14 - Stat Ramón V. León 1
2 Introductory Remarks Most methods studied so far have been based on the assumption of normally distributed data Frequently this assumption is not valid Sample size may be too small to verify it Sometimes the data is measured in an ordinal scale Nonparametric or distribution-free statistical methods Make very few assumptions about the form of the population distribution from which the data are sampled Based on ranks so they can be used on ordinal data Will concentrate on hypothesis tests but will also mention confidence interval procedures. 7/26/2004 Unit 14 - Stat Ramón V. León 2
3 Inference for a Single Sample Consider a random sample x1, x2,..., x n from a population with unknown median µ. (Recall that for nonnormal (especially skewed) distributions the median is a better measure of the center than the mean.) H : µ = µ vs. H : µ > µ Example: Test whether the median household income of a population exceeds $50,000 based on a random sample of household incomes from that population For simplicity we sometimes present methods for one-sided tests. Modifications for two-sided tests are straightforward and are given in the textbook Some examples in these notes are two-sided tests. 7/26/2004 Unit 14 - Stat Ramón V. León 3
4 Sign test: Sign Test for a Single Sample H 1. Count the number of x i 's that exceed µ 0. Denote this number by s+, called the number of plus signs. Let s = n s+, which is the number of minus signs. 2. Reject H if s is large or equivalently if s is small. 0 : µ = µ vs. H : µ > µ Test idea: Under the null hypothesis s + has a binomial distribution, Bin (n, ½). So this test is simply the test for binomial proportions 7/26/2004 Unit 14 - Stat Ramón V. León 4
5 Sign Test Example A thermostat used in an electric device is to be checked for the accuracy of its design setting of 200ºF. Ten thermostats were tested to determine their actual settings, resulting in the following data: 202.2, 203.4, 200.5, 202.5, 206.3, 198.0, 203.7, 200.8, 201.3, s + H : µ = 200 vs H : µ = 8 = number of data values > 200, so P-value = i= 8 i = = 2 i= 0 i 2 (The t test based on the mean has P-value = However recall that the t test assumes a normal population) 7/26/2004 Unit 14 - Stat Ramón V. León 5
6 Normal Approximation to Test Statistic If the sample size is large ( 20) the common of S and S is approximated by a normal distribution with 1 n ES ( + ) = ES ( ) = np= n =, n Var( S+ ) = Var( S ) = np(1 p) = n = Therefore can perform a one-sided z- test with s+ n z = n 4 + 7/26/2004 Unit 14 - Stat Ramón V. León 6
7 P-values for Sign Test Using JMP Based on normal approximation to the binomial ( = z 2 ) 7/26/2004 Unit 14 - Stat Ramón V. León 7
8 Treatment of Ties Theory of the test assumes that the distribution of the data is continuous so in theory ties are impossible In practice they do occur because of rounding A simple solution is to ignore the ties and work only with the untied observation. This does reduce the effective sample size of the test and hence its power, but the loss is not significant if there are only a few ties 7/26/2004 Unit 14 - Stat Ramón V. León 8
9 Let x x x be the ordered data values. (1) (2) ( n) Then a (1- α )-level CI for µ is given by x µ Comfidence Interval for µ x ( b+ 1) ( n b) where b= b α is the lower α 2 critical point n,1 2 of the Bin n,1 2 distribution. ( ) Note: Not all confidence levels are possible because of the discreteness of the Binomial distribution 7/26/2004 Unit 14 - Stat Ramón V. León 9
10 Thermostat Setting: Sign Confidence Interval for the Median From Table A.1 we see that for n = 10 and p=0.5, the lower critical point of the binomial distribution is 1 and by symmetry the upper critical point is 9. Setting α 2 = which gives 1-α = = 0.978, we find that = x µ x = (2) (9) is a 97.8% CI for µ. 7/26/2004 Unit 14 - Stat Ramón V. León 10
11 Sign Test for Matched Pairs Drop 3 tied pairs. Then s + = 20; s - = 3 7/26/2004 Unit 14 - Stat Ramón V. León 11
12 Sign Test for Matched Pairs 7/26/2004 Unit 14 - Stat Ramón V. León 12
13 Sign Test for Matched Pairs in JMP Pearson s p-value is not the same as the book s two-sided P-value because the book uses the continuity correction in the normal approximation to the binomial distribution, i.e, book uses z = (Page 567) rather than z = used by JMP. Note that ( ) 2 = book 7/26/2004 Unit 14 - Stat Ramón V. León 13
14 Wilcoxon Signed Rank Test H : µ = µ vs. H : µ µ More powerful than the sign test, however, it requires the assumption that the population distribution is symmetric 1. Rank and order the differences in terms of their absolute value Example 14.1 and 14.4: Thermostat Setting is 200 F 2. Calculate w + = sum of the ranks of the positive differences w + = Reject H 0 if w + is large or small 7/26/2004 Unit 14 - Stat Ramón V. León 14
15 Wilcoxon Signed Rank Test in JMP This test finds a significant difference at α=0.05 while the sign test did not at even α=0.1 7/26/2004 Unit 14 - Stat Ramón V. León 15
16 Normal Approximation in the Wilcoxon Signed Rank Test For large n, the null distribution of W W W + - can be well-approximated by a normal distribution with mean and variance given by nn ( + 1) nn ( + 1)(2n+ 1) EW ( ) = andvarw ( ) = For large samples a one-sided ( greater than median) z-test uses the statistic w+ n( n 1)/4 1/2 z = + nn ( + 1)(2n+ 1) 24 7/26/2004 Unit 14 - Stat Ramón V. León 16
17 Importance of Symmetric Population Assumption Here even though H 0 is true the long right hand tail makes the positive differences tend to be larger in magnitude than the negative differences, resulting in higher ranks. This inflates w + and hence the test s type I error probability. 7/26/2004 Unit 14 - Stat Ramón V. León 17
18 Null Distribution of the Wilcoxon Signed Rank Statistics 7/26/2004 Unit 14 - Stat Ramón V. León 18
19 Null Distribution of the Wilcoxon Signed Rank Statistics 7/26/2004 Unit 14 - Stat Ramón V. León 19
20 Wilcoxon Signed Rank Statistic: Treatment of Ties There are two types of ties Some of the data is equal to the median Drop these observations Some of the differences from the median may be tied Use midrank, that is, the average rank For example, suppose d1 = 1, d2 =+ 3, d3 = 3, d4 =+ 5 Then (2 + 3) r1 = 1, r2 = r3 = = 2.5, r4 = 4 2 With ties Table A.10 is only approximate 7/26/2004 Unit 14 - Stat Ramón V. León 20
21 Wilcoxon Sign Rank Test: Matched Pair Design Example 14.5: Comparing Two Methods of Cardiac Output Notice that we drop the three zero differences Notice that we average the tied ranks Two-Side P-values Signed test: Signed Rank test: t-test: (Page 284) (Notice that these tests require progressively more stringent assumptions about the population of differences) 7/26/2004 Unit 14 - Stat Ramón V. León 21
22 JMP Calculation 7/26/2004 Unit 14 - Stat Ramón V. León 22
23 Signed Rank Confidence Interval for the Median 7/26/2004 Unit 14 - Stat Ramón V. León 23
24 Thermostat Setting: Wilcoxon Signed Rank Confidence Interval for Median From Table A.10 we see that for n = 10, the upper 2.4% critical point is 47 and by symmetry the lower 2.4% 10(10 + 1) critical point is - 47 = = 8. 2 Setting α 2 = and hence 1-α= =0.952 we find that = x = x µ x = is a 95.2% CI for µ 7/26/2004 Unit 14 - Stat Ramón V. León 24
25 Inferences for Two Independent Samples One wants to show that the observations from one population tend to be larger than those from another population based on independent random samples x, x,..., x and y, y,..., y 1 2 n n Examples: Treated patients tend to live longer than untreated patients An equity fund tends to have a higher yield than a bond fund 7/26/2004 Unit 14 - Stat Ramón V. León 25
26 Wilcoxon-Mann-Whitney Test Example: Time to Failure of Two Capacitor Groups Reject for extreme values of w 1. 7/26/2004 Unit 14 - Stat Ramón V. León 26
27 Stochastic Ordering of Populations X is stochastically larger than Y ( X Y) if for all real numbers u, PX ( > u) PY ( > u) equivalently, P( X u) = F( u) F ( u) = P( Y u) 1 2 with strict inequality for at least some u. Denoted by X Y or equivalently by F < F ) 1 2 7/26/2004 Unit 14 - Stat Ramón V. León 27
28 Stochastic Ordering Especial Case: Location Difference θ is called a location parameter Notice that X X iff θ < θ /26/2004 Unit 14 - Stat Ramón V. León 28
29 0 1 2 Wilcoxon-Mann-Whitney Test H : F = F ( X Y) Alternatives : One sided: H : F < F ( X Y) Two sided: H : F < F or F < F ( X Y or Y X) Notice that the alternative is not H : F F (Kolmogorov-Smirnov Test can handle this alternative) 7/26/2004 Unit 14 - Stat Ramón V. León 29
30 Wilcoxon Version of the Test H : F = F ( X Y)vs. H : F < F ( X Y) Rank all N = n + n observations, 1 2 x, x,..., x and y, y,..., y 1 2 n in ascending order 2. Sum the ranks of the x's and y's separately. Denote these sums by w and Reject H if w is large or equivalently w is small n w 2 7/26/2004 Unit 14 - Stat Ramón V. León 30
31 Mann-Whitney Test Version The advantage of using the Mann-Whitney form of the test is that the same distribution applies whether we use u 1 or u 2 P value = P( U u ) = P( U u ) 1 2 7/26/2004 Unit 14 - Stat Ramón V. León 31
32 Null Distribution of the Wilcoxon- Mann-Whitney Test Statistic Under the null hypothesis each of these 10 ordering has an equal chance of occurring, namely, 1/10 5 = /26/2004 Unit 14 - Stat Ramón V. León 32
33 Null Distribution of the Wilcoxon- Mann-Whitney Test Statistic Pw ( 8) = = 0.2 (one-sided p-value for w= 8) 1 1 ( H : X Y) 1 7/26/2004 Unit 14 - Stat Ramón V. León 33
34 Normal Approximation of Mann- Whitney Statistic For large n and n, the null distribution of U can be 1 2 well approximated by a normal distribution with mean and variance given by nn 1 2 nn 1 2( N+ 1) EU ( ) = and VarU ( ) = 2 12 A large sample one-sided z- test can be based on the statistic z = u nn nn ( N 1) 12 ( H : X Y) 7/26/2004 Unit 14 - Stat Ramón V. León 34
35 Treatment of Ties A tie occurs when some x equal a y. A contribution of ½ is counted towards both u 1 and u 2 for each tied pair Equivalent to using the midrank method in computing the Wilcoxon rank sum statistic 7/26/2004 Unit 14 - Stat Ramón V. León 35
36 Wilcoxon-Mann-Whitney Confidence Interval Example14.8 shows that [d (18), d (63) ] = [-1.1, 14.7] is a 95.6% CI for the difference of the two medians of the failure times of capacitors. This example is in the book errata since Table A.11 is not detailed enough. 7/26/2004 Unit 14 - Stat Ramón V. León 36
37 Wilcoxon-Mann-Whitney Test in JMP z 2 = With continuity correction. Used in the book which gets a onesided p- value of Without continuity correction 7/26/2004 Unit 14 - Stat Ramón V. León 37
38 Inference for Several Independent Samples: Kruskal-Wallis Test Note that this is a completely randomized design 7/26/2004 Unit 14 - Stat Ramón V. León 38
39 Kruskal-Wallis Test H : F = F = = F vs. H : F < F for some i j a 1 i j Reject if a 2 H0 kw> χ 1, α Distance from the average rank 7/26/2004 Unit 14 - Stat Ramón V. León 39
40 Chi-Square Approximation For large samples the distribution of KW under the null hypothesis can be approximated by the chisquare distribution with a-1 degrees of freedom So reject H 0 if kw > χa 1, α 7/26/2004 Unit 14 - Stat Ramón V. León 40
41 Kruskal-Wallis Test Example Reject if kw is large. 2 χ 3,.005 = /26/2004 Unit 14 - Stat Ramón V. León 41
42 Kruskal-Wallis Test in JMP 7/26/2004 Unit 14 - Stat Ramón V. León 42
43 7/26/2004 Unit 14 - Stat Ramón V. León 43
44 Case method is different from Unitary method Formula method is different from Unitary method 7/26/2004 Unit 14 - Stat Ramón V. León 44
45 Pairwise Comparisons: Is Any Pair of Treatments Different? One can use the Tukey Method on the average ranks to make approximate pairwise comparisons. This is one of many approximate techniques where ranks are substituted for the observations in the normal theory methods. 7/26/2004 Unit 14 - Stat Ramón V. León 45
46 7/26/2004 Unit 14 - Stat Ramón V. León 46
47 7/26/2004 Unit 14 - Stat Ramón V. León 47
48 Tukey s Test Applied to the Ranks Averaged Lack of agreement with the more precise method of Example Here Equation method also seems to be different from Formula and Case method 7/26/2004 Unit 14 - Stat Ramón V. León 48
49 Example of Friedman s Test Ranking is done within blocks 2 χ 7,.025 = P-value =.0040 vs for ANOVA table 7/26/2004 Unit 14 - Stat Ramón V. León 49
50 i i i Inference for Several Matched Samples Randomized Block Design: a b y ij = observation on the i-th treatment in the j-th block if = c.d.f of r.v. Y corresponding to the observed value y ij ij ij For simplicity assume F ( y) = F( y θ β ) iθ i iβ j 2 treatment groups 2 blocks is the "treatment effect" is the "block effect" i.e., we assume that there is no treatment by block interaction ij i j 7/26/2004 Unit 14 - Stat Ramón V. León 50
51 Friedman Test H : θ = θ = = θ vs. H : θ > θ for some i j a 1 i j Reject if fr 2 > χa 1, α Distance from the total of the ranks from their expected value when there is no agreement between the blocks 7/26/2004 Unit 14 - Stat Ramón V. León 51
52 Pairwise Comparisons 7/26/2004 Unit 14 - Stat Ramón V. León 52
53 Rank Correlation Methods The Pearson correlation coefficient measures only the degree of linear association between two variables Inferences use the assumption of bivariate normality of the two variables We present two correlation coefficients that Take into account only the ranks of the observations Measure the degree of monotonic (increasing or decreasing) association between two variables 7/26/2004 Unit 14 - Stat Ramón V. León 53
54 Motivating Example ( xy, ) = (1, e), (2, e), (3, e), (4, e), (5, e) Note that there is a perfect positive association between between x and y with y = e x. The Pearson correlation correlation coefficient is only because the relationship is not linear The rank correlation coefficients we present yield a value of 1 for these data 7/26/2004 Unit 14 - Stat Ramón V. León 54
55 Spearman s Rank Correlation Coefficient Ranges between 1 and +1 with r s = -1 when there is a perfect negative association and r s = +1 when there is a perfect positive association 7/26/2004 Unit 14 - Stat Ramón V. León 55
56 Example (Wine Consumption and Heart Disease Deaths per 100,000 7/26/2004 Unit 14 - Stat Ramón V. León 56
57 7/26/2004 Unit 14 - Stat Ramón V. León 57
58 Calculation of Spearman s Rho 7/26/2004 Unit 14 - Stat Ramón V. León 58
59 Test for Association Based on Spearman s Rank Correlation Coefficient 7/26/2004 Unit 14 - Stat Ramón V. León 59
60 H 0 1 Hypothesis Testing Example : X= Wine Consumption and Y = Heart Disease Deaths are independent. vs. H : X and Y are (negatively or positively) associated z = r n 1 = = S Two-Sided P value = Evidence of negative association 7/26/2004 Unit 14 - Stat Ramón V. León 60
61 JMP Calculations: Pearson Correlation Heart Disease Deaths Alcohol from Wine Plot is fairly linear Pearson correlation 7/26/2004 Unit 14 - Stat Ramón V. León 61
62 JMP Calculations: Spearman Rank Correlation 7/26/2004 Unit 14 - Stat Ramón V. León 62
63 Kendall s Rank Correlation Coefficient: Key Concept Examples Concordant pairs: (1,2), (4,9) (1-4)(2-9)>0 (4,2), (3,1) (4-3)(2-1)>0 Discordant pairs: (1,2), (9,1) (1-9)(2-1)<0 (2,4), (3,1) (2-3)(4-1)<0 Tied pairs: (1,3), (1,5) (1 1)(3 5)=0 (1,4), (2,4) (1 2)(4 4)=0 (1,2), (1,2) (1 1)(2 2)=0 Kendall s idea is to compare the number of concordant pairs to the number of discordant pairs in bivariate data 7/26/2004 Unit 14 - Stat Ramón V. León 63
64 (X, Y) (1, 2) Kendall s Tau (3, 4) Example (2, 1) n 3 Number of pairwise comparisons = = = 3 = 2 2 N Concordant pairs: (1,2) (3,4) (3,4) (2,1) N c = 2 Discordant pairs: (1,2) (2,1) N d = 1 ˆ τ = = = N c N N d 7/26/2004 Unit 14 - Stat Ramón V. León 64
65 Kendall s Rank Correlation Coefficient: Population Version 7/26/2004 Unit 14 - Stat Ramón V. León 65
66 Kendall s Rank Correlation Coefficient: Sample Estimate Let Nc = Number of concordant pairs in the data Let Nd = Number of disconcordant pairs in the data n Let N = be the number of pairwise comparisons among 2 the observations ( xi, yi), i = 1, 2,..., n. Then Nc Nd ˆ τ = and Nc + Nd N = N if no ties ˆ τ = Nc Nd if ties ( N T )( N T ) x y where T and T are corrections for the number of tied pairs. x y 7/26/2004 Unit 14 - Stat Ramón V. León 66
67 Hypothesis of Independence Versus Positive Association Wine data: /26/2004 Unit 14 - Stat Ramón V. León 67
68 JMP Calculations: Kendall s Rank Correlation Coefficient 7/26/2004 Unit 14 - Stat Ramón V. León 68
69 Kendall s Coefficient of Concordance Measure of association between several matched samples Closely related to Friedman s test statistic Consider a candidates (treatments) and b judges (blocks) with each judge ranking the a candidates If there is perfect agreement between the judges, then each candidate gets the same rank. Assuming the candidates are labeled in the order of their ranking, the rank sum for the ith candidate would be r i = ib If the judges rank the candidates completely at random ( perfect disagreement ) then the expected rank of each candidate would be [1+2+ +a]/a =[a(a+1)/2]/a=(a+1)/2, and the expected value of all the rank sums would equal to b(a+1)/2 7/26/2004 Unit 14 - Stat Ramón V. León 69
70 Kendall s Coefficient of Concordance 7/26/2004 Unit 14 - Stat Ramón V. León 70
71 Kendall s Coefficient of Concordance and Friedman s Test 7/26/2004 Unit 14 - Stat Ramón V. León 71
72 w = = (8 1) 7/26/2004 Unit 14 - Stat Ramón V. León 72
73 Do You Need to Know More Nonparametric Statistical Methods, Second Edition by Myles Hollander and Douglas A. Wolfe. (1999) Wiley-Interscience 7/26/2004 Unit 14 - Stat Ramón V. León 73
74 Resampling Methods Conventional methods are based on the sampling distribution of a statistic computed for the observed sample. The sampling distribution is derived by considering all possible samples of size n from the underlying population. Resampling methods generate the sampling distribution of the statistic by drawing repeated samples from the observed sample itself. This eliminates the need to assume a specific functional form for the population distribution (e.g. normal). 7/26/2004 Unit 14 - Stat Ramón V. León 74
75 Challenger Shuttle O-Ring Data Do we have statistical evidence that cold temperature leads to more O-ring incidents? Notice that assumptions of two sample t test do not hold. Original analysis omitted the zeros? Was this justified? What do we do? 7/26/2004 Unit 14 - Stat Ramón V. León 75
76 Wrong t-test Analysis Difference of Low mean to High mean Notice that the assumptions of the independent sample t-test do not hold, i.e., data is not normal for each group. 7/26/2004 Unit 14 - Stat Ramón V. León 76
77 Permutation Distribution of t Statistic Also equal to the two-sided p-value Equivalent to selecting all simple random samples without replacement of size 20 from the 24 data points, labeling these High and the rest Low 7/26/2004 Unit 14 - Stat Ramón V. León 77
78 Comments A randomization test is a permutation test applied to data from a randomized experiment. Randomization tests are the gold standard for establishing causality. A permutation test considers all possible simple random samples without replacement from the set of observed data values The bootstrap method considers a large number of simple random samples with replacement from the set of observed data values. 7/26/2004 Unit 14 - Stat Ramón V. León 78
79 Calculation of t Statistics from 10, Bootstrap Samples Think that we are placing the 24 Challenger data values in a hat. And that we are randomly selecting 24 values with replacement from the hat, labeling the first 20 values High and the remaining 4 values Low. We repeat these process 10,000 times. For each of these 10,000 bootstrap samples we calculate the t-statistic. 35 t- statistics values were greater than or equal to out of (if s p = 0, t is defined to be 0). This gives a bootstrap P-value of 35/10000 = /26/2004 Unit 14 - Stat Ramón V. León 79
80 Bootstrap Distribution of Difference Between the Means 67 of the 10,000 differences of the Low mean and the High mean were greater than or equal to 1.3. This gives a bootstrap P-value of 67/10000 = Conclusion: Cold weather increases the chance of O-ring problems 7/26/2004 Unit 14 - Stat Ramón V. León 80
81 Bootstrap Final Remarks The JMP files - that we used to generate the bootstrap samples and to calculate the statistics - are available at the course web site. There are bootstrap procedures for most types of statistical problems. All are based on resampling from the data. These methods do not assume specific functional forms for the distribution of the data, e.g. normal The accuracy of bootstrap procedures depend on the sample size and the number of bootstrap samples generated 7/26/2004 Unit 14 - Stat Ramón V. León 81
82 How Were the Bootstrap Samples Generated? (see next page) 7/26/2004 Unit 14 - Stat Ramón V. León 82
83 7/26/2004 Unit 14 - Stat Ramón V. León 83
84 7/26/2004 Unit 14 - Stat Ramón V. León 84
85 7/26/2004 Unit 14 - Stat Ramón V. León 85
86 7/26/2004 Unit 14 - Stat Ramón V. León 86
87 Calculated Columns in JMP Samples File 7/26/2004 Unit 14 - Stat Ramón V. León 87
88 7/26/2004 Unit 14 - Stat Ramón V. León 88
89 7/26/2004 Unit 14 - Stat Ramón V. León 89
90 7/26/2004 Unit 14 - Stat Ramón V. León 90
91 7/26/2004 Unit 14 - Stat Ramón V. León 91
92 7/26/2004 Unit 14 - Stat Ramón V. León 92
93 7/26/2004 Unit 14 - Stat Ramón V. León 93
94 7/26/2004 Unit 14 - Stat Ramón V. León 94
95 Bootstrap Estimate of the Standard Error of the Mean Summary: We calculate the standard deviation of the N bootstrap estimates of the mean 7/26/2004 Unit 14 - Stat Ramón V. León 95
96 BSE for Arbitrary Statistic Example: The bootstrap standard error of the median is calculated by drawing a large number N, e.g , of bootstrap samples from the data. For each bootstrap sample we calculated the sample median. Then we calculate the standard deviation of the N bootstrap medians. 7/26/2004 Unit 14 - Stat Ramón V. León 96
97 Estimated Bootstrap Standard Error for t- statistics Using JMP Note N =10,000 7/26/2004 Unit 14 - Stat Ramón V. León 97
98 Bootstrap Standard Error Interpretation Many bootstrap statistics have an approximate normal distribution Confidence interval interpretation 68% of the time the bootstrap estimate (the average of the bootstrap estimates) will be within one standard error of true parameter value 95% of the time the bootstrap estimate (the average of the bootstrap estimates) will be within two standard error of true parameter value 7/26/2004 Unit 14 - Stat Ramón V. León 98
99 Bootstrap Confidence Intervals Percentile Method: Median Example 1. Draw N (= 10000) bootstrap samples from the data and for each calculate the (bootstrap) sample median. 2. The 2.5 percentile of the N bootstrap sample medians will be the LCL for a 95% confidence interval 3. The 97.5 percentile of the N bootstrap sample medians will be the UCL for a 95% confidence interval LCL UCL 7/26/2004 Unit 14 - Stat Ramón V. León 99
100 Do You Need to Know More? A Introduction to the Bootstrap by Bradley Efrom and Robert J. Tibshirani. (1993) Chapman & Hall/CRC 7/26/2004 Unit 14 - Stat Ramón V. León 100
We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do
More informationNonparametric Statistics
Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric
More informationEPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST
EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationConfidence Intervals for Spearman s Rank Correlation
Chapter 808 Confidence Intervals for Spearman s Rank Correlation Introduction This routine calculates the sample size needed to obtain a specified width of Spearman s rank correlation coefficient confidence
More information1 Nonparametric Statistics
1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationRank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More informationNonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationCOMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON -NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore*
COMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON -NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore* The data collection phases for evaluation designs may involve
More informationBowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application
More informationCHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working
More informationChapter 12 Nonparametric Tests. Chapter Table of Contents
Chapter 12 Nonparametric Tests Chapter Table of Contents OVERVIEW...171 Testing for Normality...... 171 Comparing Distributions....171 ONE-SAMPLE TESTS...172 TWO-SAMPLE TESTS...172 ComparingTwoIndependentSamples...172
More informationDifference tests (2): nonparametric
NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge
More informationNCSS Statistical Software. One-Sample T-Test
Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,
More informationChapter G08 Nonparametric Statistics
G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................
More informationConfidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
More informationAnalysis of Questionnaires and Qualitative Data Non-parametric Tests
Analysis of Questionnaires and Qualitative Data Non-parametric Tests JERZY STEFANOWSKI Instytut Informatyki Politechnika Poznańska Lecture SE 2013, Poznań Recalling Basics Measurment Scales Four scales
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationHypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationSTATISTICAL SIGNIFICANCE OF RANKING PARADOXES
STATISTICAL SIGNIFICANCE OF RANKING PARADOXES Anna E. Bargagliotti and Raymond N. Greenwell Department of Mathematical Sciences and Department of Mathematics University of Memphis and Hofstra University
More informationNonparametric tests these test hypotheses that are not statements about population parameters (e.g.,
CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationTwo-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationPaired T-Test. Chapter 208. Introduction. Technical Details. Research Questions
Chapter 208 Introduction This procedure provides several reports for making inference about the difference between two population means based on a paired sample. These reports include confidence intervals
More informationHow To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationPoint Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More informationSAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation
SAS/STAT Introduction to 9.2 User s Guide Nonparametric Analysis (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation
More informationTesting Group Differences using T-tests, ANOVA, and Nonparametric Measures
Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:
More informationMultivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More informationTwo-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationConfidence Intervals for Cpk
Chapter 297 Confidence Intervals for Cpk Introduction This routine calculates the sample size needed to obtain a specified width of a Cpk confidence interval at a stated confidence level. Cpk is a process
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationOnce saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationUNDERSTANDING THE DEPENDENT-SAMPLES t TEST
UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)
More informationStatistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples
Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours
More informationStatistical Functions in Excel
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
More informationNon-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions
More informationParametric and non-parametric statistical methods for the life sciences - Session I
Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute
More informationConfidence Intervals for One Standard Deviation Using Standard Deviation
Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationSIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationOne-Way Analysis of Variance
One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
More informationNAG C Library Chapter Introduction. g08 Nonparametric Statistics
g08 Nonparametric Statistics Introduction g08 NAG C Library Chapter Introduction g08 Nonparametric Statistics Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationThe Friedman Test with MS Excel. In 3 Simple Steps. Kilem L. Gwet, Ph.D.
The Friedman Test with MS Excel In 3 Simple Steps Kilem L. Gwet, Ph.D. Copyright c 2011 by Kilem Li Gwet, Ph.D. All rights reserved. Published by Advanced Analytics, LLC A single copy of this document
More informationStatistics for Sports Medicine
Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationSkewed Data and Non-parametric Methods
0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationStat 5102 Notes: Nonparametric Tests and. confidence interval
Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions
More informationPearson s Correlation
Pearson s Correlation Correlation the degree to which two variables are associated (co-vary). Covariance may be either positive or negative. Its magnitude depends on the units of measurement. Assumes the
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationCategorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationIntroduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.
Computer Workshop 1 Part I Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Outlier testing Problem: 1. Five months of nickel
More informationNonparametric Two-Sample Tests. Nonparametric Tests. Sign Test
Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric
More informationMEASURES OF LOCATION AND SPREAD
Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More information