Comparing Means in Two Populations


 Antony Carpenter
 2 years ago
 Views:
Transcription
1 Comparing Means in Two Populations
2 Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we ll consider how to compare sample means from two populations. Towards the end of the course, we ll discuss comparing means from more than two populations. When we re comparing the means from two independent samples we usually ask: How does one sample mean compare with the other?
3 However, focusing just on comparing the means can be premature. It s safer to first consider the variability of each sample, the pattern of any outliers, the shape of the distributions. Then it may be safe to assume a normal distributions; but not always. So, we ll also discuss approaches to answering these questions when we re not comfortable with the assumption of normality, or when this assumption is just not defendable.
4 Two Sample Means Cities and counties: Returning to the Alabama SOL passrates, is there a difference between cities and counties? Recall last time we looked at a difference across years in the same population. Here, we want to look at the difference in one year between two populations: city high schools and county high schools.
5 Phase 1: State the Question, 1. Evaluate and describe the data Begin by looking at the data. Where did the data come from? What are the observed statistics? The source of this data is Alabama Department of Education. The first step in any data analysis is evaluating and describing the data. These first steps are also called preliminary analysis, to distinguish them from the definitive (or outcome) analysis.
6 Preliminary Analyses The goal of a preliminary analysis is to describe and inform. To give a description of the data. Keep in mind that, the goal of a definitive analysis is decision making or hypothesis testing.
7 Preliminary Analysis What are the observed statistics? Use the Fit Y by X platform to look at a graphical and tabular summary of the data, as in the next figure.
8
9 Note: Previously, in Step 1, we used the Distribution of Y reports to identify and fix errors and to further understand the data. You can use the Distn of Y but you would have to run it twice here you can see everything in one place. There are three components to this figure: The dot plots, the box plots, and the quantiles table. Let s look at each.
10 The Dot plot A dot plot shows the continuous Yvariable s (Algebra I 2000 pass rate) values along the vertical axis and the nominal Xvariable s (City Yes or No) values along the horizontal axis. So we see the two groups along the horizontal axis; City = No and City = Yes. The width of the groups is proportional to the sample size of each group; there are more No (noncities) values so it is drawn wider. This follows the your eye goes to ink rule. Groups with larger samples are more informative than groups with smaller samples so the larger n group is drawn bigger.
11 Dot plots Along the vertical axis, we see the 10th grade, year 2000 Algebra I SOL passrate; One dot for each high school. Values range from 0% passing to 100% passing. The horizontal spreading of the values is done so that you can see each school s scores better (called Jittered Points ). The amount of horizontal spread is random; so, don t try to interpret the scores for points farther to the right or left than scores closer to the center (horizontally). Of course, the vertical values are interpretable; that is, a school at the top has a higher passrate than a school at the bottom.
12 Box Plots These sidebyside box plots describe the shape of the distributions within each group. These plots do not assume normality so we use them to begin to answer the question, Is each group normally distributed? In the box plots we can easily see whether the values are symmetric about the median. Look for these warning flags that the data is not normal:
13 Box Plots Is the distance between the median and the 75%tile different than the distance between the median and the 25%tile? Is the upper whiskerbar (actually, the 90%tile) more distant from the median than the lower whiskerbar (the 10%tile)? Are the highextreme tailvalues more distant from the median than the lowextreme tailvalues? These informal, graphic assessments don t raise any warning flags for these data. The dotted horizontal line represents the mean value for all schools (not considering the group).
14 Quantiles Report If a more detailed comparison of values is needed, the numerical values plotted in the box plots are shown in the Quantiles Report.
15 Quantiles Report For instance, in the City = No group, the distance from the median to the 75%tile (~49 vs 65, 16 points) is about the same as the distance between the median and the 25%tile (~49 vs 34, about 15 points). But, as in the Distribution platform, the preferred way to answer the question, Is each group normally distributed? is with a normal quantile plot.
16 Normal Quantile Plots Actually, the more proper phrasing of the question is: Within each group, is each group normally distributed? That is, it may be that if we were to lump both groups together, the data would appear nonnormal. We must take group membership into account when making this assessment.
17
18 Interpretation Follow the same interpretation of the normal quantile plot as we discussed with a single mean. Are each group of black dots along a straight line? In the SOL data, these two sets of points follow the lines fairly well, with some departure in the tails.
19 Normality So, we now have enough information to answer the question, Within each group, is each group normally distributed? If the answer is Yes or Probably then we can proceed with parametric tests to compare the means. The Central Limit Theorem can apply if the sample size is large. The rule of thumb is if the total n is at least 30 (n1 + n2 30). If the answer to the normality question is No or I doubt it then we ll use nonparametric methods to answer our question.
20 Preliminary analysis, showing means If the data is normally distributed then means and SDs make sense. If these distributional assumptions are unwarranted, then we should consider nonparametric methods. Thus, the next thing to do in our preliminary analysis may be to get rid of the box plots and quantile plot and to show the means and standard deviations calculated within each group.
21
22 Means This figure shows the means, here connected with a line, and a short dashed bar that is one standarderror error bars. The long dashed lines above and below the means are one standard deviation away from their respective mean. The means and standard errors and deviations can be shown by selecting Means and Std Dev from the main Options menu. From the Display Options submenu in the Options menu, select the options necessary.
23 Means and SDs Report We can use it to describe the following: the number of observations in each group, the means of each group, the standard deviation within each group. Recall that the SE is not a descriptive statistic for the data, it is used for inference about the mean. JMP includes the SE here because it is used to form confidence intervals about the mean.
24 Note: You can change the number of decimal places displayed in any JMP report: Doubleclick a number in the report. A dialog will appear. Change the number of decimal places.
25 Summary: Preliminary Analysis So far, what have we learned about the data? We have not found any errors in the data. We re comfortable with the assumption of normality within each group. We ve obtained descriptive statistics for each of the group we re comparing.
26 Preliminary Results Also, at this point, we can look at the two means and make a guess, is there a difference between cities and counties? City schools seem to be about 12 points below noncity schools, and with SEs < 3, this seems like a big difference. Recall that the tstatistic is the ratio of the difference to a standard error. The ratio of 12 to 3 is bigger than 2.
27 2. Review assumptions As always there are three questions to consider: Is the process used in this study likely to yield data that is representative of each of the two populations? Yes, it is the population Is observation in the two samples independent of the others? Yes. Is the sample size sufficient? Yes, both groups are large and we re comfortable with normality for both groups.
28 Bottom line We have to be comfortable that the first two assumptions are met before we can proceed at all. If we re comfortable with the normality assumption, then we proceed, as below. Later, we ll discuss what to do when normality can not be safely assumed.
29 3. State the question in the form of hypotheses Let s refer to the two groups as 1 and 2 for notational purposes. Using these as subscripts, there are three possible null hypotheses: 1. The null hypothesis is µ1 µ2, 2. The null hypothesis is µ1 µ2, or 3. The null hypothesis is a fixed value, µ1 = µ2. And the alternative hypothesis is the opposite of the null.
30 test statistic = summary statistic  hypothesized paramter standard error of the summary statistic Phase 2: Decide How to Answer the Question 4. Decide on a summary statistic that reflects the question Recall the general test statistic: test statistic = summary statistic  hypothesized paramter standard error of the summary statistic
31 Difference Score In this situation (as in comparing paired means), we are going to use the difference score as our summary statistic: The relevant statistic is y1 y2 or the observed difference of the two means.
32 The hypothesized parameter is easy: µ 1 µ 2 = 0, since under any of the three null hypotheses a difference of 0 would result in failing to reject the null hypothesis.
33 Standard Error What about the standard error? There are two possibilities for the standard error y y of 1 2. The two possibilities depend upon the two standard deviations within each group. o Are they the same? o Or do the two groups have different standard deviations?
34 Same SD If the standard deviations (or variances) within the two populations are equal than the standard error of the difference is easy. We just average the two estimated standard deviations and obtained a pooled estimate. The variance of the mean difference is the sum of the standard errors of each mean
35 Same SD σ 2 σ n n 1 2
36 The tstatistic We ll use the ttest to compare the two sample means and, using a pooled estimate for the variance called 2 s p, we calculate: t = y s y p sp n + n 1 2
37 Estimating σ The pooled variance estimate is a weighted average of the two individualgroup variances: s 2 p = ( 1) + ( 1) n s n s n + n Under the equal variance assumption, we calculate the pvalue using df = n1 + n2 2.
38 Unequal SD If the variances are not equal, the calculation is more complicated: t = y s y s2 n + n 1 2
39 tprime Note that the separate variance estimates are used in this t prime statistic, not the pooled estimated for variance. Further, the df is not a simple function of just n1 and n2. The details of these calculations are not important. What we need to know is how to proceed using JMP.
40 Deciding on the correct ttest Which test should we use? We may not need to choose; if the two sample sizes are equal (n1 = n2) the two methods give identical results. It s even pretty close if the n s are slightly different. If one n is more than 1.5 times the other (in the SOL case, n1 = 306 and n2 = 93, which is over 3 times as large), you ll have to decide which ttest to use.
41 Decision Decide whether the standard deviations are different. Use the equal variance ttest if they are the same, or Use the unequal variance ttest if they are different. Or, you could decide not to decide; use the unequal variance ttest. It s more conservative.
42 Determining Equal SDs There are three ways to make this decision. 1. Inspect the two standard deviation estimates. 2. Use the normal quantile plot. 3. Test for equal standard deviations.
43 Inspect the SDs Refer to the Means and Std Deviations report. Look at the two standard deviations, in this case 22.1 and Form the ratio of the largest to the smallest. If the ratio is larger than about 3, then the two SD s may be unequal (in our case, the ratio is 3.3). For a better answer to the question, see the normal quantile plot.
44 Normal Quantile Plot, SDs If the two standard deviations are equal then the slopes for the two lines in the normal quantile plot will be the same (the lines will be parallel). In our case the lines have roughly the same slope. So, for the SOL data the assumption of equal variability seems safe.
45 Questionable Parallel? If the slopes of the two lines are in that gray area between clearly parallel and clearly not parallel, what do we do? There are four possibilities: 1. Ignore the problem and be risky: use the equal variance ttest. 2. Ignore the problem and be conservative: use the unequal variance ttest. 3. Make a formal test of unequal variability in the two groups. 4. Compare the means using nonparametric methods.
46 What if not Parallel? Here, the data appear to be reasonably normal (this isn t the question) but the lines are not parallel they start out close and end far apart.
47 Here, not only do the variances appear to be unequal but normality is also in questions. We ll look at this case later.
48 Test for equal SDs In the case of the first figure, where we ll be using a ttest, but we re not sure which one, JMP provides a way to test for equal variance in the main options menu.
49
50 Choosing between the tests of equal variance Of the five tests: O Brien s, the BrownForsythe test, Levene s test, Bartlett s test, and the Ftest; the last three are out of date and are not recommended. There s not much difference between O Brien s and BrownForsythe. BrownForsythe is more robust (resistant to outlying observations), so we ll use this result. What are we testing?
51 Variance test The nullhypothesis for these tests are, the variances are equal. So, if the Prob>F value for the Brown Forsythe test is < 0.05, then you will reject the null hypothesis (universal decision rule) and conclude that the groups have unequal variances.
52 Reject Equal SDs? The report also shows the result for the ttest to compare the two means, allowing the standard deviations to be unequal. This is the unequal variance ttest. Here is a written summary of the results using this method: The two groups were compared using an unequal variance ttest and found to be significantly different (t = 7.1, df = 217.4, pvalue < ). School districts in cities had lower scores.
53 Unequal test df? Notice the degrees of freedom it isn t a whole number. That is because it is based on a weighted contribution of each sample (with unequal ns) to the standard error estimate. You can round the number off but ONLY to one decimal place, do not round to a whole number.
54 Step 5: Random Variation Recall the rough interpretation that t s larger than 2 are likely not due to chance. For either type of t
55 6. State a decision rule The universal decision rule Reject H0: if pvalue < a.
56 Phase 3: Answer the Question 7. Calculate the statistic There are three possible statistics that may be appropriate: 1. an equal variance ttest, 2. an unequal variance ttest, or 3. the nonparametric Wilcoxon ranksum test.
57 Equal variance If the equal variance assumption is reasonable, then the standard ttest is appropriate. Note: When reporting a ttest it s assumed that, unless you specify otherwise, it s the equalvariance ttest. The next figure shows the means diamonds in the dot plot, the ttest report, and the means for a oneway ANOVA (Analysis Of VAriance) report. We ll cover oneway ANOVA later in the course. When there are only two groups, the ttest and ANOVA give identical results.
58
59 In JMP To compare the two means using an equal variance t test in JMP: Choose Means/Anova/Pooled t from the main options menu. This adds the Oneway ANOVA report and means diamonds. The tvalue, df, and pvalue are shown in the ttest report. However, only the twotailed pvalue is reported (under Prob> t ). If the nullhypothesis specified that we were testing for equality, then this is the pvalue we want.
60 Onetail pvalues First, which group did JMP use for y1 and which for y2? JMP uses the order of the Xvariable: If the Xvariable is character, JMP alphabetically sorts the values and whichever comes first is y1. If the Xvariable is numeric, JMP uses the smallest value of the Xvariable as y1.
61 If your alternative was H A : µ 1 > µ 2 and the ttest value is positive, then the onetailed pvalue is half the twotail Prob> t in the report. If the ttest value was negative then you ve observed a difference in the opposite direction from that expected. The pvalue is one minus half the Prob> t in the report.
62 If your alternative was H A : µ 1 < µ 2 and the ttest value is negative, then the onetailed pvalue is half the twotail Prob> t in the report. If the ttest value was positive then you ve observed a difference in the opposite direction from that expected. The pvalue is one minus half the Prob> t in the report.
63 Unequal variance If the variances are not equal or if we just want a more conservative test then see the bottom portion of the Tests that the Variances are Equal report. The unequalvariances ttest is listed as the Welch Anova. Report the tvalue, df and pvalue, as in the equal variance case.
64 Nonparametric comparison of the medians If normality isn t reasonable, then you can use a nonparametric test. You will use a test that compares the medians between the two groups. The nonparametric test is based solely on the ranks of the values of the Yvariable. In JMP, choose Nonparametric > Wilcoxon test.
65 Wilcoxon The Wilcoxon ranksum test (also called the MannWhitney test) ranks all the Yvalues (in both groups) and then compares the sum of the ranks in each group (the groups are specified by the Xvariable). If the median of the first group is, in fact equal to the median of the second group, then the sum of the ranks should be equal for equal sample sizes
66
67 Reporting Wilcoxon When reporting the results of a nonparametric test, it s usual to only report the pvalue. In the above report, there are two pvalues, one for the ztest and one using a chisquare value. The pvalues will rarely be different. For large samples report the pvalue from the normal approximation. For smaller samples, use the chisquare For really small samples you should probably consult a statistician to help you obtain exact pvalues.
68 Steps 8 & 9 8. Make a statistical decision Using all three tests, the two groups are different. All pvalues are < State the substantive conclusion The schools in cites have significantly lower mean pass rates (35.9% vs 48.8%) and significantly different median pass rates (36.0% vs 49.2%)
69 Phase 4: Communicate the Answer to the Question 10. Document our understanding with text, tables, or figures The year 2000 Alabama SOL passrates in 10th grade Algebra I were divided into two groups according to whether the school was a city or county high school. There were n = 306 schools within city schooldistricts and n = 92 in county school districts. The observed average pass rates within city schools was 35.9% (SD = 19.9) and pass rates outside of cities were 48.8% (SD = 22.1). Using a twotailed ttest, we conclude that the observed means are significantly different (t = 5.0, df = 396, pvalue < ). From this we conclude that city schools have a significantly lower pass rate compared to county schools. The 95% confidence interval about the mean difference is between 7.8% and 17.9%.
70 Less text, replaced by information in a table Alternatively, it may be more straightforward to include many of the numbers in a table and state your results in text. So instead of the above paragraph, you could do this:
71 The summary results for the year 2000 Alabama SOL pass rate percentages in 10th grade algebra I are shown in Table 1. Schools were divided into cities if their district name contained City and were otherwise classified as a county school district. From these results we conclude that schools in cities had a passrate that was significantly lower than the passrates compared to county schools. The 95% confidence interval about the mean difference is between 7.8% and 17.9%.
72 Table Alabama SOL PassRates in 10 th Grade Algebra I For Schools in Cities and in Counties Location Number of Schools Pass Rate (SD) SE 95% CI City (19.90) County (22.05) Difference 12.9 * * t = 5.0, df = 396, pvalue <
73 Less text, replaced by information in a figure As another alternative, it may be more informative to describe the results in a figure. Instead of the above, you could do this:
74 The summary results for the year 2000 Alabama SOL pass rate percentages in 10th grade algebra I are shown in Figure 12. Schools were divided into city and county high schools. From these results we conclude that schools in cities had a passrate that was significantly lower than the passrates compared to county schools(t = 5.0, df = 396, pvalue < ). The 95% confidence interval about the mean difference is between 7.8% and 17.9%.
75 County High Schools City High Schools mean = 48.8, SD = mean = 35.9, SD = Figure Alabama SOL PassRates in 10 th Grade Algebra I For Schools in Cities and in Counties
76 Summary: Two Independent Means Briefly, here is how to proceed when comparing the means obtained from two independent samples. Describe the two groups and the values in each group. What summary statistics are appropriate? Are there missing values? (why?) Assess the normality assumption. If normality is not warranted, then do a nonparametric test to compare the medians. If normality is warranted, then assess the equal variance assumption.
77 Summary (cont) Report confidence intervals on each of the means if normality is reasonable. Perform the appropriate statistical test: equal variance ttest, unequal variance ttest, or the Wilcoxon ranksum test. Determine the pvalue that corresponds to your hypothesis. Reject or fail to reject? State your substantive conclusion.
78 Summary (cont) Additional note: Say you conclude that the groups have different means. How do you describe what the different means are? If you ve followed the above recipe, you are in one of three situations: 1. Normality is reasonable and the variances are equal 2. Normality is reasonable and the variances are unequal 3. Normality is not warranted
79 Summary (cont) Normality is reasonable and the variances are equal, use the equal variance ttest. The write up reads the means are significantly different (t = x.xx, df =xxx, pvalue = 0.xxxx). Also give a table of means, SEs and 95%CIs just like the Means for Oneway Anova.
80 Summary (cont) The variances are unequal, use unequalvariance ttest. The write up reads the means are significantly different (unequal variance t = x.xx, df =xxx, pvalue = 0.xxxx). Also give a table of means, SEs and 95%CIs just like the Means and Std. Deviations report. Note: the means are the same. The SEs and CIs are different.
81 Summary (cont) Normality is unreasonable, use Wilcoxon s test. The write up reads the medians are significantly different (by Wilcoxon s signedrank test, pvalue = 0.xxxx). Also give a table of medians and IQRs. There is a way to put 95%CIs on these estimates but not using any easily available software.
82 Always Report a measure of the center and spread. For all three tests, report the pvalue and make a decision based upon your hypothesis. Your final statement should summarize the results in terms of the experiment (no statistics),
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationNull Hypothesis H 0. The null hypothesis (denoted by H 0
Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property
More informationTwoSample TTests Assuming Equal Variance (Enter Means)
Chapter 4 TwoSample TTests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when the variances of
More informationSCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
More informationTwoSample TTests Allowing Unequal Variance (Enter Difference)
Chapter 45 TwoSample TTests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when no assumption
More informationStatistiek I. ttests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35
Statistiek I ttests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistieki/ John Nerbonne 1/35 ttests To test an average or pair of averages when σ is known, we
More informationComparing two groups (t tests...)
Page 1 of 33 Comparing two groups (t tests...) You've measured a variable in two groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the two groups are really different?
More informationUnit 21 Student s t Distribution in Hypotheses Testing
Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationSampling and Hypothesis Testing
Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus
More informationNonparametric Statistics
1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationHypothesis Testing. Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University
Hypothesis Testing Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 AMU / BonTech, LLC, JourniTech Corporation Copyright 2015 Learning Objectives Upon successful
More informationJMP for Basic Univariate and Multivariate Statistics
JMP for Basic Univariate and Multivariate Statistics Methods for Researchers and Social Scientists Second Edition Ann Lehman, Norm O Rourke, Larry Hatcher and Edward J. Stepanski Lehman, Ann, Norm O Rourke,
More informationBusiness Statistics. Lecture 8: More Hypothesis Testing
Business Statistics Lecture 8: More Hypothesis Testing 1 Goals for this Lecture Review of ttests Additional hypothesis tests Twosample tests Paired tests 2 The Basic Idea of Hypothesis Testing Start
More informationTwosample hypothesis testing, II 9.07 3/16/2004
Twosample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For twosample tests of the difference in mean, things get a little confusing, here,
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationUnit 24 Hypothesis Tests about Means
Unit 24 Hypothesis Tests about Means Objectives: To recognize the difference between a paired t test and a twosample t test To perform a paired t test To perform a twosample t test A measure of the amount
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationStatistics 104: Section 7
Statistics 104: Section 7 Section Overview Reminders Comments on Midterm Common Mistakes on Problem Set 6 Statistical Week in Review Comments on Midterm Overall, the midterms were good with one notable
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationNonparametric tests I
Nonparametric tests I Objectives MannWhitney Wilcoxon Signed Rank Relation of Parametric to Nonparametric tests 1 the problem Our testing procedures thus far have relied on assumptions of independence,
More informationNCSS Statistical Software. OneSample TTest
Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of pvalues classical significance testing depend on assumptions
More informationChapter 7 Notes  Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes  Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKALWALLIS TEST: The nonparametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationSociology 6Z03 Topic 15: Statistical Inference for Means
Sociology 6Z03 Topic 15: Statistical Inference for Means John Fox McMaster University Fall 2016 John Fox (McMaster University) Soc 6Z03: Statistical Inference for Means Fall 2016 1 / 41 Outline: Statistical
More informationINTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the oneway ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationThe Wilcoxon RankSum Test
1 The Wilcoxon RankSum Test The Wilcoxon ranksum test is a nonparametric alternative to the twosample ttest which is based solely on the order in which the observations from the two samples fall. We
More informationBox plots & ttests. Example
Box plots & ttests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as boxandwhisker diagrams. Any data that you can
More information1/22/2016. What are paired data? Tests of Differences: two related samples. What are paired data? Paired Example. Paired Data.
Tests of Differences: two related samples What are paired data? Frequently data from ecological work take the form of paired (matched, related) samples Before and after samples at a specific site (or individual)
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationNonParametric Tests (I)
Lecture 5: NonParametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of DistributionFree Tests (ii) Median Test for Two Independent
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table covariation least squares
More informationSupplement on the KruskalWallis test. So what do you do if you don t meet the assumptions of an ANOVA?
Supplement on the KruskalWallis test So what do you do if you don t meet the assumptions of an ANOVA? {There are other ways of dealing with things like unequal variances and nonnormal data, but we won
More informationNonParametric TwoSample Analysis: The MannWhitney U Test
NonParametric TwoSample Analysis: The MannWhitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, nonparametric tests can
More informationTwoSample TTest from Means and SD s
Chapter 07 TwoSample TTest from Means and SD s Introduction This procedure computes the twosample ttest and several other twosample tests directly from the mean, standard deviation, and sample size.
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two Means
Lesson : Comparison of Population Means Part c: Comparison of Two Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationOnce saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
More informationStatistical Inference and ttests
1 Statistical Inference and ttests Objectives Evaluate the difference between a sample mean and a target value using a onesample ttest. Evaluate the difference between a sample mean and a target value
More informationFor example, enter the following data in three COLUMNS in a new View window.
Statistics with Statview  18 Paired ttest A paired ttest compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the
More informationt Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
ttests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com
More informationTesting Group Differences using Ttests, ANOVA, and Nonparametric Measures
Testing Group Differences using Ttests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone:
More informationUCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates
UCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally
More informationGeneral Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test
Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics
More informationWe have already discussed hypothesis testing in study unit 13. In this
14 study unit fourteen hypothesis tests applied to means: two related samples We have already discussed hypothesis testing in study unit 13. In this study unit we shall test a hypothesis empirically in
More informationHypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam
Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationT adult = 96 T child = 114.
Homework Solutions Do all tests at the 5% level and quote pvalues when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your
More informationModule 9: Nonparametric Tests. The Applied Research Center
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } OneSample ChiSquare Test
More informationChapter 8 Hypothesis Testing
Chapter 8 Hypothesis Testing Chapter problem: Does the MicroSort method of gender selection increase the likelihood that a baby will be girl? MicroSort: a genderselection method developed by Genetics
More informationNonparametric Test Procedures
Nonparametric Test Procedures 1 Introduction to Nonparametrics Nonparametric tests do not require that samples come from populations with normal distributions or any other specific distribution. Hence
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationOneSample ttest. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools
OneSample ttest Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish
More information1 Confidence intervals
Math 143 Inference for Means 1 Statistical inference is inferring information about the distribution of a population from information about a sample. We re generally talking about one of two things: 1.
More informationIntroduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
More informationChapter 7 Part 2. Hypothesis testing Power
Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship
More informationNonparametric tests, Bootstrapping
Nonparametric tests, Bootstrapping http://www.isrec.isbsib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis
More informationThe Basics of a Hypothesis Test
Overview The Basics of a Test Dr Tom Ilvento Department of Food and Resource Economics Alternative way to make inferences from a sample to the Population is via a Test A hypothesis test is based upon A
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated KruskalWallis Test Post hoc Comparisons In the prior
More informationPaired vs. 2 sample comparisons. Comparing means. Paired comparisons allow us to account for a lot of extraneous variation.
Comparing means! Tests with one categorical and one numerical variable Paired vs. sample comparisons! Goal: to compare the mean of a numerical variable for different groups. Paired comparisons allow us
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. JaeWan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationNonparametric TwoSample Tests. Nonparametric Tests. Sign Test
Nonparametric TwoSample Tests Sign test MannWhitney Utest (a.k.a. Wilcoxon twosample test) KolmogorovSmirnov Test Wilcoxon SignedRank Test TukeyDuckworth Test 1 Nonparametric Tests Recall, nonparametric
More informationHow to choose a statistical test. Francisco J. Candido dos Reis DGOFMRP University of São Paulo
How to choose a statistical test Francisco J. Candido dos Reis DGOFMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There
More informationStatistics 641  EXAM II  1999 through 2003
Statistics 641  EXAM II  1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the
More informationAP Statistics 2007 Scoring Guidelines
AP Statistics 2007 Scoring Guidelines The College Board: Connecting Students to College Success The College Board is a notforprofit membership association whose mission is to connect students to college
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationOutline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics
Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public
More informationWater Quality Problem. Hypothesis Testing of Means. Water Quality Example. Water Quality Example. Water quality example. Water Quality Example
Water Quality Problem Hypothesis Testing of Means Dr. Tom Ilvento FREC 408 Suppose I am concerned about the quality of drinking water for people who use wells in a particular geographic area I will test
More informationHYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NONSTATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationPsychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!
Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stemandleaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrclmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationChapter 16 Appendix. Nonparametric Tests with Excel, JMP, Minitab, SPSS, CrunchIt!, R, and TI83/84 Calculators
The Wilcoxon Rank Sum Test Chapter 16 Appendix Nonparametric Tests with Excel, JMP, Minitab, SPSS, CrunchIt!, R, and TI83/84 Calculators These nonparametric tests make no assumption about Normality.
More informationModule 5 Hypotheses Tests: Comparing Two Groups
Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this
More informationExtending Hypothesis Testing. pvalues & confidence intervals
Extending Hypothesis Testing pvalues & confidence intervals So far: how to state a question in the form of two hypotheses (null and alternative), how to assess the data, how to answer the question by
More informationIntroduction to Statistics for Computer Science Projects
Introduction Introduction to Statistics for Computer Science Projects Peter Coxhead Whole modules are devoted to statistics and related topics in many degree programmes, so in this short session all I
More informationPart 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217
Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationUsing CrunchIt (http://bcs.whfreeman.com/crunchit/bps4e) or StatCrunch (www.calvin.edu/go/statcrunch)
Using CrunchIt (http://bcs.whfreeman.com/crunchit/bps4e) or StatCrunch (www.calvin.edu/go/statcrunch) 1. In general, this package is far easier to use than many statistical packages. Every so often, however,
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationTesting for differences I exercises with SPSS
Testing for differences I exercises with SPSS Introduction The exercises presented here are all about the ttest and its nonparametric equivalents in their various forms. In SPSS, all these tests can
More informationChapter 7. Oneway ANOVA
Chapter 7 Oneway ANOVA Oneway ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The ttest of Chapter 6 looks
More informationPermutation & NonParametric Tests
Permutation & NonParametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate
More informationCase 110: Per Capita Income OneWay ANOVA and KruskalWallis. Dr. DeWayne Derryberry, Idaho State University Department of Mathematics
Case 110: Per Capita Income OneWay ANOVA and KruskalWallis Dr. DeWayne Derryberry, Idaho State University Department of Mathematics Per Capita Income: OneWay ANOVA and KruskalWallis Key Ideas: ANOVA,
More informationData Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments  Introduction
Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments  Introduction
More informationHypothesis testing S2
Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to
More informationComparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples
Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The
More informationInferential Statistics. Probability. From Samples to Populations. Katie RommelEsham Education 504
Inferential Statistics Katie RommelEsham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice
More information