Let s look at some data provided by Crawley on ozone levels (in pphm) taken on 10 days from two market gardens B & C:

Size: px
Start display at page:

Download "Let s look at some data provided by Crawley on ozone levels (in pphm) taken on 10 days from two market gardens B & C:"

Transcription

1 Tutorial 4: Two-sample tests, power analysis, and tabular data Goal: To provide a more in depth look at univariate statistics and to explore quantitative and graphical methods necessary for determining the normality of sample data. We will also examine how to examine grouped data. Note: All text in the Arial font is instruction or explanation. All text in Courier font is input or output from R. Step-1: Comparing Variances Before comparing two sample means, one must customarily test to see whether the variances are significantly different. The simplest form of this test that we learned in lecture is Fishers F- max test where one divides the larger variance by the smaller variance. If the variances are the same then F=1. If the variances are different, they will equal a value > 1. As in all of biostatistics, we must ask how big does the value have to be before it is significantly greater than 1? Let s look at some data provided by Crawley on ozone levels (in pphm) taken on 10 days from two market gardens B & C: > gardenb <- c(5,5,6,7,4,4,3,5,6,5) > gardenc <- c(3,3,2,1,10,4,3,11,3,10) Since R contains built in tables of all the major statistics, we can query it to find out at the get go what the critical value is that we need to exceed at 9 df (10-1; both gardens) and an alpha of 0.05 (note since we are presupposing one garden is different than the other, this is a two-tailed test, so we must split the alpha to.025 and on the right hand side this will be 0.975): > qf(0.975,9,9) [1] Now, calculate the variance for each single sample (as we have done in previous univariate tutorials, and run the F-test): > var(gardenb) [1] > var(gardenc) [1] > F.ratio<-var(gardenC)/var(gardenB) > F.ratio [1] Since the calculated value of F is greater than the table value of F, we reject the null hypothesis and conclude that the variances are significantly different. This then assists us with following up with a specific two-sample test. NOTE: I assume here that you have already done all of the univariate analysis of these data and confirmed that the samples are normally distributed. Intro to R Page 1

2 This procedure, while instructive, can be speeded up by directly applying the built in function called var.test: > var.test (gardenb,gardenc) F test to compare two variances data: gardenb and gardenc F = , num df = 9, denom df = 9, p-value = alternative hypothesis: true ratio of variances is not equal to ratio of variances What s different about the two results? When we did it manually, the F-ratio was approximately 10; when we did it automatically, the F-ratio was approximately 1/10 th. The difference is that R does the variance test based on the order in which the variables were entered and doesn t recognize the difference between the smaller and larger one. The good news is that despite this, the P-value is still correct and you arrive at the same conclusion to reject the null hypothesis. You just need to be aware of this if you are reporting the F-value. Step-2: Two sample t-test with equal variances Let s add another garden to the mix and continue with an example, > gardena <- c(3,4,4,3,2,3,1,3,5,2) We can start calculating what the critical value of t will be for two samples of N = 10; which is df = 20 2 = 18 and again assume alpha = 0.05, split into two tails: > qt(0.975, 18) [1] And, we can proceed as above to use R to calculate the t-test long-hand (as we would on a calculator), or take the more direct approach, testing the variance first, then doing the equal variance t-test: > var.test (gardena, gardenb) F test to compare two variances data: gardena and gardenb F = 1, num df = 9, denom df = 9, p-value = 1 alternative hypothesis: true ratio of variances is not equal to 1 Intro to R Page 2

3 ratio of variances 1 So, we are unable to reject the null hypothesis of no difference between variances and conclude that the variances are homogeneous. Since the data are already normal (not shown), we can proceed with a Welch Equal Variance Two-Sample t-test: > t.test (gardena, gardenb) Welch Two Sample t-test data: gardena and gardenb t = , df = 18, p-value = alternative hypothesis: true difference in means is not equal to mean of x mean of y 3 5 The conclusion being that the null hypothesis should be rejected and you conclude that the two gardens are significantly different in mean ozone concentrations. Your stats will then likely be followed by some sort of graphics (in a thesis or manuscript). In the case of two-sample tests, side-by-side box-plots are usually the choice. A nice refinement is the use of the notched boxplot which provides a notch (95% confidence interval) around each median. Defacto, if the two notches do not overlap, then the medians are significantly different. Text can also be added to the plot. Try the following: > ozone <- c(gardena,gardenb) > label<-factor(c(rep("a",10),rep("b",10))) > boxplot(ozone~label,notch=t,xlab="garden",ylab="ozone") > text(2,2,"t= ") > text(2,1.5,"p=0.001") Intro to R Page 3

4 Ozone t= P=0.001 A B Garden If the data were not normally distributed, there is a nonparametric alternative to the two-sample t-test and that is the Wilcoxon Rank Sum Test. The automatic procedure in R for doing so is Wilcox.test: > wilcox.test(gardena,gardenb) Wilcoxon rank sum test with continuity correction data: gardena and gardenb W = 11, p-value = alternative hypothesis: true location shift is not equal to 0 Warning message: cannot compute exact p-value with ties in: wilcox.test.default(gardena, gardenb) This function actually approximates a z-value for purposes of computation and hypothesis testing. We obviously reject the null hypothesis because p = << The warninmg Intro to R Page 4

5 message at the end of the printout is not of particular concern. It is just to draw attention to the fact that there are ties in the data and an approximate value of p has been provided. This is not a real problem for most applications. Step 3: Tests on paired samples Recall that there are many instances in biological situations where the two samples are not independent in space or time. These are referred to as paired sample analyses. Using some data from Crawley (2006) where a measurement (number of invertebrate species) was taken upstream from a sewage outfall, the other measurement taken downstream (thus, two paired measurements for each stream): > streams down up To run a paired t-test, simply specify the paired = T option: > t.test(down,up,paired=t) Paired t-test data: down and up t = , df = 15, p-value = alternative hypothesis: true difference in means is not equal to mean of the differences Intro to R Page 5

6 Notice that at t = , df = 15, P = , so we reject the null hypothesis and conclude that here is a significant difference in species diversity above vs. below the outfall on each stream. Note that there is another approach to doing this type of problem and that involves an analysis of the paired differences (d). This method is more consistent with your text and what we discussed in lecture (and yields identical results): > d <- up-down > t.test(d) One Sample t-test data: d t = , df = 15, p-value = alternative hypothesis: true mean is not equal to mean of x Step-4: The Sign Test Last in our discussion of two sample tests, and an extension of the paired t-test that we just did, is the sign test. This test is the nonparametric equivalent of the paired t-test. The test is useful when either assumptions can not be met or if you are working with something that can be scored rather than measured explicitly. These tests are often useful in behavior studies where the investigator scores something as a positive or negative response to a stimulus. For example, suppose a dive team of 9 divers is evaluated fro a new training regimen. Each is asked to dive once and it is scored by observers. The divers then go on a 4-week training regimen and are brought back to each dive again and be scored again. After the numbers are tallied, each diver is rated as either better or worse relative to the pre-training regimen. Suppose 8 were judged to be better and 1 worse. How likely is it that 8 of 9 would be better? This is best modeled with a binomial distribution asking what is the number of failures (1) relative to the total (9): > binom.test(1,9) Exact binomial test data: 1 and 9 number of successes = 1, number of trials = 9, p-value = alternative hypothesis: true probability of success is not equal to Intro to R Page 6

7 probability of success Thus, this is quite a significant result (p = ) and unlikely to occur by chance. We reject the null hypothesis of no effect of the training regimen and conclude that it improves diving. The binomial test can be easily used to test or compare two proportions also. Suppose in a company that you observe that 196 men are promoted within a given year and only 4 women. Is this the blatant sexism that it appears to be? To test the question, we must examine how many total men and women there are in the company and then compare the two proportions. If there are 196 women and 3270 men, we can use the built in binomial proportions test in R which relies on the function prop.test: > prop.test(c(4,196),c(40,3270)) 2-sample test for equality of proportions with continuity correction data: c(4, 196) out of c(40, 3270) X-squared = , df = 1, p-value = alternative hypothesis: two.sided prop 1 prop Warning message: Chi-squared approximation may be incorrect in: prop.test(c(4, 196), c(40, 3270)) There is no statistical evidence of discrimination here with a p-value of A result like this will happen up to 47% of the time by chance alone. Step-5: Power and sample size determination A statistical test will not be able to detect a true difference if the sample size is too small compared to the magnitude of the difference. R has various methods for doing power and sample size calculations for one- and two-sample t-tests and comparing two proportions. Without going into an extensive review of the theory (see lecture notes and Zar for formulae) and manual calculations in R, let s look directly at the built in functions for power analysis in R. Let s consider an example where two groups are given different diets and their growth is measured. We wish to compute the sample size with a power of 90%, using a two-sided test at the 1% level to find a difference of 0.5 cm in a distribution with a SD of 2 cm. The R code for this is: Intro to R Page 7

8 > power.t.test(delta=0.5, sd=2, sig.level=0.01, power=0.9) Two-sample t test power calculation n = delta = 0.5 sd = 2 sig.level = 0.01 power = 0.9 alternative = two.sided NOTE: n is number in *each* group Note that delta stands for true difference, and sd is the standard deviation. This suggests that as ample size of 478 would be needed for this level of precision. Alternatively, one could start by substituting in sample size guesses into the power function and observing what happens to the calculated level of power as a result. For example (using 250 as a starting guess): > power.t.test(n=250, delta=0.5, sd=2, sig.level=0.01) Two-sample t test power calculation n = 250 delta = 0.5 sd = 2 sig.level = 0.01 power = alternative = two.sided NOTE: n is number in *each* group Important: note that there are 5 different parameters to the power test. They are all inter-related. With any 4 you can calculate the 5 th. This is a wonderful way to explore the inter-relationships for any experimental design and a very important set of analyses to do with pilot data prior to starting the full size experiment. One sample problems are handled simply by adding the type= one.sample in the call statement for the power function. For paired tests, simply specify type= paired. By way of example for a paired t-test example: > power.t.test(delta=10, sd=10, power=0.8,type="paired") Paired t test power calculation n = delta = 10 sd = 10 Intro to R Page 8

9 sig.level = 0.05 power = 0.8 alternative = two.sided NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs Notice that a significance level of 0.05 was accepted automatically as the default. Lastly, the power.prop.test is presented to compare proportions and is analogous to what we just did for t-tests. The main difference is that delta and sd are replaced by the hypothesized probabilities in the two groups, p1 and p2. Suppose there are two groups of people, one which is given nicotine gum and the other nothing. The binary outcome is cessation of smoking. The stipulated values are p1=0.15 and p2=0.30. We desire a power of at least 80% and a traditional 5% significance level. How many people do we need in each group to run this experiment within these parameters? > power.prop.test(power=0.8,p1=.15,p2=.30) Two-sample comparison of proportions power calculation n = p1 = 0.15 p2 = 0.3 sig.level = 0.05 power = 0.8 alternative = two.sided NOTE: n is number in *each* group Step-7: Working with tabular data Let s take a look at two forms of tabular data and the approach to their analysis. The first involves chi-square contingency tables which rely on count data. Contingency tables attempt to examine whether one variable is contingent on another. A typical example might involve trying to see if there is a relationship between hair color and eye color. Suppose you sample 114 people walking down the street and score them based on these two criteria. The results can be displayed as: Blue eyes Brown eyes Fair hair Dark hair These are our observed frequencies (or counts). The next step is to create a model which predicts the expected frequencies. There are a variety of ways to do this, but usually one utilizes the marginal (row and column) totals to derive these values. The observed values are then compared in light of the expected values and a chi-square statistic is generated and evaluated Intro to R Page 9

10 for significance. To solve this problem in R, the procedure is straightforward. First create a 2x2 matrix and then call the chi-square procedure: > count<-matrix(c(38,14,11,51),nrow=2) > count [,1] [,2] [1,] [2,] Note that you entered the data into the matrix column-wise, not row-wise. Next, run the test: > chisq.test(count) Pearson's Chi-squared test with Yates' continuity correction data: count X-squared = , df = 1, p-value = 8.7e-09 Note the use of scientific notation and that the p-value is exceptionally small. Conclusion: there is a highly significant relationship between hair color and eye color for this group of people. A variant of RxC contingency table analysis is Fisher s Exact Test. This is used when one or more of the expected frequencies is less than 5. Consider a small data set where there are 8 ants nests over 10 trees each of two species (A & B): Tree-A Tree-B w/ ants 6 2 w/o ants 4 8 > x<-as.matrix(c(6,4,2,8)) > dim(x)<-c(2,2) > x [,1] [,2] [1,] 6 2 [2,] 4 8 > fisher.test(x) Fisher's Exact Test for Count Data data: x p-value = alternative hypothesis: true odds ratio is not equal to Intro to R Page 10

11 odds ratio The Fisher test can be used with matrices much bigger than 2 x 2. Alternatively, the function may be provided as two vectors containing factor levels, instead of using a matrix. This saves the trouble of having to do all the tallying. Each observation is just listed on a separate line: > table tree nests 1 A ants 2 B ants 3 A none 4 A ants 5 B none 6 A none 7 A ants 8 B ants 9 B none 10 A none 11 A none 12 B none 13 B none 14 A ants 15 A ants 16 B none 17 A ants 18 B none 19 B none 20 B none > attach(table) > fisher.test(tree,nests) Fisher's Exact Test for Count Data data: tree and nests p-value = alternative hypothesis: true odds ratio is not equal to odds ratio Which is the same answer we arrived at above. Intro to R Page 11

12 Problem: Practice Problem 10 (p.309, W&S) Using R, solve problem 10 in your textbook. Provide explicit tests of variance and normality prior to doing the test. Provide a publication-grade paired, notched box-plot summarizing the differences. Problem: Practice Problem 16 (p. 311, W&S) Using R solve the Problem 16 in your textbook. Provide a publication figure to summarize your results. Problem: Practice Problem 24 (p. 314, W&S) Using R, solve problem 24. Provide a publication grade figure to summarize your results. Intro to R Page 12

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Tests for Two Proportions

Tests for Two Proportions Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

More information

Difference tests (2): nonparametric

Difference tests (2): nonparametric NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

t-test Statistics Overview of Statistical Tests Assumptions

t-test Statistics Overview of Statistical Tests Assumptions t-test Statistics Overview of Statistical Tests Assumption: Testing for Normality The Student s t-distribution Inference about one mean (one sample t-test) Inference about two means (two sample t-test)

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Introduction. Statistics Toolbox

Introduction. Statistics Toolbox Introduction A hypothesis test is a procedure for determining if an assertion about a characteristic of a population is reasonable. For example, suppose that someone says that the average price of a gallon

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

Non-Inferiority Tests for Two Proportions

Non-Inferiority Tests for Two Proportions Chapter 0 Non-Inferiority Tests for Two Proportions Introduction This module provides power analysis and sample size calculation for non-inferiority and superiority tests in twosample designs in which

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

Testing for differences I exercises with SPSS

Testing for differences I exercises with SPSS Testing for differences I exercises with SPSS Introduction The exercises presented here are all about the t-test and its non-parametric equivalents in their various forms. In SPSS, all these tests can

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Tests for One Proportion

Tests for One Proportion Chapter 100 Tests for One Proportion Introduction The One-Sample Proportion Test is used to assess whether a population proportion (P1) is significantly different from a hypothesized value (P0). This is

More information

Guide to Microsoft Excel for calculations, statistics, and plotting data

Guide to Microsoft Excel for calculations, statistics, and plotting data Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

More information

An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem) NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Come scegliere un test statistico

Come scegliere un test statistico Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

More information

Chapter 19 The Chi-Square Test

Chapter 19 The Chi-Square Test Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The Chi-Square Test In this chapter, we will discuss the following topics: We will plot

More information

WISE Power Tutorial All Exercises

WISE Power Tutorial All Exercises ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours

More information

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

Multiple samples: Pairwise comparisons and categorical outcomes

Multiple samples: Pairwise comparisons and categorical outcomes Multiple samples: Pairwise comparisons and categorical outcomes Patrick Breheny May 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Introduction Pairwise comparisons In the previous lecture,

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Solutions to Homework 10 Statistics 302 Professor Larget

Solutions to Homework 10 Statistics 302 Professor Larget s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information