t-test Statistics Overview of Statistical Tests Assumptions
|
|
|
- Eunice Cannon
- 9 years ago
- Views:
Transcription
1 t-test Statistics Overview of Statistical Tests Assumption: Testing for Normality The Student s t-distribution Inference about one mean (one sample t-test) Inference about two means (two sample t-test) Assumption: F-test for Variance Student s t-test - For homogeneous variances - For heterogeneous variances Statistical Power 1 Overview of Statistical Tests During the design of your experiment you must specify what statistical procedures you will use. You require at least 3 pieces of info: Type of Variable Number of Variables Number of Samples Then refer to end-papers of Sokal and Rohlf (1995) -REVIEW- Assumptions Virtually every statistic, parametric or nonparametric, has assumptions which must be met prior to the application of the experimental design and subsequent statistical analysis. We will discuss specific assumptions associated with individual tests as they come up. Virtually all parametric statistics have an assumption that the data come from a population that follows a known distribution. Most of the tests we will evaluate in this module require a normal distribution. 3
2 Assumption: Testing for Normality Review and Commentary: D Agostino, R.B., A. Belanger, and R.B. D Agostino A suggestion for using powerful and informative tests of normality. The American Statistician 44: (See Course Web Page for PDF version.) Most major normality tests have corresponding R code available in either the base stats package or affiliated package. We will review the options as we proceed. 4 Normality There are 5 major tests used: Shapiro-Wilk W test Anderson-Darling test Martinez-Iglewicz test Kolmogorov-Smirnov test D Agostino Omnibus test NB: Power of all is weak if N < 10 5 Shapiro-Wilk W test Developed by Shapiro and Wilk (1965). One of the most powerful overall tests. It is the ratio of two estimates of variance (actual calculation is cumbersome; adversely affected by ties). Test statistic is W; roughly a measure of the straightness of the quantile-quantile plot. The closer W is to 1, the more normal the sample is. Available in R and most other major stats applications. 6
3 Normal Q-Q Plot Sample Quantiles Example in R (Tutorial-3) Theoretical Quantiles 7 Anderson-Darling test Developed by Anderson and Darling (1954). Very popular test. Based on EDF (empirical distribution function) percentile statistics. Almost as powerful as Shapiro-Wilk W test. 8 Martinez-Iglewicz Test Based on the median & robust estimator of dispersion. Very powerful test. Works well with small sample sizes. Particularly useful for symmetrically skewed samples. A value close to 1.0 indicates normality. Strongly recommended during EDA. Not available in R. 9
4 Kolmogorov-Smirnov Test Calculates expected normal distribution and compares it with the observed distribution. Uses cumulative distribution functions. Based on the max difference between two distributions. Poor for discrimination below N = 30. Power to detect differences is low. Historically popular. Available in R and most other stats applications. 10 D Agostino et al. Tests Based on coefficients of skewness ( b 1 ) and kurtosis (b ). If normal, b 1 =1 and b =3 (tests based on this). Provides separate tests for skew and kurt: - Skewness test requires N 8 - Kurtosis test best if N > 0 Provides combined Omnibus test of normality. Available in R. 11 The t-distribution t-distribution is similar to Z-distribution Z = y / N Note similarity: vs. t = y S / N The functional difference is between σ and S. Virtually identical when N > 30. Much like Z, the t-distribution can be used for inferences about μ. One would use the t-statistic when σ is not known and S is (the general case). 1
5 The t-distribution See Appendix Statistical Table C t, ν = 1 Standard Normal (Z) t, ν = 6 13 One Sample t-test - Assumptions - The data must be continuous. The data must follow the normal probability distribution. The sample is a simple random sample from its population. 14 One Sample t-test t = y S / N y t,df SE y y t,df SE y df s /,df df s 1 /, df 15
6 One Sample t-test - Example - Twelve (N = 1) rats are weighed before and after being subjected to a regimen of forced exercise. Each weight change (g) is the weight after exercise minus the weight before: 1.7, 0.7, -0.4, -1.8, 0., 0.9, -1., -0.9, -1.8, -1.4, -1.8, -.0 H 0 : μ = 0 H A : μ 0 16 One Sample t-test - Example - > W<-c(1.7, 0.7, -0.4, -1.8, 0., 0.9, -1., -0.9, -1.8, -1.4, -1.8, -.0) > summary(w) Min. 1st Qu. Median Mean 3rd Qu. Max > hist(w, col= red ) > shapiro.test(w) Shapiro-Wilk normality test data: W W = , p-value = One-sample t-test using R > W<-c(1.7, 0.7, -0.4, -1.8, 0., 0.9, -1., -0.9, , -1.4, -1.8, -.0) > W [1] > t.test(w, mu=0) One Sample t-test data: W t = , df = 11, p-value = alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: sample estimates: mean of x
7 One Sample t-test For most statistical procedures, one will want to do a post-hoc test (particularly in the case of failing to reject H 0 ) of the required sample size necessary to test the hypothesis. For example, how large of a sample size would be needed to reject the null hypothesis of the onesample t-test we just did? Sample size questions and related error rates are best explored through a power analysis. 19 > power.t.test(n=15, delta=1.0, sd=1.53, sig.level=0.05, type="one.sample") One-sample t test power calculation n = 15 delta = 1 sd = 1.53 sig.level = 0.05 power = alternative = two.sided > power.t.test(n=0, delta=1.0, sd=1.53, sig.level=0.05, type="one.sample") One-sample t test power calculation n = 0 delta = 1 sd = 1.53 sig.level = 0.05 power = alternative = two.sided > power.t.test(n=5, delta=1.0, sd=1.53, sig.level=0.05, type="one.sample") One-sample t test power calculation n = 5 delta = 1 sd = 1.53 sig.level = 0.05 power = i i 0 Two Sample t-test - Assumptions - The data are continuous (not discrete). The data follow the normal probability distribution. The variances of the two populations are equal. (If not, the Aspin-Welch Unequal-Variance test is used.) The two samples are independent. There is no relationship between the individuals in one sample as compared to the other. Both samples are simple random samples from their respective populations. 1
8 Two Sample t-test Determination of which two-sample t-test to use is dependent upon first testing the variance assumption: Two Sample t-test for Homogeneous Variances Two-Sample t-test for Heterogeneous Variances Variance Ratio F-test - Variance Assumption - Must explicitly test for homogeneity of variance H o : S 1 = S H a : S 1 S Requires the use of F-test which relies on the F-distribution. F calc = S max / S min Get F table at N-1 df for each sample If F calc < F table then fail to reject H o. 3 Variance Ratio F-test - Example - Suppose you had the following sample data: Sample-A S a = 16.7 N = 1 Sample-B S b = N = 8 F calc = 16.7/13.98 = 1.16 F table = (df = 11,7) Decision: F calc < F table therefore fail to reject Ho. Conclusion: the variances are homogeneous. 4
9 Variance Ratio F-test - WARNING - Be careful! The F-test for variance requires that the two samples are drawn from normal populations (i.e., must test normality assumption first). If the two samples are not normally distributed, do not use Variance Ratio F-test! Use the Modified Levene Equal-Variance Test. 5 Modified Levene Equal-Variance Test z 1j = x j Med x z j = y j Med y First, redefine all of the variates as a function of the difference with their respective median. Then perform a twosample ANOVA to get F for redefined values. Stronger test of homogeneity of variance assumption. Not currently available in R, but code easily written and executed. 6 Two-sample t-test - for Homogeneous Variances - Begin by calculating the mean & variance for each of your two samples. Then determine pooled variance S p : S p = N 1 i=1 N y i y i y j y j j=1 N 1 1 N 1 = N 1 1 S 1 N 1 S N 1 N Theoretical formula Machine formula 7
10 Two-sample t-test - for Homogeneous Variances - Determine the test statistic t calc : t= y 1 y S p S p N 1 N df =N 1 N Go to t-table (Appendix) at the appropriate α and df to determine t table 8 Two-sample t-test - Example - Suppose a microscopist has just identified two potentially different types of cells based upon differential staining. She separates them out in to two groups (amber cells and blue cells). She suspects there may be a difference in cell wall thickness (cwt) so she wishes to test the hypothesis: Ho: AC cwt = BC cwt Ha: AC cwt BC cwt 9 Two-sample t-test - Example - Parameter AC-type BC-type Mean SS N Notes: She counts the number of cells in one randomly chosen field of view. SS is the sum of squares (theor. formula), or numerator of the variance equation. 30
11 Two-sample t-test - Example - S p = = 0.17 t calc = = df = =30 Ho: AC cwt = BC cwt Ha: AC cwt BC cwt At α = 0.05/, df = 30 t table =.04 t calc < t table Therefore Fail to reject Ho. Cell wall thickness is similar btw types. 31 Two-sample t-test using R > B<-c(8.8, 8.4,7.9,8.7,9.1,9.6) > G<-c(9.9,9.0,11.1,9.6,8.7,10.4,9.5) > var.test(b,g) F test to compare two variances data: B and G F = , num df = 5, denom df = 6, p-value = 0.47 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: sample estimates: ratio of variances Two-sample t-test using R > t.test(b,g, var.equal=true) Two Sample t-test data: B and G t = , df = 11, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y
12 Two-sample t-test - for Heterogeneous Variances - Q. Suppose we were able to meet the normality assumption, but failed the homogeneity of variance test. Can we still perform a t-test? A. Yes, but we but must calculate an adjusted degrees of freedom (df). 34 Two-sample t-test - Adjusted df for Heterogeneous Variances - df S1 N N 1 S S 1 N N 1 S 1 1 N N 1 Performs the t-test in exactly the same fashion as for homogeneous variances; but, you must enter the table at a different df. Note that this can have a big effect on decision. 35 Two-sample t-test using R - Heterogeneous Variance - > Captive<-c(10,11,1,11,10,11,11) > Wild<-c(9,8,11,1,10,13,11,10,1) > var.test(captive,wild) F test to compare two variances data: Captive and Wild F = , num df = 6, denom df = 8, p-value = alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: sample estimates: ratio of variances
13 Two-sample t-test using R - Heterogeneous Variance - > t.test(captive, Wild) Welch Two Sample t-test data: Captive and Wild t = 0.339, df = 11.48, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y Matched-Pair t-test It is not uncommon in biology to conduct an experiment whereby each observation in a treatment sample has a matched pair in a control sample. Thus, we have violated the assumption of independence and can not do a standard t-test. The matched-pair t-test was developed to address this type of experimental design. 38 Matched-Pair t-test By definition, sample sizes must be equal. Such designs arise when: Same obs are exposed to treatments over time. Before and after experiments (temporally related). Side-by-side experiments (spatially related). Many early fertilizer studies used this design. One plot received fertilizer, an adjacent plot did not. Plots were replicated in a field and plant yield measured. 39
14 Matched-Pair t-test The approach to this type of analysis is a bit counter intuitive. Even though there are two samples, you will work with only one sample composed of: STANDARD DIFFERENCES and df = N ab Matched-Pair t-test - Assumptions - The data are continuous (not discrete). The data, i.e., the differences for the matchedpairs, follow a normal probability distribution. The sample of pairs is a simple random sample from its population. 41 Matched-Pair t-test y d = N y d d=1 N S d = N d=1 N y d y d /N d =1 N 1 t = y d S d / N 4
15 Matched-Pair t-test using R - Example - > New.Fert<-c(50,410,60,00,360,30,40,300,090) > Old.Fert<-c(190,00,060,1960,1960,140,1980,1940,1790) > hist(dif.fert) > Dif.Fert<-(New.Fert-Old.Fert) > shapiro.test(dif.fert) Shapiro-Wilk normality data: Dif.Fert W = , p-value = 0.60 Histogram of Dif.Fert Normal Q-Q Plot Frequency Sample Quantiles Dif.Fert Theoretical Quantiles 43 Matched-Pair t-test using R - One-tailed test - > t.test(new.fert, Old.Fert, alternative=c("greater"), mu=50, paired=true) Paired t-test data: New.Fert and Old.Fert t = , df = 8, p-value = alternative hypothesis: true difference in means is greater than percent confidence interval: Inf sample estimates: mean of the differences Statistical Power Q. What if I do a t-test on a pair of samples and fail to reject the null hypothesis--does this mean that there is no significant difference? A. Maybe yes, maybe no. Depends upon the POWER of your test and experiment. 45
16 Power Power is the probability of rejecting the hypothesis that the means are equal when they are in fact not equal. Power is one minus the probability of Type-II error (β). The power of the test depends upon the sample size, the magnitudes of the variances, the alpha level, and the actual difference between the two population means. 46 Power Usually you would only consider the power of a test when you failed to reject the null hypothesis. High power is desirable (0.7 to 1.0). High power means that there is a high probability of rejecting the null hypothesis when the null hypothesis is false. This is a critical measure of precision in hypothesis testing and needs to be considered with care. More on Power in the next lecture. 47 Stay tuned. Coming Soon: Responding to Assumption Violations, Power, Two-sample Nonparametric Tests 48
NCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test
Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
NCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
Two-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
Tutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls [email protected] MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
Comparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
Final Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
How To Test For Significance On A Data Set
Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.
Non-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
Section 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Permutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
NCSS Statistical Software. One-Sample T-Test
Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
individualdifferences
1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,
Nonparametric Statistics
Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric
Two-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
Normality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]
Point Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test
The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation
UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST
UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly
Chapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
2 Sample t-test (unequal sample sizes and unequal variances)
Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing
Lecture Notes Module 1
Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific
THE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
Non-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM [email protected] http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
Testing for differences I exercises with SPSS
Testing for differences I exercises with SPSS Introduction The exercises presented here are all about the t-test and its non-parametric equivalents in their various forms. In SPSS, all these tests can
Guide to Microsoft Excel for calculations, statistics, and plotting data
Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with
Dongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
MEASURES OF LOCATION AND SPREAD
Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
DATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
Pearson's Correlation Tests
Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation
Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions
Chapter 208 Introduction This procedure provides several reports for making inference about the difference between two population means based on a paired sample. These reports include confidence intervals
Principles of Hypothesis Testing for Public Health
Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine [email protected] Fall 2011 Answers to Questions
Difference of Means and ANOVA Problems
Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly
3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests
Error Type, Power, Assumptions Parametric vs. Nonparametric tests Type-I & -II Error Power Revisited Meeting the Normality Assumption - Outliers, Winsorizing, Trimming - Data Transformation 1 Parametric
Chapter 7. One-way ANOVA
Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks
Tests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
Two-sample hypothesis testing, II 9.07 3/16/2004
Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,
Randomized Block Analysis of Variance
Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of
Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples
Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The
Skewed Data and Non-parametric Methods
0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted
t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected] www.excelmasterseries.com
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
Projects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Chapter 7 Section 1 Homework Set A
Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
HYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
The Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 [email protected] August 15, 2009 NC State Statistics Departement Tech Report
Inference for two Population Means
Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example
Analysis of Variance. MINITAB User s Guide 2 3-1
3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
Introduction to Statistics with GraphPad Prism (5.01) Version 1.1
Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 2010-11, Anne Segonds-Pichon. This manual
Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science [email protected]
Dept of Information Science [email protected] October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic
Introduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
Nonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
Robust t Tests. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Robust t Tests James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 29 Robust t Tests 1 Introduction 2 Effect of Violations
The Wilcoxon Rank-Sum Test
1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
Chapter 23 Inferences About Means
Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute
HYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
Confidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
Chapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media
ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media Abstract: The growth of social media is astounding and part of that success was
UNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Difference tests (2): nonparametric
NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Recall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
Non-Inferiority Tests for Two Proportions
Chapter 0 Non-Inferiority Tests for Two Proportions Introduction This module provides power analysis and sample size calculation for non-inferiority and superiority tests in twosample designs in which
Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
ELEMENTARY STATISTICS
ELEMENTARY STATISTICS Study Guide Dr. Shinemin Lin Table of Contents 1. Introduction to Statistics. Descriptive Statistics 3. Probabilities and Standard Normal Distribution 4. Estimates and Sample Sizes
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
2. Filling Data Gaps, Data validation & Descriptive Statistics
2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n
Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
Non-Inferiority Tests for One Mean
Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random
Rank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015
Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation
