How To Test For Significance On A Data Set



Similar documents
2 Sample t-test (unequal sample sizes and unequal variances)

Paired 2 Sample t-test

Data Transforms: Natural Logarithms and Square Roots

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

HYPOTHESIS TESTING: POWER OF THE TEST

Two Related Samples t Test

Difference of Means and ANOVA Problems

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Hypothesis testing - Steps

Estimation of σ 2, the variance of ɛ

Skewed Data and Non-parametric Methods

Descriptive Statistics

HYPOTHESIS TESTING WITH SPSS:

NCSS Statistical Software. One-Sample T-Test

1 Nonparametric Statistics

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Final Exam Practice Problem Answers

Stats for Strategy Fall 2012 First-Discussion Handout: Stats Using Calculators and MINITAB

Projects Involving Statistics (& SPSS)

Additional sources Compilation of sources:

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Descriptive and Inferential Statistics

NCSS Statistical Software

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions

Quantitative Methods for Finance

Chapter 7 Section 1 Homework Set A

Statistics Review PSY379

Chapter 7 Section 7.1: Inference for the Mean of a Population

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

StatCrunch and Nonparametric Statistics

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

The Assumption(s) of Normality

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Descriptive Statistics

SPSS Explore procedure

Case Study Call Centre Hypothesis Testing

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Principles of Hypothesis Testing for Public Health

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Independent t- Test (Comparing Two Means)

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Introduction. Statistics Toolbox

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Dongfeng Li. Autumn 2010

Introduction to Quantitative Methods

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Lecture Notes Module 1

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

MEASURES OF LOCATION AND SPREAD

NCSS Statistical Software

12: Analysis of Variance. Introduction

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

II. DISTRIBUTIONS distribution normal distribution. standard scores

Testing for differences I exercises with SPSS

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Data Analysis Tools. Tools for Summarizing Data

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Descriptive Statistics

3.4 Statistical inference for 2 populations based on two samples

The Wilcoxon Rank-Sum Test

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Normality Testing in Excel

1.5 Oneway Analysis of Variance

How To Check For Differences In The One Way Anova

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

START Selected Topics in Assurance

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Chapter 2 Probability Topics SPSS T tests

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Permutation Tests for Comparing Two Populations

Chapter 2. Hypothesis testing in one population

How Does My TI-84 Do That

Week 3&4: Z tables and the Sampling Distribution of X

Comparing Means in Two Populations

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Chapter 5 Analysis of variance SPSS Analysis of variance

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Recall this chart that showed how most of our course would be organized:

Study Guide for the Final Exam

t-test Statistics Overview of Statistical Tests Assumptions

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Using Excel for inferential statistics

Tutorial 5: Hypothesis Testing

DATA INTERPRETATION AND STATISTICS

Transcription:

Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming. This test makes no assumption about the shape of the population distribution, therefore this test can handle a data set that is non-symmetric, that is skewed either to the left or the right. THE POINT: The SIGN TEST simply computes a significance test of a hypothesized median value for a single data set. Like the 1 SAMPLE T-TEST you can choose whether you want to use a one-tailed or two-tailed distribution based on your hypothesis; basically, do you want to test whether the median value of the data set is equal to some hypothesized value (H : η = η o ), or do you want to test whether it is greater (or lesser) than that value (H : η > or < η o ). Likewise, you can choose to set confidence intervals around η at some α level and see whether η o falls within the range. This is equivalent to the significance test, just as in any T-TEST. In English, the kind of question you would be interested in asking is whether some observation is likely to belong to some data set you already have defined. That is to say, whether the observation of interest is a typical value of that data set. Formally, you are either testing to see whether the observation is a good estimate of the median, or whether it falls within confidence levels set around that median. THE WORKINGS: Very simple. Under the hypothesis that the sample median (η) is equal to some hypothesized value (η o, so H : η = η o ), then you would expect half the data set S of sample size n to be greater than the hypothesized value η o. If S >.5n then η > η o, and if S <.5n then η < η o. The SIGN TEST simply computes whether there is a significant deviation from this assumption, and gives you a p value based on a binomial distribution. If you are only interested in whether the hypothesized value is greater or lesser than the sample median (H : η > or < η o ), the test uses the corresponding upper or lower tail of the distribution. The workings for calculating the confidence intervals (CI) get complicated to do manually as you have to use a technique called non-linear interpolation that is about as fun to do as it sounds, therefore, let MINITAB do it for you. Basically, it is doing the same thing as in the parametric tests, setting a CI around a measure of central tendency, however, because were are now dealing with discrete data it isn t always possible to calculate exactly the CI corresponding to your assigned α level: Non-linear interpolation is simply a method of getting as close to that value as possible.

Non-Parametric Univariate Tests: 1 Sample Sign Test 2 MINITAB EAMPLE: Suppose we have a data set of twenty-six observations (N=26) and we want to see of a value of 3 is a reasonable estimate of the mean: 1 1 2 2 3 3 4 5 5 6 7 7 8 1 2 22 25 27 33 4 42 5 55 75 8 To test for normality we print out the descriptive statistics from MINITAB and run an Anderson- Darling normality test. For the descriptive stats the MINITAB procedure is: >BASIC STATISTICS >DESCRIPTIVE STATISTICS >Double click on the column your data is entered and HISTOGRAM WITH NORMAL CURVE >GRAPHS: Click on BOPLOT, GRAPHICAL SUMMARY, We get the three following graphics. Descriptive Statistics Variable: Anderson-Darling Normality Test A-Squared: P-Value: 1.748. 2 4 6 8 Mean StDev Variance Skewness Kurtosis N 21.32 23.5137 552.893 1.12317.12683 25 95% Confidence Interval for Mu 1 2 3 Minimum 1st Quartile Median 3rd Quartile Maximum 11.614 1. 3.5 8. 36.5 8. 95% Confidence Interval for Mu 31.26 95% Confidence Interval for Sigma 18.362 32.7111 95% Confidence Interval for Median 95% Confidence Interval for Median 5. 26.638

Non-Parametric Univariate Tests: 1 Sample Sign Test 3 Boxplot of 1 2 3 4 5 6 7 8 Histogram of, with Normal Curve 7 6 Frequency 5 4 3 2 1 1 2 3 4 5 6 7 8 These data do not look normal, so we run a formal normality test (Anderson-Darling). The MINITAB procedure is: >BASIC STATISTICS >NORMALITY TEST >Double click on the column your data is in >Choose ANDERSON-DARLING

Non-Parametric Univariate Tests: 1 Sample Sign Test 4 You will get the following output: Normal Probability Plot Probability.999.99.95.8.5.2.5.1.1 Average: 21.32 StDev: 23.5137 N: 25 1 2 3 4 5 6 7 8 Anderson-Darling Normality Test A-Squared: 1.748 P-Value:. You see that the p =., meaning that the data is highly non-normal; this is obviously verified by looking at the above three plots from the descriptive stats. From here we decide to perform a non-parametric Sign test to see if 3 is a reasonable estimate of the mean, that is to say if a value of 3 could belong to this distribution: >NON-PARAMETRICS >1 SAMPLE SIGN >Double click on the column your data is in: for the Confidence level click on CONFIDENCE INTERVAL and leave it at the default value (.95) >For a hypothesis test click on TEST MEDIAN, and enter the hypothesized value 3 and leave it on the default value (not equal) And we get the following output:

Non-Parametric Univariate Tests: 1 Sample Sign Test 5 Sign Confidence Interval Sign confidence interval for median Achieved N Median Confidence Confidence Interval Position 25 8..8922 (5., 25.) 9.95 (5., 26.6) NLI.9567 (5., 27.) 8 This output gives you three CIs. If you remember, we are interested in the NLI output (nonlinear interpolation). This says that our median is 8, with confidence intervals bounding it at 5 and 26.6. As our hypothesized value 3 lies outside this range we reject the null hypothesis in favor of the alternative and conclude that at the α =.5 level we reject the hypothesis that a value of 3 is a reasonable estimate of the median. For the hypothesis test: Sign Test for Median Sign test of median = 3. versus not = 3. N Below Equal Above P Median 25 18 7.433 8. This output states the hypothesis at the top and then states that there are 18 observations below the hypothesized median, and 7 above (remember it should be around 5/5). The p =.433, less than the stated α, therefore we conclude that at the α =.5 level, we reject the null hypothesis that 3 is a reasonable estimate of the mean and accept the alternative hypothesis. The results of the hypothesis test are obviously the same as the CI test. Which method you choose to use is dependent on personal preference and the exact type of question you are interested in answering.