# Non Parametric Statistics

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Non Parametric Statistics Διατμηματικό ΠΜΣ Επαγγελματική και Περιβαλλοντική Υγεία-Διαχείριση και Οικονομική Αποτίμηση Δημήτρης Φουσκάκης

2 Introduction So far in the course we ve assumed that the data come from some known distribution, e.g. normal or the Central Limit Theory hold. Methods of estimation and hypothesis testing have been based on these assumption. These procedures are usually called parametric statistical methods.. If these assumptions are not met the nonparametric statistical methods must be used.

3 Revision Inferential Statistics Hypothesis testing versus Confidence Intervals Parametric versus Nonparametric Quantitative data Categorical data Relation between two variables Relation between several variables

4 What does inferential statistics do? helps to quantify how certain we can be when we make inferences from a given sample. The three approaches: a) Hypothesis testing b) Confidence Intervals c) Both I know how to do a t-test, but I don t know when!

5 Hypothesis Testing H O : W=w a H A : W w a α: : The Type I error or significance level of the test, is usually set to a value like 5%. Power = (1-β), the power of the test, common value 80%. Power calculations: Have I chosen a correct number of observations? Is H 0 really true? Yes No Researcher s decision Reject H 0 Accept H 0 Type I error α Correct decision Correct decision Power Type II error β

6 Statistical and clinical significance Statistical significance (P value ): The probability that this sample was drawn from a population with characteristics consistent with H 0 was low enough to reject H 0. (usual rule: reject H O if P value < 0.05; why 0.05 and not 0.04?) Clinical (practical) significance: An important finding with implications for your clinical practice.

7 Summary points for Pvalues P values, or significant levels, measure the strength of the evidence against the null hypothesis; the smaller it is the stronger the evidence is. An arbitrary division of results, into significant or not, according to the P value was not the intention of the founders. A P value of 0.05 provides some but not strong evidence against the null hypothesis, but it is reasonable to say that P value <0.001 does. Results of medical research should not be reported as significant or not but should be interpreted in the context of the type of study and other available evidence.

8 Correct Definition of the Pvalue P value is the chance of getting a test statistics as extreme or more than the observed one. P value is NOT the chance of the null hypothesis being right.

9 Confidence Intervals(C.I.).) The wrong definition: There is a 95% (e.g.) chance that the parameter of interest will fall within the particular interval. The exact definition: If we take a series of samples from the same population and construct e.g. 95%confidence intervals around their parameters then 95% of these confidence intervals will contain the true parameter. Implementation to the Hypothesis testing: Check if the interval includes w a, in order to decide if you are going to reject the null hypothesis.

10 How to choose a statistical test... The type of data continuous versus categorical The distribution parametric versus non-parametric The sample size The number of samples The relation of samples to each other paired versus unpaired The number of variables univariate versus multivariate

11 Parametric versus Non-Parametric Parametric methods: make distributional assumptions usually assume Normal distribution or use the Central Limit Theorem. comparable Standard Deviations Non-parametric methods: distribution-free P value (non-parametric) > P value (parametric) No confidence intervals usually in the non-parametric tests.

12 Statistical methods for continuous data Univariate tests to compare means: Number of samples or more paired parametric One-sample t-test Paired t-test unpaired Two-sample t-test One-way ANOVA non-parametric Wilcoxon signed rank sum test Wilcoxon matched pairs signed rank sum test Mann-Whitney U test Kruskal-Wallis test

13 One Sample Table 1: Average daily energy intake (kj) over 10 days of 11 healthy women. Subject Average daily energy intake (kj) Mean SD What can we say about the energy intake of these women in relation to a recommended daily intake of 7725kJ?

14 One Sample To answer the question we can carry out a test of the null hypothesis that our data are a sample from a population with a specific hypothesized mean. The test is called the one sample t-test. t test. t sample mean - hypothesized mean x k = = standard error of sample mean s/ n = = / 11 2 (area to the right of t under the t distribution with 10 df) t distribution with n -1=10 df Table If t > t n-1, 1,α/2 or t < - t n-1, /2 reject H o 1,α/2 P value < 0.02 Reject H 0

15 One Sample Alternatively we could calculate a 95% C.I. for the mean intake: (x ± t10,0.025 s / n) = ( ± ) = (5986, 7521) This range does not include the recommended level of 7725KJ. If we assume that the women are a representative sample, then we can infer that for all women of this age the average daily energy consumption is less than is recommended.

16 One Sample Assumptions: The Data comes from a Normal distribution. If the sample size is >30 then because of the Central Limit Theory we can perform the test even if data doesn t t look very near to Normal. For small samples non Normally distributed we should perform a non parametric method like the Sign Test or the Wilcoxon signed rank sum test.

17 One Sample The Sign Test (or Binomial Test) If there were no differences on average between the sample values and the hypothesized specific value we would expect an equal number of observations above and below the specific value. We can thus use the Binomial distribution, or the Normal approximation of it, to evaluate the probability of the observed frequencies when the true probability of exceeding the expected intake is p=1/2. In our dataset 2 women had daily intakes above 7725 KJ and 9 below. We calculate the following test statistic: If z > z α/2 or z<-z α/2 reject H o 2 (area to the right of z under the N(0,1) distribution) r np z = = = 2.11 np(1 p) OR r np z = = = 2.11 np(1 p) Normal Table P value =0.035 REJECT H 0

18 One Sample The Sign Test (or Binomial Test) If any of the observations is exactly the same as the hypothesized value then we ignore it in the calculation. Thus the sample size is the number of observations that differ from the hypothesized value. Because of the small sample size it would be better in the normal approximation to use the continuity correction, i.e. subtract ½ in the absolute value of the numerator. r np 1/2 z = = 1.81 np(1 p) Normal Table P value =0.07 DO NOT REJECT H 0

19 One Sample The Wilcoxon Signed rank Test Calculate the difference between each observation and the value of interest. Ignoring the signs of the differences, rank them in order of magnitude. More powerful test than the sign test. Calculate the sum of the ranks of all the negative (or positive) ranks and find P value from corresponding table.

20 One Sample The Wilcoxon Signed rank Test 3+5 = 8 P value < 0.05 Reject H 0 Wilcoxon Signed rank Test Table

21 Two Groups of Paired Observations Paired data arise when the same individuals are studied more than once, usually in different circumstances. Also, when we have 2 different groups of subjects who have been individually matched, for example on a matched pair case-control control study. Very common in Medical Research. We are interested in the average difference between the observations for each individual and the variability of these differences.

22 Two Groups of Paired Observations Table 2: Mean daily intake over 10 pre-menstrual and 10 post-menstrual days Dietary intake Subject Pre-menstrual Post-menstrual Difference Mean SD We can use the one sample t-test test to calculate a P value for the comparison of means, the observed mean difference of KJ and the hypothetical value of zero, i.e. the null hypothesis is that pre- and post- menstrual dietary intake is the same. d t = = = se(d) / 11 P value < T distribution with n -1=10 df Table Reject H 0

23 Two Groups of Paired Observations Alternatively we could calculate a 95% C.I. for the mean difference: (d ± t10,0.025 s / n) = ( ± ) = (1074.2,1566.8) This range does not include the recommended level of 0KJ. If we assume that the women are a representative sample, then we can infer that dietary intake is much lower in the post- menstrual period.

24 Two Groups of Paired Observations The same assumptions as before hold for the difference data (thus( we require normality for the differences not for each set of data). If these assumptions are not met then we can apply the same non parametric techniques as before for the difference data. For example we see that all 11 differences have the same sign so the test statistic of the sign test with the continuity correction is: r np Normal Table z = = = 3.02 np(1 p) Reject H 0 P value = 0.003

25 Two Independent Groups of Observations The most common statistical analysis, e.g. clinical trials or observational studies comparing different groups of subjects. Table: 24 hour total energy expenditure (MJ/day) in groups of lean and obese women. Lean (n=13) Obese (n=9) Mean SD Is there a true difference in the 24 hour total energy expenditure between lean and obese women?

26 Two Independent Groups of Observations To answer this question we can carry out a test of the null hypothesis that the means of the two populations, obese and lean women have the same mean of total energy expenditure. The test is called the two sample t-test. t test. x x = = = se(x x ) s 1/ n + 1/ n 1 2 t 3.95 where s is the 1 2 p 1 2 p 1 2 pooled standard deviation given by P value <0.001 (T distribution with n 1 + n 2-2=20 df ) Reject H (n1 1)s 1 + (n 2 1)s2 2 th s p =, with s i the variance of the i group. n + n 2 If t > t n1+n2-2, 2,α/2 or t < - t n1+n2-2, 2,α/2 reject H o

27 Two Independent Groups of Observations Alternatively we could calculate a 95% C.I. for the mean difference: ( x ) 1 x2 ± tn + n 2,0.025 sp 1/n1+ 1/n2 1 2 = (2.232 ± ) = (1.05,3.41) This range does not include the value of 0MJ/day. Thus the total energy expenditure in the obese women is greater than that of the lean women.

28 Two Independent Groups of Assumptions: Observations Each set of observations is sampled from a population with a Normal distribution and the variances of the two populations are the same. If the sample sizes of the two groups are >30 then because of the Central Limit Theory we can perform the test even if data doesn t t look very near to Normal in either or both groups. For small samples non Normally distributed, or/and for populations with unequal variances, we should perform a non parametric method, the Mann-Whitney test (or the Wilcoxon Rank sum test).

29 Two Independent Groups of Observations Mann-Whitney Test The Mann-Whitney test requires all observations to be ranked as if they were from a single sample. Then T = sum of the ranks in the smaller group (either group can be taken if they have equal size) is calculated and a P value is found from tables. Mann Whitney Table In our case T=150 P value < 0.01 Reject H 0

30 Two Independent Groups of Observations Mann-Whitney Test

31 Testing the Assumptions How to test normality?? Most people just make a histogram of the data and check if this looks like a bell shape. Although remember r that the assumption is not that the sample has the normal distribution n but that it comes from a population which does. For large samples we expect to see a histogram with a bell shape if the population is normal but with small samples it is quite unlike to get a symmetric distribution even if the population is normally distributed. There are formal methods that test for normality, and you can find them in most statistical packages, like the Shapiro-Wilk test or the Shapiro- Francia test.. You can also use common sense and answer the question if it is reasonable to make the assumption that the population of interest is normally distributed. When the data are not normally distributed and are skewed, it is better to try some transformations first, like the logarithmic one, o in order to make their shape symmetric and then perform a parametric test on the transformed data, instead of doing directly a non parametric test.

32 Testing the Assumptions How to test equality of variances?? Most people just see how close are the 2 sample variances. Instead you can perform a hypothesis testing with a null hypothesis that the two variances are equal; this test is called the F test.

33 Testing the Assumptions Table: Serum thyroxine level (nmol/l) in 16 hypothyroid infants by severity of symptoms (Hulse et al., 1979) Marked symptoms Slight or no symptoms (n=7) (n=9) Mean SD F distribution with n 1-1=6 and n 2-1=8 df 2 2 s F = = = s th where s i is the standard deviation of the i group. If F < F n1 reject H o n1-1,n2 1,n2-1,1-α/2 /2 or F > F n1 n1-1,n2 1,n2-1,a/21,a/2 We wish to compare thyroxine levels in the two groups defined by severity of symptoms, but the sample standard deviations are markedly different. P value < 0.01 Reject H 0 area to the right of F under the F distribution with 6, 8 df)

34 Testing the Assumptions Alternatively we could calculate a 95% C.I. for the variances ratio: 2 2 s1 1 s1 1, = 2 2 s2 Fn 1 1,n2 1,0.975 s2 F n1 1,n2 1, =, = (1.49,38.61) This range does not include the value of 1. Thus the variance in the marked symptoms group is larger than the one in the slight or no symptoms group. Thus we cannot use the t-test t test and we have to perform a non- parametric method.

35 Testing the Assumptions The F test is non-robust to a violation of Normality. Alternatively one can use the Levene s Test using a statistical package, which is not strongly dependent on the assumption of Normality of the two groups.

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### MEASURES OF LOCATION AND SPREAD

Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the

### Quantitative Methods for Finance

Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### www.rmsolutions.net R&M Solutons

Ahmed Hassouna, MD Professor of cardiovascular surgery, Ain-Shams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A- Choose the best answer The duration

### Experimental Designs (revisited)

Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

### Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

### Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Chapter 7. One-way ANOVA

Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

### Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

### ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### Reporting Statistics in Psychology

This document contains general guidelines for the reporting of statistics in psychology research. The details of statistical reporting vary slightly among different areas of science and also among different

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

### Descriptive and Inferential Statistics

General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### 12: Analysis of Variance. Introduction

1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

### Descriptive Analysis

Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

### The InStat guide to choosing and interpreting statistical tests

Version 3.0 The InStat guide to choosing and interpreting statistical tests Harvey Motulsky 1990-2003, GraphPad Software, Inc. All rights reserved. Program design, manual and help screens: Programming:

### COMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON -NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore*

COMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON -NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore* The data collection phases for evaluation designs may involve

### Mind on Statistics. Chapter 12

Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### The Statistics Tutor s Quick Guide to

statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

### Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

### A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

### STATISTICS FOR PSYCHOLOGISTS

STATISTICS FOR PSYCHOLOGISTS SECTION: STATISTICAL METHODS CHAPTER: REPORTING STATISTICS Abstract: This chapter describes basic rules for presenting statistical results in APA style. All rules come from

### Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

LA-UR-12-24572 Approved for public release; distribution is unlimited Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory Alicia Garcia-Lopez Steven R. Booth September 2012

### DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

### The Assumption(s) of Normality

The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew

### Terminating Sequential Delphi Survey Data Collection

A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

### Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

### Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

### TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2

About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### AP Statistics 2010 Scoring Guidelines

AP Statistics 2010 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York . NONPARAMETRIC STATISTICS I. DEFINITIONS A. Parametric

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### Likert Scales. are the meaning of life: Dane Bertram

are the meaning of life: Note: A glossary is included near the end of this handout defining many of the terms used throughout this report. Likert Scale \lick urt\, n. Definition: Variations: A psychometric

### Introduction to Statistics with GraphPad Prism (5.01) Version 1.1

Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 2010-11, Anne Segonds-Pichon. This manual

### Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### Statistics in Medicine Research Lecture Series CSMC Fall 2014

Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power

### Statistical Functions in Excel

Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

### Unit 27: Comparing Two Means

Unit 27: Comparing Two Means Prerequisites Students should have experience with one-sample t-procedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month

### Online 12 - Sections 9.1 and 9.2-Doug Ensley

Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

### Analysis of Questionnaires and Qualitative Data Non-parametric Tests

Analysis of Questionnaires and Qualitative Data Non-parametric Tests JERZY STEFANOWSKI Instytut Informatyki Politechnika Poznańska Lecture SE 2013, Poznań Recalling Basics Measurment Scales Four scales

### Crash Course on Basic Statistics

Crash Course on Basic Statistics Marina Wahl, marina.w4hl@gmail.com University of New York at Stony Brook November 6, 2013 2 Contents 1 Basic Probability 5 1.1 Basic Definitions...........................................

### List of Examples. Examples 319

Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

### Sample Size Planning, Calculation, and Justification

Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa

### AP Statistics 2005 Scoring Guidelines

AP Statistics 2005 Scoring Guidelines The College Board: Connecting Students to College Success The College Board is a not-for-profit membership association whose mission is to connect students to college

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### p ˆ (sample mean and sample

Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

### P(every one of the seven intervals covers the true mean yield at its location) = 3.

1 Let = number of locations at which the computed confidence interval for that location hits the true value of the mean yield at its location has a binomial(7,095) (a) P(every one of the seven intervals

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Introduction to Hypothesis Testing

I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

### = \$96 = \$24. (b) The degrees of freedom are. s n. 7.3. For the mean monthly rent, the 95% confidence interval for µ is

Chapter 7 Solutions 71 (a) The standard error of the mean is df = n 1 = 15 s n = \$96 = \$24 (b) The degrees of freedom are 16 72 In each case, use df = n 1; if that number is not in Table D, drop to the

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4

STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate

### Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

STT315 Practice Ch 5-7 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed

### Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting

### Prospects, Problems of Marketing Research and Data Mining in Turkey

Prospects, Problems of Marketing Research and Data Mining in Turkey Sema Kurtulu, and Kemal Kurtulu Abstract The objective of this paper is to review and assess the methodological issues and problems in

### Organizing Your Approach to a Data Analysis

Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

### ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media Abstract: The growth of social media is astounding and part of that success was

### Over the past decade, the use of evidencebased. Interpretation and Use of Statistics in Nursing Research ABSTRACT

AACN19_2_211 222 4/14/08 5:44 PM Page 211 Volume 19, Number 2, pp.211 222 2008, AACN Interpretation and Use of Statistics in Nursing Research Karen K. Giuliano, PhD, RN, FAAN Michelle Polanowicz, MSN,

### Chi Square Tests. Chapter 10. 10.1 Introduction

Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square