Non Parametric Statistics


 Rudolf Sullivan
 2 years ago
 Views:
Transcription
1 Non Parametric Statistics Διατμηματικό ΠΜΣ Επαγγελματική και Περιβαλλοντική ΥγείαΔιαχείριση και Οικονομική Αποτίμηση Δημήτρης Φουσκάκης
2 Introduction So far in the course we ve assumed that the data come from some known distribution, e.g. normal or the Central Limit Theory hold. Methods of estimation and hypothesis testing have been based on these assumption. These procedures are usually called parametric statistical methods.. If these assumptions are not met the nonparametric statistical methods must be used.
3 Revision Inferential Statistics Hypothesis testing versus Confidence Intervals Parametric versus Nonparametric Quantitative data Categorical data Relation between two variables Relation between several variables
4 What does inferential statistics do? helps to quantify how certain we can be when we make inferences from a given sample. The three approaches: a) Hypothesis testing b) Confidence Intervals c) Both I know how to do a ttest, but I don t know when!
5 Hypothesis Testing H O : W=w a H A : W w a α: : The Type I error or significance level of the test, is usually set to a value like 5%. Power = (1β), the power of the test, common value 80%. Power calculations: Have I chosen a correct number of observations? Is H 0 really true? Yes No Researcher s decision Reject H 0 Accept H 0 Type I error α Correct decision Correct decision Power Type II error β
6 Statistical and clinical significance Statistical significance (P value ): The probability that this sample was drawn from a population with characteristics consistent with H 0 was low enough to reject H 0. (usual rule: reject H O if P value < 0.05; why 0.05 and not 0.04?) Clinical (practical) significance: An important finding with implications for your clinical practice.
7 Summary points for Pvalues P values, or significant levels, measure the strength of the evidence against the null hypothesis; the smaller it is the stronger the evidence is. An arbitrary division of results, into significant or not, according to the P value was not the intention of the founders. A P value of 0.05 provides some but not strong evidence against the null hypothesis, but it is reasonable to say that P value <0.001 does. Results of medical research should not be reported as significant or not but should be interpreted in the context of the type of study and other available evidence.
8 Correct Definition of the Pvalue P value is the chance of getting a test statistics as extreme or more than the observed one. P value is NOT the chance of the null hypothesis being right.
9 Confidence Intervals(C.I.).) The wrong definition: There is a 95% (e.g.) chance that the parameter of interest will fall within the particular interval. The exact definition: If we take a series of samples from the same population and construct e.g. 95%confidence intervals around their parameters then 95% of these confidence intervals will contain the true parameter. Implementation to the Hypothesis testing: Check if the interval includes w a, in order to decide if you are going to reject the null hypothesis.
10 How to choose a statistical test... The type of data continuous versus categorical The distribution parametric versus nonparametric The sample size The number of samples The relation of samples to each other paired versus unpaired The number of variables univariate versus multivariate
11 Parametric versus NonParametric Parametric methods: make distributional assumptions usually assume Normal distribution or use the Central Limit Theorem. comparable Standard Deviations Nonparametric methods: distributionfree P value (nonparametric) > P value (parametric) No confidence intervals usually in the nonparametric tests.
12 Statistical methods for continuous data Univariate tests to compare means: Number of samples or more paired parametric Onesample ttest Paired ttest unpaired Twosample ttest Oneway ANOVA nonparametric Wilcoxon signed rank sum test Wilcoxon matched pairs signed rank sum test MannWhitney U test KruskalWallis test
13 One Sample Table 1: Average daily energy intake (kj) over 10 days of 11 healthy women. Subject Average daily energy intake (kj) Mean SD What can we say about the energy intake of these women in relation to a recommended daily intake of 7725kJ?
14 One Sample To answer the question we can carry out a test of the null hypothesis that our data are a sample from a population with a specific hypothesized mean. The test is called the one sample ttest. t test. t sample mean  hypothesized mean x k = = standard error of sample mean s/ n = = / 11 2 (area to the right of t under the t distribution with 10 df) t distribution with n 1=10 df Table If t > t n1, 1,α/2 or t <  t n1, /2 reject H o 1,α/2 P value < 0.02 Reject H 0
15 One Sample Alternatively we could calculate a 95% C.I. for the mean intake: (x ± t10,0.025 s / n) = ( ± ) = (5986, 7521) This range does not include the recommended level of 7725KJ. If we assume that the women are a representative sample, then we can infer that for all women of this age the average daily energy consumption is less than is recommended.
16 One Sample Assumptions: The Data comes from a Normal distribution. If the sample size is >30 then because of the Central Limit Theory we can perform the test even if data doesn t t look very near to Normal. For small samples non Normally distributed we should perform a non parametric method like the Sign Test or the Wilcoxon signed rank sum test.
17 One Sample The Sign Test (or Binomial Test) If there were no differences on average between the sample values and the hypothesized specific value we would expect an equal number of observations above and below the specific value. We can thus use the Binomial distribution, or the Normal approximation of it, to evaluate the probability of the observed frequencies when the true probability of exceeding the expected intake is p=1/2. In our dataset 2 women had daily intakes above 7725 KJ and 9 below. We calculate the following test statistic: If z > z α/2 or z<z α/2 reject H o 2 (area to the right of z under the N(0,1) distribution) r np z = = = 2.11 np(1 p) OR r np z = = = 2.11 np(1 p) Normal Table P value =0.035 REJECT H 0
18 One Sample The Sign Test (or Binomial Test) If any of the observations is exactly the same as the hypothesized value then we ignore it in the calculation. Thus the sample size is the number of observations that differ from the hypothesized value. Because of the small sample size it would be better in the normal approximation to use the continuity correction, i.e. subtract ½ in the absolute value of the numerator. r np 1/2 z = = 1.81 np(1 p) Normal Table P value =0.07 DO NOT REJECT H 0
19 One Sample The Wilcoxon Signed rank Test Calculate the difference between each observation and the value of interest. Ignoring the signs of the differences, rank them in order of magnitude. More powerful test than the sign test. Calculate the sum of the ranks of all the negative (or positive) ranks and find P value from corresponding table.
20 One Sample The Wilcoxon Signed rank Test 3+5 = 8 P value < 0.05 Reject H 0 Wilcoxon Signed rank Test Table
21 Two Groups of Paired Observations Paired data arise when the same individuals are studied more than once, usually in different circumstances. Also, when we have 2 different groups of subjects who have been individually matched, for example on a matched pair casecontrol control study. Very common in Medical Research. We are interested in the average difference between the observations for each individual and the variability of these differences.
22 Two Groups of Paired Observations Table 2: Mean daily intake over 10 premenstrual and 10 postmenstrual days Dietary intake Subject Premenstrual Postmenstrual Difference Mean SD We can use the one sample ttest test to calculate a P value for the comparison of means, the observed mean difference of KJ and the hypothetical value of zero, i.e. the null hypothesis is that pre and post menstrual dietary intake is the same. d t = = = se(d) / 11 P value < T distribution with n 1=10 df Table Reject H 0
23 Two Groups of Paired Observations Alternatively we could calculate a 95% C.I. for the mean difference: (d ± t10,0.025 s / n) = ( ± ) = (1074.2,1566.8) This range does not include the recommended level of 0KJ. If we assume that the women are a representative sample, then we can infer that dietary intake is much lower in the post menstrual period.
24 Two Groups of Paired Observations The same assumptions as before hold for the difference data (thus( we require normality for the differences not for each set of data). If these assumptions are not met then we can apply the same non parametric techniques as before for the difference data. For example we see that all 11 differences have the same sign so the test statistic of the sign test with the continuity correction is: r np Normal Table z = = = 3.02 np(1 p) Reject H 0 P value = 0.003
25 Two Independent Groups of Observations The most common statistical analysis, e.g. clinical trials or observational studies comparing different groups of subjects. Table: 24 hour total energy expenditure (MJ/day) in groups of lean and obese women. Lean (n=13) Obese (n=9) Mean SD Is there a true difference in the 24 hour total energy expenditure between lean and obese women?
26 Two Independent Groups of Observations To answer this question we can carry out a test of the null hypothesis that the means of the two populations, obese and lean women have the same mean of total energy expenditure. The test is called the two sample ttest. t test. x x = = = se(x x ) s 1/ n + 1/ n 1 2 t 3.95 where s is the 1 2 p 1 2 p 1 2 pooled standard deviation given by P value <0.001 (T distribution with n 1 + n 22=20 df ) Reject H (n1 1)s 1 + (n 2 1)s2 2 th s p =, with s i the variance of the i group. n + n 2 If t > t n1+n22, 2,α/2 or t <  t n1+n22, 2,α/2 reject H o
27 Two Independent Groups of Observations Alternatively we could calculate a 95% C.I. for the mean difference: ( x ) 1 x2 ± tn + n 2,0.025 sp 1/n1+ 1/n2 1 2 = (2.232 ± ) = (1.05,3.41) This range does not include the value of 0MJ/day. Thus the total energy expenditure in the obese women is greater than that of the lean women.
28 Two Independent Groups of Assumptions: Observations Each set of observations is sampled from a population with a Normal distribution and the variances of the two populations are the same. If the sample sizes of the two groups are >30 then because of the Central Limit Theory we can perform the test even if data doesn t t look very near to Normal in either or both groups. For small samples non Normally distributed, or/and for populations with unequal variances, we should perform a non parametric method, the MannWhitney test (or the Wilcoxon Rank sum test).
29 Two Independent Groups of Observations MannWhitney Test The MannWhitney test requires all observations to be ranked as if they were from a single sample. Then T = sum of the ranks in the smaller group (either group can be taken if they have equal size) is calculated and a P value is found from tables. Mann Whitney Table In our case T=150 P value < 0.01 Reject H 0
30 Two Independent Groups of Observations MannWhitney Test
31 Testing the Assumptions How to test normality?? Most people just make a histogram of the data and check if this looks like a bell shape. Although remember r that the assumption is not that the sample has the normal distribution n but that it comes from a population which does. For large samples we expect to see a histogram with a bell shape if the population is normal but with small samples it is quite unlike to get a symmetric distribution even if the population is normally distributed. There are formal methods that test for normality, and you can find them in most statistical packages, like the ShapiroWilk test or the Shapiro Francia test.. You can also use common sense and answer the question if it is reasonable to make the assumption that the population of interest is normally distributed. When the data are not normally distributed and are skewed, it is better to try some transformations first, like the logarithmic one, o in order to make their shape symmetric and then perform a parametric test on the transformed data, instead of doing directly a non parametric test.
32 Testing the Assumptions How to test equality of variances?? Most people just see how close are the 2 sample variances. Instead you can perform a hypothesis testing with a null hypothesis that the two variances are equal; this test is called the F test.
33 Testing the Assumptions Table: Serum thyroxine level (nmol/l) in 16 hypothyroid infants by severity of symptoms (Hulse et al., 1979) Marked symptoms Slight or no symptoms (n=7) (n=9) Mean SD F distribution with n 11=6 and n 21=8 df 2 2 s F = = = s th where s i is the standard deviation of the i group. If F < F n1 reject H o n11,n2 1,n21,1α/2 /2 or F > F n1 n11,n2 1,n21,a/21,a/2 We wish to compare thyroxine levels in the two groups defined by severity of symptoms, but the sample standard deviations are markedly different. P value < 0.01 Reject H 0 area to the right of F under the F distribution with 6, 8 df)
34 Testing the Assumptions Alternatively we could calculate a 95% C.I. for the variances ratio: 2 2 s1 1 s1 1, = 2 2 s2 Fn 1 1,n2 1,0.975 s2 F n1 1,n2 1, =, = (1.49,38.61) This range does not include the value of 1. Thus the variance in the marked symptoms group is larger than the one in the slight or no symptoms group. Thus we cannot use the ttest t test and we have to perform a non parametric method.
35 Testing the Assumptions The F test is nonrobust to a violation of Normality. Alternatively one can use the Levene s Test using a statistical package, which is not strongly dependent on the assumption of Normality of the two groups.
Chapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrclmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationNonparametric tests these test hypotheses that are not statements about population parameters (e.g.,
CHAPTER 13 Nonparametric and DistributionFree Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationMEASURES OF LOCATION AND SPREAD
Paper TU04 An Overview of Nonparametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationParametric and Nonparametric: Demystifying the Terms
Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationt Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
ttests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com
More informationwww.rmsolutions.net R&M Solutons
Ahmed Hassouna, MD Professor of cardiovascular surgery, AinShams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A Choose the best answer The duration
More informationExperimental Designs (revisited)
Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described
More informationTesting Group Differences using Ttests, ANOVA, and Nonparametric Measures
Testing Group Differences using Ttests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone:
More informationNonInferiority Tests for One Mean
Chapter 45 NonInferiority ests for One Mean Introduction his module computes power and sample size for noninferiority tests in onesample designs in which the outcome is distributed as a normal random
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationChapter 23 Inferences About Means
Chapter 23 Inferences About Means Chapter 23  Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300minute
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationChapter 7. Oneway ANOVA
Chapter 7 Oneway ANOVA Oneway ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The ttest of Chapter 6 looks
More informationAssumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model
Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity
More informationANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall
More informationExamining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish
Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)
More informationOneWay Analysis of Variance (ANOVA) Example Problem
OneWay Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesistesting technique used to test the equality of two or more population (or treatment) means
More informationReporting Statistics in Psychology
This document contains general guidelines for the reporting of statistics in psychology research. The details of statistical reporting vary slightly among different areas of science and also among different
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationUNDERSTANDING THE INDEPENDENTSAMPLES t TEST
UNDERSTANDING The independentsamples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly
More informationDescriptive and Inferential Statistics
General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?
More informationIntroduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
More information12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
More informationDescriptive Analysis
Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,
More informationThe InStat guide to choosing and interpreting statistical tests
Version 3.0 The InStat guide to choosing and interpreting statistical tests Harvey Motulsky 19902003, GraphPad Software, Inc. All rights reserved. Program design, manual and help screens: Programming:
More informationCOMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore*
COMPARING DATA ANALYSIS TECHNIQUES FOR EVALUATION DESIGNS WITH NON NORMAL POFULP_TIOKS Elaine S. Jeffers, University of Maryland, Eastern Shore* The data collection phases for evaluation designs may involve
More informationMind on Statistics. Chapter 12
Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationThe Statistics Tutor s Quick Guide to
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcpmarshallowen7
More informationProfile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:
Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables
More informationChapter 9. TwoSample Tests. Effect Sizes and Power Paired t Test Calculation
Chapter 9 TwoSample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and TwoSample Tests: Paired Versus
More informationA POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment
More informationSTATISTICS FOR PSYCHOLOGISTS
STATISTICS FOR PSYCHOLOGISTS SECTION: STATISTICAL METHODS CHAPTER: REPORTING STATISTICS Abstract: This chapter describes basic rules for presenting statistical results in APA style. All rules come from
More informationStatistical Impact of Slip Simulator Training at Los Alamos National Laboratory
LAUR1224572 Approved for public release; distribution is unlimited Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory Alicia GarciaLopez Steven R. Booth September 2012
More informationDATA ANALYSIS. QEM Network HBCUUP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University
DATA ANALYSIS QEM Network HBCUUP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science
More informationThe Assumption(s) of Normality
The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew
More informationTerminating Sequential Delphi Survey Data Collection
A peerreviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
More informationCase Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention
More informationAnalyzing Research Data Using Excel
Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial
More informationTABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2
About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (Oneway χ 2 )... 1 Test of Independence (Twoway χ 2 )... 2 Hypothesis Testing
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationAP Statistics 2010 Scoring Guidelines
AP Statistics 2010 Scoring Guidelines The College Board The College Board is a notforprofit membership association whose mission is to connect students to college success and opportunity. Founded in
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationTypes of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York . NONPARAMETRIC STATISTICS I. DEFINITIONS A. Parametric
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationLikert Scales. are the meaning of life: Dane Bertram
are the meaning of life: Note: A glossary is included near the end of this handout defining many of the terms used throughout this report. Likert Scale \lick urt\, n. Definition: Variations: A psychometric
More informationIntroduction to Statistics with GraphPad Prism (5.01) Version 1.1
Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 201011, Anne SegondsPichon. This manual
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationVersion 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky
Version 4.0 Statistics Guide Statistical analyses for laboratory and clinical researchers Harvey Motulsky 19992005 GraphPad Software, Inc. All rights reserved. Third printing February 2005 GraphPad Prism
More informationAnalysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Betweensubjects manipulations: variable to
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGrawHill/Irwin, 2008, ISBN: 9780073319889. Required Computing
More informationStatistics in Medicine Research Lecture Series CSMC Fall 2014
Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power
More informationStatistical Functions in Excel
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
More informationUnit 27: Comparing Two Means
Unit 27: Comparing Two Means Prerequisites Students should have experience with onesample tprocedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationThere are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month
More informationOnline 12  Sections 9.1 and 9.2Doug Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics  Ensley Assignment: Online 12  Sections 9.1 and 9.2 1. Does a Pvalue of 0.001 give strong evidence or not especially strong
More informationAnalysis of Questionnaires and Qualitative Data Nonparametric Tests
Analysis of Questionnaires and Qualitative Data Nonparametric Tests JERZY STEFANOWSKI Instytut Informatyki Politechnika Poznańska Lecture SE 2013, Poznań Recalling Basics Measurment Scales Four scales
More informationCrash Course on Basic Statistics
Crash Course on Basic Statistics Marina Wahl, marina.w4hl@gmail.com University of New York at Stony Brook November 6, 2013 2 Contents 1 Basic Probability 5 1.1 Basic Definitions...........................................
More informationList of Examples. Examples 319
Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.
More informationSample Size Planning, Calculation, and Justification
Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa
More informationAP Statistics 2005 Scoring Guidelines
AP Statistics 2005 Scoring Guidelines The College Board: Connecting Students to College Success The College Board is a notforprofit membership association whose mission is to connect students to college
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 TwoWay ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationp ˆ (sample mean and sample
Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics
More informationP(every one of the seven intervals covers the true mean yield at its location) = 3.
1 Let = number of locations at which the computed confidence interval for that location hits the true value of the mean yield at its location has a binomial(7,095) (a) P(every one of the seven intervals
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGrawHill/Irwin, 2010, ISBN: 9780077384470 [This
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters  they must be estimated. However, we do have hypotheses about what the true
More information= $96 = $24. (b) The degrees of freedom are. s n. 7.3. For the mean monthly rent, the 95% confidence interval for µ is
Chapter 7 Solutions 71 (a) The standard error of the mean is df = n 1 = 15 s n = $96 = $24 (b) The degrees of freedom are 16 72 In each case, use df = n 1; if that number is not in Table D, drop to the
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationSTATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4
STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate
More informationTwosample hypothesis testing, II 9.07 3/16/2004
Twosample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For twosample tests of the difference in mean, things get a little confusing, here,
More informationKSTAT MINIMANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINIMANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationThe Variability of PValues. Summary
The Variability of PValues Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 276958203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationMathematics. Probability and Statistics Curriculum Guide. Revised 2010
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
STT315 Practice Ch 57 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed
More informationCalculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)
1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting
More informationProspects, Problems of Marketing Research and Data Mining in Turkey
Prospects, Problems of Marketing Research and Data Mining in Turkey Sema Kurtulu, and Kemal Kurtulu Abstract The objective of this paper is to review and assess the methodological issues and problems in
More informationOrganizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
More informationISyE 2028 Basic Statistical Methods  Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media
ISyE 2028 Basic Statistical Methods  Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media Abstract: The growth of social media is astounding and part of that success was
More informationOver the past decade, the use of evidencebased. Interpretation and Use of Statistics in Nursing Research ABSTRACT
AACN19_2_211 222 4/14/08 5:44 PM Page 211 Volume 19, Number 2, pp.211 222 2008, AACN Interpretation and Use of Statistics in Nursing Research Karen K. Giuliano, PhD, RN, FAAN Michelle Polanowicz, MSN,
More informationChi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
More informationBasic Concepts in Research and Data Analysis
Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the
More informationConfidence Intervals on Effect Size David C. Howell University of Vermont
Confidence Intervals on Effect Size David C. Howell University of Vermont Recent years have seen a large increase in the use of confidence intervals and effect size measures such as Cohen s d in reporting
More information