Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions


 Charity Horn
 2 years ago
 Views:
Transcription
1 BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: ChiSquared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chisquared and Fisher's exact tests Fisher's exact test is a computationally intensive test Chi squared test provides a good approximation in many cases Both tests are for purely categorical data i.e. looking at the count of values in given categories There are two distinct uses for these tests: Testing if data match an expected distribution Comparing proportions among different groups Observed vs Expected Distributions The chisquared test can be used to test whether a discrete distribution of results follow a predicted or expected distribution. Example (from Motulsky): Study from 2007 investigated heart disease among firefighters Hypothesized that risk of death from heart disease was related to duty the firefighter was performing at the time Null hypothesis is that risk of death from heart disease is independent of duty performed Kales et al. study Duty Number of heart attacks observed Proportion of time spent in duty Fire Suppression % Alarm response % Physical training % Other duties % Total % Under the null hypothesis, the number of heart attack deaths occuring during each duty would be proportional to the time spent in that duty So, for example, the number of deaths expected during fire suppression would be 2.0% of 449 = /6
2 Observed and Expected values Duty Number of heart attacks observed Expected number Fire Suppression Alarm response Physical training Other duties Total The number of deaths from heart attack while on active duty, and particularly while actively working on fire suppression, is far higher than the expected number The Chisquared goodness of fit test The Chisquared goodness of fit test computes a pvalue for the following question: If the data were distributed as defined in the null hypothesis, what is the probability of seeing a discrepancy between the observed and expected values this large solely by random sample selection? In our example, distribution for the null hypothesis is merely the one in which the chance of death from heart disease is randomly distributed according to time The chisquared statistic is defined as follows: For each category: Subtract the expected value from the observed value and squared the difference Divide the square by the expected value Sum these up. This is the chisquared (χ 2 ) value. The chisquared distribution The chisquared value under the null hypothesis follows a known distribution that depends on the number of degrees of freedom The degrees of freedom in a goodness of fit test is the number of categories minus one In the firefighter example there are three degrees of freedom The pvalue is obtained by comparing the number of degrees of freedom to a table of χ 2 values, or by using statistical software. For our example, χ 2 =2245, d.f.=3, and p<106 Chisquared goodness of fit test: cautions When using the chisquared goodness of fit test: Always use actual count data for the observed numbers Do not use percentages or normalized values 2/6
3 The observed values should always be integers If the expected number is less than 5 in any category, or less than 10 if there are only two categories, the results may be invalid. For two categories, a binomial test may be used instead Or the sample size can be increased to achieve the required expected values Do not confuse this test (goodness of fit test) with the chisquared test for comparing proportions (test of independence). Proportion comparison studies Many studies, particularly clinical studies, answer a question of the type "Does exposure to a risk factor (or a specific treatment, etc) change the rate of disease?" Note that in these studies, both the dependent variable (disease status) and the independent variable (treated vs untreated, etc), are categorical Types of study Some jargon: Incidence: rate of new cases of disease Prevalence: proportion of a sample which has the disease Crosssectional study: a sample is chosen without control as to how many are affected with the disease or as to how many were exposed to the risk factor. The subjects are divided as to exposure to the risk factor and disease prevalence is compared. Prospective or longditudinal study: Two groups are selected, one exposed to a risk factor (or treatment), one not. They are followed over the natural timeline of the disease, and disease incidence is compared. Experimental study: A single sample is chosen and randomly divided into two groups. One group is treated (or exposed to a risk factor), one is not. Incidence is compared between the two groups. Casecontrol or retrospective study: Two groups are selected, one with the disease, one without. The number exposed to the risk factor (or treated, etc) is compared between the two groups. Contingency Tables Data from all these types of study may be summarized in a contingency table Rows in the table represent exposure to the risk factor or treatment status Columns in the table represent disease status Cells in the table are counts of the number of subjects in that category Disease No disease Total Exposed/Treated A B A+B Not Exposed/Not Treated C D C+D 3/6
4 Total A+C B+D A+B+C+D Example experimental study Example study we saw previously: CABG vs PTCA in coronary artery disease patients Sample of 1829 patients with CAD. 914 Randomly assigned to bypass (CABG), 915 to angioplasty (PTCA). Focus on fiveyear survival rates Survived 5 years Did not survive 5 years Total CABG PCTA Total % of those receiving CABG died within five years, and 41.3% of those receiving PTCA died within five years Clearly little difference Diabetes and CABG/PTCA Study also looked at survival rates for CABG vs PTCA among patients treated for diabetes Note this is still an experimental study as the treatments were assigned and the outcome was measured subsequently Even though the diabetes diagnosis was retrieved retrospectively Five year survival among diabetic patients: Survived 5 years Did not survive 5 years Total CABG PCTA Total Mortality rates were 48.3% for CABGtreated patients and 60.1% for PTCAtreated patients Diabetes and CABG/PTCA: Data Analysis Aim is to know the extent to which these data generalize to the general population CAD patients with diabetes One way is to compute confidence intervals for these proportions Saw how to do this in lecture 2. 95% CI for CABG is 41% to 56% and for PTCA is 53% to 67% The CIs overlap, but this does not mean the difference is not statistically significant Attributable Risk and NNT The difference between the two proportions is called the attributable risk The amount of risk which can be attributed to the treatment or exposure to risk factor 4/6
5 For our example the attributable risk is 60.1%48.3%=11.8%. This means that for diabetic patients there is an 11.8% risk of mortality in five years associated with choosing PTCA over CABG. The reciprocal of attributable risk is called the Number needed to treat (NNT). For this example, NNT=1/0.118=8.5. This means that if we choose to give CAD patients with diabetes CABG instead of PTCA, for every 8.5 patients one will survive five years who would not have done under PTCA. Relative Risk The relative risk is the ratio of the risks. In this example it is 48.3%/60.1%=0.80. This means that CADdiabetic patients treated with CABG have 80% of the chance of mortality of PTCA treated patients. The confidence interval of the relative risk can also be computed. For this example the 95% confidence interval is 66% to 98%. Be careful with the percentages: For attributable risk, the percentages represent the percentages of subjects For relative risk, the proportion 0.80 (80%) represents a relative probability; it's the proportion of risk assumed by one group relative to the other group Attributable Risk or Relative Risk? Whether attributable risk (or NNT) or relative risk are more useful depends on the context When making a choice between treatments, the relative risk is often more intutive Risk under CABG is 80% of the risk under PTCA But on a population level, the relative risk can be misleading Imagine a vaccine which reduces the occurence of a disease by 60% Relative risk of taking the vaccine is 40% compared to no vaccine The utility of the vaccine depends on the prevalence of the disease in the general population If the disease is very rare, say prevalence is 1 in 1 million, then administering the vaccine will save 4 in 10 million people If the disease is common, say prevalence of 1%, then the vaccine will save 4 in 1000 people Fisher's Exact Test The data analysis can be supplemented with a pvalue Remember the pvalue cannot be interpreted without a null hypothesis The null hypothesis is that the categories corresponding to the rows in the contingency table are independent of the categories corresponding to the columns. In this example, the null hypothesis is that the mortality rate does not depend on which treatment was used The pvalue is the probability of seeing mortality rates this different under the assumption of the null hypothesis. The best way to compute a pvalue for contingency table data is with Fisher's Exact Test. For very large sample sizes (in the 100,000s and above), this test is computationally 5/6
6 Summary unwieldy, and a chisquared test can be used as an approximation For the example data, p= by Fisher's Exact Test Chisquared and Fisher's Exact Tests can be used for two scenarios: Testing if categorical data match an expected distribution Testing for independence of two categorical variables i.e. testing if proportions are equivalent among two or more sets of counted data Chisquared goodness of fit test used to test if categorical data matches an expected distribution Approximate test Good when expected values are at least 5 in each category (at least 10 if only two categories) Must use actual counts, not proportions or normalized values Contingency tables tabluate subject counts in different categories Usually use rows in the table for the independent (predictor) variable Columns for the dependent (outcome) variable Summary (continued) Attributable risk is the difference in proportions between treatment categories Number needed to treat (NNT) is the reciprocal of attributable risk Relative risk is the ratio of proportions between treatment categories Fisher's exact test can be used to compute a pvalue for the null hypothesis that there is no relationship between the dependent and independent variable (i.e. the variables are independent of each other) Computationally prohibitive for very large data sets Can use chisquared test for independence instead (but never needed in practice) 6/6
Chisquare test Fisher s Exact test
Lesson 1 Chisquare test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
More informationUse of the ChiSquare Statistic. Marie DienerWest, PhD Johns Hopkins University
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSection 12 Part 2. Chisquare test
Section 12 Part 2 Chisquare test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of
More informationIs it statistically significant? The chisquare test
UAS Conference Series 2013/14 Is it statistically significant? The chisquare test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chisquare? Tests whether two categorical
More informationTesting differences in proportions
Testing differences in proportions Murray J Fisher RN, ITU Cert., DipAppSc, BHSc, MHPEd, PhD Senior Lecturer and Director Preregistration Programs Sydney Nursing School (MO2) University of Sydney NSW 2006
More informationChapter 23. Two Categorical Variables: The ChiSquare Test
Chapter 23. Two Categorical Variables: The ChiSquare Test 1 Chapter 23. Two Categorical Variables: The ChiSquare Test TwoWay Tables Note. We quickly review twoway tables with an example. Example. Exercise
More informationCohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationGuide to Biostatistics
MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading
More informationRecommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170
Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label
More informationLecture 25. December 19, 2007. Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationData Analysis, Research Study Design and the IRB
Minding the pvalues p and Quartiles: Data Analysis, Research Study Design and the IRB Don AllensworthDavies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB
More informationTest Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 51: 2 x 2 Contingency Table
ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live
More informationCHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS
CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack
More informationBivariate Statistics Session 2: Measuring Associations ChiSquare Test
Bivariate Statistics Session 2: Measuring Associations ChiSquare Test Features Of The ChiSquare Statistic The chisquare test is nonparametric. That is, it makes no assumptions about the distribution
More informationPeople like to clump things into categories. Virtually every research
05Elliott4987.qxd 7/18/2006 5:26 PM Page 113 5 Analysis of Categorical Data People like to clump things into categories. Virtually every research project categorizes some of its observations into neat,
More informationChapter Five: Paired Samples Methods 1/38
Chapter Five: Paired Samples Methods 1/38 5.1 Introduction 2/38 Introduction Paired data arise with some frequency in a variety of research contexts. Patients might have a particular type of laser surgery
More informationChi Square Distribution
17. Chi Square A. Chi Square Distribution B. OneWay Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes
More informationEBM Cheat Sheet Measurements Card
EBM Cheat Sheet Measurements Card Basic terms: Prevalence = Number of existing cases of disease at a point in time / Total population. Notes: Numerator includes old and new cases Prevalence is crosssectional
More informationOneWay Analysis of Variance (ANOVA) Example Problem
OneWay Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesistesting technique used to test the equality of two or more population (or treatment) means
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationPASS Sample Size Software
Chapter 250 Introduction The Chisquare test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common instances are tests of goodness of fit using multinomial
More informationModule 9: Nonparametric Tests. The Applied Research Center
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } OneSample ChiSquare Test
More informationLecture 42 Section 14.3. Tue, Apr 8, 2008
the Lecture 42 Section 14.3 HampdenSydney College Tue, Apr 8, 2008 Outline the 1 2 the 3 4 5 the The will compute χ 2 areas, but not χ 2 percentiles. (That s ok.) After performing the χ 2 test by hand,
More informationThe calculations lead to the following values: d 2 = 46, n = 8, s d 2 = 4, s d = 2, SEof d = s d n s d n
EXAMPLE 1: Paired ttest and tinterval DBP Readings by Two Devices The diastolic blood pressures (DBP) of 8 patients were determined using two techniques: the standard method used by medical personnel
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationCHAPTER 11 CHISQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHISQUARE AND F DISTRIBUTIONS CHISQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chisquare tests of independence we use the hypotheses. H0: The variables are independent
More informationModule 5 Hypotheses Tests: Comparing Two Groups
Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More informationChiSquare Test. Contingency Tables. Contingency Tables. ChiSquare Test for Independence. ChiSquare Tests for GoodnessofFit
ChiSquare Tests 15 Chapter ChiSquare Test for Independence ChiSquare Tests for Goodness Uniform Goodness Poisson Goodness Goodness Test ECDF Tests (Optional) McGrawHill/Irwin Copyright 2009 by The
More informationSimulating ChiSquare Test Using Excel
Simulating ChiSquare Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 lchandra@jjay.cuny.edu
More informationStatistics. Annex 7. Calculating rates
Annex 7 Statistics Calculating rates Rates are the most common way of measuring disease frequency in a population and are calculated as: number of new cases of disease in population at risk number of persons
More informationUniversity of Colorado Campus Box 470 Boulder, CO 803090470 (303) 4928230 Fax (303) 4924916 http://www.colorado.edu/research/hughes
Hughes Undergraduate Biological Science Education Initiative HHMI Tracking the Source of Disease: Koch s Postulates, Causality, and Contemporary Epidemiology Koch s Postulates In the late 1800 s, the German
More informationCrosssectional studies
Crosssectional studies Patarawan Woratanarat, MD., Ph.D. (Clin.Epid.) Department of Orthopaedics Faculty of Medicine Ramathibodi Hospital, Mahidol University Outline Crosssectional study Advantages/disadvantages
More informationTopic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationContingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables
Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows
More informationChapter 19 The ChiSquare Test
Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The ChiSquare Test In this chapter, we will discuss the following topics: We will plot
More informationTABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2
About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (Oneway χ 2 )... 1 Test of Independence (Twoway χ 2 )... 2 Hypothesis Testing
More informationTypes of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York . NONPARAMETRIC STATISTICS I. DEFINITIONS A. Parametric
More informationCHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES
CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES The chisquare distribution was discussed in Chapter 4. We now turn to some applications of this distribution. As previously discussed, chisquare is
More informationSECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION. Friday 14 March 2008 9.009.
SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION Friday 14 March 2008 9.009.45 am Attempt all ten questions. For each question, choose the
More informationUnit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2way tables Adds capability studying several predictors, but Limited to
More informationChi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980)
Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980) For the Driver Behavior Study, the Chi Square Analysis II is the appropriate analysis below.
More informationCATEGORICAL DATA ChiSquare Tests for Univariate Data
CATEGORICAL DATA ChiSquare Tests For Univariate Data 1 CATEGORICAL DATA ChiSquare Tests for Univariate Data Recall that a categorical variable is one in which the possible values are categories or groupings.
More informationTips for surviving the analysis of survival data. Philip TwumasiAnkrah, PhD
Tips for surviving the analysis of survival data Philip TwumasiAnkrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationBasic research methods. Basic research methods. Question: BRM.2. Question: BRM.1
BRM.1 The proportion of individuals with a particular disease who die from that condition is called... BRM.2 This study design examines factors that may contribute to a condition by comparing subjects
More informationStatistical Impact of Slip Simulator Training at Los Alamos National Laboratory
LAUR1224572 Approved for public release; distribution is unlimited Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory Alicia GarciaLopez Steven R. Booth September 2012
More informationChapter 7 Notes  Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes  Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More informationCHAPTER 11 CHISQUARE: NONPARAMETRIC COMPARISONS OF FREQUENCY
CHAPTER 11 CHISQUARE: NONPARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples
More information2 GENETIC DATA ANALYSIS
2.1 Strategies for learning genetics 2 GENETIC DATA ANALYSIS We will begin this lecture by discussing some strategies for learning genetics. Genetics is different from most other biology courses you have
More informationComparing Multiple Proportions, Test of Independence and Goodness of Fit
Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2
More informationChi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
More informationSample Size Planning, Calculation, and Justification
Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa
More informationUnit 29 ChiSquare GoodnessofFit Test
Unit 29 ChiSquare GoodnessofFit Test Objectives: To perform the chisquare hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni
More informationThe three statistics of interest comparing the exposed to the not exposed are:
Review: Common notation for a x table: Not Exposed Exposed Disease a b a+b No Disease c d c+d a+c b+d N Warning: Rosner presents the transposed table so b and c have their meanings reversed in the text.
More informationThe ChiSquare Test. STAT E50 Introduction to Statistics
STAT 50 Introduction to Statistics The ChiSquare Test The Chisquare test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationBeginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW.
Paper 6925 PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI ABSTRACT The FREQ procedure can be used for more than just obtaining a simple frequency distribution
More informationHow to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationStudy Design and Statistical Analysis
Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing
More information1. Comparing Two Means: Dependent Samples
1. Comparing Two Means: ependent Samples In the preceding lectures we've considered how to test a difference of two means for independent samples. Now we look at how to do the same thing with dependent
More informationBinary Diagnostic Tests Two Independent Samples
Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More information8 6 X 2 Test for a Variance or Standard Deviation
Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the Pvalue method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion
More informationProspective, retrospective, and crosssectional studies
Prospective, retrospective, and crosssectional studies Patrick Breheny April 3 Patrick Breheny Introduction to Biostatistics (171:161) 1/17 Study designs that can be analyzed with χ 2 tests One reason
More informationExamination Orientation Practice Items
Examination Orientation Practice Items 1 Contents TEST QUESTION FORMATS... 3 A. MULTLPLE CHOICE FORMAT SINGLE ONE BEST ANSWER... 3 STRATEGIES FOR ANSWERING SINGLE ONE BEST ANSWER TEST QUESTIONS... 3 EXAMPLES
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationHypothesis Testing. Bluman Chapter 8
CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 81 Steps in Traditional Method 82 z Test for a Mean 83 t Test for a Mean 84 z Test for a Proportion 85 2 Test for
More informationAnalysis of categorical data: Course quiz instructions for SPSS
Analysis of categorical data: Course quiz instructions for SPSS The dataset Please download the Online sales dataset from the Download pod in the Course quiz resources screen. The filename is smr_bus_acd_clo_quiz_online_250.xls.
More informationStatistics in Medicine Research Lecture Series CSMC Fall 2014
Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power
More informationMath 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
More informationTechnology StepbyStep Using StatCrunch
Technology StepbyStep Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
More informationOnline 12  Sections 9.1 and 9.2Doug Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics  Ensley Assignment: Online 12  Sections 9.1 and 9.2 1. Does a Pvalue of 0.001 give strong evidence or not especially strong
More informationUCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates
UCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally
More informationMath 108 Exam 3 Solutions Spring 00
Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two Means
Lesson : Comparison of Population Means Part c: Comparison of Two Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationThis supplementary material has been provided by the authors to give readers additional information about their work.
SUPPLEMENTAL MATERIAL Table S1. The logistic regression model used to calculate the propensity score. Table S2. Distribution of propensity score among the treat and control groups of the full and matched
More informationAppendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National
More informationLAB : THE CHISQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHISQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationMultiple samples: Pairwise comparisons and categorical outcomes
Multiple samples: Pairwise comparisons and categorical outcomes Patrick Breheny May 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Introduction Pairwise comparisons In the previous lecture,
More informationChapter 14: 16, 9, 12; Chapter 15: 8 Solutions When is it appropriate to use the normal approximation to the binomial distribution?
Chapter 14: 16, 9, 1; Chapter 15: 8 Solutions 141 When is it appropriate to use the normal approximation to the binomial distribution? The usual recommendation is that the approximation is good if np
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of pvalues classical significance testing depend on assumptions
More informationThe GoodnessofFit Test
on the Lecture 49 Section 14.3 HampdenSydney College Tue, Apr 21, 2009 Outline 1 on the 2 3 on the 4 5 Hypotheses on the (Steps 1 and 2) (1) H 0 : H 1 : H 0 is false. (2) α = 0.05. p 1 = 0.24 p 2 = 0.20
More informationA POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationHomework 5 Solutions
Math 130 Assignment Chapter 18: 6, 10, 38 Chapter 19: 4, 6, 8, 10, 14, 16, 40 Chapter 20: 2, 4, 9 Chapter 18 Homework 5 Solutions 18.6] M&M s. The candy company claims that 10% of the M&M s it produces
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 TwoWay ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationStatistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
More informationCalculating PValues. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating PValues Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating PValues" (2014). A with Honors Projects.
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationElementary Statistics
lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chisquare Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page
More informationPRACTICE PROBLEMS FOR BIOSTATISTICS
PRACTICE PROBLEMS FOR BIOSTATISTICS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION 1. The duration of time from first exposure to HIV infection to AIDS diagnosis is called the incubation period.
More informationUsing SPSS to perform ChiSquare tests:
Using SPSS to perform ChiSquare tests: Graham Hole, January 2006: page 1: Using SPSS to perform ChiSquare tests: This handout explains how to perform the two types of ChiSquare test that were discussed
More informationTwosample inference: Continuous data
Twosample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with twosample inference for continuous data As
More information