Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions
|
|
|
- Charity Horn
- 10 years ago
- Views:
Transcription
1 BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared and Fisher's exact tests Fisher's exact test is a computationally intensive test Chi squared test provides a good approximation in many cases Both tests are for purely categorical data i.e. looking at the count of values in given categories There are two distinct uses for these tests: Testing if data match an expected distribution Comparing proportions among different groups Observed vs Expected Distributions The chi-squared test can be used to test whether a discrete distribution of results follow a predicted or expected distribution. Example (from Motulsky): Study from 2007 investigated heart disease among firefighters Hypothesized that risk of death from heart disease was related to duty the firefighter was performing at the time Null hypothesis is that risk of death from heart disease is independent of duty performed Kales et al. study Duty Number of heart attacks observed Proportion of time spent in duty Fire Suppression % Alarm response % Physical training % Other duties % Total % Under the null hypothesis, the number of heart attack deaths occuring during each duty would be proportional to the time spent in that duty So, for example, the number of deaths expected during fire suppression would be 2.0% of 449 = /6
2 Observed and Expected values Duty Number of heart attacks observed Expected number Fire Suppression Alarm response Physical training Other duties Total The number of deaths from heart attack while on active duty, and particularly while actively working on fire suppression, is far higher than the expected number The Chi-squared goodness of fit test The Chi-squared goodness of fit test computes a p-value for the following question: If the data were distributed as defined in the null hypothesis, what is the probability of seeing a discrepancy between the observed and expected values this large solely by random sample selection? In our example, distribution for the null hypothesis is merely the one in which the chance of death from heart disease is randomly distributed according to time The chi-squared statistic is defined as follows: For each category: Subtract the expected value from the observed value and squared the difference Divide the square by the expected value Sum these up. This is the chi-squared (χ 2 ) value. The chi-squared distribution The chi-squared value under the null hypothesis follows a known distribution that depends on the number of degrees of freedom The degrees of freedom in a goodness of fit test is the number of categories minus one In the firefighter example there are three degrees of freedom The p-value is obtained by comparing the number of degrees of freedom to a table of χ 2 values, or by using statistical software. For our example, χ 2 =2245, d.f.=3, and p<10-6 Chi-squared goodness of fit test: cautions When using the chi-squared goodness of fit test: Always use actual count data for the observed numbers Do not use percentages or normalized values 2/6
3 The observed values should always be integers If the expected number is less than 5 in any category, or less than 10 if there are only two categories, the results may be invalid. For two categories, a binomial test may be used instead Or the sample size can be increased to achieve the required expected values Do not confuse this test (goodness of fit test) with the chi-squared test for comparing proportions (test of independence). Proportion comparison studies Many studies, particularly clinical studies, answer a question of the type "Does exposure to a risk factor (or a specific treatment, etc) change the rate of disease?" Note that in these studies, both the dependent variable (disease status) and the independent variable (treated vs untreated, etc), are categorical Types of study Some jargon: Incidence: rate of new cases of disease Prevalence: proportion of a sample which has the disease Cross-sectional study: a sample is chosen without control as to how many are affected with the disease or as to how many were exposed to the risk factor. The subjects are divided as to exposure to the risk factor and disease prevalence is compared. Prospective or longditudinal study: Two groups are selected, one exposed to a risk factor (or treatment), one not. They are followed over the natural timeline of the disease, and disease incidence is compared. Experimental study: A single sample is chosen and randomly divided into two groups. One group is treated (or exposed to a risk factor), one is not. Incidence is compared between the two groups. Case-control or retrospective study: Two groups are selected, one with the disease, one without. The number exposed to the risk factor (or treated, etc) is compared between the two groups. Contingency Tables Data from all these types of study may be summarized in a contingency table Rows in the table represent exposure to the risk factor or treatment status Columns in the table represent disease status Cells in the table are counts of the number of subjects in that category Disease No disease Total Exposed/Treated A B A+B Not Exposed/Not Treated C D C+D 3/6
4 Total A+C B+D A+B+C+D Example experimental study Example study we saw previously: CABG vs PTCA in coronary artery disease patients Sample of 1829 patients with CAD. 914 Randomly assigned to bypass (CABG), 915 to angioplasty (PTCA). Focus on five-year survival rates Survived 5 years Did not survive 5 years Total CABG PCTA Total % of those receiving CABG died within five years, and 41.3% of those receiving PTCA died within five years Clearly little difference Diabetes and CABG/PTCA Study also looked at survival rates for CABG vs PTCA among patients treated for diabetes Note this is still an experimental study as the treatments were assigned and the outcome was measured subsequently Even though the diabetes diagnosis was retrieved retrospectively Five year survival among diabetic patients: Survived 5 years Did not survive 5 years Total CABG PCTA Total Mortality rates were 48.3% for CABG-treated patients and 60.1% for PTCA-treated patients Diabetes and CABG/PTCA: Data Analysis Aim is to know the extent to which these data generalize to the general population CAD patients with diabetes One way is to compute confidence intervals for these proportions Saw how to do this in lecture 2. 95% CI for CABG is 41% to 56% and for PTCA is 53% to 67% The CIs overlap, but this does not mean the difference is not statistically significant Attributable Risk and NNT The difference between the two proportions is called the attributable risk The amount of risk which can be attributed to the treatment or exposure to risk factor 4/6
5 For our example the attributable risk is 60.1%-48.3%=11.8%. This means that for diabetic patients there is an 11.8% risk of mortality in five years associated with choosing PTCA over CABG. The reciprocal of attributable risk is called the Number needed to treat (NNT). For this example, NNT=1/0.118=8.5. This means that if we choose to give CAD patients with diabetes CABG instead of PTCA, for every 8.5 patients one will survive five years who would not have done under PTCA. Relative Risk The relative risk is the ratio of the risks. In this example it is 48.3%/60.1%=0.80. This means that CAD-diabetic patients treated with CABG have 80% of the chance of mortality of PTCA treated patients. The confidence interval of the relative risk can also be computed. For this example the 95% confidence interval is 66% to 98%. Be careful with the percentages: For attributable risk, the percentages represent the percentages of subjects For relative risk, the proportion 0.80 (80%) represents a relative probability; it's the proportion of risk assumed by one group relative to the other group Attributable Risk or Relative Risk? Whether attributable risk (or NNT) or relative risk are more useful depends on the context When making a choice between treatments, the relative risk is often more intutive Risk under CABG is 80% of the risk under PTCA But on a population level, the relative risk can be misleading Imagine a vaccine which reduces the occurence of a disease by 60% Relative risk of taking the vaccine is 40% compared to no vaccine The utility of the vaccine depends on the prevalence of the disease in the general population If the disease is very rare, say prevalence is 1 in 1 million, then administering the vaccine will save 4 in 10 million people If the disease is common, say prevalence of 1%, then the vaccine will save 4 in 1000 people Fisher's Exact Test The data analysis can be supplemented with a p-value Remember the p-value cannot be interpreted without a null hypothesis The null hypothesis is that the categories corresponding to the rows in the contingency table are independent of the categories corresponding to the columns. In this example, the null hypothesis is that the mortality rate does not depend on which treatment was used The p-value is the probability of seeing mortality rates this different under the assumption of the null hypothesis. The best way to compute a p-value for contingency table data is with Fisher's Exact Test. For very large sample sizes (in the 100,000s and above), this test is computationally 5/6
6 Summary unwieldy, and a chi-squared test can be used as an approximation For the example data, p= by Fisher's Exact Test Chi-squared and Fisher's Exact Tests can be used for two scenarios: Testing if categorical data match an expected distribution Testing for independence of two categorical variables i.e. testing if proportions are equivalent among two or more sets of counted data Chi-squared goodness of fit test used to test if categorical data matches an expected distribution Approximate test Good when expected values are at least 5 in each category (at least 10 if only two categories) Must use actual counts, not proportions or normalized values Contingency tables tabluate subject counts in different categories Usually use rows in the table for the independent (predictor) variable Columns for the dependent (outcome) variable Summary (continued) Attributable risk is the difference in proportions between treatment categories Number needed to treat (NNT) is the reciprocal of attributable risk Relative risk is the ratio of proportions between treatment categories Fisher's exact test can be used to compute a p-value for the null hypothesis that there is no relationship between the dependent and independent variable (i.e. the variables are independent of each other) Computationally prohibitive for very large data sets Can use chi-squared test for independence instead (but never needed in practice) 6/6
Chi-square test Fisher s Exact test
Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Is it statistically significant? The chi-square test
UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical
Section 12 Part 2. Chi-square test
Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of
Cohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Testing differences in proportions
Testing differences in proportions Murray J Fisher RN, ITU Cert., DipAppSc, BHSc, MHPEd, PhD Senior Lecturer and Director Preregistration Programs Sydney Nursing School (MO2) University of Sydney NSW 2006
Guide to Biostatistics
MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading
Chapter 23. Two Categorical Variables: The Chi-Square Test
Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise
Chi Square Distribution
17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes
Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170
Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label
Data Analysis, Research Study Design and the IRB
Minding the p-values p and Quartiles: Data Analysis, Research Study Design and the IRB Don Allensworth-Davies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table
ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS
CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack
Lecture 25. December 19, 2007. Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
People like to clump things into categories. Virtually every research
05-Elliott-4987.qxd 7/18/2006 5:26 PM Page 113 5 Analysis of Categorical Data People like to clump things into categories. Virtually every research project categorizes some of its observations into neat,
One-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Simulating Chi-Square Test Using Excel
Simulating Chi-Square Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 [email protected]
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Topic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables
Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows
Comparing Multiple Proportions, Test of Independence and Goodness of Fit
Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2
Chi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
Chapter 19 The Chi-Square Test
Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The Chi-Square Test In this chapter, we will discuss the following topics: We will plot
TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2
About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing
Study Design and Statistical Analysis
Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
University of Colorado Campus Box 470 Boulder, CO 80309-0470 (303) 492-8230 Fax (303) 492-4916 http://www.colorado.edu/research/hughes
Hughes Undergraduate Biological Science Education Initiative HHMI Tracking the Source of Disease: Koch s Postulates, Causality, and Contemporary Epidemiology Koch s Postulates In the late 1800 s, the German
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent
Two Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1
BRM.1 The proportion of individuals with a particular disease who die from that condition is called... BRM.2 This study design examines factors that may contribute to a condition by comparing subjects
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York . NONPARAMETRIC STATISTICS I. DEFINITIONS A. Parametric
Testing Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory
LA-UR-12-24572 Approved for public release; distribution is unlimited Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory Alicia Garcia-Lopez Steven R. Booth September 2012
Sample Size Planning, Calculation, and Justification
Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics [email protected] http://biostat.mc.vanderbilt.edu/theresascott Theresa
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
8 6 X 2 Test for a Variance or Standard Deviation
Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion
Statistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Elementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National
Analysis of categorical data: Course quiz instructions for SPSS
Analysis of categorical data: Course quiz instructions for SPSS The dataset Please download the Online sales dataset from the Download pod in the Course quiz resources screen. The filename is smr_bus_acd_clo_quiz_online_250.xls.
Beginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW.
Paper 69-25 PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI ABSTRACT The FREQ procedure can be used for more than just obtaining a simple frequency distribution
Prospective, retrospective, and cross-sectional studies
Prospective, retrospective, and cross-sectional studies Patrick Breheny April 3 Patrick Breheny Introduction to Biostatistics (171:161) 1/17 Study designs that can be analyzed with χ 2 -tests One reason
Biostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics [email protected] http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
Tests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
Come scegliere un test statistico
Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table
An analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
Multiple samples: Pairwise comparisons and categorical outcomes
Multiple samples: Pairwise comparisons and categorical outcomes Patrick Breheny May 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Introduction Pairwise comparisons In the previous lecture,
2 GENETIC DATA ANALYSIS
2.1 Strategies for learning genetics 2 GENETIC DATA ANALYSIS We will begin this lecture by discussing some strategies for learning genetics. Genetics is different from most other biology courses you have
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Recall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
PRACTICE PROBLEMS FOR BIOSTATISTICS
PRACTICE PROBLEMS FOR BIOSTATISTICS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION 1. The duration of time from first exposure to HIV infection to AIDS diagnosis is called the incubation period.
Statistics in Medicine Research Lecture Series CSMC Fall 2014
Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power
Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.
Confidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
Analysis of Variance. MINITAB User s Guide 2 3-1
3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced
Online 12 - Sections 9.1 and 9.2-Doug Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong
NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions
MTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
Two-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
Odds ratio, Odds ratio test for independence, chi-squared statistic.
Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review
Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
Crosstabulation & Chi Square
Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
The Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
Binary Diagnostic Tests Two Independent Samples
Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
First-year Statistics for Psychology Students Through Worked Examples
First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental
A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment
C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample
06 Validation of risk prediction model
HA Territory-wide PCI Audit 2003-06 06 Validation of risk prediction model PCI Audit Working Group Central Committee (Cardiac Services) HA Convention 2007 Background Participants: All HA hospitals via
Competency 1 Describe the role of epidemiology in public health
The Northwest Center for Public Health Practice (NWCPHP) has developed competency-based epidemiology training materials for public health professionals in practice. Epidemiology is broadly accepted as
Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
Goodness of Fit. Proportional Model. Probability Models & Frequency Data
Probability Models & Frequency Data Goodness of Fit Proportional Model Chi-square Statistic Example R Distribution Assumptions Example R 1 Goodness of Fit Goodness of fit tests are used to compare any
Confidence Intervals for Exponential Reliability
Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion
SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS
SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION SOLUTIONS 1. a. To calculate the mean, we just add up all 7 values, and divide by 7. In Xi i= 1 fancy
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program Department of Mathematics and Statistics Degree Level Expectations, Learning Outcomes, Indicators of
Elementary Statistics
lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chi-square Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page
1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
Final Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
Measurement in ediscovery
Measurement in ediscovery A Technical White Paper Herbert Roitblat, Ph.D. CTO, Chief Scientist Measurement in ediscovery From an information-science perspective, ediscovery is about separating the responsive
Using Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
Nonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
Categorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
