# Simple Sensitivity Analyses for Matched Samples. CRSP 500 Spring 2008 Thomas E. Love, Ph. D., Instructor. Goal of a Formal Sensitivity Analysis

Save this PDF as:

Size: px
Start display at page:

Download "Simple Sensitivity Analyses for Matched Samples. CRSP 500 Spring 2008 Thomas E. Love, Ph. D., Instructor. Goal of a Formal Sensitivity Analysis"

## Transcription

1 Goal of a Formal Sensitivity Analysis To replace a general qualitative statement that applies in all observational studies the association we observe between treatment and outcome does not imply causation hidden biases can explain observed associations with a quantitative statement that is specific to what is observed in a particular study to explain the association seen in a particular study, one would need a hidden bias of a particular magnitude. If the association is strong, the hidden bias needed to explain it would be large. If a study is free of hidden bias (main example: a carefully randomized trial), this means that any two units (patients, subjects, whatever) that appear similar in terms of their observed covariates actually have the same chance of assignment to treatment. There is hidden bias if two units with the same observed covariates have different chances of receiving the treatment. A sensitivity analysis asks: How would inferences about treatment effects be altered by hidden biases of various magnitudes? How large would these differences have to be to alter the qualitative conclusions of the study? The Sensitivity Parameter, Γ Suppose we have two units (subjects, patients), say, j and k, with the same observed covariate values x but different probabilities p of treatment assignment (possibly due to some unobserved covariate), so that x [j] = x [k] but that possibly p [j] p [k]. Units j and k might be matched to form a matched pair in our attempt to control overt bias due to the covariates x. The odds that units j and k receive the treatment are, respectively, p[ j] p[ k] and, and the odds ratio is the ratio of these odds. 1 p 1 p [ j] [ k] Imagine that we knew that this odds ratio for units with the same x was at most some number Γ, so that Γ 1. That is, 1 p[ ]( 1 p j [ k] ) Γ Γ p 1 p ( ) [ k] [ j] We call Γ the sensitivity parameter, and it forms the basis for our sensitivity analyses. If Γ = 1, then p [j] = p [k] whenever x [j] = x [k], so the study would be free of hidden bias, and standard statistical methods designed for randomized trials would apply. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 1

2 If Γ =, then two units who appear similar, who have the same set of observed covariates x, could differ in their odds of receiving the treatment by as much as a factor of, so that one could be twice as likely as the other to receive the treatment. In other words, Γ is a measure of the degree of departure from a study that is free of hidden bias. A sensitivity analysis will consider possible values of Γ and show how the inference might change. A study is sensitive if values of Γ close to 1 could lead to inferences that are very different from those obtained assuming the study is free of hidden bias. A study is insensitive if extreme values of Γ are required to alter the inference. Scenario 1. A Binary Outcome A Sensitivity Analysis for McNemar s Test Exposure: Heavy Smoker vs. Nonsmoker; Outcome: Death due to Lung Cancer (no censoring) Suppose we paired 1000 heavy smokers to 1000 nonsmokers on the basis of a series of baseline characteristics (without using propensity methods, but that doesn t matter here). Totally fake data follow: Heavy Smoker Dies from Lung Cancer Heavy Smoker Doesn t Die from Lung Cancer Total Nonsmoker Dies from Lung Cancer Nonsmoker Doesn t Die from Lung Cancer Total Of the S = 1000 matched pairs, suppose that there were 1 pairs in which exactly one person died of lung cancer. Of these, there were 1 pairs in which the nonsmoker died of lung cancer, and 110 pairs in which the heavy smoker died of lung cancer. In other words, 1 of the pairs are discordant for death from lung cancer. If the study were a randomized experiment, or if it was an observational study free of any hidden bias (neither of which are true), then we d use McNemar s test to compare the 110 lung cancer deaths among smokers to a binomial distribution with 1 trials and probability of success ½ to yield a significance level of p < In R, we d have > Temp <- matrix(c(175,1,110,703), nrow=) > Temp [,1] [,] [1,] [,] > mcnemar.test(temp, correct=f) McNemar's Chi-squared test data: Temp McNemar's chi-squared = , df = 1, p-value <.e-16 Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page

3 In the absence of hidden bias, there would be strong evidence that smoking causes lung cancer. How much hidden bias would need to be present to alter this conclusion? Let T = # of the 1 discordant pairs in which the heavy smoker died of lung cancer. Here T = 110. Here are the key formulae for this scenario: Any particular value of the sensitivity parameter Γ determines the values for p and p, specifically, p = Γ / (1 Γ) and p = 1 / (1 Γ). We then use these p and p values to determine the bounds on the p value for the McNemar statistic for varying values of Γ. The formulae are identical for the two bounds, except for the p and p substitution 1 a Upper Bound for p value is ( p ) ( 1 p ) a= a 1 a 1 a Lower Bound for p value is ( p ) ( 1 p ) a= 110 1a 1a While these formulae look troublingly complicated at first, they are just the binomial probabilities of obtaining a value of T = 110 or higher assuming a binomial distribution with n = 1 trials and common probability p = p (for upper bound) or p = p (for lower bound). Happily, Excel, R, SAS (or any other useful statistical software) can find these two probabilities. Below, I m using an Excel sheet to demonstrate (which is a little imprecise as is the case for most of Excel s statistical functions, but lands you in the right ballpark and is easy to use.) The first part of the sheet (I ll show you the whole thing shortly) is shown below In this case, I ve already typed in the correct values for the by table based on outcomes and exposures in the matched pairs. Remember that you should have one observation in this x table for each pair of subjects. Here, we have 1000 pairs. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 3

4 We have 1 discordant pairs (i.e. pairs in which exactly one of the two pair members has outcome = Yes) note that the discordant pairs are just the offdiagonal cells here. The test statistic, which is the number of pairs in which the Treated s outcome is Yes but the Control s outcome is No, is 110. The remainder of the sheet does the actual set of sensitivity analysis calculations. Here s the whole thing In the Sensitivity Analysis section, the spreadsheet automatically completes the calculations for gamma values between 1.0 (i.e. no hidden bias) and 6.0, stepping by 0.5. We see, for instance, the p value assuming no hidden bias is 0, to four decimal places (this is obtained from the gamma = 1.0 row). The sensitivity analysis tips over significance at the two-tailed α = 0.05 level somewhere between Γ = 5.0 and Γ = 5.5. To isolate the correct value of the sensitivity parameter to greater detail, you can use the insert gamma value below cell (row 9) and twiddle the value of gamma until the upper bound for the p value is just under This case implies a Γ threshold of about Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 4

5 The conclusion here (stating specifically that Γ > 5 but < 6) would be something like: To attribute the higher rate of death from lung cancer to an unobserved covariate rather than to an effect of smoking, that unobserved covariate would need to produce more than a fivefold increase in the odds of smoking, and it would need to be a near perfect predictor of lung cancer. As we shall see by comparison in later examples, this is a high degree of insensitivity to hidden bias. In many other studies, biases smaller than Γ = 5 could explain the association between treatment and outcome. Now suppose that there were 0 pairs (instead of 110) where the treated patient died but the control patient did not. What is the impact of this change? Scenario. A Continuous Outcome A Sensitivity Analysis for the Wilcoxon Signed Rank Test Exposure: Parent working (or not) in a factory where lead was used to make batteries; Outcome: level of lead found in the child s blood (in μg/dl of whole blood) Morton et al. (198) studied lead in the blood of 33 kids (from different families) whose parents worked in a factory where lead was used in making batteries. The covariate x was two-dimensional, recording age and neighborhood of residence. They matched each exposed child to one control child of the same age and neighborhood whose parents were employed in other industries not using lead. Pair Exposed Control Difference Rank Lead in Children s Blood (μg/dl) Pair Exposed Control Difference Rank The Wilcoxon signed rank statistic for S matched pairs is computed by ranking the absolute value of the differences within each pair from 1 to S, and then summing the ranks of the pairs where the exposed unit had a higher response than the matched control. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 5

6 Pair Exposed Control Difference Rank Pair Exposed Control Difference Rank In this case 1, we sum up the ranks associated with the shaded pairs above (i.e ) and conclude that the Wilcoxon signed rank statistic for the differences in children s lead levels is 57. It turns out that if there are no ties among the absolute differences and no zero differences and no hidden bias, then the expectation and variance of this test statistic is known, and we appeal to a standard normal distribution to evaluate our test statistic. S( S 1) 33( 34) Here, E( T) = = = 80.5 and 4 4 S( S 1)( S 1) 33( 34)( 67) var ( T ) = = = 313.5, so that SD ( T ) = = T E( T) We then calculate a Z statistic, as Z = = = 4.40, and we can SD ( T ) compare this to a standard normal table to get a two-tailed p value < Alternatively, we could simply let R, or SAS, or some other package complete the calculations. At any rate, if the study was free of hidden bias, this would constitute strong evidence (p < ) of an effect of parental exposure to lead on children s lead levels. 1 Note that the easier thing to do here would be to subtract off the total of the ranks associated with the unshaded pairs (those in which the exposed child DID NOT have more lead in their blood), and then subtract this from the total sum of the integers from 1 to S, which is always just S(S1)/. Note that = 34, and 33*34/ = 561, so that we again get 57 for the Wilcoxon statistic. To do this in Excel, the formula you need is =*(1-NORMSDIST(Z)) where Z is the value of interest. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 6

7 Now, of course, there were some tied differences (in which average ranks were used) and there was one zero difference. Thus, the null expectation and variance expressions above are a little bit off, but that difference doesn t affect the conclusion here at all. We ve seen that if the study were free of hidden bias, that is, if Γ = 1, then there would be strong evidence that parents occupational exposures to lead increased the level of lead in their children s blood. The sensitivity analysis we ll conduct now asks how this conclusion might be changed by hidden biases of various magnitudes. To establish this sensitivity analysis, we need the following key formulae: ( 1) ps S E T ( ) = and var ( T ) p ( 1 p ) ( 1)( S 1) S S =, where 6 p Γ = Γ 1 With 1 p = Γ 1 in place of p, we get the same expectation and variance for T ( 1) ps S E T ( ) = and var ( T ) p ( 1 p ) ( 1)( S 1) S S =, where 6 1 p = Γ1 1 Note that if Γ = 1, then p = p =, and the expressions revert to the standard formula for the signed rank test shown above. As an example, suppose Γ =, so that matched children might differ in their odds of exposure to lead by a factor of due to hidden bias. 1 1 We have p 1 = =, ( ) 3 ( 33)( 33 1) E T = = 187 and Γ ( 33 1)( 66 1) var ( T ) = 1 = and p Γ = =, so ( ) 3 ( 33)( 33 1) E T = = 374, and Γ ( 33 1)( 66 1) var ( T ) = 1 = SD T ( ) = SD ( T ), so that SD ( T ) = 784. = 5.77, so that SD ( T ) = 784. = 5.77, as will be true for any particular Γ, so we can calculate either alone. Now, the standard normal deviates (Z-scores) are: T E( T ) T E( T ) Z = = = 6.44 and Z = = =.90 SD ( T ) 5.77 SD ( T ) 5.77 And the relevant p value range is therefore from less than (associated with 6.44) to (associated with.90). A hidden bias of size Γ = is insufficient to explain the observed difference between exposed and control children. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 7

8 Yes, of course, I have a spreadsheet to do the sensitivity analysis calculation here The tipping point for the sensitivity parameter is a little over 4.5. To explain away the observed association between parental exposure to lead and child s lead level, a hidden bias or unobserved covariate would need to increase the odds of exposure by more than a factor of Γ = 4.5. The association cannot be attributed to small hidden biases, but it is somewhat more sensitive to bias than the study of heavy smokers in Scenario 1. Note that all you need to insert here is the number of matched pairs and the Wilcoxon test statistic. If you have a big data set, getting the Wilcoxon statistic will be the most timeconsuming issue Scenario 3. A Censored Survival Outcome A Sensitivity Analysis Exposure: Either Normal or Low Serum Potassium Level in the DIG trial (HF Pts) Outcome: All cause mortality during the follow-up period (i.e. there is censoring) In this case, we re doing a secondary data analysis of the DIG trial, to study heart failure patients with either normal or low serum potassium levels. We identified 1187 matched pairs of a normal potassium and a low potassium HF patient with similar baseline characteristics at admission. The main complicating issue here is that many of the patients were censored (they dropped out of the study, or they survived to the end of the follow-up period). I ll spare you the formulae this time. The tricky part is conceptually simple, but computationally Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 8

9 time-consuming. You need to count the number of pairs with a clear winner (subject who survives longer) and then determine the winner for each of those pairs. Exposed Subject Control Subject Survival Survival Clear Pair Time Censored? Time Censored? Winner? 1 00 No 100 No Exposed 00 Yes 100 No Exposed 3 00 No 100 Yes Unknown 4 00 Yes 100 Yes Unknown No 00 No Control Yes 00 No Unknown No 00 Yes Control Yes 00 Yes Unknown Either 100 Either Unknown In this set of 9 pairs, we have four clear winners, Exposed and Control. The inputs for the spreadsheet below are: The Number of Pairs in which the Winner (longer survival time) can be conclusively determined. # of pairs with a clear winner in which the patient with normal potassium outlived the patient with low potassium In the specific scenario, there were 440 pairs with a clear winner in terms of all-cause mortality, and in 335 of these pairs the winner was the patient with normal potassium. The tipping point for the sensitivity parameter is a little over Γ =.5. Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 9

10 Exercises for Discussion in Class (1 & from Rosenbaum Chapter 4) 1. Many drugs used to treat cancer are quite harsh, and there is the possibility that these drugs can harm hospital workers who are exposed by accident. Kevekordes, Gebel, Hellwig, Dames and Dunkelberg (1998) studied this possibility when a malfunction of a safety hood result[ed] in air flowing from the hood along the arms of the person preparing infusions of antineoplastic drugs. They studied 10 nurses who may have experienced substantial exposures, matching each nurse to a control based on gender, age, and intensity of smoking. They measured genetic damage using the cytokinesis block micronucleus test, reporting mean micronuclei/10 3 binucleate lymphocytes (mm/10 3 ), as follows. Perform an appropriate sensitivity analysis. Pair Ages Smoking Exposed Control 1 37/37 NonS /5 S /3 S /30 NonS /3 NonS /9 NonS /4 NonS /40 S /3 NonS /34 S Starting with 10,87 death certificates with the diagnosis of sporadic motor neuron disease (MND), Graham, Macdonald and Hawkes (1997) examined their birth certificates and found that 70 of these cases with MND had a living twin free of MND. These 70 twin pairs formed the basis for a case-referent study. Because little is known about the causes of MND, they examined many variables as potential causes in their explanatory study. The strongest association they found was with carrying out car or vehicle maintenance. There were 16 twin pairs discordant for this variable, and 14/16 had an exposed case, while /16 had a referred referent, yielding an estimated odds ratio of 14/ = 7 in the absence of hidden bias. Do a sensitivity analysis for the significance level using an appropriate test. 3. In the study described previously as part of Scenario 3, Ahmed, Love et al. studied the impact of low vs. normal serum potassium on two additional outcomes, specifically cardiovascular mortality, and heart failure mortality. Of the pairs with clear winners in terms of cardiovascular mortality, 74 had normal potassium winners, while 77 had low potassium winners. In terms of heart failure mortality, the pairs with winners broke down as 135 for normal potassium and 7 for low potassium. What conclusions can be drawn? Source: Rosenbaum PR Observational Studies, nd Ed. (00), NY: Springer. Chapter 4. Page 10

### Chapter 7: Effect Modification

A short introduction to epidemiology Chapter 7: Effect Modification Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand Chapter 8 Effect modification Concepts of interaction

### Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

### Canadian Individual Critical Illness Insurance Morbidity Experience

Morbidity Study Canadian Individual Critical Illness Insurance Morbidity Experience Between Policy Anniversaries in 2002 and 2007 Using Expected CIA Incidence Tables from July 2012 Individual Living Benefits

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

### Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

### STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4

STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Results from the 2014 AP Statistics Exam. Jessica Utts, University of California, Irvine Chief Reader, AP Statistics jutts@uci.edu

Results from the 2014 AP Statistics Exam Jessica Utts, University of California, Irvine Chief Reader, AP Statistics jutts@uci.edu The six free-response questions Question #1: Extracurricular activities

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### Chi Square Distribution

17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

### Confidence Intervals in Public Health

Confidence Intervals in Public Health When public health practitioners use health statistics, sometimes they are interested in the actual number of health events, but more often they use the statistics

### SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION. Friday 14 March 2008 9.00-9.

SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION Friday 14 March 2008 9.00-9.45 am Attempt all ten questions. For each question, choose the

### Statistical Functions in Excel

Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Terminal illness pricing considerations. Steve Varney

pricing considerations Presenters Brad Louis Steve Varney AGENDA 1. Current Status 2. Estimating TI Cost 3. Data and Critical Assumptions 4. Estimated Costs 5. Conclusions 1: CURRENT STATUS Terminal Illness

### Basic Probability Theory II

RECAP Basic Probability heory II Dr. om Ilvento FREC 408 We said the approach to establishing probabilities for events is to Define the experiment List the sample points Assign probabilities to the sample

### The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

The Normal Approximation to Probability Histograms Where are we going? Probability histograms The normal approximation to binomial histograms The normal approximation to probability histograms of sums

### Binary Diagnostic Tests Two Independent Samples

Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary

### The American Cancer Society Cancer Prevention Study I: 12-Year Followup

Chapter 3 The American Cancer Society Cancer Prevention Study I: 12-Year Followup of 1 Million Men and Women David M. Burns, Thomas G. Shanks, Won Choi, Michael J. Thun, Clark W. Heath, Jr., and Lawrence

### Simple Random Sampling

Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 3 Simple Random Sampling 3.1 INTRODUCTION Everyone mentions simple random sampling, but few use this method for

### If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

### With Big Data Comes Big Responsibility

With Big Data Comes Big Responsibility Using health care data to emulate randomized trials when randomized trials are not available Miguel A. Hernán Departments of Epidemiology and Biostatistics Harvard

### Lecture 25. December 19, 2007. Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

### Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.

Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National

### Pennies and Blood. Mike Bomar

Pennies and Blood Mike Bomar In partial fulfillment of the requirements for the Master of Arts in Teaching with a Specialization in the Teaching of Middle Level Mathematics in the Department of Mathematics.

### Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Randomized trials versus observational studies

Randomized trials versus observational studies The case of postmenopausal hormone therapy and heart disease Miguel Hernán Harvard School of Public Health www.hsph.harvard.edu/causal Joint work with James

### Solution. Let us write s for the policy year. Then the mortality rate during year s is q 30+s 1. q 30+s 1

Solutions to the May 213 Course MLC Examination by Krzysztof Ostaszewski, http://wwwkrzysionet, krzysio@krzysionet Copyright 213 by Krzysztof Ostaszewski All rights reserved No reproduction in any form

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### Models Where the Fate of Every Individual is Known

Models Where the Fate of Every Individual is Known This class of models is important because they provide a theory for estimation of survival probability and other parameters from radio-tagged animals.

### Biostatistics and Epidemiology within the Paradigm of Public Health. Sukon Kanchanaraksa, PhD Marie Diener-West, PhD Johns Hopkins University

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### IP Subnetting: Practical Subnet Design and Address Determination Example

IP Subnetting: Practical Subnet Design and Address Determination Example When educators ask students what they consider to be the most confusing aspect in learning about networking, many say that it is

### Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Advanced Statistical Analysis of Mortality Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc 160 University Avenue Westwood, MA 02090 001-(781)-751-6356 fax 001-(781)-329-3379 trhodes@mib.com Abstract

### Chapter 2: Systems of Linear Equations and Matrices:

At the end of the lesson, you should be able to: Chapter 2: Systems of Linear Equations and Matrices: 2.1: Solutions of Linear Systems by the Echelon Method Define linear systems, unique solution, inconsistent,

### ANNEX 2: Assessment of the 7 points agreed by WATCH as meriting attention (cover paper, paragraph 9, bullet points) by Andy Darnton, HSE

ANNEX 2: Assessment of the 7 points agreed by WATCH as meriting attention (cover paper, paragraph 9, bullet points) by Andy Darnton, HSE The 7 issues to be addressed outlined in paragraph 9 of the cover

### Risk of alcohol. Peter Anderson MD, MPH, PhD, FRCP Professor, Alcohol and Health, Maastricht University Netherlands. Zurich, 4 May 2011

Risk of alcohol Peter Anderson MD, MPH, PhD, FRCP Professor, Alcohol and Health, Maastricht University Netherlands Zurich, 4 May 2011 Lifetime risk of an alcoholrelated death (1/100) 16.0 14.0 12.0 10.0

### Session 8 Probability

Key Terms for This Session Session 8 Probability Previously Introduced frequency New in This Session binomial experiment binomial probability model experimental probability mathematical probability outcome

### Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

### People like to clump things into categories. Virtually every research

05-Elliott-4987.qxd 7/18/2006 5:26 PM Page 113 5 Analysis of Categorical Data People like to clump things into categories. Virtually every research project categorizes some of its observations into neat,

### Likelihood of Cancer

Suggested Grade Levels: 9 and up Likelihood of Cancer Possible Subject Area(s): Social Studies, Health, and Science Math Skills: reading and interpreting pie charts; calculating and understanding percentages

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

### www.rmsolutions.net R&M Solutons

Ahmed Hassouna, MD Professor of cardiovascular surgery, Ain-Shams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A- Choose the best answer The duration

### The Effects of Moderate Aerobic Exercise on Memory Retention and Recall

The Effects of Moderate Aerobic Exercise on Memory Retention and Recall Lab 603 Group 1 Kailey Fritz, Emily Drakas, Naureen Rashid, Terry Schmitt, Graham King Medical Sciences Center University of Wisconsin-Madison

### An Update on Lung Cancer Diagnosis

An Update on Lung Cancer Diagnosis Dr Michael Fanning MBBS FRACGP FRACP RESPIRATORY AND SLEEP PHYSICIAN Mater Medical Centre Outline Risk factors for lung cancer Screening for lung cancer Radiologic follow-up

### WISE Power Tutorial All Exercises

ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II

### Introduction to StatsDirect, 11/05/2012 1

INTRODUCTION TO STATSDIRECT PART 1... 2 INTRODUCTION... 2 Why Use StatsDirect... 2 ACCESSING STATSDIRECT FOR WINDOWS XP... 4 DATA ENTRY... 5 Missing Data... 6 Opening an Excel Workbook... 6 Moving around

### Violent crime total. Problem Set 1

Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the

### Guide to Biostatistics

MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading

### restricted to certain centers and certain patients, preferably in some sort of experimental trial format.

Managing Pancreatic Cancer, Part 4: Pancreatic Cancer Surgery, Complications, & the Importance of Surgical Volume Dr. Matthew Katz, Surgeon, MD Anderson Cancer Center, Houston, TX I m going to talk a little

### Butler Memorial Hospital Community Health Needs Assessment 2013

Butler Memorial Hospital Community Health Needs Assessment 2013 Butler County best represents the community that Butler Memorial Hospital serves. Butler Memorial Hospital (BMH) has conducted community

### Descriptive and Inferential Statistics

General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?

### Algebra 1 Course Information

Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

### SPSS/Excel Workshop 2 Semester One, 2010

SPSS/Excel Workshop 2 Semester One, 2010 In Assignment 2 of STATS 10x you may want to use Excel or SPSS to perform some calculations, that is, finding Normal probabilities and Inverse Normal values in

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### ECE302 Spring 2006 HW1 Solutions January 16, 2006 1

ECE302 Spring 2006 HW1 Solutions January 16, 2006 1 Solutions to HW1 Note: These solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics

### Excel Modeling Practice. The Svelte Glove Problem Step-by-Step With Instructions

Excel Modeling Practice The Svelte Glove Problem Step-by-Step With Instructions EXCEL REVIEW 2001-2002 Contents Page Number Overview...1 Features...1 The Svelte Glove Problem...1 Outputs...2 Approaching

### Automobiles - Tips & Traps By Nelson F. Deslippe

Newsletter Title Page 1 Office Address: Morgan Creek Corporate Centre Suite 306-15252 32 nd Avenue Check our website: www.integral-financial.com August/Summer 2011 I N T H I S I S S U E Automobiles Tips

### Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health

Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health Introduction Reader (Statistics and Epidemiology) Research team epidemiologists/statisticians/phd students Primary care

### TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2

About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing

### In the past, the increase in the price of gasoline could be attributed to major national or global

Chapter 7 Testing Hypotheses Chapter Learning Objectives Understanding the assumptions of statistical hypothesis testing Defining and applying the components in hypothesis testing: the research and null

### An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides: www.unc.

An Application of the G-formula to Asbestos and Lung Cancer Stephen R. Cole Epidemiology, UNC Chapel Hill Slides: www.unc.edu/~colesr/ 1 Acknowledgements Collaboration with David B. Richardson, Haitao

### P(every one of the seven intervals covers the true mean yield at its location) = 3.

1 Let = number of locations at which the computed confidence interval for that location hits the true value of the mean yield at its location has a binomial(7,095) (a) P(every one of the seven intervals

### Income Distribution Database (http://oe.cd/idd)

Income Distribution Database (http://oe.cd/idd) TERMS OF REFERENCE OECD PROJECT ON THE DISTRIBUTION OF HOUSEHOLD INCOMES 2014/15 COLLECTION October 2014 The OECD income distribution questionnaire aims

### Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Problem Set 1 (with solutions)

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand Problem Set 1 (with solutions) About this problem set: These are problems from Course 1/P actuarial exams that I have collected over the years,

### Cell Phone Impairment?

Cell Phone Impairment? Overview of Lesson This lesson is based upon data collected by researchers at the University of Utah (Strayer and Johnston, 2001). The researchers asked student volunteers (subjects)

### ( ) is proportional to ( 10 + x)!2. Calculate the

PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

### If several different trials are mentioned in one publication, the data of each should be extracted in a separate data extraction form.

General Remarks This template of a data extraction form is intended to help you to start developing your own data extraction form, it certainly has to be adapted to your specific question. Delete unnecessary

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Systematic Reviews and Meta-analyses

Systematic Reviews and Meta-analyses Introduction A systematic review (also called an overview) attempts to summarize the scientific evidence related to treatment, causation, diagnosis, or prognosis of

### Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

### Understanding and Quantifying EFFECT SIZES

Understanding and Quantifying EFFECT SIZES Karabi Nandy, Ph.d. Assistant Adjunct Professor Translational Sciences Section, School of Nursing Department of Biostatistics, School of Public Health, University

### Examining a Fitted Logistic Model

STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

### Smoking Cessation Program

Smoking Cessation Program UHN Information for people who are ready to quit smoking Read this information to learn: why you should quit smoking how the Smoking Cessation Program works treatments to help

### An Innocent Investigation

An Innocent Investigation D. Joyce, Clark University January 2006 The beginning. Have you ever wondered why every number is either even or odd? I don t mean to ask if you ever wondered whether every number

### STAT 35A HW2 Solutions

STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },

### Chapter 5. Discrete Probability Distributions

Chapter 5. Discrete Probability Distributions Chapter Problem: Did Mendel s result from plant hybridization experiments contradicts his theory? 1. Mendel s theory says that when there are two inheritable

### AP Statistics 7!3! 6!

Lesson 6-4 Introduction to Binomial Distributions Factorials 3!= Definition: n! = n( n 1)( n 2)...(3)(2)(1), n 0 Note: 0! = 1 (by definition) Ex. #1 Evaluate: a) 5! b) 3!(4!) c) 7!3! 6! d) 22! 21! 20!

### ACMS 10140 Section 02 Elements of Statistics October 28, 2010 Midterm Examination II Answers

ACMS 10140 Section 02 Elements of Statistics October 28, 2010 Midterm Examination II Answers Name DO NOT remove this answer page. DO turn in the entire exam. Make sure that you have all ten (10) pages

### The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests

Tutorial The F distribution and the basic principle behind ANOVAs Bodo Winter 1 Updates: September 21, 2011; January 23, 2014; April 24, 2014; March 2, 2015 This tutorial focuses on understanding rather

### Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

Math 370, Spring 008 Prof. A.J. Hildebrand Practice Test Solutions About this test. This is a practice test made up of a random collection of 5 problems from past Course /P actuarial exams. Most of the

### Terminating Sequential Delphi Survey Data Collection

A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### EDUCATION AND EXAMINATION COMMITTEE SOCIETY OF ACTUARIES RISK AND INSURANCE. Copyright 2005 by the Society of Actuaries

EDUCATION AND EXAMINATION COMMITTEE OF THE SOCIET OF ACTUARIES RISK AND INSURANCE by Judy Feldman Anderson, FSA and Robert L. Brown, FSA Copyright 25 by the Society of Actuaries The Education and Examination

### Analyzing and interpreting data Evaluation resources from Wilder Research

Wilder Research Analyzing and interpreting data Evaluation resources from Wilder Research Once data are collected, the next step is to analyze the data. A plan for analyzing your data should be developed

### ( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x

Errata for the ASM Study Manual for Exam P, Eleventh Edition By Dr. Krzysztof M. Ostaszewski, FSA, CERA, FSAS, CFA, MAAA Web site: http://www.krzysio.net E-mail: krzysio@krzysio.net Posted September 21,