Effect Size Calculation for experimental & quasiexperimental



Similar documents
Calculating Effect-Sizes

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Study Guide for the Final Exam

11. Analysis of Case-control Studies Logistic Regression

CLUSTER SAMPLE SIZE CALCULATOR USER MANUAL

Descriptive Statistics

Data Analysis, Research Study Design and the IRB

A Basic Introduction to Missing Data

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

Simple Linear Regression Inference

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Simple Linear Regression

Ordinal Regression. Chapter

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Missing data in randomized controlled trials (RCTs) can

Additional sources Compilation of sources:

Guided Reading 9 th Edition. informed consent, protection from harm, deception, confidentiality, and anonymity.

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Fixed-Effect Versus Random-Effects Models

Understanding and Quantifying EFFECT SIZES

Statistical Rules of Thumb

individualdifferences

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Simple linear regression

Illustration (and the use of HLM)

Chapter Eight: Quantitative Methods

Specifications for this HLM2 run

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Lesson Outline Outline

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Multinomial and Ordinal Logistic Regression

Introduction to Statistics and Quantitative Research Methods

When to Use a Particular Statistical Test

II. DISTRIBUTIONS distribution normal distribution. standard scores

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Two Correlated Proportions (McNemar Test)

Trial Sequential Analysis (TSA)

WHAT IS A JOURNAL CLUB?

Tests for Two Proportions

Multiple Choice: 2 points each

Multivariate Logistic Regression

IPDET Module 6: Descriptive, Normative, and Impact Evaluation Designs

Probability Calculator

Simple Regression Theory II 2010 Samuel L. Baker

DATA COLLECTION AND ANALYSIS

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

SUMAN DUVVURU STAT 567 PROJECT REPORT

Introduction to mixed model and missing data issues in longitudinal studies

Non-Inferiority Tests for Two Proportions

LOGIT AND PROBIT ANALYSIS

2. Simple Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Multiple Regression: What Is It?

Module 3: Correlation and Covariance

Calculating the Probability of Returning a Loan with Binary Probability Models

Regression with a Binary Dependent Variable

CALCULATIONS & STATISTICS

Univariate Regression

Longitudinal Meta-analysis

Week 4: Standard Error and Confidence Intervals

Crash Course on Basic Statistics

SAMPLE SIZE CONSIDERATIONS

Approaches for Analyzing Survey Data: a Discussion

Module 5: Multiple Regression Analysis

10. Analysis of Longitudinal Studies Repeat-measures analysis

Interpretation of Somers D under four simple models

The Impact of Retail Payment Innovations on Cash Usage

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Supplementary PROCESS Documentation

Chapter 7: Simple linear regression Learning Objectives

Principles of Hypothesis Testing for Public Health

Planning sample size for randomized evaluations

Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses

Standard errors of marginal effects in the heteroskedastic probit model

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Multiple Imputation for Missing Data: A Cautionary Tale

Introduction to Fixed Effects Methods

Methods for Meta-analysis in Medical Research

Statistical Functions in Excel

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1

Introduction to Longitudinal Data Analysis

Introduction to Regression and Data Analysis

Logit and Probit. Brad Jones 1. April 21, University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Statistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation

Meta-Analytic Synthesis of Studies Conducted at Marzano Research Laboratory on Instructional Strategies

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Binary Diagnostic Tests Two Independent Samples

Structural Equation Modelling (SEM)

Transcription:

Effect Size Calculation for experimental & quasiexperimental methods Jorge Garcia Hombrados & Hugh Waddington

Our goal is to get from here... Source: Baird et al. 2011 Worms at work

... to here Our goal is to get from here Source: Petrosino et al. (2012) Interventions in Developing Nations for Improving Primary and Secondary School Enrollment of Children: A Systematic Review Campbell Systematic Reviews

Working in small groups, you will learn: To calculate effect sizes: odds ratios, risk ratios, response ratios, standardized mean differences (SMDs) Effect sizes for different types of data based on data reported as group comparisons from raw and adjusted data based on data reported from statistical inference (eg regression analysis) For different types of rigorous evaluation design: randomized control trials (RCTs), cross-section design with Propensity Score Matching (PSM), longitudinal design with Difference-in-Differences (DID) analysis

Concepts we will use liberally Mean Standard deviation Standard error Confidence interval Average treatment effect on the treated (ATET)

What is an effect size? Impact evaluation tells us what difference an intervention makes to outcomes, measured by the treatment effect (eg regression coefficient on treatment variable) The effect size scales the treatment effect in units which tell us the magnitude of this difference and its statistical significance (as indicated by 95% confidence interval) There are different units and measures in which we can quantify this difference but it needs to be consistent and comparable across studies. Effect sizes are the unit of analysis in meta-analysis but they are also important to indicate policy relevance of findings

Example: Height-for-age Mean Z-score Percentage below z=-2 2004 Bangla-DHS -2.0 51% 2011 Bangla-DHS -1.7 41%

Examples of different effect sizes Standardized mean difference Group contrast Continuous outcome variable e.g. test scores Odds ratio/ risk ratio Group contrast Dichotomous outcome variable e.g. mortality Correlation coefficient (pearson s r) Association between 2 variables e.g. value of micro-credit loan and income Proportion Disease prevalence rates e.g. proportion suffering diarrhoea

Effect sizes measures For continuous outcomes: Standardised Mean Differences (SMD) Response Ratios (RR) d-based Regression coefficients, t-stat effect sizes R-based effect sizes (correlation coefficients) For dichotomous outcomes: Odds ratios Risk ratios

Standardized mean difference Uses the pooled standard deviation (some cases use control group standard deviation)

Exercise: calculate absolute and SMD Income per capita Treatment mean Comparis on mean Treatment SD Comparis on SD Sample size t Sample size c Bangladesh 770 500 300 200 50 50 Nepal 550 280 200 150 50 50

Odds ratio and risk ratio Frequencies Success Failure Treatment Group a b Control Group c d OR a / b ad Ratio of odds of success in the c / d bc of success in the comparison treatment group relative to odds RR a c /( a b) /( c d) Ratio of probability of success in the treatment group relative to probability of success in the comparison group

Exercise: calculate and interpret OR and RR Poor Non-Poor Treatment 315 685 Comparison 400 600

Selecting the effect size measure in practice We need to estimate the same effect size measure across all the included studies pooled in the metaanalysis However, very often not all the studies report enough information to compute all the effect size measures and contacting authors does not always work The selection of the effect size measure in practice takes into account: Nature of outcome (dichotomous vs continuous) Minimising the number of studies lost

Option 1: SMD Measures the impact of the programme in standard deviations of the outcome variable Can be computed consistently for both experimental and non-experimental studies Is the less problematic methodology However, its interpretation is not straightforward and the data required for its computation is not always available

Option 2: Response Ratio Measures the impact of the programme in percentage change Based on the Risk Ratio effect size used for dichotomous outcomes Data required for its computation is minimum Can be computed consistently for both experimental and non-experimental studies It is appropriate for both continuous and dichotomous outcomes (risk ratio) providing the outcome measure has a natural scale unit and natural zero points (but is not likely to equal zero) Synthesis uses logarithmic scales for both RR and SE(RR)

Other measures are problematic T-statistics are noisy and when applied to regression has some important shortcomings (Becker & Wu, 2007) Regression coefficients: data on covariance matrix (hardly reported) are required for appropriate synthesis R-based effect sizes do not seem to work properly for multivariate regression

Computation of effect sizes: SMD Standardised Mean Difference (SMD): How to estimate SMD from different study designs? An easy way of dealing with it is thinking separately in the numerator and denominator.

Computation of effect size: SMD The numerator Y t -Y c represents the causal raw impact of the programme in the outcome: In a regression analysis is the coefficient of interest (Beta). In a matching-based study this is the causal impact (ATT) or the difference in outcomes between groups after matching.

Computation of effect size: SMD The denominator S p is a measure of the standard deviation of the outcome. It makes the effect size comparable across studies In regression studies, we can use the standard deviation of the regression errors. Alternatively, we can use the sample standard deviation or the treatment and control standard deviation to calculate S p in matched-based studies or approximate it in regression based studies:

Computation of effect size: SMD Diff-in-Diff models In Diff-in-Diff model, the dependent variable is the change in the dependent variable of reference, or in semi log models, the growth rate of the variable of reference. In these cases, Sp should measure the pool standard deviation of the change (or the growth rate) in the dependent variable of reference. Unfortunately, this information is not reported very often and most of the times the calculation of SMDs requires assumptions.

Computation of effect size: SMD Tobit, Logit and Probit models Where for: Probit models Δ= Φ (x i β+γ) - Φ(x i β) Censored Tobit models Δ=((Φ ((x i β+γ)/σ))*(( x i β+γ)+ σ φ(x i β+γ))) - ((Φ ((x i β)/σ))*(( x i β)+ σ φ(x i β)) Logit models

SMD correction for small sample bias When the sample size is small, a correction in the effect size and its variance is needed. Although the correction is going to be almost imperceptible, we recommend to apply to all SMD calculations. For regression studies, there is a more efficient alternative (see Keef and Roberts, 2004)

Computation of effect size: Response Ratios For matched-based studies (e.g., PSM), Y c = Y t ATT where Y t is the outcome level in the treatment, ATT is the average treatment effect on the treated and Y c is the outcome level in the control group after matching. For regression-based studies, Y t = Y c + β where Y c is the outcome level in the total sample and Y t is the ceteris paribus average predicted outcome if the sample received the treatment.

Computation of Response Ratio Standard Error When the standard deviation of the dependent variable or the necessary information to calculate SD is not reported, we can approximate the SE for response ratios using the t statistics/p-value of the regression coefficient or of the results of the t test for equality of means between groups after matching:

Computation of effect size: Response Ratios Semi-log difference-in-differences (DID): RR= e β If Sp is computed to calculate the SE (RR), Sp should measure the pooled standard deviation of the change (or the growth rate) in the variable of reference.

Computation of effect size: Response Ratios Logit, Probit and Tobit model Where for: Probit models Δ= Φ (x i β+γ) - Φ(x i β) Censored Tobit models Δ=((Φ ((x i β+γ)/σ))*(( x i β+γ)+ σ φ(x i β+γ))) - ((Φ ((x i β)/σ))*(( x i β)+ σ φ(x i β)) Logit models

Computation of effect size: Response Ratios For logit model we can also estimate RR as:

Unit of analysis error correction Unit of analysis error (UoA) arises in impact evaluation studies in which programme placement and analysis are conducted at a different unit level and the researcher does not account for this within-cluster dependency. E.g. Programme placement at cluster level and outcomes analysed at household level. The consequences of UoA are false smaller variances and false narrower confidence intervals. If the study conduct analysis and programme placement at different levels and do not use cluster robust standard errors, we need to apply a correction to the standard errors to avoid potential Type II error: Where m is the cluster size and ICC is the intra-cluster correlation coefficient.

Formulae Table for ES computation Effect size measure Formulae for matched-based studies: Information needed to be reported in matched-based studies: Formulae for regression-based studies: Information needed to be reported in regression-based studies: Standardizes Mean Differences (SMD) SE SMD = S p = S p = SMD = Y t Y c S p n t + n c n t n c + OR SE SMD = SMD t SMD 2 2 (n t + n c ) n t 1 S t 2 + n c 1 S c 2 OR n t + n c 2 SD y 2 n t + n c 1 β2 n t n c n t + n c n t + n c SMD -Sample mean outcome for the treated and control group after matching OR -Sample mean outcome for the treatment group AND Average Treatment Effect on the Treated. -Sample standard deviation for treatment and control group AND sample size for treatment and control group OR -Sample standard deviation for the total sample AND sample mean outcome for treatment and control group OR sample mean outcome for treatment group and ATT AND sample size for treatment and control group. SE (SMD) -Sample size of the treated and control group OR t statistics of the treatment effect. S p = SMD = β S p SD y 2 (n t + n c 1) β2 (n t n c ) n t + n c n t + n c OR S p = SD of te regression residuals. SE SMD = SE SMD = SMD t OR SMD 2 v 2 v t 2 + v c(v) 2 v + 2 SMD -Regression coefficient -Sample standard deviation of the dependent variable AND sample size for treatment and control group OR -Standard deviation of the error term in the regression. SE(SMD) -t statistics for the regression coefficient OR t statistics for the regression coefficient AND Number of covariates AND sample size for the total sample.

Formulae Table for ES computation Effect size measure Formulae for matched-based studies: Information needed to be reported in matched-based studies: Formulae for regressionbased studies: Information needed to be reported in regressionbased studies: Response Ratio (RR) S p = RR = Y t Y c SE RR = Exp Ln(RR) t SE RR = S p 2 S p = OR 1 2 n t Y + 1 t n c Y 2 c n t 1 S t 2 + n c 1 S c 2 n t + n c 2 OR SD y2 n t + n c 1 n t + n c β 2 n t n c n t + n c RR -Sample mean outcome for the treated and control group after matching OR -Sample mean outcome for the treatment group AND Average Treatment Effect on the Treated. SE(RR) -t statistics of the treatment effec t. OR -Sample mean outcome for the treated and control group after matching OR Sample mean outcome for the treatment group AND Average Treatment Effect on the Treated AND Sample size for the treatment and control group AND sample standard deviation for the treatment and control group. OR sample standard deviation for all the sample. RR = Y s + β Y s SE RR = Exp Ln(RR ) t RR -Mean outcome for the total sample. -Beta of the regression coefficient. SE(RR) -t statistics of the regression coefficient.

Definitions S p is the pool standard deviation β is the coefficient or impact effect of interest. t is the t statistics of the regression coefficient or of the relevant treatment impact (t-test for equality of means). Exp is the exponential function (e f(x) ) Y t, Y c, Y s, n t, n c and n s are the mean outcome in the treatment group, control group and total sample and the sample size for the treatment group, control group and total sample. SD t, SD c and SD Y are the standard deviation for the treatment group, control group and total sample. v is the degrees of freedom of the regression equation. ATT Average treatment effect on the treated. m is the cluster size ICC is the intra-cluster correlation coefficient, an estimate of the relative variability within clusters.

Definitions In a logit, probit or tobit regression: Δ is the impact effect xiβ is the mean predicted outcomes for without participating in the programme. γ is the coefficient of interest. Ф is the cumulative distribution function. φ is the probability distribution function.

Exercises Compute effect sizes for the following studies: For Banerjee et al. 2009: estimate RR and SMD of the impact of microfinance access on per capita expenditure Estimate RR of the impact of microfinance access on non-food decision making by women For Chen et al. 2012, estimate RR and SMD of the impact of tuition relief on the change in test scores (use the Diff-in-Diff multivariate regression specification)

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 1419.229 SD t SD c 978.3 Beta= 37.375 SE(Beta)= 978.299 n s = 6821; we assumed n c = 3410 and n t = 3411 RCT with regression analysis.

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 1419.229 SD t SD c 978.3 Beta= 37.375 SE(Beta)= 46.221 n s = 6821; we assumed n c = 3410 and n t = 3411 RCT with regression analysis Estimate Standardised Mean Difference (SMD): 1. Compute t: 2. Compute SD y: : 3. Compute S p :

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 1419.229 SD t SD c 978.3 Beta= 37.375 SE(Beta)= 46.221 n s = 6821; we assumed n c = 3410 and n t = 3411 t=0.809 SD y =928.1 S p =927.8 RCT with regression analysis Estimate Standardised Mean Difference (SMD): 4. Compute SMD: 5. Compute SE(SMD): 6. Interpret the results 7. For yourself: correct for sample bias

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 1419.229 SD t SD c 978.3 Beta= 37.375 SE(Beta)= 46.221 n s = 6821; we assumed n c = 3410 and n t = 3411 t=0.809 SD y =928.1 S p =927.8 RCT with regression analysis Estimate Response Ratio (RR): 1. Compute RR: 2. Compute SE(RR): 3. Interpret the results

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 0.901 SD t SD c 0.299 Beta= -0.001 SE(Beta)= 0.014 n s = 6821; we assumed n c = 3410 and n t = 3411 t=-0.071 RCT with regression analysis

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 0.901 SD t SD c 0.299 Beta= -0.001 SE(Beta)= 0.014 n s = 6821; we assumed n c = 3410 and n t = 3411 t=-0.071 RCT with regression analysis Estimate Standardised Mean Difference (SMD): 1. Compute t statistics: 2. Compute SD y : 3. Compute S p :

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 0.901 SD t SD c 0.299 Beta= -0.001 SE(Beta)= 0.014 n s = 6821; we assumed n c = 3410 and n t = 3411 t=-0.071 SD y =0.284 S p =0.284 RCT with regression analysis Estimate Standardised Mean Difference (SMD): 4. Compute SMD: 5. Compute SE(SMD): 6. Interpret the results:

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 0.901 SD t SD c 0.299 Beta= -0.001 SE(Beta)= 0.014 n s = 6821; we assumed n c = 3410 and n t = 3411 t=-0.071 SMD=-0.004 SE(SMD)=0.207 RCT with regression analysis Estimate Odds ratio (OR): 1. Compute OR: 2. Compute SE(OR): 3. Interpret the results:

Banerjee et al. 2009 Information coded: In an RCT, at the baseline and assuming n t =n c : Y s Y t Y c 0.901 SD t SD c 0.299 Beta= -0.001 SE(Beta)= 0.014 n s = 6821; we assumed n c = 3410 and n t = 3411 t=-0.071 RCT with regression analysis Estimate Risk Ratio (RR): 1. Compute RR: 2. Compute SE(RR): 3. Interpret the results

Chen et al. 2012 Information coded: Y s = 16.02 Beta=2.85 t= 1.82 n t = 555 n c = 1709 SD y 2010 =16.61 SD y 2009 =16.85 Diff-in Diff regression study

Chen et al. 2012 Information coded: Y s = 16.02 Beta=2.85 t= 1.82 n t = 555 n c = 1709 SD y 2010 =16.61 SD y 2009 =16.85 Diff-in Diff regression study Estimate Standardised Mean Difference (SMD): 1. Compute SD y: on the change in the outcome: 3. Compute S p :

Chen et al. 2012 Information coded: Y s = 16.02 Beta=2.85 SE(Beta)= 1.82 n t = 555 n c = 1709 SD y 2010 =16.61 SD y 2009 =16.85 SD y =10.58 t=1.82 S p =10.51 Diff-in Diff regression study Estimate Standardised Mean Difference (SMD): 4. Compute SMD: 5. Compute SE(SMD): 6. Interpret the results 7. For yourself: correct for sample bias

Chen et al. 2012 Information coded: Y s = 16.02 Beta=2.85 SE(Beta)= 1.82 n t = 555 n c = 1709 SD y 2010 =16.61 SD y 2009 =16.85 SD y =10.58 t=1.82 S p =10.51 Diff-in Diff regression study Estimate Response Ratios (RR): 1. Compute RR: 2. Compute SE(RR): 3. Interpret the results

Practical tips David Wilson effect size calculator available here: http://gunston.gmu.edu/cebcp/effectsizecalculator /index.html but remember to apply corrections for sample bias! Code all the relevant information in a spreadsheet for all relevant studies before starting ES calculations. Decision on the selection of ES measure would depend on which ES measure would lead to the smaller study loss.

Main Bibliography Used Banerjee, A., Duflo, E., Glennerster, R., Kinnan, C. 2009, The Miracle of Microfinance? Evidence from a Randomized Evaluation, JPAL Becker, B. & Wu, M. 2007, The Synthesis of Regression Slopes in Meta-Analysis, Statistical Science, Vol. 22, No 3, 414-429. Borenstein, M., Hedges, L., Higgins, J., Rothstein,H. 2009, Introduction to Meta- Analysis. WILEY Chen, X., Shi, Y., Yi, H., Zhang, L., Mo, D., Chu, J., Rozelle, S. 2012, The Impact of a Senior High School Tuition Relief Program on Poor Junior High School Students in Rural China, REAP Working Paper 239, University of Stanford. Hansen, H., Klejntrup, N.R., Andersen, O.W. 2011, A comparison of Model-based and Design-based Impact Evaluations of Interventions in Developing Countries, FOI Working Paper n o.16, University of Copenhagen. Hasler, S. Forthcoming, The Effect of Multi-grade Teaching on Education Quality in Cameroon. A Propensity Score Matching Model, Journal of Development Studies. Hedges, L.V. 1981, Distribution Theory for Glass s Estimator of Effect Size and Related Estimators, Journal of Educational Statistics, vol. 6, pp107-128. Keef, S.P. & Roberts, L.A. 2004. The meta-analysis of partial effect sizes, British J. Math. Statist. Psych, Vol 57, pp 97-129. Lipsey, M.W. & Wilson, D.B. 2001, Practical Meta-Analysis, Applied Social Research Methods Series, Volume 49. SAGE Publications.

Thank you very much Visit: www.3ieimpact.org http://www.campbellcollaboration.org/interna tional_development/index.php

Annex: standard errors and 95% CI for OR / RR d c b a SE OR 1 1 1 1 95% confidence interval = exp[ln(or))±1.96*ln(se OR )] d c c b a a SE RR 1 1 1 1 95% confidence interval = exp[ln(rr))±1.96*ln(se RR )]