International Initiative for Impact Evaluation Publication bias for studies in international development, 3ie Jorge Hombrados, 3ie Acknowledgement: Emily Tanner-Smith, Vanderbilt University and Campbell Methods Group
What is publication bias Publication bias refers to bias that occurs when research found in the published literature is systematically unrepresentative of the population of studies (Rothstein et al., 2005) Also referred to as the file drawer problem: journals are filled with the 5% of studies that show Type I errors, while the file drawers back at the lab are filled with the 95% of the studies that show non-significant (e.g. p < 0.05) results (Rosenthal, 1979) On average published studies have a larger mean effect size than unpublished studies, providing evidence for a publication bias Lipsey and Wilson 1993
Types of Reporting Biases Definition Publication bias Time lag bias Multiple publication bias Location bias Citation bias Language bias Outcome reporting bias The publication or non-publication of research findings, depending on the nature and direction of results The rapid or delayed publication of research findings, depending on the nature and direction of results The multiple or singular publication of research findings, depending on the nature and direction of results The publication of research findings in journals with different ease of access or levels of indexing in standard databases, depending on the nature and direction of results The citation or non-citation publication of research findings, depending on the nature and direction of results The publication of research findings in a particular language, depending on the nature and direction of results The selective reporting of some outcomes but not others, depending on the nature and direction of results Source: Sterne et al. (Eds.) (2008: 298)
How much of a problem is it in development research? Exploratory research tradition in social sciences suggests potentially severe problems of file drawer effects Partly mitigated by tradition of publishing working papers in development economics and political science More problematic for observational data (and small sample intervention studies) But we need more evidence since very systematic reviews have addressed this topic in international development
Avoiding publication bias: grey lit searching Sources of gray lit in development: (1) Multidisciplinary: Google, Google Scholar (2) International development specific: JOLIS, BLDS and ELDIS (Institute of Development Studies) (3) Good sources for impact evaluation: J-PAL/IPA databases, 3ie s database of impact evaluations; IDEAS/Repec (4) Subject-specific, e.g. LILACS for Latin American health publications, ALNAP
An ounce of prevention is worth a pound of cure Conference proceedings Technical reports (research, governmental agencies) Organization websites Dissertations, theses Contact with primary researchers
Detecting publication bias The only direct evidence for publication bias is through comparison of published and unpublished studies results But there are also ways of assessing likelihood of publication bias directly and indirectly
Study ID Published in journal Huan et al., 1999 (Vietnam) Rejesus et al, 2010 (Vietnam) Feder et al, 2004 (Indonesia) Ali & Sharif, 2011 (Pakistan) Gockowski et al., 2010 (Ghana) Yang et al., 2005 (China) Khan et al., 2007 (Pakistan) Cavatassi et al., 2011 (Ecuador) Davis et al, 2012 (Tanzania) Birthal et al., 2000 (India) Dinpanah et al., 2010 (Iran) Mutandwa & Mpangwa, 2004 (Zimbabwe) Palis, 1998 (Philippines) Yamazaki & Resosudarmo, 2007 (Indonesia) Davis et al, 2012 (Kenya) Dinpanah et al., 2010 (Iran) Orozco Cirilo et al., 2008 b) (Mexico) Subtotal (I-squared = 95.4%, p = 0.000). Not published in journal Pananurak, 2010 (India) Van Rijn, 2010 (Peru) Naik et al., 2008 (India) Labarta, 2005 (Nicaragua) Wu Lifeng, 2010 (China) Pananurak, 2010 (China) Hiller et al., 2009 (Kenya) Pananurak, 2010 (Pakistan) Wandji et al., 2007 (Cameroon) Zuger 2004 (Peru) Carlberg et al., 2012 (Ghana) Van den Berg et al., 2002 (Sri Lanka) Pande et al., 2009 (Nepal) Todo & Takahashi, 2011 (Ethiopia) Subtotal (I-squared = 84.2%, p = 0.000). NOTE: Weights are from random effects analysis ES (95% CI) 0.95 (0.92, 0.98) 0.97 (0.72, 1.31) 0.98 (0.96, 1.01) 1.09 (1.03, 1.15) 1.14 (1.03, 1.25) 1.15 (0.94, 1.41) 1.17 (0.97, 1.42) 1.22 (0.97, 1.53) 1.23 (1.00, 1.51) 1.24 (1.13, 1.36) 1.32 (1.22, 1.42) 1.36 (1.06, 1.73) 1.36 (0.97, 1.92) 1.67 (1.23, 2.26) 1.81 (1.15, 2.84) 2.52 (2.05, 3.11) 2.62 (2.23, 3.08) 1.30 (1.19, 1.43) 0.80 (0.61, 1.05) 0.86 (0.63, 1.18) 0.89 (0.83, 0.96) 0.97 (0.92, 1.02) 1.08 (1.03, 1.14) 1.09 (1.04, 1.14) 1.17 (0.53, 2.56) 1.24 (1.01, 1.54) 1.32 (1.07, 1.63) 1.44 (1.09, 1.92) 1.58 (1.19, 2.10) 1.68 (1.30, 2.18) 2.11 (1.25, 3.56) 2.71 (1.11, 6.60) 1.14 (1.04, 1.24).5 1 2 3 Favours intervention
Assess file-drawer effects in each included study Is there evidence that results have been reported selectively, such as where outcomes are not reported on data collected (or indicated in methods section, or reported in study protocol if available)? Have outcomes been constructed in a way which is uncommon which might suggest biased exploratory research?
File drawer effects in FFS studies
Detecting Publication Bias Methods for detecting publication bias assume: Large n studies are likely to get published regardless of results due to time and money investments Small n studies with the largest effects are most likely to be reported, many will never be published or will be difficult to locate Medium n studies will have some modest significant effects that are reported, others may never be published
Funnel Plots Exploratory tool used to visually assess the possibility of publication bias in a meta-analysis Scatter plot of effect size (x-axis) against some measure of study size (y-axis) x-axis: use logged values of effect sizes for binary data, e.g., ln(or), ln(rr) y-axis: the standard error of the effect size is generally recommended (see Sterne et al., 2005 for a review of additional y-axis options) Not meaningful in very small meta-analyses (e.g., n < 10)
Funnel Plots Precision of estimates increases as the sample size of a study increases Estimates from small n studies (i.e., less precise, larger standard errors) will show more variability in the effect size estimates, thus a wider scatter on the plot Estimates from larger n studies will show less variability in effect size estimates, thus have a narrower scatter on the plot If publication bias is present, we would expect null or negative findings from small n studies to be suppressed (i.e., missing from the plot)
.05.15.25.1.2 0 Farmer field schools published studies only Funnel plot with pseudo 95% confidence limits -.5 0.5 1 Ln RR Published in journal Lower CI Lower CI Pooled
.1.2.3.4.5 0 Farmer field schools all studies Funnel plot with pseudo 95% confidence limits -1 -.5 0.5 1 Ln RR Not published in journal Lower CI Pooled Published in journal Lower CI
Tests for Funnel Plot Asymmetry Several regression tests are available to test for funnel plot asymmetry Attempt to overcome subjectivity of visual funnel plot inspection Framed as tests for small study effects, or the tendency for smaller n studies to show greater effects than larger n studies; i.e., effects aren t necessarily a result of bias
Egger Test Recommended test for mean difference effect sizes (d, g) ES i 1 0se i i : 0 H0 0 Weighted regression of the effect size on standard error β 0 = 0 indicates a symmetric funnel plot β 0 > 0 shows less precise (i.e., smaller n) studies yield bigger effects Can be extended to include p predictors hypothesized to potentially explain funnel plot asymmetry (see Sterne et al., 2001)
standardized effect Egger test for farmer field school studies Egger's publication bias plot Coef. t P>t 15 10 slope -0.047-1.70 0.100 bias 3.085 4.14 0.000 5 0-5 0 50 100 precision
Egger Test limitations Low power unless there is severe bias and large n Inflated Type I error with large treatment effects, rare event data, or equal sample sizes across studies Inflated Type I error with log odds ratio effect sizes
Other statistical tests Peters test: modified Egger test for use with log odds ratio effect sizes Begg s test Selection modeling (Hedges & Vevea, 2005) Rosenthal s failsafe n (not recommended) (Becker, 2005)
Trim and fill analysis (Duval & Tweedie, 2000) Iteratively trims (removes) smaller studies causing asymmetry Uses trimmed plot to re-estimate the mean effect size Fills (replaces) omitted studies and mirror-images Provides an estimate of the number of missing (filled) studies and a new estimate of the mean effect size Major limitations include: misinterpretation of results, assumption of a symmetric funnel plot, poor performance in the presence of heterogeneity
Trim & fill from farmer field school studies
FFS: results of meta-trim and Egger test 95% lower Effect size 95% upper Num. studies Metaanalysis 1.16 1.23 1.32 31 Filled metaanalysis Egger s test: P=0.000 1.03 1.10 1.17 40
Examples of other methods Cumulative meta-analysis Typically used to update pooled effect size estimate with each new study cumulatively over time Can use as an alternative to update pooled effect size estimate with each study in order of largest to smallest sample size If pooled effect size does not shift with the addition of small n studies, provides some evidence against publication bias
Cumulative meta-analysis: farmer field school studies Study ID Ali & Sharif, 2011 (Pakistan) Birthal et al., 2000 (India) Birthal et al., 2000 (India) Carlberg et al., 2012 (Ghana) Cavatassi et al., 2011 (Ecuador) Danida, 2011 (Bangladesh) Davis et al, 2012 (Kenya) Davis et al, 2012 (Tanzania) Davis et al, 2012 (Uganda) Dinpanah et al., 2010 (Iran) Dinpanah et al., 2010 (Iran) Feder et al, 2004 (Indonesia) Gockowski et al., 2005 (Nigeria) Gockowski et al., 2010 (Ghana) Hiller et al., 2009 (Kenya) Huan et al., 1999 (Vietnam) Islam et al., 2006 (Bangladesh) Jones, n.d. (Sri Lanka) Kabir & Uphoff, 2007 (Myanmar) Khan et al., 2007 (Pakistan) Labarta, 2005 (Nicaragua) Mancini et al., 2008 (India) Mutandwa & Mpangwa, 2004 (Zimbabwe) Naik et al., 2008 (India) Orozco Cirilo et al., 2008 b) (Mexico) Palis, 1998 (Philippines) Pananurak, 2010 (China) Pananurak, 2010 (India) Pananurak, 2010 (Pakistan) Pande et al., 2009 (Nepal) Rejesus et al, 2010 (Vietnam) Todo & Takahashi, 2011 (Ethiopia) Torrez et al., 1999 (Bolivia) Van Rijn, 2010 (Peru) Van de Fliert 2000 (Indonesia) Van den Berg et al., 2002 (Sri Lanka) Waarts et al., 2012 (Kenya) Wandji et al., 2007 (Cameroon) Williamson et al., 2003 (India) Williamson et al., 2003 (Kenya) Wu Lifeng, 2010 (China) Yamazaki & Resosudarmo, 2007 (Indonesia) Yang et al., 2005 (China) Zuger 2004 (Peru) ES (95% CI) 1.09 (1.03, 1.15) 1.09 (1.03, 1.15) 1.12 (1.07, 1.18) 1.13 (1.08, 1.19) 1.14 (1.09, 1.19) 1.14 (1.09, 1.19) 1.14 (1.09, 1.19) 1.15 (1.10, 1.19) 1.15 (1.10, 1.19) 1.18 (1.13, 1.23) 1.21 (1.17, 1.26) 1.04 (1.02, 1.06) 1.04 (1.02, 1.06) 1.05 (1.03, 1.07) 1.05 (1.03, 1.07) 1.01 (1.00, 1.03) 1.01 (1.00, 1.03) 1.01 (1.00, 1.03) 1.01 (1.00, 1.03) 1.02 (1.00, 1.03) 1.01 (1.00, 1.03) 1.01 (1.00, 1.03) 1.01 (1.00, 1.03) 1.01 (0.99, 1.02) 1.02 (1.00, 1.03) 1.02 (1.00, 1.03) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.02 (1.01, 1.04) 1.03 (1.02, 1.04) 1.03 (1.02, 1.04) 1.03 (1.02, 1.04) 1.03 (1.02, 1.04).795 1 1.26
Detecting Publication Bias in Stata Several user-written commands are available that automate the most commonly used methods to detect publication bias Method Funnel plots Stata ado-file metafunnel Contour enhanced funnel plots confunnel Egger, Peters, Harbord tests Cumulative meta-analysis metabias metacum Trim and fill analysis metatrim
Recommended Reading Duval, S. J., & Tweedie, R. L. (2000). A non-parametric trim and fill method of accounting for publication bias in meta-analysis. Journal of the American Statistical Association, 95, 89-98. Egger, M., Davey Smith, G., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple, graphical test. British Medical Journal, 315, 629-634. Hammerstrøm, K., Wade, A., Jørgensen, A. K. (2010). Searching for studies: A guide to information retrieval for Campbell systematic reviews. Campbell Systematic Review, Supplement 1. Harbord, R. M., Egger, M., & Sterne, J. A. C. (2006). A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Statistics in Medicine, 25, 3443-3457. Peters, J. L., Sutton, A. J., Jones, D. R., Abrams, K. R., & Rushton, L. (2008). Contour-enhanced meta-analysis funnel plots help distinguish publication bias from other causes of asymmetry. Journal of Clinical Epidemiology, 61, 991-996.
Recommended Reading Rosenthal, R. (1979). The file-drawer problem and tolerance for null results. Psychological Bulletin, 86, 638-641. Rothstein, H. R., Sutton, A. J., & Borenstein, M. L. (Eds). (2005). Publication bias in meta-analysis: Prevention, assessment and adjustments. Hoboken, NJ: Wiley. Rücker, G., Schwarzer, G., & Carpenter, J. (2008). Arcsine test for publication bias in meta-analyses with binary outcomes. Statistics in Medicine, 27, 746-763 Sterne, J. A., & Egger, M. (2001). Funnel plots for detecting bias in meta-analysis: Guidelines on choice of axis. Journal of Clinical Epidemiology, 54, 1046-1055. Sterne, J. A. C., Egger, M., & Moher, D. (Eds.) (2008). Chapter 10: Addressing reporting biases. In J. P. T. Higgins & S. Green (Eds.), Cochrane handbook for systematic reviews of interventions, pp. 297 333. Chichester, UK: Wiley. Sterne, J. A. C., et al. (2011). Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. BMJ, 343, d4002. Waddington, H., White, H., Snilstveit, B., Hombrados, J. Vojtkova, M. (2012) How to do a good systematic review of effects in international development: a tool-kit. Journal of Development Effectiveness, 4 (3).