Introduction to Systematic Review and Meta-Analysis: A Health Care Perspective

Similar documents
Assessing study quality: critical appraisal. MRC/CSO Social and Public Health Sciences Unit

GUIDELINES FOR REVIEWING QUANTITATIVE DESCRIPTIVE STUDIES

18/11/2013. Getting the Searches off to a good start: Scoping the Literature and Devising a Search Strategy

Characteristics of studies

Biostat Methods STAT 5820/6910 Handout #6: Intro. to Clinical Trials (Matthews text)

Clinical Study Design and Methods Terminology

Principles of Systematic Review: Focus on Alcoholism Treatment

PCORI Methodology Standards: Academic Curriculum Patient-Centered Outcomes Research Institute. All Rights Reserved.

Article Four Different Types of Evidence / Literature Reviews

Guidelines for AJO-DO submissions: Randomized Clinical Trials June 2015

Identifying and Prioritizing Research Gaps. Tim Carey, M.D., M.P.H. Amica Yon, Pharm.D. Chris Beadles, M.D. Roberta Wines, M.P.H.

ChangingPractice. Appraising Systematic Reviews. Evidence Based Practice Information Sheets for Health Professionals. What Are Systematic Reviews?

WHAT IS A JOURNAL CLUB?

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Tatyana A Shamliyan. I do not have COI. 30-May-2012

Systematic reviews and meta-analysis

Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.

NKR 13 Alkoholbehandling. Disulfiram for alcohol dependency

The Campbell Collaboration Study content coding Effect size coding (next session)

How To Write A Systematic Review

Methods Guide for Comparative Effectiveness Reviews

WWC Single Study Review A review of the design and summary of findings for an individual study

Study Designs. Simon Day, PhD Johns Hopkins University

IPDET Module 6: Descriptive, Normative, and Impact Evaluation Designs

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

C2 Training: June 8 9, 2010

1.0 Abstract. Title: Real Life Evaluation of Rheumatoid Arthritis in Canadians taking HUMIRA. Keywords. Rationale and Background:

Sampling. COUN 695 Experimental Design

Evidence-based guideline development. Dr. Jako Burgers/dr. H.P.Muller Dutch Institute for Healthcare Improvement CBO, Utrecht, The Netherlands

Systematic Reviews in JNEB

MRC Autism Research Forum Interventions in Autism

Current reporting in published research

Study Design and Statistical Analysis

Department of Veterans Affairs Health Services Research and Development - A Systematic Review

An Evidence-Based Approach to Reviewing the Science on the Safety of Chemicals in Foods

The Cross-Sectional Study:

Evidence-Based Nursing Practice Toolkit

Mary B Codd. MD, MPH, PhD, FFPHMI UCD School of Public Health, Physiotherapy & Pop. Sciences

PART 2: General methods for Cochrane reviews

Data Extraction and Quality Assessment

Introduction to study design

Systematic Reviews and Clinical Practice Guidelines

Stakeholder Guide

Finding and Evaluating Evidence: Systematic Reviews and Evidence-Based Practice

DENTAL IMPLANTS IN EDENTULISM

Methods for Meta-analysis in Medical Research

PCORI Methodology Standards: Academic Curriculum Patient-Centered Outcomes Research Institute. All Rights Reserved.

PSYCHOLOGY PROGRAM LEARNING GOALS AND OUTCOMES BY COURSE LISTING

The PCORI Methodology Report. Appendix A: Methodology Standards

BOSTON UNIVERSITY SCHOOL OF PUBLIC HEALTH PUBLIC HEALTH COMPETENCIES

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

PEER REVIEW HISTORY ARTICLE DETAILS

Critical Appraisal of a Research Paper

Critical appraisal. Gary Collins. EQUATOR Network, Centre for Statistics in Medicine NDORMS, University of Oxford

Breast cancer treatment for elderly women: a systematic review

Can I have FAITH in this Review?

Distance Education as a Training Tool for Health Professionals: Systematic Review and Meta-Analysis of Randomized Trials

EVIPNet Capacity-Building Workshop Ethiopian Health and Nutrition Research Institute, Addis Ababa, Ethiopia 18 to 22 February 2008

Research & Development Guidance for Students

Developing an implementation research proposal. Session 2: Research design

Tufts-New England Medical Center Evidence-based Practice Center Boston, MA

User Manual: Version 5.0. System for the Unified Management, Assessment and Review of Information

Epidemiology. Admission Guidelines for MPH DEGREES OFFERED: MPH IN EPIDEMIOLOGY PH.D. IN EPIDEMIOLOGY FACULTY INTERIM CHAIR PROFESSORS

The Research Design Service Yorkshire and the Humber CONTENTS TOP TIPS FOR QUALITATIVE RESEARCH TOP TIPS FOR QUANTITATIVE PROPOSALS

What Works Clearinghouse

OCRe: Ontology of Clinical Research

HTA. Quality Assessment Tools Project Report. Canadian Agency for Drugs and Technologies in Health

Personalized Predictive Medicine and Genomic Clinical Trials

Defining your PICO Question. In: Patient or Problem or Risk Factor EXAMPLE: Does: Treatment A or Exposure A EXAMPLE: EXAMPLE: Outcome of interest

Inclusion and Exclusion Criteria

Reporting guidelines: past, present and future

A Guide To Producing an Evidence-based Report

Familieorienteret alkoholbehandling versus Individuel alkoholbehandling for

Potential Career Paths: Specialization Descriptions

Missing data in randomized controlled trials (RCTs) can

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

NKR 33 Urininkontinens, PICO 3: Bør kvinder med urininkontinens tilbydes behandling

Randomization in Clinical Trials

Critical Appraisal of Article on Therapy

NON-PROBABILITY SAMPLING TECHNIQUES

UCLA FIELDING SCHOOL OF PUBLIC HEALTH. Competencies for Graduate Degree Programs

Programme du parcours Clinical Epidemiology UMR 1. Methods in therapeutic evaluation A Dechartres/A Flahault

What is critical appraisal?

How To Weigh Data From A Study

University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies

Recommendations Oxford Centre for Evidence-based Medicine Oxford Centre for Evidence-based Medicine

Data Analysis, Research Study Design and the IRB

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

Web appendix: Supplementary material. Appendix 1 (on-line): Medline search strategy

How to Develop a Research Protocol

Critical appraisal of systematic reviews

PSYCHOLOGY PROGRAM LEARNING GOALS, LEARNING OUTCOMES AND COURSE ALLIGNMENT MATRIX. 8 Oct. 2010

Paper PO06. Randomization in Clinical Trial Studies

Types of Studies. Systematic Reviews and Meta-Analyses

Systematic Review Resource Package

Service delivery interventions

GLOSSARY OF EVALUATION TERMS

In an experimental study there are two types of variables: Independent variable (I will abbreviate this as the IV)

Board of Member States ERN implementation strategies

Metrics: (1) Poorly (2) Adequately (3) Well (4) Very Well (5) With Distinction

Clinical practice guidelines

Transcription:

Introduction to Systematic Review and Meta-Analysis: A Health Care Perspective Sally C. Morton Department of Biostatistics University of Pittsburgh Methods for Research Synthesis: A Cross-Disciplinary Approach, October 2013

Cross-Disciplinary Communication From the Healthcare Systematic Review World Terminology Institute of Medicine (IOM) standards for systematic reviews Quality of individual studies Heterogeneity Strength of the body of evidence Research Synthesis, Morton, 10/13, 2

How Did We Get Here? The Evidence Paradox (Sean Tunis): 18,000+ RCTs published each year Tens of thousands of other clinical studies Systematic reviews routinely conclude that: The available evidence is of poor quality and therefore inadequate to inform decisions of the type we are interested in making. Research Synthesis, Morton, 10/13, 3

Terminology Systematic Review (SR): Review of a clearly formulated question that uses systematic and explicit methods to identify, select, and critically appraise relevant research, and to collect and analyze data from the studies that are included in the review Meta-analysis (MA): Use of statistical techniques in an SR to integrate the results of included studies to conduct statistical inference Adapted from the Cochrane Collaboration Glossary Research Synthesis, Morton, 10/13, 4

Key Points 1. MA should not be used as a synonym for SR 2. An MA should be done in the context of an SR 3. An MA should not be assumed to always be an appropriate step in an SR. The decision to conduct an MA is neither purely analytical nor statistical in nature. SRs MAs Research Synthesis, Morton, 10/13, 5

Steps of a Systematic Review 1. Develop a focused research question 2. Define inclusion/exclusion criteria 3. Select the outcomes for your review 4. Find the studies 5. Abstract the data 6. Assess quality of the data 7. Explore data (heterogeneity) 8. Synthesize the data descriptively and inferentially via meta-analysis if appropriate 9. Summarize the findings Research Synthesis, Morton, 10/13, 6

Systematic Review Standards A standard is a process, action, or procedure for performing SRs that is deemed essential to producing scientifically valid, transparent, and reproducible results Research Synthesis, Morton, 10/13, 7

IOM Report on Standards for SRs (2011) Committee Charge: Recommend methodological standards for SRs of comparative effectiveness research (CER) on health and health care Assess potential methodological standards that would assure objective, transparent, and scientifically valid SRs of CER Recommend a set of methodological standards for developing and reporting such SRs Research Synthesis, Morton, 10/13, 8

Committee Methodology Available research evidence Expert guidance from: Agency for Healthcare Research and Quality (AHRQ) Effective Health Care Program Centre for Reviews and Dissemination (CRD) Cochrane Collaboration Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Committee s assessment criteria: o Acceptability (credibility) o Applicability (generalizability) o Efficiency o Patient-centeredness o Scientific rigor o Timeliness o Transparency 21 standards and 82 elements recommended Research Synthesis, Morton, 10/13, 9

IOM standards are categorized into four subgroups: Initiating an SR Finding and assessing individual studies Synthesizing the body of evidence Reporting SRs Research Synthesis, Morton, 10/13, 10

Buscemi et al., 2006 Single extraction was faster, but resulted in 21.7% more mistakes. AHRQ Ensure quality control mechanism; usually through use of independent researchers to assess studies for eligibility. Pilot testing is particularly important if there is not dual-review screening. CRD Good to have more than one researcher to help minimize bias and error at all stages of the review. Parallel independent assessments should be conducted to minimize the risk of errors. Cochrane At least two people, independently. Process must be transparent, and chosen to minimize biases and human error. Research Synthesis, Morton, 10/13, 11

Collectively the standards and elements present a daunting task. Few, if any, members of the committee have participated in an SR that fully meets all of them. Yet the evidence and experience are strong enough that it is impossible to ignore these standards or hope that one can safely cut corners. The standards will be especially valuable for SRs of high-stakes clinical questions with broad population impact, where the use of public funds to get the right answer justifies careful attention to the rigor with which the SR is conducted. Individuals involved in SRs should be thoughtful about all of the standards and elements, using their best judgment if resources are inadequate to implement all of them, or if some seem inappropriate for the particular task or question at hand. Transparency in reporting the methods actually used and the reasoning behind the choices are among the most important of the standards recommended by the committee. Research Synthesis, Morton, 10/13, 12

IOM Standards Regarding Study Quality and Heterogeneity STANDARD 3.6 Critically appraise each study 3.6.1 Systematically assess the risk of bias, using predefined criteria 3.6.2 Assess the relevance of the study s populations, interventions, and outcome measures 3.6.3 Assess the fidelity of the implementation of interventions STANDARD 4.2 Conduct a qualitative synthesis 4.2.4 Describe the relationships between the characteristics of the individual studies and their reported findings and patterns across studies STANDARD 4.4 If conducting a meta-analysis, then do the following: 4.4.2 Address the heterogeneity among study effects

Why Assess the Quality of Individual Studies? Combining poor quality studies may lead to biased, and therefore, misleading, pooled estimates Assessment of quality can be controversial and lead to its own form of bias Variety of methods exist including the Cochrane risk of bias tool Assessing quality of observational studies is very difficult Research Synthesis, Morton, 10/13, 14

Cochrane Risk of Bias Tool Domain Description Review authors judgement 1. Sequence generation. 2. Allocation concealment. Describe the method used to generate the allocation sequence in sufficient detail to allow an assessment of whether it should produce comparable groups. Describe the method used to conceal the allocation sequence in sufficient detail to determine whether intervention allocations could have been foreseen in advance of, or during, enrolment. Was the allocation sequence adequately generated? Was allocation adequately concealed? 3. Blinding of participants, personnel and outcome assessors Describe all measures used, if any, to blind study participants and personnel from knowledge of which intervention a participant received. Provide any information relating to whether the intended blinding was effective. Was knowledge of the allocated intervention adequately prevented during the study? Research Synthesis, Morton, 10/13, 15

Domain Description Review authors judgment 4. Incomplete outcome data Describe the completeness of outcome data for each main outcome, including attrition and exclusions from the analysis. State whether attrition and exclusions were reported, the numbers in each intervention group (compared with total randomized participants), reasons for attrition/exclusions where reported, and any re-inclusions in analyses performed by the review authors. Were incomplete outcome data adequately addressed? 5. Selective outcome reporting State how the possibility of selective outcome reporting was examined by the review authors, and what was found. Are reports of the study free of suggestion of selective outcome reporting? 6. Other sources of bias State any important concerns about bias not addressed in the other domains in the tool. If particular questions/entries were pre-specified in the review s protocol, responses should be provided for each question/entry. Was the study apparently free of other problems that could put it at a high risk of bias? Research Synthesis, Morton, 10/13, 16

Research Synthesis, Morton, 10/13, 17

Heterogeneity is Your Friend (J. Berlin) Fruit salad may, or may not, be tasty and interesting Which are the apples and oranges, and how do they differ? Research Synthesis, Morton, 10/13, 18

Definitions of Heterogeneity From a Health Care Perspective Different types of heterogeneity: Clinical heterogeneity (diversity): Variability in participants, interventions and outcomes Methodological heterogeneity (diversity): Variability in study design and risk of bias Statistical heterogeneity: Variability in treatment effects, resulting from clinical and/or methodological diversity Statistical heterogeneity is present if the observed treatment effects are more different from each other than would be expected due to chance alone Research Synthesis, Morton, 10/13, 19

Discuss Clinical/Methodological or Substantive Heterogeneity Prior To Analysis Think first: Are included studies similar with respect to treatment effect? Study design, subjects, treatments, etc. may affect results. Include in protocol: Sources of heterogeneity that you might stratify analysis on, or that you might include as independent variables in a meta-regression Do statistics later: Q statistic to test the hypothesis that the true (population) treatment effect is equal in all studies; and/or I-squared (I 2 ) statistic Remember: Tests for heterogeneity have low statistical power Research Synthesis, Morton, 10/13, 20

Evaluating The Strength of The Body of Evidence Research Synthesis, Morton, 10/13, 21

IOM Standard Regarding Strength of Evidence Research Synthesis, Morton, 10/13, 22

Use consistent language to summarize the conclusions of individual studies as well as the body of evidence: Presenting results not sufficient Reviews often very long Evidence on assessment methods is elusive Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach is becoming more popular Anecdotal evidence that GRADE is difficult to apply GRADE is being modified for specific situations GRADE starts by downgrading observational studies Research Synthesis, Morton, 10/13, 23

Reliability Testing of the AHRQ EPC Approach to Grading the Strength of Evidence in Comparative Effectiveness Reviews (Berkman et al., RTI/UNC EPC) Journal of Clinical Epidemiology 66 (2013) 1105e1117 ORIGINAL ARTICLES Interrater reliability of grading strength of evidence varies with the complexity of the evidence in systematic reviews Nancy D. Berkman a, *, Kathleen N. Lohr a, Laura C. Morgan a, Tzy-Mey Kuo b, Sally C. Morton c a Division of Social Policy, Health, and Economics Research, RTI International (Research Triangle Institute), Research Triangle Park, NC 27709-2194, USA b Lineburger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 101 E. Weaver Street, Carrboro, NC, 27599, USA c Department of Biostatistics, University of Pittsburgh, 130 DeSoto Street, Pittsburgh, PA, 15261, USA Accepted 5 June 2013 Research Synthesis, Morton, 10/13, 24

Inter-rater reliability of Study Design 4 required domains (risk of bias; consistency; directness; and precision) for RCTs and observational studies separately 10 exercises from 2 published CER SRs on depression and rheumatoid arthritis All exercises contained RCTs 6 exercises included one or more observational studies Eleven pairs of reviewers (10 from 9 EPCs, 1 from AHRQ) participated in each exercise Research Synthesis, Morton, 10/13, 25

Study/Design N Comparison Quality Results: Biologics vs. Oral DMARDs TEMPO, 2005 RCT PREMIER, 2006 RCT Listing 2006 Prospective cohort study 451 ETN 25mg twice/wk vs. MTX 531 ADA 40 mg biweekly vs. MTX 20 mg/wk 1083 Biologics vs. conventional DMARDs Fair Remission at week 24: DAS < 1.6: 13.0% vs. 13.6% (P = NS) DAS28 < 2.6: 13.9% vs. 13.6% (P = NS) Remission at week 52: DAS <1.6: 17.5% vs. 14%, (P = NS) DAS28 < 2.6: 17.5% vs. 17.1%, (P = NS) Fair Clinical remission (DAS28 < 2.6) at 1 year: 23% vs. 21%, (P = 0.582 ) Fair Odds of achieving remission (DAS28 < 2.6) at 12 months: Adjusted* OR, 1.95 (95% CI, 1.20-3.19); (P = 0.006) *Adjusted for age, sex, # of previous DMARDs, DAS28, ESR, FFbH, osteoporosis, previous txt with cyclosporine A. Matched pairs analysis DAS28 remission at 12 months: 24.9% vs. 12.4%, (P = 0.004) Research Synthesis, Morton, 10/13, 26

Study Results Domain or Strength of Evidence (SOE) Independent Reviewer Agreement Reconciled Reviewer Pair Agreement Bias: RCTs 0.67 (0.61,0.73) 0.65 (0.56,0.73) Bias: Observational studies 0.11 (0.05,0.18) 0.22 (0.13,0.32) SOE: All studies 0.20 (0.16,0.25) 0.24 (0.14,0.34) SOE: RCTs only 0.22 (0.17,0.28) 0.30 (0.17,0.43) Research Synthesis, Morton, 10/13, 27

Study Conclusions Inter-rater reliability is low Complex evidence bases, particularly those with a mix of randomized and observational studies, can be extremely difficult to grade Dual review with adjudication of differences improves reliability Additional methodological guidance needed for reviewers More research is needed on reliability, and the advantages and disadvantages of concrete rules for determining strength of evidence Research Synthesis, Morton, 10/13, 28

Final words To do a meta-analysis is easy, to do one well is hard. - Ingram Olkin Research Synthesis, Morton, 10/13, 29

Thank You Sally C. Morton Department of Biostatistics Graduate School of Public Health University of Pittsburgh scmorton@pitt.edu

Are RCTs Enough? Austin Bradford Hill Heberden Oration (1965): this leads directly to a related criticism of the present controlled trial that it does not tell the doctor what he wants to know. It may be so constituted as to show without any doubt that treatment A is on the average better than treatment B. On the other hand, that result does not answer the practicing doctor s question what is the most likely outcome when this drug is given to a particular patient? Research Synthesis, Morton, 10/13, 31

PRIMSA Flowchart and Checklist Endorsed (required) by many top clinical journals Research Synthesis, Morton, 10/13, 32

Research Synthesis, Morton, 10/13, 33

Research Synthesis, Morton, 10/13, 34

o 1 2 K P Q Statistic The Math Let θ = true treatment effect for study i for i = 1,..., K i Individual study i effect is estimated by T Pooled treatment effect H : θ = θ =... = θ = θ Test statistic is 0 χ Q = K i= 1 θ w i is estimated by i= 1 P P K ( T T ) 2 and Q~ H with K 1 degrees of freedom i P 2 i T = K i= 1 wt i w i I-squared is a newer heterogeneity statistic that measures the percentage of variation across studies that cannot be explained by chance Research Synthesis, Morton, 10/13, 35 i

Assessment of Heterogeneity Proceed with Meta-analysis, or Not? First, check your data! Options: Do not do meta-analysis Change treatment effect measure Explore heterogeneity via stratification or meta-regression Account for heterogeneity via random effects model (not advisable if heterogeneity large) Exclude studies (can do leave-oneout or jackknife test to determine individual study effect on heterogeneity) Research Synthesis, Morton, 10/13, 36

Example Strength of Evidence Table Research Synthesis, Morton, 10/13, 37

Example SOE Table, continued Research Synthesis, Morton, 10/13, 38