Bayesian Adaptive Methods for Clinical Trials Pfizer Research Technology Center, Cambridge, MA March 2, 2012 presented by Bradley P. Carlin University of Minnesota brad@biostat.umn.edu Bayesian Adaptive Methods for Clinical Trials p. 1/100
Textbooks for this course Recommended ( BCLM ): Bayesian Adaptive Methods for Clinical Trials (ISBN 978-1-4398-2548-8) by S.M. Berry, B.P. Carlin, J.J. Lee, and P. Müller, Boca Raton, FL: Chapman and Hall/CRC Press, 2010. Other books of interest: Your favorite math stat and linear models books Bayesian Methods for Data Analysis, 3rd ed., by B.P. Carlin and T.A. Louis, Boca Raton, FL: Chapman and Hall/CRC Press, 2009. Bayesian Approaches to Clinical Trials and Health-Care Evaluation, by D.J. Spiegelhalter, K.R. Abrams, and J.P. Myles: Chichester: Wiley, 2004. Bayesian Adaptive Methods for Clinical Trials p. 4/100
Software for Bayesian CTs I Expensive but incredibly cool commercial software: FACTS (Fixed and Adaptive Clinical Trials Simulator) software for many Phase I and II trial designs permits dose finding in the presence of both safety and efficacy endpoints; stopping for success or futility supports continuous, dichotomous, or time-to-event (TITE) endpoints; also longitudinal data (e.g. biomarkers) joint venture between Berry Consultants (statistics, algorithms) and Tessella Technology and Consulting (software interface) Website: http://www.smarterclinicaltrials.com/what-we-offer/facts/ probably best for large companies with ongoing commitments to Bayesian CT development Bayesian Adaptive Methods for Clinical Trials p. 38/100
Software for Bayesian CTs II Noncommercial but still very professional software: freely available from the M.D. Anderson Cancer Center Department of Biostatistics Software page, https://biostatistics.mdanderson.org/softwaredownload/ All are stand-alone packages, free for download and local install easy to learn: menu- and dialog box-driven, nice graphics where appropriate accompanied by extensive tutorials, typically with guidelines, exercises, and solutions remarkably broad coverage of areas from all phases of the regulatory process... Bayesian Adaptive Methods for Clinical Trials p. 39/100
Software for Bayesian CTs II Partial list of MD Anderson Phase I software packages: CRMSimulator: Simulates power and Type I error of Continual Reassessment Method (CRM) dose-finding designs, offering improvements in power and/or sample size over 3 + 3" designs; see BCLM Section 3.2. bcrm: handles bivariate dose-finding with two competing outcomes (say, toxicity and progression) and a single agent; see BCLM Section 3.4.2. EffTox: finds a best dose when efficacy must be traded off against toxicity, both assumed increasing in dose; see BCLM Section 3.3. ToxFinder: for combination therapy, i.e., two drugs being administered in combination, with only one outcome (usually toxicity); see BCLM Section 3.4.4. Bayesian Adaptive Methods for Clinical Trials p. 40/100
Software for Bayesian CTs II Partial list of MD Anderson Phase II software packages: Phase II PP Design: computes stopping boundaries for a single-arm Phase II predictive probability design with a binary endpoint; see BCLM Section 4.2. MultcLean: for monitoring toxicity and efficacy in single arm phase II clinical trials with binary data; see BCLM Section 4.3.2. Adaptive Randomization (AR): for designing and simulating outcome-adaptive randomized trials with up to 10 arms, using binary or time-to-event (TITE) outcomes; more patients are treated with the better treatment while retaining the benefits of randomization; see BCLM Section 4.4. Bayesian Adaptive Methods for Clinical Trials p. 41/100
Software for Bayesian CTs III Noncommercial and not-all-that-professional software, but with R and BUGS source code freely available: from the BCLM book s data and software page, http://www.biostat.umn.edu/ brad/data3.html organized by chapter in the book similar range of models/problems as the MDACC software continuing to grow Mostly written in R, but those that require MCMC typically call OpenBUGS using the BRugs package, whose installation and exemplification are given here: http://www.biostat.umn.edu/ brad/software/brugs/ Bayesian Adaptive Methods for Clinical Trials p. 42/100
Software for Bayesian CTs III Partial list of BCLM Phase I software packages: betabinhm.r: elementary BRugs metaanalysis program for a single success proportion; see BCLM Section 2.4. Power_BRugs.txt: slightly more advanced BRugs power and Type I error-simulating program, Weibull survival model; see BCLM Section 2.5.4. titecrm.r: a basic R implementation of the TITE-CRM method; see BCLM Section 3.2.3. 324.R: an R program for dose-finding based on toxicity intervals (rather than fixed target levels); see BCLM Section 3.2.4. 354.R: a basic R implementation of the 2-agent combination therapy dose-escalation method (similar to ToxFinder); see BCLM Section 3.4.4. Bayesian Adaptive Methods for Clinical Trials p. 43/100
Software for Bayesian CTs III Partial list of BCLM Phase II-III software packages: 431.R: an R program for binary stopping for futilty, efficacy, or toxicity (similar to MultcLean); see BCLM Section 4.3.2. 443.R: R code for outcome adaptive randomization with delayed survival response ; see BCLM Section 4.4.4. adapt.r: R code to compute the simulated Type I error and other operating characteristics of a basic one-arm binary response confirmatory trial; see BCLM Section 5.2.1. example5.4.r: an R program to simulate operating characteristics of the basic confirmatory trial with delayed outcomes; see BCLM Section 5.2.3. Bayesian Adaptive Methods for Clinical Trials p. 44/100
MCMC-based Bayesian design Simulating the power or other operating characteristics (say, Type I error) in this setting works as follows: Sample true β values from an assumed true prior (skeptical, enthusiastic, or in between) Given these, sample fake survival times t i (say, N from each study group) from the Weibull We may also wish to sample fake censoring times c i from a particular distribution (e.g., a normal truncated below 0); for all i such that t i > c i, replace t i by NA Compute (β L,β U ) by calling BUGS from R using BRugs Determine the simulated trial s outcome based on location of (β L,β U ) relative to the indifference zone Repeat this process N rep times; report empirical frequencies of the six possible outcomes Bayesian Adaptive Methods for Clinical Trials p. 68/100
Results from Power.BRugs Assuming: Weibull shape r = 2, and N = 50 in each group median survival of 36 days with 50% improvement in the treatment group a N(80, 20) censoring distribution the enthusiastic prior as the truth We obtain the following output from Nrep = 100 reps: Here are simulated outcome frequencies for N= 50 accept control: 0 reject treatment: 0.07 equivalence: 0 reject control: 0.87 accept treatment: 0.06 no decision: 0 End of BRugs power simulation Bayesian Adaptive Methods for Clinical Trials p. 69/100
Bayesian power calculation We will likely wish to repeat the entire process for several sample sizes N and several priors. A Bayesian power calculation here might arise from using the enthusiastic prior as the truth For Nrep = 1000 (and using 100 burn-in and 1000 production MCMC iterations in each WinBUGS call), we obtained the following probabilities of rejecting the control when the enthusiastic prior is true: N Skeptical Reference Enthusiastic 25.014.207.475 50.087.352.615 75.191.378.652 100.288.472.682 Power increases with N and/or prior enthusiasm! Bayesian Adaptive Methods for Clinical Trials p. 70/100
Type I error rate calculation A Bayesian version of this calculation would arise similar to the method of the previous slide, but now assuming the skeptical prior is true A true frequentist Type I error calculation is also possible: simply fix β 1 = 0, and generate only the t i and c i for each of the Nrep iterations. Note that while Bayesians are free to look at their data at any time without affecting the inference, multiple looks will alter the frequentist Type I error behavior of the procedure. If this is of interest, the algorithm must be modified to explicitly include these multiple looks, checking for early stopping after each look. Early stopping for futility based on predictive distributions ( Bayesian stochastic curtailment ) may also be of interest see Berry and Berry (2004)! Bayesian Adaptive Methods for Clinical Trials p. 71/100
Ch 3: Phase I studies The first application of a new drug to humans Typically small (20-50 patients) Main goal: to establish the safety of a proposed drug, often through determining an appropriate dosing schedule (dose-finding) For cytotoxic agents (cancer), we assume the drug benefit (as well as the severity of its toxicity and other side effects) increases with dose, and thus seek the maximum tolerated dose (MTD) Key elements: a starting dose (often 1 10 LD 10,mice) a definition of dose limiting toxicity (DLT) a target toxicity level (TTL) (say, 20-30%) a dose escalation scheme Bayesian Adaptive Methods for Clinical Trials p. 72/100
Sec 3.1: Rule-based MTD designs Alter the dose based on the toxicity observed in the previous cohort. Most common: the 3+3 design: 1. Enter 3 patients at the lowest dose level 2. Observe the toxicity outcome: if 0/3 DLT Treat next 3 patients at next higher dose if 1/3 DLT Treat next 3 patients at the same dose; if 0/3 DLT Treat next 3 at next higher dose if 1/3 Define this dose as MTD if 2/3 or 3/3 DLT dose exceeds MTD if 2/3 or 3/3 DLT dose exceeds MTD 3. Repeat Step 2. If the last dose exceeds MTD, define the previous dose as MTD provided 6 or more patients have been treated at that dose. 4. MTD is defined as a dose with 2/6 DLT Bayesian Adaptive Methods for Clinical Trials p. 73/100
3+3 Design: Example In Panel 1, dose level 3 is chosen as the MTD with estimated DLT rate 16.7% In Panel 2, the same dose level is chosen, but with estimated DLT rate 0% (note 33.3% is also possible) MTD choice is ad hoc and fairly imprecise Bayesian Adaptive Methods for Clinical Trials p. 74/100
Sec 3.2: Model-based MTD designs Here we assume there is a monotonic relationship between dose and P(DLT), typically in the form of a simple one- or two-parameter model. In the figure, for a target toxicity level (TTL) of 33%, dose level 4 is the MTD Bayesian Adaptive Methods for Clinical Trials p. 75/100
Continual Reassessment Method (CRM) The first Bayesian model-based design introduced in the literature (O Quigley et al., 1990). Often characterizes the dose-toxicity relationship via simple one-parameter parametric models. That is, letting p(d) = P(DLT dose = d), 3 possible models are: Hyperbolic tangent: p(d) = [ exp(d) exp(d) + exp( d) ] a Logistic: p(d) = exp(3+ad) 1+exp(3+ad) Power: p(d) = d exp(a) The Bayesian posterior distribution of a induces a posterior for p(d), and hence that of the MTD for any given TTL! Bayesian Adaptive Methods for Clinical Trials p. 76/100
CRM algorithm 1. Assume a vague or fully non-informative prior for a. 2. Treat 1 patient at the level closest to the current estimate of the MTD, and observe the toxicity outcome 3. Update the posterior distribution of a, proportional to the prior for a times the likelihood, L(a;d,y) n i=1 p(d i ) y i [1 p(d i )] 1 y i, where d i is the dose level for patient i, and where y i = 1 if a DLT is observed for i and y i = 0 if not. 4. Treat the next patient at the level closest to the updated estimate of MTD based on the posterior distribution of a. 5. Repeat these steps until a sufficiently precise estimate of a is achieved or the maximum sample size is reached Bayesian Adaptive Methods for Clinical Trials p. 77/100
Properties of the CRM Advantages Model-based method with a clearly defined objective Treats more patients at close to the target MTD level, hence reduces the number of patients treated at low or ineffective dose levels Uses all the data to model the dose-toxicity curve Disadvantages The dose assignment may be too aggressive Success depends on a proper choice of the dose-toxicity curve and the prior distribution on a Need special a computer program to implement the design In order to address some of these safety concerns... Bayesian Adaptive Methods for Clinical Trials p. 78/100
Modified CRM c.f. Faries (1994, J. Biopharm. Stat.); Korn et al. (1994, Statist. in Med.); Goodman et al. (1995, Statist. in Med.) Use CRM but with the following modifications: Start at the lowest dose level, and do not skip doses No dose escalation to new doses until all treatment in patients from the previous doses are completed Use asymmetric metrics to determine the current MTD, e.g., the level closest to but no higher than the TTL Use cohort size of 2 or 3 Possibly use a more conservative stopping rule, e.g., no more than 2 of 6 developed MTD at any given dose level Bayesian Adaptive Methods for Clinical Trials p. 79/100
Software for Modified CRM 1. MD Anderson option: The CRMSimulator package from biostatistics.mdanderson.org/softwaredownload/ can do simulations for the power model... 2. BCLM book-related option: The phaseisim.r function from www.biostat.umn.edu/ brad/software/bclm ch3.html The following code corresponds to the example on the next 2 pages! p.tox0 <- c(.05,.15,.3,.45,.6) s.dose <- log(p.tox0/(1-p.tox0)) - 3 phaseisim(nsim=10000,npat=30,sdose=s.dose,prob.tox=p.tox0, design= 3+3, outfile= 3plus3.txt ) # 3+3 results phaseisim(nsim=100,npat=30,sdose=s.dose,prob.tox=p.tox0, outfile= CRM1.txt ) # CRM1 results phaseisim(nsim=100,npat=30,sdose=s.dose,prob.tox=p.tox0, crm.group.size=3,outfile= CRM3.txt ) # CRM3 results Bayesian Adaptive Methods for Clinical Trials p. 80/100
Modified CRM outperforms 3+3 Suppose that in developing a new agent, we have five potential dose levels, and our target toxicity level is 30%. We wish to simulate and compare the operating characteristics of the 3+3 and two CRM designs using 10,000 simulated trials. Suppose the true probabilities of DLT at dose levels 1 to 5 are 0.05, 0.15, 0.30, 0.45, and 0.60, respectively, so that dose level 3 is the true MTD. The table on the next page shows that the CRM design with a cohort size of 1 (CRM 1) treats more patients at the true MTD level, but also more patients at dose levels above the MTD. The overall percent of DLT for the 3+3 and CRM 1 designs are 21.1 and 27.0, respectively. By increasing the CRM cohort size from 1 to 3 (CRM 3), we treat fewer patients at levels above the MTD. Bayesian Adaptive Methods for Clinical Trials p. 81/100
Modified CRM outperforms 3+3 Dose Ave % 1 2 3 4 5 N DLT Scenario 1 P(DLT): 0.05 0.15 0.30 0.45 0.60 3+3 % patients 26.0 32.5 27.2 12.1 2.3 15.2 21.1 % MTD 20.5 42.7 27.5 5.7 0 CRM 1 % patients 15.6 24.1 34.7 19.0 6.7 18.5 27.0 % MTD 1.0 21.4 52.4 23.0 2.2 CRM 3 % patients 21.3 31.4 29.1 15.8 2.5 19.0 23.3 % MTD 1.5 22.6 49.8 23.7 2.4 CRM designs are much more likely to find the true MTD CRM 3 offers protection for just a bit higher Ave N CRM also beats 3+3 when the assigned doses are less and more toxic than anticipated, respectively. Bayesian Adaptive Methods for Clinical Trials p. 82/100
Escalation w/ Overdose Control (EWOC) Same as CRM, except when choosing the next dose use the α th quantile of the MTD s posterior, instead of the mean For dose x, P(DLT x = MTD) θ, the TTL Start at the lowest dose level, i.e., set x 1 = d 1 For any patient k, let π k (γ) be the MTD posterior cdf, π k (γ) = P(MTD γ y k ) where y k is the data currently available Ideally, select the next dose level x k such that π k (x k ) = α To restrict to our dose levels {d 1,...,d r }, take x k = max{d 1,...,d r : d i x k T 1 and π k (x k ) α T 2 } for prespecified tolerances T 1,T 2 > 0 Bayesian Adaptive Methods for Clinical Trials p. 83/100
Bayesian EWOC The EWOC doses x k minimize risk with respect to the asymmetric loss function, { α(γ x) for x γ (i.e., x is an underdose) L(x,γ) = (1 α)(x γ) for x > γ (i.e., x is an overdose). Choosing the feasibility bound α < 0.5 corresponds to placing a higher penalty on overdosing than on underdosing Choosing α = 0.5 implies a symmetric loss function, and produces the MTD posterior median as the new dose When α << 0.5, the final dose recommended for Phase II study (e.g., MTD posterior median) may be significantly larger than the dose any Phase I patient has received use a varying feasibility bound? Bayesian Adaptive Methods for Clinical Trials p. 84/100
EWOC Implementation Consider EWOC under the logistic model, Prob(DLT dose = x) p(x) = exp(β 0 +β 1 x) 1+exp(β 0 +β 1 x). To ease prior specification, reparametrize from (β 0,β 1 ) to (ρ 0,γ), where ρ 0 = p(x min ), the probability of DLT at the minimum dose, X min, and γ is the MTD. Then β 0 = and β 1 = 1 [γ logit(ρ 0 ) X min logit(θ)] γ X min 1 [logit(θ) logit(ρ 0 )]. γ X min Note: We assume that γ (X min,x max ) with probability 1; we would typically take the starting dose d 1 = X min. Bayesian Adaptive Methods for Clinical Trials p. 85/100
EWOC Code and Extensions www.biostat.umn.edu/ brad/data3.html offers WinBUGS EWOC code, using the model above and adopting independent uniform priors on γ and ρ 0 on the ranges (X min,x max ) and (0,θ), respectively. sisyphus.emory.edu/software ewoc.php is the" EWOC website, featuring papers and relevant programs Covariate adjustment = individualized patient dosing" Example: a phase I non-small-cell lung cancer trial of PNU-214936, a monoclonal antibody, with a covariate (anti-sea) previously shown to have a neutralizing effect on PNU. A convenient dose-toxicity model: P(DLT x,c) = exp[β 0 +β 1 log(x)+β 2 log(c)] 1+exp[β 0 +β 1 log(x)+β 2 log(c)] where c denotes anti-sea level. Bayesian Adaptive Methods for Clinical Trials p. 86/100
EWOC Extensions Recent work by Zabor (2010) considers the case of two covariates, one categorical and one continuous Example: Trial of 852A, an agonist" that can enhance the tumor-inhibiting and immune response-boosting properties of other oncologic agents. Let z be 0 for male, 1 for female, and let c [39,80] be the patient s age in years. Then our model for P(DLT) is P(DLT x,c,z) = exp[β 0 +β 1 x+β 2 c+β 3 z] 1+exp[β 0 +β 1 x+β 2 c+β 3 z]. Reparametrize to γ max = γ(c = 80,z = 0) and ρ 1 = P(DLT X = X min,c = 39,Z = 0) ρ 2 = P(DLT X = X min,c = 80,Z = 0) ρ 3 = P(DLT X = X min,c = 39,Z = 1) Bayesian Adaptive Methods for Clinical Trials p. 87/100
EWOC Results P(DLT) 1.0 0.8 0.6 0.4 θ 0.2 0.0 Males 39 64 65 74 75 80 P(DLT) 1.0 0.8 0.6 0.4 θ 0.2 0.0 Females 39 64 65 74 75 80 Posterior density 0.0 0.2 0.4 0.6 0.8 1.0 n=3 n=30 n=60 True MTD=1.2 0.15 0.30 0.60 1.20 Dose 1.55 2.00 0.15 0.30 0.60 1.20 Dose Figure 1: Probability of DLT by covariate group and dose 1.55 2.00 0.5 0.0 0.5 1.0 1.5 2.0 2.5 MTD Figure 2: The marginal posterior distribution of the maximum tolerated dose (MTD) for an 80 year old male as data accumulates Left plot shows true dose-response curves by gender and grouping ages into 3 clinically meaningful groups Simulated trials used a feasibility bound of α = 0.25, entered up to 20 cohorts of size 3, and stopped early if 95% Bayesian CI had width < 1.2 BRugs results for posterior of γ(80,0) in one trial (right plot) reveal Bayesian learning as data accumulates! Bayesian Adaptive Methods for Clinical Trials p. 88/100
Outline Session II (cont.) TITE-CRM Efficacy - Toxicity Trade-off (EffTox) Combination Therapy Session III (Phase II Studies) Standard phase IIA Designs Predictive Probability-Based Methods Posterior Probability-Based Methods Stopping for futility and efficacy Stopping for futility, efficacy, and toxicity Adaptive Randomization (AR) Baseline AR Response (Outcome-based) AR Biomarker-based Adaptive Design (BATTLE) 1
2 TITE-CRM: CRM with Survival Endpoint Cheung and Chappell (2000, Biometrics) Survival outcome with minimal model assumptions. For a patient observed (or censored) at time let and replace by Using that's all! at (calendar) time t for patient recruited at. Justification
3 TITE-CRM Algorithm 0. Initialization and cohort size k = 3 1. Initial dose escalation: initial dose escalation until the first toxicity is observed or d = 6 2. Compute based on the pseudo likelihood 3. Allocation: estimated toxicity probs select dose
4. Next cohort: simulate (when evaluating operating characteristics) or recruit (when carrying out the trial) the next cohort of k patients, i=n+1,, n+k, allocated at Record the recruitment times When simulating, generate and save the (future) response times Increment and advance the calendar time t = t + 0.5. 5. Stopping: if stop and report posterior estimated toxicity probabilities (computed as in Step 2). Otherwise repeat with step 2. 4
5 Compute the Posterior of a Evaluating evaluate the posterior expectation as average over a grid: Posterior expectation: sum over a grid. Grid: Evaluate the pointwise posterior and compute
6 Software Software and Example SAS program: http://roadrunner.cancer.med.umich.edu/ wiki/index.php/tite-crm R code: titecrm library Example Simulation truth: p 0 = 0.05, 0.1, 0.2, 0.3, 0.5, 0.7 for d = 0.05, 0.1, 0.2, 0.3, 0.5, 0.7 i.e., CRM model with a = 1. Target: p* = 20%. Initial dose escalation: first two cohorts t = 0 and 6 Posterior updating: starting with t = 12.
7 A Simulated Trial Realization (a) Allocation d* by t (b) Estimate by t
8 Dose-Finding Based On Efficacy-Toxicity Trade-Offs Thall and Cook, Biometrics 60:684-693, 2004 Patient Outcome = {Efficacy, Toxicity} The physician must specify A Lower Limit p E * on π E = Pr(Efficacy) An Upper Limit p T * on π T = Pr(Tox) Three equally desirable(π E, π T ) targets to construct an Efficacy -Toxicity Trade-off Contour A dose x is acceptable if e.g., Pr{ π E (x,θ) > p E* data } >.90 Pr{ π T (x,θ) < p T* data } >.90 Given the current data, compute the posterior prob of π E and π T to determine the dose level of the next patient
Use the EffTox program from http://biostatistics.mdanderson.org/softwaredownload/ 9
10
11
A Cohort-by-Cohort Illustration AML patients relapsed within 6 months of CR Rx = Fixed dose ara-c + one of 4 doses of a new anti-sense biological agent Res = Alive & in CR at day 35 Tox = Grade 4 symptomatic tox within 35 days N max = 36, cohort size = 3 p T * =.50 and p E * =.20 for Target pairs (.20, 0), (.60,.40), (1.00,.50) (Courtesy of Dr. Peter Thall) 12
Dose-Finding /w the Combination of Two Agents 13 Thall et al. (2003, Biometrics) y {0,1} Binary toxicity response i Bivariate dose for two agents Model: with Dose finding: climb up to reach target toxicity; then adjust dose combinations, keeping toxicity unchanged. Stage 1: Dose escalation on a grid of until target toxicity is reached. Stage 2: Maintaining toxicity, adjust dose combination to maximize cancer-cell killing and learning.
Stage 1 Define grid on a fixed line segment Start with the lowest dose Let posterior mean toxicity surface after n th cohort Assign (n+1) st cohort at: Subject to escalation control When the first toxicity is observed at refine the grid by introducing half-steps. Stop after patients. 14
Stage 2 Let set of doses with Alternate between dose in Each cohort, assign dose in that maximizes expected learning and cell killing. 15
16 Example: Gemcitibine + Cyclophosphamide Stage 1: Escalation on grid over line segment Stage 2: Adjust dose combination
As the Trial Progress 17
18 Operating Characteristics Dose combinations selected for the Gem/CTX (SD as subscripts). selected dose; estimated and true toxicity, V = posterior uncertainty
Phase IIA Clinical Trial Design H 0 : p p 0 p 0 -- an uninteresting response rate H 1 : p p 1 p 1 -- a target response rate The observed responses: X ~ Binomial(n, p) Simon s Minimax/Optimal 2-Stage Design Stage 1: Enroll n 1 patients, If r 1 responses, stop trial, reject H 1 ; Otherwise, continue to Stage 2; Stage 2: Enroll N max -n 1 patients, If r responses, reject H 1 Otherwise, accept H 0 PET(p 0 ) = Prob(Early Termination p 0 ) = pbinom(r 1, n 1, p 0 ) E(N p 0 ) = n 1 + [1 - PET(p 0 )] (N max - n 1 ) Minimax Design: give smallest N max Optimal Design: yield smallest E(N p 0 ) 19
Predictive Probability (PP) Design PP is the prob of rej H 0 at the end of study should the current trend continue, i.e., given the current data, the chance of a positive result at the end of study. If PP is very large or very small, essentially we know the answer can stop the trial and draw a conclusion. # of future pts after x responses in n pts: m=n max n # of future responses: Y ~ beta-binomial(m, a 0 + x, b 0 + n x) For each Y=i: f(p x, Y=i)= beta(a 0 + x + i, b 0 + N max x i) Predictive Probability (PP) = {Prob(Y=i x) [Prob(p > p 0 p ~ f(p x,y=i), x, Y=i) > θ T ]} If PP < θ L, stop the trial, accept H 0. Otherwise, continue to the next stage until N max At the end of trial, reject if Prob(p > p 0 p, x, Y=y) > θ T. Otherwise, accept H 0 20
21 PP Designs Goal: Find θ L, θ T, and N max to satisfying the type I and type II error rates constraints. Properties: 1. PP design can control type I and type II error rates while allow interim monitoring. 2. Under H 0, PP design can yield a higher PET(p 0 ), and smaller E(N p 0 ) or N max than Simon s 2-stage design 3. PP design produces a flexible monitoring schedule with robust operating characteristics across a wide range of stages and cohort sizes. 4. Advantages of PP design compared to standard multi-stage design: More flexible, More efficient, More robust
Operating characteristics of designs with type I and type II error rates 0.10, prior for p = beta(0.2,0.8), p 0 =0.2, p 1 =0.4 Simon s Minimax/Optimal 2-Satge r1/n1 r/n max PET(p 0 ) E(N p 0 ) α β Minimax 3/19 10/36 0.46 28.26 0.086 0.098 Optimal 3/17 10/37 0.55 26.02 0.095 0.097 Predictive Probability θ L θ T r/n max PET(p 0 ) E(N p 0 ) α β --- --- na/35 + --- --- 0.126 0.093 --- --- na/35 ++ --- --- 0.074 0.116 0.001 [0.852,0.922]* 10/36 0.86 27.67 0.088 0.094 0.011 [0.830,0.908] 10/37 0.85 25.13 0.099 0.084 0.001 [0.876,0.935] 11/39 0.88 29.24 0.073 0.092 0.001 [0.857,0.923] 11/40 0.86 30.23 0.086 0.075 0.003 [0.837,0.910] 11/41 0.85 30.27 0.100 0.062 0.043 [0.816,0.895] 11/42 0.86 23.56 0.099 0.083 0.001 [0.880,0.935] 12/43 0.88 32.13 0.072 0.074 0.001 [0.862,0.924] 12/44 0.87 33.71 0.085 0.059 0.001 [0.844,0.912] 12/45 0.85 34.69 0.098 0.048 22
Stopping Boundaries for p 0 =0.20, p 1 =0.40, α= β= 0.10 n Simon s Optimal Rej Region PET(p 0 ) Rej Region PP PET(p 0 ) 10 0 0.1074 17 3 0.55 1 0.0563 21 2 0.0663 24 3 0.0815 27 4 0.0843 29 5 0.1010 31 6 0.0996 33 7 0.0895 34 8 0.0946 35 9 0.0767 36 10 0.55 10 0.86 prior for p = beta(0.2,0.8) α = 0.088 β = 0.094 E(N p 0 ) = 27.67 PET(p 0 ) = 0.86 Simon s MiniMax: α = 0.086 β = 0.098 E(N p 0 ) = 28.26 PET(p 0 ) = 0.46 Simon s Optimal: α = 0.095 β = 0.097 E(N p 0 ) = 26.02 PET(p 0 ) = 0.55 23
Stopping Boundaries Rejection Region in Number of Responses 0 2 4 6 8 10 Simon's MiniMax Simon's Optimal PP 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Number of Patients 24
Posterior Probability-Based Designs Bayesian Sequential Monitoring Designs for Single-arm Trials with Multiple Outcomes Thall et al., Statistics in Medicine, 1995. Denote experimental tx by E and standard tx (historical data) by S Prob. of response is θ R and Prob. of tox is θ T Stop the trial if Prob(θ R,S + δ R > θ R,E ) > π or Prob(θ T,S + δ T < θ T,E ) > π E has lower response rate than S E has higher toxicity rate than S δ R (typically 0 δ R 0.1) is an offset of minimal response improvement of E over S. δ T (typically 0.1 δ T 0) is an offset of maximal toxicity allowance of E over S. Otherwise, continue 25
26
27
The following are less (greater) -than-or-equal boundaries: a pair (n, m) means to stop if the number of responses (toxicity) after treating m patients is less (greater) than or equal to n. Response Stopping Boundaries: 0 1 6 20 1 5 7 23 2 8 8 26 3 11 9 29 4 14 10 30 5 17 Toxicity Stopping Boundaries: 3 3 8 17 3 4 9 19 4 6 9 20 5 8 10 22 5 9 11 24 6 11 11 25 7 13 12 27 7 14 12 28 8 16 13 30 28
29 Stopping for Futility, Efficacy and Toxicity Define a set of elementary events, e.g. A 1 =(CR, TOX), A 2 =(no CR, TOX), A 3 =(CR, no TOX), A 4 =(no CR, no TOX); then Probability model: Prior: Posterior:
Decision Rule Choose thresholds on posterior prob to correspond to clinically meaningful events Sequential Stopping Rule Computation: Evaluation π n (CR) and π n (TOX) requires integration wrt two independent Beta r.v. s 30
31 Example Study: Phase II trial of post-transplant prophylaxis for GVHD in bone marrow transplant (BMT), Endpoint: Efficacy: GVHD within 100 days post transplant, G = no GVHD within 100 days (``CR'') Toxicity: transplant rejection T = transplant rejection within 100 days (``TOX'') Elementary outcomes: Prior: Design:
Simulation 32
Adaptive Randomization (AR) (Equal or Adaptive) Randomization Is A Good Thing Result in comparable study groups wrt known/unknown prognostic factors removes investigator bias in the treatment allocation guarantees the validity of statistical tests Baseline AR Ensure balance in prognostic factors among treatment groups Based on baseline covariates (static) Treatment allocation method Biased coin, urn design Pocock-Simon (1975), Minimization Response (Outcome-based) AR Allocate more patients in superior treatments and less patients in inferior treatment based on the observed data Play the winner - deterministic Model-based - probabilistic 33
Response Adaptive Randomization Play-the-winner (Deterministic) Model-based (Probabilistic) Urn model, Bandit problem, Probability-based randomization Goal: max. # of pts assigned to the superior arm (total # of successes) Advantage: Treat more pts in the better treatments Obtain more precise estimate for the better treatments Disadvantage/Limitation Imbalance results in loss of efficiency Require response to be measured quickly 34
Continue until reaching early stopping criteria or N max 35 Two-Arm Adaptive Randomization (AR) Consider two treatments, binary outcome First n pts equally randomized (ER) into TX1 and TX2 After ER phase, the next patient will be assigned to TX1 with probability, where Note that the tuning parameter λ = 0, ER λ =, play the winner λ = 0.5 or 1 λ = n / N max (Thall and Wathen, Europ J. Cancer, 2007)
36 Equal Randomization (ER) ER + Adaptive Randomization AR rate to TX 1= Demo for more two-arm AR designs
Randomized Two-Arm Trial Frequentist s approach H o : P 1 = P 2 vs. H 1 : P 1 <P 2 P 1 = 0.3, P 2 = 0.5, α=.025 (one-sided), 1 β =.8 N 1 = N 2 = 103, N = 206 Bayesian approach with adaptive randomization Consider P 1 and P 2 are random; Give a prior distribution; Compute the posterior distribution after observing outcomes Randomize more patients proportionally into the arm with higher response rate At the end of trial, Prob(P 1 >P 2 ) > 0.975, conclude Tx 1 is better Prob(P 2 >P 1 ) > 0.975, conclude Tx 2 is better At interim, Prob(P 1 >P 2 ) > 0.999, Stop the trial early, conclude Tx 1 is better Prob(P 2 >P 1 ) > 0.999, Stop the trial early, conclude Tx 2 is better 37
AR Comparisons (4) No AR AR AR w/ Early Stopping (N max =200) AR w/ Early Stopping (N max =250) H o H 1 H o H 1 H o H 1 H o H 1 N 1 100 100 100 46 98 42 122 46 N 2 100 100 100 154 97 125 121 150 N 200 200 200 200 195 167 243 196 P(Tx1 Better).02 0.04 0.05 0.05 0 P(Tx2 Better).03.83.04.75.05.76.05.85 P(Early Stopping).04.34.04.34.04.44 P(rand. in arm 2).50.50.50.77.50.75.50.77 Overall Resp..30.40.30.45.30.45.30.45 38
Demo for Multi-Arm AR Designs 39
Biomarker-Based Adaptive Designs Identify prognostic and predictive markers for treatments Prognostic markers markers that associate with the disease outcome regardless of the treatment: e.g., stage, performance status Predictive markers markers which predict differential treatment efficacy in different marker groups: e.g., In Marker ( ), tx does not work but in Marker (+), tx works Test treatment efficacy Control type I and II error rates Maximize study power for testing efficacy between txs Group ethics Provide better treatment to patients enrolled in the trial Assign patients into the better treatments with higher probs Maximize total number of successes in the trial Individual ethics 40
BATTLE (Biomarker-based Approaches of Targeted Therapy for Lung Cancer Elimination) Patient Population: Stage IV recurrent non-small cell lung cancer (NSCLC) Primary Endpoint: 8-week disease control rate [DCR] 4 Targeted treatments 14 Biomarkers 200 patients 20% type I error rate and 80% power Zhou X, Liu S, Kim ES, Lee JJ. Bayesian adaptive design for targeted therapy development in lung cancer - A step toward personalized medicine (Clin Trials, 2008). 41
42 Biomarker Analysis in Core Biopsies for Targeted Therapy Response Prediction Baseline Biopsy Biomarker assessment: 1. EGFR mutation 2. 3. EGFR over exp/amplification EGFR increased copy number 4. KRAS mutation 5. 6. BRAF mutation VEGF expression 7. VEGFR-2 expression 8. RXRα cytoplasm expression 9. RXRα nucleus expression 10. RXRβ cytoplasm expression 11. RXRβ nucleus expression 12. RXRγ expression 13. Cyclin D1 expression 14. Cyclin D1 amplification Histology (H&E) DNA extraction and Mutation Analysis (EGFR, K-RAS, B-RAF) EGFR Exon 19 Mutation K-RAS Codon 12 Mutation Histology Sections FISH and Immunohistochemistry EGFR FISH VEGF VEGFR-2 RXR-α RXR-β RXR-γ
Four Molecular Pathways Targeted in the BATTLE Program Enrollment into BATTLE Umbrella Protocol Biomarker Profile and Adaptive Randomization MG EGFR K-ras and/or B-raf Biomarke r VEGF and/or VEGFR RXR and/or cyclin D1 Frequency 1 + x x x 0.15 2 - + x x 0.2 3 - - + x 0.3 4 - - - + 0.25 5 - - - - 0.1 Erlotinib Sorafenib Vandetanib Erlotinib + Bexarotene Endpoint: Progression-free survival at 8 weeks Disease Control Rate (DCR) 43
Bayesian Hierarchical Probit Model Probit model with hyper prior (Albert et al, 1993) Notation -- i th : subject, i=1,..., n jk -- j th : treatment arm, j=1,, 4 -- k th : marker group, k=1,, 5 -- y ijk : 8-week progression-free survival status: 0(no) vs 1(yes) -- z ijk : latent variable -- μ jk : location parameter -- φ j : hyper-prior on μ jk -- γ jk : disease control rate (DCR) -- σ 2,τ 2 : hyper-parameters control borrowing across MGs within and between treatments 44
45 Computation of the Posterior Probability via Full Conditional Distribution Gibbs Sampling The random variables are generated from their complete posterior conditional distributions as follows. The latent variable z ijk is sampled from a truncated normal distribution centering at μ jk. The full conditional distribution of μ jk and φ j are the linear combination of the prior distribution and the sampling distribution.
Equal Randomization (ER) Followed By Adaptive Randomization (AR) ER is applied in the first stage for model development AR will be applied after enrolling at least one patient in each (Treatment x MG) subgroup. Adaptively assign the next patient into the treatment arms proportional to the marginal posterior disease control rates. Randomization Rate (RR): proportional to the marginal posterior DCR. ˆ γ jk / w ˆ γ wk set a minimum RR to 10% to ensuring a reasonable probability of randomizing pts in each arm Suspend randomization of a treatment in a biomarker group if Probability(DCR > 0.5 Data) < 10% Declare a treatment is effective in a biomarker group if Probability(DCR > 0.3 Data) > 80% 46
Simulation Results, Scenario 1 (with early stopping) 47
Demo of Biomarker-Based Multi-Arm AR Designs 48
BATTLE Results: Disease Control in % (n) Marker Groups EGFR KRAS VEGF RXR/ CycD1 None Total Erlotinib 35% (17) 14% (7) 40% (25) 0% (1) 38% (8) 34% (58) Treatments Vandetanib 41% (27) 0% (3) 38% (16) NA (0) 0% (6) 33% (52) Erlotinib + Bexarotene 55% (20) 33% (3) 0% (3) 100% (1) 56% (9) 50% (36) Sorafenib 39% (23) 79% (14) 64% (39) 25% (4) 61% (18) 58% (98) Total 43% (87) 48% (27) 49% (83) 33% (6) 46% (41) 46% (244) ntation: http://app2.capitalreach.com/esp1204/servlet/tc?cn=aacr&c=10165&s=20435&e=12587&&m=1&br=80&audio=false 49
Lessons Learned from BATTLE Biomarker-based adaptive design is doable! It is well received by clinicians and patients. Prospective tissues collection & biomarkers analysis provide a wealth of information Treatment effect and predictive markers are efficiently assessed. Pre-selecting markers is not a good idea. We don t know what are the best predictive markers at get-go. Bundling markers into groups, although can reduce total number of marker patterns, is not the best way to use the marker information either. AR should kicks in early & needs to be closely monitored. AR works well only when we have good drugs and good predictive markers. 50
Section 4.4: Adaptive Randomization Bayesians don t need/like randomization plays no role in calculating posterior probabilities (whereas crucial for frequentist inference) we can control for prognosis-related covariates anyway ethically difficult for physicians patients willing to be randomized are inherently different Our take: Randomization IS still essential: ensures pt prognosis is uncorrelated with trt assigned balances trt assignment within patient subgroups we can t control for unknown/unmeasured covariates! BUT: Note that we needn t randomize patients with equal probabilities to all arms... Bayesian Adaptive Methods for Clinical Trials p. 90/100
Principles of Adaptive Randomization By adaptive, we mean a procedure that alters something based on on the results of the trial so far = implications for Type I error! Here we focus on outcome-adaptive designs, not covariate-adaptive designs that seek to balance covariates across treatments Basic idea: Treatment arms A k having response probabilities θ j, j = 1,...,m. Given data y, randomize to treatment A k with probability r k (y) {p(θ k = max j θ j y)} c for some c 0 (1) c = 0 = equal randomization might take c = n/2n where n is number of currently enrolled pts and N is maximum enrollment Bayesian Adaptive Methods for Clinical Trials p. 91/100
MDACC software package: AR Windows application for trials having up to 10 arms outcomes may be either binary or time-to-event (TITE), though latter case currently restricted to exponential survival with a conjugate prior easy-to-read user s guide 583 registered downloads between 2005 and 2009 Binary case: Assign Beta(α k,β k ) priors to the θ k by choosing the (α k,β k ) pairs directly specifying either two quantiles or the mean and the variance; AR can then back out (α k,β k ) Assuming x k positive and n k x k negative (independent) responses = θ k y Beta(x k +α k, n k x k +β k ) as usual = used to define a variety of stopping rules... Bayesian Adaptive Methods for Clinical Trials p. 92/100
Algorithm used by AR Step 1. Early loser: If for some prespecified probability p L, P(θ k > θ j k y) < p L, then arm k is declared a loser and is suspended. Normally p L is fairly small (say, 0.10 or less) AR permits an arm to return later Step 2. Early winner: If for some prespecified probability p U, P(θ k > θ j k y) > p U, then arm k is declared the winner and the trial is stopped early. Normally p U is fairly large (say, 1 p L for a two-arm trial) Bayesian Adaptive Methods for Clinical Trials p. 93/100
Algorithm used by AR (cont d) Step 3. Final winner: If, after all patients have been evaluated, for some prespecified probability p U, P(θ k > θ j k y) > p U, then arm k is declared the winner. If no treatment arm can meet this criterion, AR does not make a final selection. Normally p U < p U (say, between 0.70 and 0.90) Step 4. Futility: If for some θ min and some prespecified p L, P(θ k > θ min y) < p L, then arm k is declared futile and its accrual is stopped. Reactivation of a futile arm is not permitted Normally p L is quite small (say, 0.10 or less) Bayesian Adaptive Methods for Clinical Trials p. 94/100
Example: Sensitizer Trial Goal: Evaluate ability of a sensitizer (given concurrently with another chemotherapeutic agent) to produce complete remission (CR) at 28 days post-treatment Classical two-arm comparison of drug-plus-sensitizer vs. drug alone sample size near 100 too large for our accrual rate of just 30/year! Instead, use some prior information with a Bayesian AR design having maximum patient accrual = 60 minimum randomization probability = 0.10 first 14 patients randomized fairly (7 to each arm) before AR begins AR tuning parameter c = 1 (i.e., modest deviation from equal randomization after the first 14 patients) Bayesian Adaptive Methods for Clinical Trials p. 95/100
Priors for the Sensitizer Trial density 0 1 2 3 4 5 6 Arm 1 (control) Arm 2 (sensitizer) density 0 1 2 3 4 5 6 Arm 1 (control) Arm 2 (sensitizer) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 θ Left: standard" priors; right, conservative" priors (both sd s doubled) Both Arm 1 priors have mean 0.55; both Arm 2 priors have mean 0.75 θ Bayesian Adaptive Methods for Clinical Trials p. 96/100
Control Parameters for Sensitizer Trial Begin with a standard" rule that sets early loser selection probability p L = 0.025 early winner selection probability p U = 0.975 final winner selection probability p U = 0.90 futility parameters θ min = 0.50 and p L = 0.05 Use AR to compare results from three different scenarios: Scenario 1: true response rates of.55 in both groups (the null" scenario), Scenario 2: true response rates of.55 control,.70 sensitizer (the most likely" scenario), and Scenario 3: true response rates of.55 control,.80 sensitizer (the optimistic" scenario). Bayesian Adaptive Methods for Clinical Trials p. 97/100
100 Simulated Trials, Standard Prior Scenario 1 (Average Trial Length: 22.5 months) True Pr Pr Pr(select Pr(stop # Patients Arm (success) (select) early) early) (2.5%, 97.5%) Arm1 0.55 0.01 0 0.11 19.6 ( 5, 38 ) Arm2 0.55 0.16 0.11 0 35.6 ( 8, 53 ) Scenario 2 (Average Trial Length: 16.4 months) Arm1 0.55 0 0 0.55 10.1 ( 4, 22 ) Arm2 0.7 0.74 0.55 0 30.8 ( 4, 51 ) Scenario 3 (Average Trial Length: 10.8 months) Arm1 0.55 0 0 0.89 7.01 ( 4, 16 ) Arm2 0.8 0.96 0.89 0 20.1 ( 4, 51 ) Good Type I error (17% total) and power (74%, 96%) But fairly high sample sizes & fairly long trial lengths Bayesian Adaptive Methods for Clinical Trials p. 98/100
100 Simulated Trials, Conservative Prior Scenario 1 (Average Trial Length: 21.0 months True Pr Pr Pr(select Pr(stop # Patients Arm (success) (select) early) early) (2.5%, 97.5%) Arm1 0.55 0.19 0.05 0.15 26.2 ( 5, 45 ) Arm2 0.55 0.16 0.13 0.09 25.4 ( 4, 47 ) Scenario 2 (Average Trial Length: 18.1 months) Arm1 0.55 0.04 0.02 0.39 15.2 ( 4, 44 ) Arm2 0.7 0.52 0.39 0.02 29.4 ( 4, 49 ) Scenario 3 (Average Trial Length: 14.3 months) Arm1 0.55 0 0 0.63 10.6 ( 2, 26 ) Arm2 0.8 0.80 0.63 0 25.6 ( 3, 50 ) is more conservative; higher Type I error, lower power still fairly low average sample sizes (36.2, 44.6) Bayesian Adaptive Methods for Clinical Trials p. 99/100
Summary MD Anderson software available at https://biostatistics.mdanderson.org/softwaredownload/ BCLM text-related software available at: http://www.biostat.umn.edu/ brad/data3.html BRugs package installation and further examples: http://www.biostat.umn.edu/ brad/software/brugs/ Related design site for binary and Cox PH models: www.biostat.umn.edu/ brianho/papers/2007/jbs/prac_bayes_design.html Thanks for your attention! Bayesian Adaptive Methods for Clinical Trials p. 100/100
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Hierarchical Commensurate Prior Models for Adaptive Incorporation of Historical Information in Clinical Trials Brian P. Hobbs 1, Bradley P. Carlin 2, Sumithra Mandrekar, 3 and Daniel Sargent 3 1 Department of Biostatistics, University of Texas M.D. Anderson Cancer Center, Houston, TX 2 Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 3 Mayo Clinic, Rochester, MN University of Chicago, Department of Health Studies, December 7, 2011
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Background Using Historical Data Email 8 Sep 2008 from Dr. Telba Irony, FDA: When we try to borrow strength from only one historical study (be it a control group or a treatment group)... [the results] become VERY sensitive to the hyperprior [on the variance parameters that control the amount of borrowing]. Borrowing from historical data offers advantages: reduced sample size (at least in control group) hence lower cost and ethical hazard, plus higher power but also disadvantages: higher Type I error, plus a possibly lengthier trial if the informative prior turns out to be wrong Thus what is needed is a recipe for how much strength to borrow from the historical data One possibility: back out this amount based on Type I error and power considerations. This is often done, but tends to defeat the historical data s original purpose!
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Proposed Solution: Power Priors Introduced by Ibrahim and Chen (2000, Statistical Science) Let D 0 = (n 0, x 0 ) denote historical data, suppose θ is the parameter of interest, and let L(D 0 θ) denote the general likelihood Suppose π 0 (θ) is the prior distribution on θ before D 0 is observed, the initial prior The conditional power prior on θ for the current study is the historical likelihood, L(D 0 θ), raised to power α 0, where α 0 [0, 1], multiplied by the initial prior: π(θ D 0, α 0 ) L(D 0 θ) α 0 π 0 (θ), α 0 is the power parameter that controls the degree of borrowing from D 0
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Power Priors (cont d) The power parameter, α 0, can be interpreted as a relative precision parameter for the historical data (IC, 2000, p.48) Certainly apparent for normal data, x 0i iid N(θ, σ 2 0 ), since under a flat initial prior, π 0 (θ D 0, α 0 ) = N ( x 0, σ 2 0/(α 0 n 0 ) ) Think of α 0 n 0 as the effective historical sample size Given current data, D = (n, x), the conditional posterior q(θ D, D 0, α 0 ) L(D 0 θ) α 0 L(D θ)π 0 (θ) α 0 1, q(θ D, D 0, α 0 ) approaches full borrowing from D 0 α 0 0, q(θ D, D 0, α 0 ) approaches no borrowing from D 0
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Power Priors (cont d) We could fix α 0 [0, 1] and assume consistency among D 0 and D is known, but if this is not the case, models with poor frequentist operating characteristics may result Choosing a hyperprior, π(α 0 ), for α 0 enables the data to help determine probable values for α 0 Ibrahim-Chen (2000) propose joint power priors of form π(θ, α 0 D 0 ) L(D 0 θ) α 0 π 0 (θ)π(α 0 ) Duan et al. (2006), Neuenschwander et al. (2009), and Pericchi (2009) propose modified joint power priors (MPP) which respect the Likelihood Principle, π(θ, α 0 D 0 ) L(D 0 θ) α 0 π 0 (θ) L(D0 θ) α 0 π0 (θ)dθ π(α 0)
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Location Commensurate Power Priors (LCPP) We propose an adaptive modification of the MPP Let x 0i iid Normal(µ0, σ 2 0 ) and x i iid Normal(µ, σ 2 ) Different parameters in historical and current group, µ 0 and µ Extend hierarchical model to include parameter, τ, that directly measures similarity of µ and µ 0 Construct prior for µ dependent upon µ 0 and τ τ parametrizes commensurability (precision) Use information in τ to guide prior on α 0 π LCPP (µ, α 0, τ x 0 ) [ N(x0 µ 0, ˆσ 2 0 )] α 0 [ N(x0 µ 0, ˆσ 2 0 )] α 0 dµ0 dµ 0 N(µ µ 0, 1 τ ) Beta(α 0 τ a, 1) p(τ)
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example LCPP for Single Arm Trial Formalize commensurate as µ 0 near µ by adopting Normal prior on µ with mean µ 0 and precision τ Beta(τ a, 1) prior on α 0 for some a > 0 τ close to 0 corresponds to very low commensurability, while very large τ implies the two datasets may arise from similar populations τ, point-mass prior on α 0 at 1 τ 0 discourages incorporation of historical information Requires a fixed, known sampling historical variance ˆσ 2 0 (MLE)
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example LCPP for Single Arm Trial (cont d) Both α 0 and τ inflate the prior variance π LCPP (µ, α 0, τ x 0 ) N(µ x 0, 1 τ + ˆσ2 0 α 0 n 0 ) Beta(α 0 τ a, 1) p(τ) Prior on τ: Mixture of two gammas with mixing probability ω = 1/2 and with hyperparameter a = 1/2 provides sufficient flexibility: p(τ) ( ωgamma(τ 1, 10) + (1 ω)gamma(τ 3/2, ) 1 1000 ) Posterior obtained after multiplying by current likelihood N(x µ, σ 2 ) and vague (say reference) prior on σ 2 q(σ 2 x) IG ( n 2, n [ 2 s 2 + ( x µ) 2])
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Extension to Linear Models Formulate linear model to borrow adaptively from the identical covariates Suppose that both trials identically measure p 1 covariates Let X 0 and X be n 0 p and n p design matrices Suppose y 0 N n0 (X 0 β 0, σ 2 ) and y N n (X β + Zλ, σ 2 ) where Z is an n r design matrix containing variables relevant only to the current trial, and an indicator for new treatment Let D 0 = (y 0, X 0, n 0, p), and D = (y, X, Z, n, p, r) Assume flat prior for λ Let ˆβ 0 = (X T 0 X 0) 1 X T 0 y 0 π LCPP (β, λ, σ 2, α 0, τ 2 D 0) N n0 ( y 0 X 0β 0, ˆσ 2 0I n0 ) N p ( β ˆβ 0, 1 τ Ip ) dβ 0 Beta (α 0 τ a, 1) 1 σ 2 p(τ)
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Randomized Controlled Colorectal Cancer Trials Two successive randomized controlled colorectal cancer trials on subjects with previously untreated metastatic colorectal cancer: Saltz et al. (2000) trial randomized N 0 = 683: May 1996 and May 1998 1. Irinotecan alone (arm A) 2. Irinotecan and bolus Fluorouracil plus Leucovorin (arm B; IFL) significantly longer progression free survival 3. Fluorouracil and Leucovorin (arm C; 5FU/LV) standard therapy Goldberg et al. (2004) trial randomized N = 795: May 1999 and April 2001 1. Irinotecan and bolus Fluorouracil plus Leucovorin (IFL) regulatory standard in March 2000 2. Oxaliplatin and infused Fluorouracil plus Leucovorin (FOLFOX) new regimen 3. Irinotecan and Oxaliplatin (IROX) new regimen
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Randomized Controlled Colorectal Cancer Trials (cont d) Longest diameter (ld) in cm of 1 to 9 tumors measured every 6 weeks for the first 42 weeks or until a response Compare FOLFOX to IFL for average reduction in ld sum from BL Covariate adjustments for baseline ld sum, age, and AST in units/l Historical: arm B (IFL) from the Saltz trial, n 0 = 171 Current: IFL, n = 129, and FOLFOX in the Goldberg trial, n = 141 Ordinary linear regression fits to colorectal cancer data: Historical data Current data estimate 95% CI estimate 95% CI Intercept 0.880 ( 1.977, 3.738) 0.467 ( 2.275, 1.341) BL Tumor Sum 0.232 ( 0.310, 0.154) 0.397 ( 0.453, 0.340) Age 0.022 ( 0.067, 0.022) 0.014 ( 0.014, 0.041) AST 0.001 ( 0.017, 0.015) 0.005 ( 0.007, 0.017) FOLFOX 0.413 ( 1.017, 0.190)
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Randomized Controlled Colorectal Cancer Trials (cont d) Frequency 0 10 20 30 40 50 60 20 15 10 5 0 5 Historical IFL Frequency 0 10 20 30 40 20 15 10 5 0 5 Concurrent FOLFOX Histograms of average change in ld tumor sum from baseline: historical IFL (left), FOLFOX (right); note FOLFOX results are slightly better (more negative).
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Intro Commensurate Power Priors Linear Model Example Randomized Controlled Colorectal Cancer Trial (cont d) LCPP fit to colorectal cancer data: estimate 95% BCI β (Intercept) 0.180 ( 1.11, 1.42) β (BL Tumor Sum) 0.39 ( 0.44, 0.33) β (Age) 0 ( 0.02, 0.02) β (AST) 0 ( 0.01, 0.01) λ (FOLFOX) 0.46 ( 0.82, 0.10) α 0 0.86 (0.44, 1.00) High posterior estimate for α 0 LCPP analysis incorporates virtually all of the historical data Conclude FOLFOX resulted in a significant reduction in average ld sum when compared to the IFL Consistent with those of Goldberg et al. (2004), who determined FOLFOX to have better times to progression and response rates
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Competing Non-Power Prior Approaches Let x 0i iid Normal(µ0, σ 2 0 ) and x i iid Normal(µ, σ 2 ) 1. Cauchy prior on µ centered at historical sample mean x 0 π cau Cauchy(µ median = x 0, scale = γ) 2. Location Commensurate Prior (LCP) Adaptive mechanism based solely on the commensurability parameter τ: π LCP (µ, σ 2, τ x 0 ) N(µ x 0, 1 τ + ˆσ2 0 n 0 ) 1 σ 2 p(τ) We use a mixture prior on τ, often with ω = 1/2: ( ) 1 p(τ) ωgamma(τ 1, 10) + (1 ω)gamma(τ 3/2, 1000 )
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Non-Power Prior Approaches (cont d) 3. Location-Scale Commensurate Mixture Prior (LSCMP) Borrowing depends upon commensurability of both the location and scale parameters ( ) q0 (µ 0, σ0 2 x 0) Normal(x 0 µ 0, σ0 2) 1 σ0 2 Commensurability Priors N ( ) µ µ 0, 1 ν IG ( σ 2 A = γσ0 4 + 2, B = σ0(γσ 2 0 4 + 1) ) π LSCMP (µ, σ 2, σ0 2 x 0, ν, γ) q 0 (µ 0, σ0 x 2 0 ) N ( µ µ 0, 1 ν ) IG ( σ 2 A(σ 0, γ), B(σ 0, γ) ) dµ 0 ( N µ x 0, n 0 + νσ0 2 ) IG ( σ 2 A, B ) ( IG σ νn 0 2 n 0 1, n ˆσ ) 0 0 2 0 2 2
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Non-Power Prior Approaches (cont d) 3. Location-Scale Commensurate Mixture Prior (cont d) Fix values of ν = (ν1, ν 0 ) and γ = (γ 1, γ 0 ) corresponding to high and low precision ν = (10 12, (10ˆσ 2 0) 1 ) γ = (10 2, 5 1 ) Formulate mixture prior with fixed mixing probability θ such that π (µ, σ x 0, ν, γ, θ) is proportional to θπ(µ, σ 2, σ 2 0 x 0, ν 1, γ 1 ) + (1 θ)π(µ, σ 2, σ 2 0 x 0, ν 0, γ 0 ), where fixing θ = 0.5 seems to work well.
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 0, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 0.2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 0.4, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 0.6, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 0.8, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 1, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 1.2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 1.4, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 1.6, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 1.8, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = s 2 0 = 1; x = 2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 0, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 0.2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 0.4, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 0.6, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 0.8, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 1, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 1.2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 1.4, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 1.6, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 1.8, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Equal Sampling Variances Unequal Sampling Variances Historical Data Pooled Data Modified Power Prior Cauchy Prior Location Commensurate Power Prior Location Commensurate Prior Location Scale Commensurate Mixture Prior Current Data 1 0 1 2 3 95% BCI for µ given n = 10, n 0 = 30; s 2 = 3, s 2 0 = 1; x = 2, x 0 = 1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Conventional Random-effects Meta-analytic Approach Random-effects meta-analysis a assumes exchangeability: µ 0,1,..., µ 0,H, µ N(ξ, η 2 ) between-study heterogeneity and within-study variability ξ and η 2 characterize the population mean and between-study variance shrinkage parameter B = σ 2 /(σ 2 + η 2 ), weight placed on the prior mean ξ for the posterior mean µ denote unknown bias by h = µ µ 0,h, parameter vector θ = (λ,, σ 2, σ0,1 2,..., σ2 0,H ), and Y = (y, y 0,1,..., y 0,H ) denote the collection of response data estimate of ξ is a weighted average of the observed historical and current study effects, with weights 1/(σ 2 0,1 /n 0,1 + η 2 ),..., 1/(σ 2 0,H /n 0,H + η 2 ), 1/(σ 2 /n + η 2 )
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Common noninformative and weakly-informative priors for η 2 prior form uniform variance a p(η 2 ) = U(0, a), a = 100 inverse gamma a p(η 2 ) = Γ 1 (ɛ, ɛ), ɛ = 0.001 uniform standard deviation a p(η) = U(0, a) half-cauchy b p(η) (η 2 + b) 1, b = 25 uniform shrinkage c p(η 2 ) σ 2 /{(σ 2 + η 2 ) 2 }, σ0,h 2 = σ2, a = Spiegelhalter et al., 2004; b = Gelman, 2006; c = Daniels, 1999
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion prior posterior, H = 1 posterior, H = 3 relative prior density unif. var inv. gamma unif. sd half Cauchy unif. shrink posterior density 0.0 0.1 0.2 0.3 0.4 unif. var (v=30) inv. gamma (404890) unif. sd (21) half Cauchy (209) unif. shrink (8) unif. var (4.19) inv. gamma (0.11) unif. sd (1.30) half Cauchy (1.37) unif. shrink (0.41) 10 5 0 5 10 log η 2 10 5 0 5 10 log η 2 10 5 0 5 10 log η 2 Prior & posteriors for log(η 2 ) under full homogeneity: tr = 0, for n = 180, n 0 = 60, and σ 2 = σ0,h 2 = 1. Parens = stand dev on the scale of η2 { } η 2 p(η 2, θ)l(y θ)dθ L(Y θ tr )dy Y θ
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Commensurate Prior: One Historical Study commensurate prior for µ (HCMS, 2011) µ N(µ µ 0, 1/τ) µ is a non-systematically biased representation of µ 0 initial prior, p(µ 0 ), characterizes info. before observing hist. data one-to-one relationship between τ and η 2 : τ = 1/(2η 2 ) joint posterior: q(θ τ, y, y 0 ) n 0 n N(µ µ 0, 1/τ)p(µ 0 )p(σ, σ 0 ) N(y 0j µ 0, σ0) 2 N(y i µ + d i λ, σ 2 ) j=1 Pocock (1976) repeated analysis under several fixed values of 1/τ HCMS (2011) consider fully Bayesian approaches as well as power prior formulations i=1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Commensurate Prior: One Historical Study (cont d) HSC (2011) proceed with parametric empirical Bayesian inference by replacing hyperparameter τ in with its marginal MLE, τ = arg max {m(y τ)}, τ [l τ,u τ ] restricted to the effective range of borrowing of strength Gaussian data, τ [0.5/10 2, 0.5/0.05 2 ], or η [0.05, 10] Our proposed empirical Bayes (EB) procedure: underestimates variability in θ given that posterior uncertainty in τ is unacknowledged offers alternative, (perhaps more desirable) trade-off between borrowing of strength and bias reduce dimensionality in the numerical marginalization of θ with a normal approx. for non-gaussian data
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Commensurate Prior: Multiple Historical Studies assume homogeneity among the hist. studies (or fixed degree of heterogeneity) commensurate prior for µ cond. on the hist. population mean: N(µ ξ 0, 1/τ) Relationship between τ and η 2 is more complicated denote v 0,h = σ 2 0,h /n 0,h, and let v = σ 2 /(n d i ) τ 1 characterizes the meta-analytic between-study variability, plus the diff. between the summed variability among the sample means, and the pop. mean when heterogeneity is estimated η 2 versus when full homogeneity is assumed, τ 1 = η 2 + { 1/(v + η 2 ) + 1 ( H 1/(v 0,h + η )} 2 1/v + h=1 ) 1 H 1/v 0,h h=1
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Ratios of the post. variances of λ versus under no borrowing tr = 0 tr = 0.5 tr = 1 H = 1 3 5 H = 1 3 5 H = 1 3 5 unif. var. 1.00 0.99 0.99 unif. shrink 1.01 0.99 0.99 unif. sd 1.00 0.97 0.98 half-cauchy 1.03 0.97 0.97 inv. gamma 1.03 0.92 0.95 commens. homog. 1.25 1.18 1.00 Absolute Bias: E(λ tr ) λ tr unif. var. 0.00 0.00 0.01 unif. shrink 0.00 0.00 0.01 unif. sd 0.00 0.00 0.01 half-cauchy 0.00 0.00 0.01 inv. gamma 0.00 0.02 0.01 commens. homog. 0.00 0.18 0.32
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Ratios of the post. variances of λ versus under no borrowing tr = 0 tr = 0.5 tr = 1 H = 1 3 5 H = 1 3 5 H = 1 3 5 unif. var. 1.00 0.99 0.99 unif. shrink 1.01 0.99 0.99 unif. sd 1.00 0.97 0.98 half-cauchy 1.03 0.97 0.97 inv. gamma 1.03 0.92 0.95 commens. 1.23 1.14 1.03 homog. 1.25 1.18 1.00 Absolute Bias: E(λ tr ) λ tr unif. var. 0.00 0.00 0.01 unif. shrink 0.00 0.00 0.01 unif. sd 0.00 0.00 0.01 half-cauchy 0.00 0.00 0.01 inv. gamma 0.00 0.02 0.01 commens. 0.00 0.09 0.03 homog. 0.00 0.18 0.32
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Ratios of the post. variances of λ versus under no borrowing tr = 0 tr = 0.5 tr = 1 H = 1 3 5 H = 1 3 5 H = 1 3 5 unif. var. 1.00 1.05 1.15 0.99 0.99 unif. shrink 1.01 1.07 1.14 0.99 0.99 unif. sd 1.00 1.16 1.28 0.97 0.98 half-cauchy 1.03 1.17 1.28 0.97 0.97 inv. gamma 1.03 1.24 1.33 0.92 0.95 commens. 1.23 1.37 1.44 1.14 1.03 homog. 1.25 1.49 1.60 1.18 1.00 Absolute Bias: E(λ tr ) λ tr unif. var. 0.00 0.00 0.00 0.00 0.01 unif. shrink 0.00 0.00 0.00 0.00 0.01 unif. sd 0.00 0.00 0.00 0.00 0.01 half-cauchy 0.00 0.00 0.00 0.00 0.01 inv. gamma 0.00 0.00 0.00 0.02 0.01 commens. 0.00 0.00 0.00 0.09 0.03 homog. 0.00 0.00 0.00 0.18 0.32
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Ratios of the post. variances of λ versus under no borrowing tr = 0 tr = 0.5 tr = 1 1 3 5 1 3 5 1 3 5 unif. var. 1.00 1.05 1.15 0.99 0.99 0.96 0.99 0.99 0.97 unif. shrink 1.01 1.07 1.14 0.99 0.98 0.98 0.99 0.98 0.98 unif. sd 1.00 1.16 1.28 0.97 0.95 0.95 0.98 0.97 0.98 half-cauchy 1.03 1.17 1.28 0.97 0.97 0.97 0.97 0.98 1.00 inv. gamma 1.03 1.24 1.33 0.92 0.94 0.95 0.95 0.98 0.99 commens. 1.23 1.37 1.44 1.14 1.09 1.06 1.03 1.01 1.00 homog. 1.25 1.49 1.60 1.18 1.39 1.49 1.00 1.14 1.21 Absolute Bias: E(λ tr ) λ tr unif. var. 0.00 0.00 0.00 0.00 0.01 0.05 0.01 0.01 0.02 unif. shrink 0.00 0.00 0.00 0.00 0.03 0.06 0.01 0.02 0.03 unif. sd 0.00 0.00 0.00 0.00 0.04 0.08 0.01 0.02 0.03 half-cauchy 0.00 0.00 0.00 0.00 0.04 0.08 0.01 0.03 0.03 inv. gamma 0.00 0.00 0.00 0.02 0.06 0.10 0.01 0.03 0.04 commens. 0.00 0.00 0.00 0.09 0.13 0.11 0.03 0.04 0.02 homog. 0.00 0.00 0.00 0.18 0.33 0.38 0.32 0.69 0.81
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Extension to Binary Regression Formulate commensurate power prior for binary outcomes Historical data y 0i Ber [π 0 (X 0i )]; Current data y i Ber [π(x i, d i )] where d i is the treatment indicator Location commensurate prior for binary regression follows proportional to n 0 ( ) 1 Ber (y 0i π(x 0i )) N β β 0, p(τ), τ i=1 where we use p(τ) to bound τ away from 0 or. Can t integrate out β 0 analytically, so instead just multiply by current data likelihood and normalize via MCMC... Can choose probit link, π 0 (X 0 ) = Φ(X 0 β 0 ) and π(x, d) = Φ(X β + dλ) closed form full conditionals! Can instead pick logit link, π(x 0 ) = ( 1 + e x0β0 ) 1 requires Metropolis-Hastings Example: Adaptively incorporating nonrandomized HIV trial arms!
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Extension to Generalized Linear Models Assume an initial flat prior on β 0, and use the Bayesian Central Limit Theorem with the historical likelihood to obtain ( β 0 N ˆβ0, Σ ) 0, where ˆβ 0 is the historical MLE for β 0, and Σ 0 is the inverse of the historical observed Fisher information matrix Specifying a flat prior for λ, a N(β 0, 1 τ I p) prior for β, and integrating out β 0 again leads to joint location commensurate prior of ( p(β, λ, τ y 0 ) N p V 1 M, 1 ) τ V 1 p(τ) The joint posterior follows by multiplying by the current data likelihood, which yields intractable full conditionals. Here we use Metropolis-Hastings (NOT the BCLT again, which would require fixed β and λ in the Fisher information matrix) Primary target application: formulation for survival outcomes...
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Time-to-Event Response Historical, concurrent data are triples (t 0j, δ 0j, X 0j ) for j = 1,..., n 0 and (t i, δ i, X i ) for i = 1,..., n; where ts are the observed (possibly censored) failure times, δs are noncensoring indicators, and X 0j and X i are row vectors of p covariates associated with historical subject j and concurrent subject i. Let f (t 0j ) and f (t i ) denote survival time densities with survival functions F (t) and F (t 0 ) Adopt a log-linear model: y 0 = log(t 0 ) = X 0 β 0 + σ 0 e 0 and y = log(t) = X β + dλ + σe, where e 0 = (y 0 X 0 β 0 )/σ 0 and e = (y X β dλ)/σ. The likelihoods follow as n 0 [ ] δ0j 1 L 0 (β 0, σ 0 y 0 ) f (e 0j ) F (e 0j) 1 δ 0j σ 0 j=1 and L (β, σ y) n i=1 [ ] δi 1 σ f (e i) F (e i ) 1 δ i
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Time-to-Event Response (cont d) Weibull regression occurs when e 0 and e follow the extreme value distribution, f (u) = exp [u exp (u)]. We consider a location-scale commensurate approach, so borrowing depends upon commensurability among σ 0 and σ, as well as β 0 and β. LSCP follows from general theory (2 slides back) as p(θ, λ, τ y 0 ) N p+1 ( Λ 1 Q, 1 τ Λ 1 ) p(τ), where Q = ( ˆΨ0 + τi p+1 ) 1 ˆΨ0 ˆθ0, Λ = I p+1 τ ( ˆΨ0 + τi p+1 ) 1, and ˆΨ 0 = ˆΨ 0 (ˆθ 0 ) is the observed Fisher information matrix. The posterior is then proportional to the product of this LCPP and the concurrent data likelihood, as usual. Note that the exponential model is a special case where σ = 1.
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Extension to General/Generalized Mixed Models Consider the Gaussian case for now (non-gaussian also feasible): historical data: y 0ij, i = 1,..., n 0j, j = 1,..., m 0 ; arrange in a vector y 0 and model as y 0 = X 0 β 0 + Z 0 u 0 + ɛ 0, where ( ) ( ) ( ) ( ) u0 0 u0 G0 0 E = and Cov = 0 0 R 0 ɛ 0 current data: uses similar notation: y = X β + dλ + Zu + ɛ, where d is a N 1 indicator of treatment. The location commensurate prior emerges as proportional to ( N p β V 1 M, 1 ) ( ) τ V 1 N m u 0, σui 2 m p(λ, σ ɛ, σ u, τ), ( where M = X0 T ˆΣ ) 1 1 X 0 + τi p X T ˆΣ 1 0 X 0 β0, ( V = I p τ X0 T ˆΣ ) 1, 1 X 0 + τi p and β0 is the BLUE. As usual, multiply by current data likelihood and normalize; full conditionals are normal and inverse Wishart; non-gaussian case requires Metropolis-Hastings ɛ 0
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Regression Model Simulations Check frequentist OCs of the linear, exponential, & Weibull models: E[y 0 ] = µ 0 and E[y] = µ 0 + dλ, where d is treatment indicator and µ 0 & µ are intercepts for controls Let = µ 0 µ, & spz λ > 0 indicates positive treatment effect > 0 implies hist. performed better than current controls < 0 implies hist. performed worse than current controls Type I error (λ = 0), power, & 95% post. CI interval coverage were computed by sampling y 0 and y for true fixed values of and λ Use 95% posterior CI to test the null hypothesis H 0 : λ = 0 Below, Weibull sims compare commensurability among shape parameters, through ratio ω = σ 0 /σ, when we set = 0 Compare commensurate priors to models that ignore (no borrowing) & fully incorporate (full borrowing) the historical data
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations = 1 = 0 = 1 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 λ Power curves for the full borrowing, commensurate prior, and no borrowing models based on the 95% posterior CIs for λ. The top row shows results for the Gaussian linear model, where µ 0 = 1, σ 2 0 = σ 2 = 1, n 0 = 100, and n = 50; bottom row the exponential model, for µ 0 = 2, n 0 = 200, and n = 100.
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Logistic Generalized Linear Time-to-Event Mixed Simulations Gaussian Exponential Weibull 0.0 0.2 0.4 0.6 0.8 1.0 1.5 0.5 0.5 1.5 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.5 1.0 1.5 2.0 2.5 3.0 ω 0.90 1.00 1.10 1.20 1.5 0.5 0.5 1.5 0.70 0.80 0.90 1.0 0.5 0.0 0.5 1.0 0.4 0.8 1.2 1.6 0.5 1.0 1.5 2.0 2.5 3.0 ω Sim. results for full borrowing, commensurate prior, and no borrowing models. Top row plots average 95% posterior coverage probabilities for λ. Bottom row plots average length of 95% posterior CIs for λ. X-axes correspond to intercepts, = µ 0 µ (left & center), or the scale ratio, ω = σ 0/σ where = 0 (right).
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Adaptive Randomization Friedman, Furberg, and DeMets (1998, p.69) broadly refer to randomization procedures that adjust the allocation ratio as the study progresses as adaptive. Commonly used to balance prognostic factors among the intervention arms assign patients to the better performing regimen in early phase trials Commensurate prior models for controlled trials naturally advocate a randomization scheme that is optimal with respect to the amount strength borrowed from the historical controls Use more of the new patients to learn about the efficacy and safety profile of the new intervention Adjust prob. of allocating next patient to new intervention according to the commensurability of the historical and current controls Evaluate sequentially using repeated fitting
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Optimal-Balanced Randomization Define allocation probability balance as a function of the effective sample size of the historical controls Morita, Thall, and Müller (2008) define prior effective sample size (ESS) of a parametric prior for non-adaptive models We expand their method for computing the prior effective sample size of the intercepts & identically measured covariates (previously β) Referred to as the effective number of historical controls, EHC First find the posterior distribution of the commensurability parameter, τ; then EHC is essentially ESS is computed for p(β, λ, ˆτ y 0 ) N p ( V 1 M, 1ˆτ V 1 ) p(ˆτ), where τ replaced with its posterior median, ˆτ
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Optimal-Balanced Randomization (cont d) The new treatment allocation probability, or balance function, for the jth+1 new patient follows as δ j = C j + EHC j T j + C j + EHC j, where T j and C j denote the number of subjects randomized to new treatment and control following the jth enrollment This imposes information balance by encouraging optimal use of new patients relative to amount of incorporated prior information We suggest adjusting the allocation probability in blocks after an initial period where δ j is fixed at 1/2
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Adaptive Randomization Example Flexible Initial Retrovirus Suppressive Therapies (FIRST) trial: This was a large, long-term, randomized, prospective comparison of three different antiretroviral strategies in highly active antiretroviral therapy-naive, HIV-1-infected persons (MacArthur et al., 2001) Patients within all three strategies were also assigned one or two nucleoside reverse transcriptase inhibitors (NRTIs) The three strategies for initial treatment were compared for long-term virological and immunological durability and safety, for the development of drug resistance, and for clinical disease progression Before randomization, patients within the two NNRTIs arms were given the option of preselecting the NNRTI drug, either nevirapine (NVP) or efavirenz (EFV), or allowing an additional randomization to NVP or EFV
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Randomization Design Randomization to FIRST PI Strategy (PI + NRTIs) (data not used) NNRTI Strategy (NNRTI + NRTIs) N=409 3-Class Strategy (PI + NNRTI + NRTIs) N=403 NNRTI substudy Randomization Prescribed NVP (N=100) Prescribed EFV (N=211) NNRTI substudy Randomization Prescribed NVP (N=137) Prescribed EFV (N=161) NVP (N=50) EFV (N=48) NVP (N=51) EFV (N=54) Outline of FIRST design and randomization for eligible subjects (Berg-Wolf et al., 2006).
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Adaptive Randomization Example (cont d) Current trial consists of data from the randomized arms of FIRST, 1. EFV (n = 99) is the novel intervention and 2. NVP (n = 104) the control therapy Historical controls consist of the NVP obs. arm (n = 237) Compare the prob. of virological suppression (HIV RNA < 50 copies/ml) under EFV and NVP at 32 weeks Use the commensurate prior probit regression model Technical details:, Φ 1 [π 0 ] = µ 0 and Φ 1 [π(d)] = µ + dλ Ŵ 0jj = φ(x 0 ˆβ 0 ) 2 jj Φ(X 0 ˆβ0 ) jj ( 1 Φ(X 0 ˆβ0 ) jj ), where Φ() is the standard normal c.d.f. & φ(.) is the standard normal p.d.f.
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Adaptive Randomization Example (cont d) Simulating the adaptive randomization method proceeds iteratively: assign the jth+1 new patient to NVP or EFV with probability, δ j generate response for several fixed values of = µ 0 µ λ is fixed at 0.23 (MLE obtained from fitting the real current data from FIRST) equal allocation to NVP & EFV for first 80 assignments after randomizations 80, 100, 120, and 160, EHC is computed and δ j is updated record EHC and δ j after each randomization block, & total assigned to each group after all n = 203 patients are randomized = 0.221 is MLE obtained from fitting the real current data from FIRST
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Optimal-Balanced Randomization Simulation Simulation Averages for adaptive randomization to EFV, for n = 203, where n 0 = 237, Ratio=δ/(1 δ):1. Rz = 2.25 = 1.5 = 0.75 = 0.221 = 0 EHC,Ratio EHC,Ratio EHC,Ratio EHC,Ratio EHC,Ratio 80 1, 1.03:1 18, 1.42:1 177, 5.43:1 235, 6.86:1 237, 6.92:1 100 1, 1.02:1 15, 1.23:1 168, 3.74:1 235, 4.82:1 237, 4.85:1 120 1, 1.01:1 11, 1.11:1 164, 2.95:1 235, 3.80:1 237, 3.82:1 160 1, 1.01:1 5, 1.01:1 144, 2.02:1 235, 2.74:1 237, 2.75:1 NVP, EFV NVP, EFV NVP, EFV NVP, EFV NVP, EFV 203 101, 102 99, 104 77, 126 66, 137 65, 138
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Conclusion Is all of this legal? X-L Meng: NO: Agreement between the historical and concurrent data does not necessarily imply they are studying the same thing! External validation/information is always required! (Fairly hardcore) frequentist referee: Not yet: One needs to evaluate theoretical properties, such as Bernstein-von Mises type results [showing asymptotic equivalence between Bayes and frequentist results] to provide some kind of universal assurances and/or insight beyond simulations into when gains can be expected. BP Carlin: Yes: Point taken, but our goal here was never to match frequentist results, and moreover the frequentist simulations we have carried out speak for themsselves and are Bayesianly justified in the sense of Rubin (1984)!
Intro Power Priors Alternatives Metaanalysis GLMs Rando Discussion Conclusion Conclusion Adaptive borrowing is consistent with recent arguments on behalf of the ease and desirability of adaptivity in Bayesian clinical trials generally; c.f. new Chapman and Hall textbook by Berry, Carlin, Lee, and Müller (2010)! Papers related to this work: Hobbs, B.P., Carlin, B.P., Mandrekar, S., and Sargent, D.J. (2011). Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials. Biometrics, 67, 1047 1056. Hobbs, B.P., Sargent, D.J., and Carlin, B.P. (2010). Commensurate priors for incorporating historical information in clinical trials using general and generalized linear models. Under revision for Bayesian Analysis. Hobbs, B.P. and Carlin, B.P. (2012). Optimal-balanced randomization for controlled clinical trials that incorporate historical controls. Manuscript in process.
Bayesian Hierarchical Modeling for Detecting Safety Signals in Clinical Trials Amy Xia, Amgen, Inc. Haijun Ma, Amgen, Inc. Brad Carlin, University of Minnesota Medtronic Statistics Conference Minneapolis, MN, October 4, 2011
Outline Introduction Why Are Bayesian Models Helpful? Three Stage Bayesian Hierarchical Models An Example Simulation Study Closing Remarks
Three-Tier System for Analyzing Adverse Events in Clinical Trials Tier 1 AEs -- events for which a hypothesis has been defined Tier 2 AEs -- events that are not pre-specified and common Tier 3 AEs events that are not pre-specified and infrequent Gould 2002 & Mehrotra and Heyes 2004 SPERT White Paper -- Crowe et al 2009
Multiplicity Issue in Detecting Signals Is Challenging Detection of safety signals from routinely collected, not pre-specified AE data in clinical trials is a critical task in drug development Multiplicity issue in such a setting is a challenging statistical problem Without multiplicity considerations, there is a potential for an excess of false positive signals Traditional ways of adjusting for multiplicity such as Bonferroni may lead to an excessive rate of false negatives The challenge is to develop a procedure for flagging safety signals which provides a proper balance between no adjustment versus too much adjustment
Bayesian Work in Signal Detection Spontaneous adverse drug reaction reports Gamma Poisson Shrinker (GPS) on FDA AERS database (DuMouchel,1999) Bayesian Confidence Propagation Neural Network (BCPNN) on WHO database (Bate, et al. 1998) Clinical trial safety (AE) data Bayesian hierarchical mixture modeling (Berry and Berry, 2004)
Safety assessment is one area where frequentist strategies have been less applicable. Perhaps Bayesian approaches in this area have more promise. -- Chi, Hung, and O Neill; Pharmaceutical Report, 2002
Proposed Bayesian Approach 1. A three-level hierarchical mixture model for binary responses was constructed following the work by Berry & Berry (2004) 2. Further extended to a hierarchical Poisson mixture model, to account for different exposure/follow-up times between patients 3. Provided guidance on how to choose a signal detection threshold to achieve acceptable false discovery rate (FDR) and power for meaningful risk sizes 4. Extended the work to the setting of meta-analysis and incorporating individual patient-level data. 5. Implemented the above models with available software WinBUGS for model implementation S-Plus graphics for inference
Considerations Regarding Whether to Flag an Event Actual significance levels Total number of types of AEs Rates for those AEs not considered for flagging Biologic relationships among various AEs 1 st two are standard considerations in the frequentist approach. The 2 nd two are not, but relevant in the Bayesian approach -- Berry and Berry, 2004
Example of the Medical Dictionary for Regulatory Activities (MedDRA) Hierarchy System Organ Class (SOC) Infections and Infestations High Level Group Term (HLGT) Viral infectious disorders High Level Term (HLT) Influenza Viral Infections Preferred Term (PT) Influenza Lowest Level Term (LLT) Flu Syndrome Reported AE verbatim Flu Syndrome
Why Are Bayesian Models Helpful? Bayesian hierarchical models allow for explicitly modeling AEs with the existing coding structure (eg, SOC and PT in MedDRA) AEs in the same SOC more likely to be similar within than across SOCs. Allow for this possibility, but does not impose it, depending on the actual data Flexible in modeling SOC/PT, HLT/PT or even the full hierarchy In fact, clinical and safety people would (informally) consider the similarity of the AEs, say, within SOCs when they review AE tables For example, if differences in several CV events were observed, then each would be more likely to be causal than if differences came from medically unrelated areas (eg, skin, neurological, thrombosis, cancer) Bayesian hierarchical modeling allows a scientific, explicit, and more formal way to take it into consideration
An Example Pooled data from four double-blind placebocontrolled studies on drug X Sample size: Nt = 1245 and Nc = 720 Reported AEs are coded to 465 PTs under 24 SOCs Notations SOC b=1, B and PT j=1, k b Data: For AE bj Treatment group: Y bj incident events in N t patients Control group: X bj incident events in N c patients
Data from the Example b j SOC Preferred Term Ctrl n Trt n Ctrl r (%) Trt r (%) Ctrl n/e Trt n/e 1 1 Blood and Lymphatic System Disorders Lymphadenopathy 6 10 0.83 0.80 0.04 0.03 General Disorders and Administration Site 8 8 Conditions Fatigue* 9 37 1.25 2.97 0.06 0.13 8 9 Feeling Cold 1 0 0.14 0.00 0.01 0.00 8 10 Feeling Hot 1 3 0.14 0.24 0.01 0.01 11 34 Infections and Infestations Herpes Simplex* 3 19 0.42 1.53 0.02 0.07 11 64 Sinusitis* 12 46 1.67 3.69 0.08 0.16 12 12 Injury, Poisoning and Procedural Complications Excoriation* 0 8 0.00 0.64 0.00 0.03 22 15 Skin and Subcutaneous Tissue Disorders Ecchymosis* 0 12 0.00 0.96 0.00 0.04 r = subject incidence (%) n/e = exposure-adjusted event rate (per subject-year) * Fisher s 2-sided exact test unadjusted p-value <=0.05 with higher risk on treatment arm.
Logistic Regression Models for Subject Incidence (Berry & Berry 2004) Y bj ~Binom(N t, t bj ); X bj ~Binom(N c, c bj ) where t bj and c bj are event rates for AE bj in the treatment and control groups, respectively logit(c bj )=λ bj ; logit(t bj )=λ bj + θ bj, Note that θ bj = log(or bj )
Model 1a: Logistic Regression Model with Normal Prior on log-or Stage 1 priors λ bj ~N(μ λb, σ 2 λb ); θ bj ~N(μ θb, σ 2 θb ); Stage 2 priors μ λb ~N(μ λ0, σ 2 λ0 ); σ 2 λb ~IG(α λ, β λ ); μ θb ~N(μ θ0, σ 2 θ0 ); σ 2 θb ~IG(α θ, β θ ) Stage 3 priors μ λ0 ~N(μ λ00, σ 2 λ00 ); σ 2 λ0 ~IG(α λ00, β λ00 ) μ θ0 ~N(μ θ00, σ 2 θ00 ); σ 2 θ0 ~IG(α θ00, β θ00 ) Hyperparameters μ λ00, σ 2 λ00, α λ00, β λ00, μ θ00, σ 2 θ00, α θ00, β θ00, α λ, β λ, α θ, β θ are fixed constants
Model 1b: Logistic Regression Model with Mixture Prior on log-or Same likelihood and mean structure Same priors for control group parameters Mixture prior for log(or): point mass at 0 + normal distribution Stage 1 prior: θ bj ~p b δ(0) + (1-p b )N(μ θb, σ 2 θb ); Stage 2 prior: p b ~Beta(α pb, β pb ); Stage 3 prior: α pb ~exp( ξ α )I(α pb >1); β pb ~exp( ξ β )I(β pb >1) This mixture allows a point mass on equality of the treatment and the control rates because many AEs may be completely unaffected by treatment
Model 1c: Non-hierarchical One-stage Logistic Regression Model with Mixture Prior on log-or Simply the 1 st stage of three-level hierarchical mixture model Same likelihood and mean structure No information is borrowed across AEs within the same SOC, i.e. the PTs are treated independently: λ bj ~N(0, 10 2 ); Mixture prior for log(or): point mass at 0 + normal distribution Prior: θ bj ~0.5 δ(0) + 0.5N(0, 10 2 ); With vague prior information, this model delivers similar results as frequentist approaches
Model Selection - DIC (Spiegelhalter 2002) Deviance Information Criterion (DIC) DIC = Dbar (fit) + pd (effective # of parameters) Dbar pd DIC Model 1a 2604.86 382.09 2986.95 Model 1b 2644.86 324.03 2968.89 Model 1c 2574.86 602.38 3177.24 Hierarchical models with mixture priors are preferred based on their smaller DIC values
Inference AE bj is flagged if Pr( θ bj > c Data) > p, where θ bj is OR in Binomial models Pr( (t bj - c bj ) > d Data) > p, where t bj - c bj is risk difference (RD) c, d and p are all prespecified constants Examples c=1 and p = 0.8 means that the given AE type is flagged if the posterior exceedance probability of the true OR being > 1 is more than 80% d=0.02 and p = 0.8 means that the given AE type is flagged if the posterior exceedance probability of the true RD being > 2% is more than 80%
Inferences of Binomial Hierarchical Model with Mixture Prior (Model 1b) Posterior Exceedance Probability for: SOC General Disorders & Administration Site Conditions Infections and Infestations Injury, Poisoning & Procedural Complications Skin & Subcutaneous Tissue Disorders PT Unadjusted p-value OR>1.0 OR>1.2 OR>2 RD>2% RD>5% Fatigue.019.56.55.32.10.00 Herpes Simplex.025.53.51.36.00.00 Sinusitis.012.70.69.42.28.00 Excoriation.030.30.28.18.00.00 Ecchymosis.005.54.52.44.00.00 OR = odds ratio (drug:placebo), RD = risk difference (drug - placebo)
- log10 (Fisher exact p-value) 0.0 0.5 1.0 1.5 2.0 2.5 Fisher's Exact Test 2-sided P-values Ecchymosis Sinusitis Fatigue Excoriation Herpes Simplex Blood and Lymphatic Syste... Cardiac Disorders Congenital, Familial and... Ear and Labyrinth Disorde... Endocrine Disorders Eye Disorders Gastrointestinal Disorder... General Disorders and Adm... Hepatobiliary Disorders Immune System Disorders Infections and Infestatio... Injury, Poisoning and Pro... Investigations Metabolism and Nutrition... Musculoskeletal and Conne... Neoplasms Benign, Maligna... Nervous System Disorders Psychiatric Disorders Renal and Urinary Disorde... Reproductive System and B... Respiratory, Thoracic and... Skin and Subcutaneous Tis... Surgical and Medical Proc... Vascular Disorders -0.01 0.0 0.01 0.02 Raw Risk Difference(%), Treatment - Placebo
P(OR > 1 Data) 0.0 0.2 0.4 0.6 0.8 1.0 Binomial Hierarchical Model with Mixture Prior Sinusitis Fatigue Herpes Simplex Ecchymosis Pharyngitis Bronchitis Dizziness Dyspepsia Seasonal Excoriation Urticaria Allergy Headache Influenza Injection Conjunctivitis Site Basal Bruising Cell Carcinoma Folliculitis Cystitis Furuncle Migraine Tension Headache Anxiety Vaginal Infection Flushing Rhinitis Hypersensitivity Allergic Gastroenteritis Viral Viral Skin Infection Otitis Ear Dry Infection Laryngitis Eye Bronchitis Fungal Infected Labyrinthitis Rhinitis Paraesthesia Nerve Tooth Onychomycosis Viral Malaise Acne Upper Compression Respiratory Tract Infection Dermal Blood Abdominal Conjunctival Abdominal Gastrointestinal Externa Infection Insect Chronic Cholesterol Pressure Abscess Cyst Bite Increased Cat Vertigo Pain Haemorrhage Upper Inflammation Lower Acute Appendicitis Erysipelas Cellulitis Cervicitis Genital Gingival Infectious Candidiasis Condyloma Oral Eyelid Impetigo Mastitis Diverticulitis Infected Herpes Otitis Pneumonia Rash Tooth Syncope Carpal Dizziness Dermatitis Carotid Hypertension Abdominal Scratch Pustule Media Puncture Postoperative Pulpitis Respiratory Pustular Infection Sialoadenitis Tonsillitis Vaginal Tinea Subcutaneous Pain Foot Fall Skin Nephrolithiasis Cruris Infection Versicolour Sinusitis Candidiasis Virus Mycosis Trauma Dental Fracture Sebaceous Papilloma Tunnel Artery Postural Staphylococcal Site Mononucleosis Acuminatum Disease Syndrome Stenosis Infection Abscess Activated Female Tract Cyst Infection Viral Herpes Oral Tinea Diplopia Cataract Diarrhoea Cerebrovascular Conjunctivitis Lymphadenopathy Dry Gastrooesophageal Mouth Reflux Disease Eyelid Feeling Neuropathic Dysgeusia Hypersomnia Hypertonia Restless Facial Contusion Disturbance Lethargy Memory Dyspnoea Erectile Dysfunction Zoster Tinnitus Vertigo Ear Irritation Haemorrhage Pruritus Inflammation Hot Pain Oedema Positional Allergic Vision Flatulence Abdominal Nasopharyngitis Hand Muscle Seborrhoeic Palsy Food Candidiasis Pharyngitis Pedis Fracture Strain Keratosis Attention Insufficiency Accident Weight Blurred Poisoning Road Nausea Toothache Stomach Animal Bite Discomfort Distension Sunburn Upper Skin Thermal Injection Blood Body Tendonitis Shoulder Wound Temperature Pain Squamous Fibrous Burning Hypoaesthesia Sensory Somnolence Disturbance Impairment Legs Pain Syndrome Syncope Sciatica Attention Bipolar Alcoholism Irritability Agitation Dysphoria Libido Nervousness Epistaxis Nasal Throat Rhinorrhoea Congestion Irritation Contact Dysuria Stress Sleep Premature Actinic Dysmenorrhoea Dysphonia Eczema Cervical Sinus Laceration Traffic Injury Glucose Increased Histiocytoma Apnoea Decreased Sensation Congestion Decreased I Vasovagal Keratosis Onychomadesis Skin Intertrigo Night Lesion Sweats Disorder Cell Dysplasia Deficit/Hyperactivity Ejaculation Carcinoma Increased Syndrome Increased of Skin Disorder Psoriasis Rash Hot Insomnia Cough Depression Decreased Flush Back Burn Respiratory Pain Site Appetite Accident Haemorrhage Myalgia Tract Infection 0.5 1.0 1.5 2.0 2.5 OR posterior means
Poisson Regression for Exposure-adjusted Incidence Poisson models Adjust for different exposures in treatment and control Assume constant hazard over time Subject exposure: time to AE first onset or end of treatment, whichever comes first Y bj ~Pois(t bj T bj ); X bj ~Pois(c bj C bj ) where t bj and c bj are event rates, and T bj and C bj for AE bj are total exposure times in the treatment and control groups, respectively log(c bj )=λ bj ; log(t bj )=λ bj + θ bj, Note that θ bj =log(rr bj )
Poisson Regression for Exposure-adjusted Incidence Model 2a: Poisson regression with normal prior on log-rr Same priors as used in Model 1a Model 2b: Poisson regression with mixture prior on log-rr Poisson likelihood and log linear mean structure, same as Model 2a; Mixture prior on log-rr: point mass at 0 + Normal
CIs by Model Ecchymosis Excoriation Sinusitis Herpes Simplex Fatigue Ecchymosis Excoriation Sinusitis Herpes Simplex Fatigue Ecchymosis Excoriation Sinusitis Herpes Simplex Fatigue Ecchymosis Excoriation Sinusitis Herpes Simplex Fatigue o observed + post.mean 0 5 10 15 20 Odds Ratios, 95% CI Exact CI using the frequentist approach (no multiplicity adjustment) Bayesian non-hierarchical onestage binomial model with mixture prior on log-or (Model 1c) Bayesian binomial hierarchical model with normal prior on log-or (Model 1a) Bayesian binomial hierarchical model with mixture prior on log- OR (Model 1b)
Simulation Study Questions to answer: Is modeling the SOC/PT hierarchy helpful? What operating characteristics should be chosen to achieve acceptable false discovery rate (FDR) and power for meaningful risk sizes Simulation schema: Null scenario: Randomly assign subjects to treatment arms Adverse events within subject remain unchanged to maintain the SOC/PT hierarchy Scenarios with elevated risks: Choose 2 SOCs with 3 and 21 PTs each; OR=2 and 5; Placebo risk = 1%, 5% and 10%; logor ~ N(log2 or log5,0.25^2) within same SOC Other SOCs remain the same as in the null datasets.
Simulation Study (cont.) NULL Scenario Scenario with OR=2 and Placebo risk=1% Scenario with OR=2 and Placebo risk=5% FDR/ FWER (%) FDR (%) Power (%) FDR (%) Power (%) Non-adjusted Fisher s exact test 2-sided test, p-value <=0.05 100 35 38 19 84 Non-hierarchical Bayes Model * (Model 1c) c=1, p=0.6 100 50 23 25 68 c=1, p=0.7 96 38 19 15 64 c=1, p=0.8 68 24 15 7 60 Bayesian Hierarchical Model * (Model 1b) c=1, p=0.6 17 12 66 9 95 c=1, p=0.7 12 9 61 5 94 c=1, p=0.8 8 7 56 3 91 * Pr( OR bj c Data ) p FDR = False Discovery Rate = # of falsely flagged signals / # of flagged signals Power = # of correctly flagged signals / # of signals that should be flagged FWER = Family-wise Error Rate
Closing Remarks Current traditional approach of flagging routinely collected AEs based on unadjusted p-values or CIs can result in excessive false positive signals As a result, it can cause undue concern for approval/labeling/post marketing commitment Bayesian hierarchical mixture modeling provides a useful tool to address multiplicity Allows for explicitly modeling AEs with the existing MedDRA coding structure so that AEs within a SOC or across SOCs can borrow strength from each other
Closing Remarks (Cont.) (Cont.) Appealing in the rare event setting The model modulates the extremes Inferences based on the exact full posterior distributions, relaxing the assumption of normality of the outcome Models the entire AE dataset, so the distinction between Tier 2 and Tier 3 AEs may not be necessary Makes efficient use of all the data Straightforward and flexible to assess clinically important difference with different scales (risk difference, OR, or RR) Avoid detecting medically unimportant signals (an AE could have high Pr(OR or RR > 1 or RD > 0 Data), but medically unimportant)
Closing Remarks (Cont.) Simulation study can guide selection of a signal detection threshold Graphics are effective in displaying flagged signals when analyzing hundreds or thousands types of AEs The field of clinical trial signal detection is still in its infancy More research and practice are needed Further developments await statisticians work closely with clinicians/safety scientists to advance this field
Thank You!
References Bate A, Lindquist M, Edwards, IR, Olsson S, Orre R, Lansner A, and De Freitas RM (1998). A Bayesian neural network method for adverse drug reaction signal detection. Eur J Clin Pharmacol 54:315-321 Berry S and Berry D (2004) Accounting for multiplicities in assessing drug safety: a three-level hierarchical mixture model. Biometrics, 60: 418-426 Chi G, Hung HMJ, and O Neill R (2002). Some comments on Adaptive Trials and Bayesian Statistics in Drug Development by Don Berry. In Pharmaceutical Report, Vol 9, 1-11 Crowe B, Xia A, Berlin, J, Watson D, Shi H, Lin S, Kuebler J, et al. (2009). Recommendations for Safety Planning, Data Collection, Evaluation and Reporting During Drug, Biologic and Vaccine Development: A Report of the PhRMA Safety Planning, Evaluation and Reporting Team (SPERT). To appear in Clinical Trials DuMouchel W (1999). Bayesian data mining in large frequency tables, with an application to the FDA Spontaneous Reporting System (with discussion). The American Statistician 53:177-202 Gould AL. Drug safety evaluation in and after clinical trials. Deming Conference, Atlantic City, 3 December 2002 Mehrotra, DV and Heyse, JF (2004). Multiplicity considerations in clinical safety analysis. Statistical Methods in Medical Research 13, 227-238 O'Connell, M. (2006). Statistical graphics for the design and analysis of clinical development studies. Insightful Webcast: http://www.insightful.com/news_events/webcasts/2006/07clinical/default.asp Spiegelhalter DJ, Best NG, Carlin BP and van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J. Roy. Statist. Soc. B. 64, 583-640.