Bayesian Hidden Markov Models for Alcoholism Treatment Tria



Similar documents
HIDDEN MARKOV MODELS FOR ALCOHOLISM TREATMENT TRIAL DATA

BayesX - Software for Bayesian Inference in Structured Additive Regression

Bayes and Naïve Bayes. cs534-machine Learning

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Adaptive Approach to Naltrexone Treatment for Alcoholism

Estimation and comparison of multiple change-point models

Tutorial on Markov Chain Monte Carlo

DURATION ANALYSIS OF FLEET DYNAMICS

Longitudinal random effects models for genetic analysis of binary data with application to mastitis in dairy cattle

Problem of Missing Data

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Introduction to Algorithmic Trading Strategies Lecture 2

STA 4273H: Statistical Machine Learning

Special Populations in Alcoholics Anonymous. J. Scott Tonigan, Ph.D., Gerard J. Connors, Ph.D., and William R. Miller, Ph.D.

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

Effectiveness of Treatment The Evidence

Introduction to Markov Chain Monte Carlo

Monte Carlo-based statistical methods (MASM11/FMS091)

Applications of R Software in Bayesian Data Analysis

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Semiparametric Multinomial Logit Models for the Analysis of Brand Choice Behaviour

Bayesian Statistics: Indian Buffet Process

Hidden Markov Models

APPLIED MISSING DATA ANALYSIS

Bayesian Statistics in One Hour. Patrick Lam

Hidden Markov Models with Applications to DNA Sequence Analysis. Christopher Nemeth, STOR-i

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Incorporating prior information to overcome complete separation problems in discrete choice model estimation

Basics of Statistical Machine Learning

Equivalence Concepts for Social Networks

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Analysis of Bayesian Dynamic Linear Models

Machine Learning and Statistics: What s the Connection?

Advanced Signal Processing and Digital Noise Reduction

Probabilistic user behavior models in online stores for recommender systems

The Causal Effect of Mortgage Refinancing on Interest-Rate Volatility: Empirical Evidence and Theoretical Implications by Jefferson Duarte

Statistics in Retail Finance. Chapter 6: Behavioural models

Electronic Theses and Dissertations UC Riverside

Treatment of Alcoholism

15 Ordinal longitudinal data analysis

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

Statistical Models in Data Mining

Probabilistic methods for post-genomic data integration

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

System Identification for Acoustic Comms.:

Centre for Central Banking Studies

RELAPSE PREVENTION WORKBOOK

11. Time series and dynamic linear models

Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center

California Society of Addiction Medicine (CSAM) Consumer Q&As

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

More details on the inputs, functionality, and output can be found below.

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Overview of Chemical Addictions Treatment. Psychology 470. Background

Adaptive Design for Intra Patient Dose Escalation in Phase I Trials in Oncology

Sample Script of an Initial Brief Alcohol Counseling Session

Bayesian Penalized Methods for High Dimensional Data

Co-Occurring Substance Use and Mental Health Disorders. Joy Chudzynski, PsyD UCLA Integrated Substance Abuse Programs

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Validation of Software for Bayesian Models Using Posterior Quantiles

Lasso on Categorical Data

Note on the EM Algorithm in Linear Regression Model

Statistics Graduate Courses

Computational Statistics for Big Data

HMM : Viterbi algorithm - a toy example

D.G. Counseling Inc.

James R. McKay, Ph.D.

PS 271B: Quantitative Methods II. Lecture Notes

Sample Size Designs to Assess Controls

MOBC Research Highlights Reel. Mitch Karno Mechanisms of Behavior Change Conference San Antonio, Texas June 20, 2015

Health Care Service System in Thailand for Patients with Alcohol Use Disorder

CBC/HB v5. Software for Hierarchical Bayes Estimation for CBC Data. (Updated August 20, 2009)

Department of Psychology Washington State University. April 7 th, Katie Witkiewitz, PhD

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University

Activity recognition in ADL settings. Ben Kröse

Single item inventory control under periodic review and a minimum order quantity

Towards better accuracy for Spam predictions

The Chinese Restaurant Process

Suggested APA style reference:

Latent Dirichlet Markov Allocation for Sentiment Analysis

Course: Model, Learning, and Inference: Lecture 5

Time series analysis as a framework for the characterization of waterborne disease outbreaks

INSTRUCTIONS AND PROTOCOLS FOR THE IMPLEMENTATION OF CASE MANAGEMENT SERVICES FOR INDIVIDUALS AND FAMILIES WITH SUBSTANCE USE DISORDERS

Linear Threshold Units

This research models the dynamics of customer relationships using typical transaction data. Our proposed

Transcription:

Bayesian Hidden Markov Models for Alcoholism Treatment Trial Data May 12, 2008

Co-Authors Dylan Small, Statistics Department, UPenn Kevin Lynch, Treatment Research Center, Upenn Steve Maisto, Psychology Department, Syracuse University Dave Oslin, Treatment Research Center, UPenn

The Problem N subjects measured on T days: daily drink counts. Want to estimate the average treatment effect on outcome. Day Subject 1 2 3 4... 166 167 168 1 1 1 2 2... 2 1 1 2 1 1 1... 3 1 1 3 3 3 3 1... 1 1 3........... 238 1 3... 2 3 3 239 1 1 1 1... 1 1 1 240 1 1 1 2... 1 2 2

Sample Time Series Subject 61 Subject 108 3 3 Drinks 2 Drinks 2 1 1 0 50 100 150 Day 0 50 100 150 Day Subject 142 Subject 183 3 3 Drinks 2 Drinks 2 1 1 0 50 100 150 Day 0 50 100 150 Day

The Goal of Treatment The main goal: Reduce Alcohol Consumption 1. Does the treatment reduce the frequency of all drinking events - or only certain types of drinking events? Is moderate drinking an acceptable outcome? How does the treatment affect different complex drinking patterns and behaviors? 2. Does the treatment reduce the frequency and/or duration of relapses? What is a relapse? Everybody agrees on the notion of a relapse, but there is no concensus for an operational definition of relapse.

What is the Outcome? It s complicated. The subjects are recovering alcoholics, whose drinking behaviors are complex processes that evolve and change through time. Simple models lack the structure to adequately describe these processes (Wang, et. al., 2002).

Simple Models Time until first drink/relapse (Ignores all behavior after first drink) Percentage of days drinking (Ignores amount of alcohol that is consumed) Multiple failure time models (Requires definition of a relapse) Drinks per Day 3 Y it 2 1 0 5 10 15 20 25 30 day

HMM Motivation A well-known theory of relapse, the cognitive-behavioral model of relapse (McKay, et. al. 2006, Marlatt and Gordon, 1985), suggests that the cause of a relapse is two-fold: 1. First, the subject must be in a mental and/or physical condition in which he or she is vulnerable to drinking. That is, if presented with an opportunity to drink, the subject would not be able to mount a coping response. 2. Second, the subject must actually encounter such a high-risk drinking situation.

HMM structure Y it is the observation for subject i at time t. Y i1 Y i2 Y i, t 1 Y it Y i, t+1 Y i, T 1 Y it H i1 H i2 H i, t 1 H it H i, t+1 H i, T 1 H it H it is the hidden state for subject i at time t.

A Simple HMM with no covariates The complete-data likelihood for an HMM factors into three parts: p(y, H θ) = N p(h i1 θ) (1) i=1 N i=1 t=2 N i=1 t=1 T p(h it H (i,t 1),θ) (2) T p(y it H it,θ), (3) where Y and H denote observations and hidden states, and parts (1), (2), and (3) refer to the initial state distribution, the hidden state transitions, and the observations, respectively.

Simple HMM Fit: S=5 Fit multinomial distributions for hidden state transitions and observations conditional on hidden states. Data is pooled across individuals: ˆπ = (.79,.11,.01,.07,.01).99.00.00.00.01.01.98.01.00.00 ˆQ =.01.00.95.00.04.01.00.00.98.01.05.00.02.02.91 ˆP =.99.01.00.71.26.03.08.86.06.65.06.29.03.01.96 where ˆQ and ˆP denote the hidden state transition matrix, and the observation distributions, respectively.

Interpretation of hidden states for S=5 1. Large probabilities on the diagonal of ˆQ hidden states are persistent. 2. Observation Distributions are clinically interpretable: Y it = 1 Y it = 2 Y it = 3 0 1 A.99.01.00 IM.71.26.03 ˆP = SM.08.86.06 IH B @.65.06.29 C A SH.03.01.96 Abstinence Intermittent Moderate Drinking Steady Moderate Drinking Intermittent Heavy Drinking Steady Heavy Drinking Fitting additional latent states (S = 6, 7) yielded no additional interpretable drinking behaviors.

Choosing the number of Hidden States 10-fold CV to make out-of-sample predictions; measure deviance N T D = 2 log ˆP(Y it = y it ). i=1 t=11 HMM Markov MTD 0.80 0.75 Deviance 0.70 0.65 3 4 5 6 7 Number of Hidden States 1 2 3 Order 4 5 6 7 8 9 10 Number of Lags

Question 1: Is Moderate Drinking OK? Question: If the hidden states are persistent, can a subject drink moderately, and not resort to heavy drinking soon after? Define states 4 and 5 as Relapse States. Probability of Avoiding Relapse as a Function of Time 1.0 0.8 Initial State = 1 (A) Initial State = 2 (IM) Initial State = 3 (SM) Probability 0.6 0.4 0.2 0.0 0 25 50 75 100 125 150 175 Day

Question 2: What is a Relapse? Currently, there is no universally agreed upon operational definition of relapse. Furthermore, different definitions can have an impact on the estimates of treatments (Maisto, et. al, 2003). Any drink of alcohol A day of heavy drinking Four consecutive drinking days (any amount of alcohol) Any drink of alcohol that follows at least 4 days of abstinence The HMM offers a new data-based definition: Any time point at which a subject has a high probability of being in hidden state 4 or 5 ( Intermittent Heavy Drinking or Steady Heavy Drinking ). Estimate the most likely hidden state sequence for each subject using the Viterbi algorithm.

Most Likely Sequence 1 Subject 34 3 5(SH) 4(IH) Y it 2 3(SM) Latent State 2(IM) 1 1(A) 0 50 100 150 day

Most Likely Sequence (2) Subject 126 3 5(SH) 4(IH) Y it 2 3(SM) Latent State 2(IM) 1 1(A) 0 50 100 150 day

A More Complex HMM Incorporate Covariates, possibly time-varying Random Effects Missing data, assuming MAR

The Model For hidden state transition probabilities, use a multinomial logit model, where P(H it = s H (i,t 1) = r, X it, β) = exp(xq it β rs ) k exp(xq it β rk ). β rs0 N(µ rs, σ rs ). For observation probabilities, use an ordinal probit model, where P(y sj = 1) P(y sj = 2) P(y sj = 3) x P 4 2 0 sj β s 2 4 γ s1 γ s2

The Hidden State Transition Matrix The hidden state transition matrix parameters are organized as follows (for S = 3 hidden states): 1 2 3 1 {0} (0,0,...,) {β 120i } (β 121,β 122,...) {β 130i } (β 131,β 132,...) 2 {0} (0,0,...,) {β 220i } (β 221,β 222,...) {β 230i } (β 231,β 232,...) 3 {0} (0,0,...,) {β 320i } (β 321,β 322,...) {β 330i } (β 331,β 332,...) where braces {β} denote a set of random effects, and the rest are fixed effects.

The Data The outcome (N = 240 subjects and T = 168 days) is distributed as follows: Y % 1 68 2 7 3 8 Missing 17 Total 100

Covariates This clinical trial, conducted at UPenn s Treatment Research Center, had 6 arms: treatment/control for Naltrexone, and two therapies vs. control. In the hidden state transition matrix, we include: 1. Treatment (Naltrexone) 2. Therapy 1 3. Therapy 2 4. Female 5. Time In the observation model, we include: 1. Weekend indicator 2. Past Drinking Behavior

The Gibbs Sampler 1. Initialize the parameters θ = (β, η, γ, π, µ, σ). 2. H Y obs, θ from its full conditional distribution by evaluating the likelihood using the forward recursion, and then using a stochastic backward recursion for all subjects i = 1, 2,..., N (Scott, 2002). 3. β H from its posterior using Scott s DAFE algorithm (2007), which involves augmented variables and a Metropolis-Hastings step, or using a random-walk Metropolis step. 4. µ β, σ from their full conditional distributions, assuming flat or weakly informative priors (Gelman, forthcoming). 5. σ β, µ from their full conditional distributions. 6. Y mis H, η, γ assuming it is missing at random (MAR) using the current batch of parameters. 7. γ H, Y obs, Y mis, η using Cowles (1996) random-walk Metropolis-Hastings step. 8. η H, Y obs, Y mis, γ in the standard data augmentation way (Albert and Chib 1993). 9. π H from its full conditional Dirichlet distribution. 10. Repeat steps 2-9 for g = 2,..., G.

Characterizing the Fit: S = 3 ˆπ = (.94,.04,.02) ˆQ =.98.01.01.69.28.03.37.02.61 ˆP =.99.01.00.24.73.03.03.00.97

The Treatment Effect (Treat = Red, Control = Black) Q(1,1) Q(1,2) Q(1,3) Density Density Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Q(2,1) Q(2,2) Q(2,3) Density Density Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Q(3,1) Q(3,2) Q(3,3) Density Density Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Missing Data Subject 183 3 Drinks 2 1 60 70 80 90 100 Day

Hidden States Posterior Distribution Subject 183 1.0 Drinks 0.5 0.0 60 70 80 90 100 Day

Missing Data Posterior Distribution Subject 183 3 Drinks 2 1 60 70 80 90 100 Day

Missing Data Subject 61 3 Drinks 2 1 0 50 100 150 Day

Hidden States Posterior Distribution Subject 61 1.0 Drinks 0.5 0.0 0 50 100 150 Day

Missing Data Posterior Distribution Subject 61 3 Drinks 2 1 0 50 100 150 Day

An HMM is a model with a rich structure that can capture complex drinking behaviors as they evolve through time. It corresponds to a well-known theoretical model for relapse, the cognitive-behavioral model of relapse. We can (1) assess the danger of moderate drinking, and (2) define relapse in a data-based way. We can measure treatment effects. We can fit the model to subjects with incomplete data, and we can incorporate random effects.

Thanks! www.stat.columbia.edu/ shirley