How Effective Is Alcoholism Treatment in the United States?*

Transcription

1 How Effective Is Alcoholism Treatment in the United States?* WILLIAM R. MILLER, PH.I)., SCOTT T WALTERS, M.A., AND MELANIE E. BENNETT, PH.D. Department of Psychology, University of New Mexico, Albuquerque, New Mexico, ABSTRACT. Objechtive: Following in the footsteps of several prior attempts, this review seeks a meaningful and data-based answer to the common question of how people fare, on average, after being treated for alcoholism (broadly defined as alcohol use disorders). Method: Findings from seven large multisite studies were combined to derive estimates of the average effectiveness of alcoholism treatment. To provide common outcome measures, conversion equations were used to compute variables not reported in the original studies. Results: During the year after treatment, I in 4 clients remained continuously abstinent on average, and an additional I in 10 used alcohol moderately and without problems. During this period, mortality averaged less than 2 B,. The remaining clients, as a group, showed substantial improvement, abstaining on 3 days out of 4 and reducing their overall alcohol consumption by 87%, on average. Alcohol-related problems also decreased by 60%, Conclusions: About one third of clients remain asymptomatic during the year following a single treatment event. The remaining two thirds show, on average, large and significant decreascs in drinking and related problems. This substantial level of improvement in 'unremitted" clients tends to be overlooked when outcomes are dichotomized as successful or relapsed. (I Stud. Alcohol 62: ,2001) How EFFECTTVE is alcoholism treatment? This is a simple pragmatic question that is asked by legislators, reporters, funding sources, concemed families and clients themselves. It is also a difficult question to answer in a straightforward manner. The questioner often expects a simple response (c.g., a percentage rate of success). Two decades ago there was an informal word-of-mouth "industry standard" of sorts in the United States. in that it was common for programs to claim success rates of 80% or higher (Miller and Hester, 1986). The introduction to the "big book" of Alcoholics Anonymous (1976) implies a similarly high rate of success. Those who conduct alcohol treatment outcome research know that claims of a greater than 80% success rate in any single treatment event do not converge well with carefully observed reality. There are, to be sure, many ways to inflate success statistics. Miller and Sanchez-Craig (1996) provided a tongue-in-cheek list of program evaluation strategies for improving success rates (e.g., exclude poor prognosis clients, disregard lost cases and dropouts, and keep followup very short). Outcomes look less rosy when all cases are included in analyses (e.g., intent to treat), follow-up extends for at least a year after treatment, a high proportion of cases is interviewed, there is a careful and detailed reconstruction of alcohol and other drug use, and self-report is confinmed against collateral interviews, biochemical measures or records (Miller et al., a). Researchers also know that a myriad of complexities forestalls any simple answer to the question of the general effectiveness of alcoholism treatment. * There is no general consensus as to what defines a good outcome, or the period of remission required in order to declare a success. * There are widely varying conceptions of "alcoholism," and people with alcohol-related problems are quite heterogeneous (institute of Medicine, 1990). (For purposes of this review, we have chosen a broad conception of the tem similar to Jellinek's [1960], as encompassing a wide range of severity of alcohol problems and alcohol dependence.) * Although aggregate outcomes remain reasonably stable after 6-12 months, individual cases continue to shift quite a bit in outcome status over time. * It is unclear how much imperfection constitutes a "relapse" (Miller, a) and how much deviation from perfect abstinence defines a treatment failure. * What is described as "trcatment" is highly variable and no blanket endorsement of all forms of treatment can be given (Institute of Medicine, 1990). * Even within the same treatment approach or program, there are often wide differences in therapists' effectiveness, so that it also matters who is doing the treating (Najavits and Weiss, 1994; Project MATCH Research Group, 1998). Received: June 12, Revision: October23, *This research was supported in pan by National Institute on Alcohol Abuse and Alcoholism grants T32-AA07465 and K05-AAO Notwithstanding these complexities, however, there are reasons why the question of alcohol treatment's effectiveness deserves consideration and an answer that is based on more 211

2 212 JOURNAL OF STUDIES ON ALCOHOL / MARCH 2001 than guesswork. Legislators, reporters and third-party payers are understandably dissatisfied with a litany of reasons why the question cannot be answered. We are not the first to attempt an answer to this question based on the treatment outcome literature. Although a comprehensive review of either efficacy or effectiveness research is well beyond the scope of this article, two classic reviews are illustrative. Emrick (1974), reviewing 265 uncontrolled and controlled outcome studies, concluded that about one third of clients abstain and another one third show substantial improvement after treatment, at least over short periods of follow-up. A less optimistic conclusion was reached by Costello et al. (1977), who confined themselves to 80 studies with at least 12 months of follow-up and counted as failures any cases lost to follow-up, arriving at an average alcoholism treatment success rate of 26%. The present review is a further attempt to provide a fair and comprehensible response to the question, "How effective is alcoholism treatment?" We offer the following principles that guided our work: * It is a reasonable question, despite the complexities. People with a life-threatening diagnosis often want to know their chances for survival and recovery. * What is being asked for is an average, a representative sense of treatment outcomes, rather than the best possible scenario. Though there may be substantial variability across populations, clients, programs, therapists and time, the question asks for a reasonable estimate of typical outcomes. * Answers should therefore be based on outcome data for a broad spectrum of populations and treatment approaches. That makes this a very different question from the effectiveness of a particular treatment or program, or the prognosis for a particular person. * The question is not primarily concemed with efficacy (the causal attribution of outcomes to specific aspects of treatment), but rather asks about people's average expected course after treatment, "all else being equal." * Answering the question in a careful, data-based manner can be of scientific as well as practical utility. In his widely cited review, Emrick (1974) sought to establish average alcoholism treatment outcome ratcs, with which specific observed outcomes might be compared. Such comparisons are not straightforward, because outcomes can vary for many different reasons. Nevertheless, we believe that Emrick's quest was worthwhile, to give some reasonably objective basis for an answer to this question. * Common metrics are needed in order to combine and compare outcome data sources. The lack of consensus outcome measures has been a persistent problem in this field, and a principal source of frustration for meta-analysts. * Because of the complexity of outcomes, and the fact that different dependent variables may portray outcomes quite differently (Miller, a; Westerberg et al., 1998), no single metric is sufficient. A fair answer is one that characterizes outcomes in several different ways. Outcome measures Method If there is any point on which most parties seem to agree with regard to outcomes, it is that alcohol consumption is of central concern in judging adlcoholism treatment effectiveness. Drinking, however, is not the only relevant aspect to consider in studying recovery. It is widely recognized in Alcoholics Anonymous, for example, that sobriety involves far more than abstinence, encompassing mental, emotional, physical and spiritual dimensions of outcome. Yet, it is fair to say that if alcoholism treatment does not change drinking behavior, it has not succeeded. For this reason, nearly every outcome study reports, in some form, changes in drinking behavior. It has also become reasonably clear that one should consider at least two dimensions of drinking outcomes: frequency and intensity (Project MATCH Research Group, 1993). Although more emphasis is often given to frequency (e.g., percent days abstinent), how often a person drinks is not the whole story, even with alcohol dependent clients. These two different aspects of alcohol use were evident in Cahalan's pioneering work on the quantity/frequency measures that are now widely used in survey research (Cahalan, 1970; Cahalan et al., 1969). Abstinence. Frequency of drinking is usually discussed as its inverse, abstinence. One of the crudest measures of outcome is the proportion of cases maintaining perfect continuous abstinence from alcohol during a specified period. An obvious shortcoming of this metric is that recurrence of addictive behaviors is exceedingly common (Brownell et al., 1986; Hunt et al., 1971), even among those who ultimately maintain stable abstinence. Definitions of abstinence differ, allowing for various levels of slippage before incurring a judgment of relapse or treatment failure. The definition, method and care taken in ascertaining abstinence are sometimes left unspecified. Over the past decade, with the widespread use of timeline follow-back interviewing (Sobell and Sobell, 1995), abstinence has come to be more carefully quantified in terms of continuous variables: the percentage of days abstinent versus drinking, time to first drink or heavy drinking day, or longest duration of abstinence. Different metrics yield different answers. Intensity. The intensity of drinking (when it does occur) has been assessed in a wide variety of ways. Some have quantified the total amount of alcohol consumed within a certain assessment window of time, using such outcome vanables as average number of drinks per day or per week (e.g., Miller et al., 1992). Some have designated a threshold for "heavy drinking days" and have counted the number of these that occur (e.g., Project MATCH Research Group, 1993, 1997). Most studies now convert alcohol consumption into a standard drink unit, the size of which varies

3 MILLER, WALTERS AND BENNETT 213 considerably from one study or nation to another (Miller et al., 1991). Following Cahalan's lead, the total amount of alcohol consumed during a given month is often determined by multiplying the frequency of drinking (number of drinking days) by the average amount consumed on drinking days. Some investigators have used as their outcome metric "drinks per drinking day" (DDD; e.g., Project MATCH Research Group, 1997). A serious problem with DDD, however, is what to do with abstainers, who had no drinking days. DDD makes little sense for abstainers, in that a true zero value is logically impossible. Yet in order to have a numeric entry for each case, researchers often assign a zero DDD value to abstainers (e.g., Miller et al., 1996; Project MATCH Research Group, 1997). This creates a misleading impression of the intensity of alcohol use by drinkers. In a sample with 75% abstainers, for example, an average of 5 DDD would mean that the drinkers (excluding the abstainers) were actually consuming 20 drinks on a typical drinking day. As will be discussed shortly, however, intensity measures can be interchanged through relatively simple calculation procedures. Mortality. In cancer treatment, the rate of survival versus mortality during various lengths of time is a common outcome measure. Death rates are also often reported in alcoholism treatment research, and represent another way to characterize outcomes. Problems. Alcohol-related problems have been defined and measured in many ways, complicating comparisons across studies. Whatever the method, however, continuous measures of alcohol-related problems or dependence symptoms can be used to compute percent reductions at specific intervals after treatment. Other dimensions. There are many other ways in which outcomes can be characterized. For example, studies focused on alcoholism treatment may or may not reveal its impact on the use of other psychoactive substances. Changes in concomitant psychological or family problems are sometimes reported. There has been commendable attention, more recently, to general quality of life (the rest of sobriety). Yet there has been so little consistency in the measures used to report such ancillary outcomes, that summaries across studies are nearly impossible at present. Converting outcome measures As discussed above, a problem well known to any metaanalyst in the alcohol field is the inconsistency of outcome measures. In preparing this review, we assembled a spreadsheet of multisite studies, showing the outcome measures reported in each. The result looked rather like a low-budget chocolate chip cookie. Clusters of values (the "chocolate chips") dotted a grid of mostly empty space. Part of our challenge was to increase the density of chocolate chips, filling in missing measures wherever possible by estimation from other available data. Measures offrequency oqfdrinking. In order to yield common outcome indices, drinking measures can often be converted into other metrics so that variables not reported in a treatment study can, in some cases, be computed or estimated from values that are reported. For example, the proportion of drinking days (PDD) is simply l-pda (proportion days abstinent). Thus, if one knows the mean proportion of abstinent days for an entire sample (PDAJ, the sample size (N), and the number of individuals who were totally abstinent during an assessment period (n,absiaines), then one can estimate PDAd, the mean proportion of abstinent days among drinkers (who did not abstain; nd,inkcr.). -PDA for total abstainers is by definition 1.0 (100%), therefore: PDAd = (PDA, Days- N) -(nbstainer, * Days) ndrinkers Days where Days is the number of days in the assessment period. This reduces to the simpler formula: PDAd - (PDAt, N) - abtainer 0 drinkers which can also be solved for PDAt: PDA4 = ( PDA,. ndrinkers )_tb&tainers + N For the proportion of drinking days (PDD), the formulae for conversion are still simpler: and 5 PDDd = = PDD, / proportion of drinkers ndrinkcrs PDD = P-DDd N and of course drinkers = PDDd, proportion of drinkers PDD, = I - PD4, and PD14, = I - PDDd Measures of quantity of drinking. All of the above measures pertain to the frequency of drinking. The other traditional outcome dimension has to do with the quantity or intensity of drinking. Beginning with drinks per drinking day (DDD), it is possible to move back and forth between this metric for the total sample (DDD,) and for drinkers only (DDDd). For example: which reduces to: DDDd = (1DDD1* Days * PDD,, N) nd,i,k,,k, Days * PDDt DDDQ = DD'. N = DDD, / proportion of drinkers ndrinker,

4 214 JOURNAL OF STUDIES ON ALCOHOL / MARCH 2001 Solving for DDD,, this becomes: DDD, = DDDd proportion of drinkers DDDd = DDD, proportion of drinkers and As noted above, however, DDD, is an odd metric because it has no true zero value. We therefore recommend reporting DDD for drinkers only (DDDd). if mean quantity of consumption is to be represented for an entire sample, it is more meaningful to report an average amount of drinking per unit time, standard drinks per day (DPD,), which can also be computed just for drinkers (DPDd). When only DDD has been reported, DPD can be estimated. Some formulae for moving back and forth between DDD and DPD values are: DPD, = DDD,- PDD, and DPD PD and proportion of drinkers DPDd - (DDD, -PDD,) d proportion of drinkers It is noteworthy that these estimates closely approximate but may not equal the same variables as computed directly from timeline data. For example, when PDAd was computed from Project MATCH outpatient data, the actual value over 12 months was 69.25%, whereas when estimating from PDA,, we computed 68.0%. Whenever raw data are available, it is preferable to compute outcome variables directly. When only summary data are reported, however, the above procedures can be used to fill in missing metrics with quite reasonable estimates. Estimates from seven multisite studies The availability of a number of multisite trials provided an opportunity to develop estimates of the average outcomes of treatment. Following the principles described above, we identified longitudinal studies of alcoholism treatment that (I) were multisite, including treatment programs in at least three different geographic locations; (2) evaluated a treatment (not brief intervention) for alcohol use disorders; (3) reported some quantified measure of drinking outcomes; (4) followed clients for at least 12 months from intake; and (5) obtained follow-up data for at least 60% of the sample at 12 months. The seven studies that met these criteria were the Veterans Affairs (VA) cooperative trials of lithium (Dorus et al., 1989) and disulfiram (Fuller et al., 1986), the Relapse Replication and Extension Project (RREP; Lowman et al., 1996), the Project MATCH outpatient (opt) and aftercare (aft) studies (Project MATCH Research Group, 1993, 1997), the VA study of treatment for substance abuse (VAST; Ouimette et al., 1997) and the Rand report (Polich et al., 1981). All seven were conducted in the United States. Study and sample characteristics for the seven studies are summarized in Table 1. How representative are these seven studies of alcoholism treatment in the United States? Generalizability was one reason for limiting our analyses to multisite rather than single-site studies. Four of the studies were randomized clinical trials (LITHIUM, DISULFIRAM and the two MATCH studies) and three were uncontrolled studies of treatment-as-usual (RREP, VAST and RAND). Both the lithium and disulfiram trials, however, involved treatmentas-usual delivered to all participants, with only the medication controlled (which in both studies exerted no overall effect on outcomes). The MATCH treatments were carefully controlled, and in the aftercare study were delivered immediately after participants had completed intensive treatment-as-usual. The VA collaborative trial of lithium (LITHIUM). Dorus et al. (1989) assessed treatment with lithium carbonate for 457 male alcoholics from seven VA medical center inpatient programs. In addition to detoxification and unspecified inpatient treatment of at least 30 days, all were offered weekly outpatient visits for 13 weeks and biweekly visits for the remainder of a year, and were encouraged to seek additional treatment for alcoholism and to attend Alcoholics Anonymous. Lithium (vs placebo) was found to exert TAsBLE. Sample characteristics for the seven multisite studies Length of Alcohol follow-up Gender Treatment Minority dependent (months) (% male) seting (%) (%) LITHIUM IP+OP DISULFIRAM IP or OP 46 NR RREP IP or OP 33 NR MATCH opt OP MATCH aft IP/lO + OP VAST IP+OP RAND IP or OP 25 NR Notes: IP - inpatient; OP = outpatient; 10 = intensive outpatient treatment; NR = not reported; opt = outpatient; aft - aftercare. alength of follow-up in months after intake.

5 MILLER, WALTERS AND BENNETT 215 no specific effect on outcomes. At 12-month follow-up, 280 participants (63%) were reassessed, completed breath tests and had collateral interviews. Several drinking outcome variables were assessed, including the number of clients who were abstinent, the number of drinking days in the preceding 4-week period, and days until first drink. Abstinence was defined as no drinking, based on reports from both the participant and the collateral, as well as no positive results on a breath test. No definition of moderation was included. Alcohol problem severity was evaluated via the Addiction Severity Index (ASI; McLellan et al., 1992) and dependence was measured using the Diagnostic Interview Schedule (DIS; Robins et al., 1981). The VA collaborative study of disuljiram (DISULFIRAM). As in the lithium study, participants in this trial received standard (primarily group) alcoholism treatment in nine VA alcoholism programs (seven inpatient and two outpatient) and, in addition, were randomized to receive disulfiram, placebo or no medication (Fuller et al., 1986). Relative to placebo, disulfiram was found to exert no overall effect on treatment outcomes, with some difference observed when analyses were limited to cases showing high medication compliance. Weekly aftercare was encouraged for 6 months and biweekly visits for an additional 6 months. Follow-up interviews (including blood and urine samples) were completed every 8 weeks throughout the 12 months following intake, and collateral interviews were also conducted. Outcome data were obtained for 90% of the sample at 12 months. Outcomes were judged from three drinking measures (complete abstinence, time to first drink and percent drinking days) and two psychosocial measures (employment status and social stability). Any indication of drinking (from self-report, collateral report, or blood or urine tests) excluded a client from the complete abstinence outcome status. Moderate drinking was not evaluated in this study. The Relapse Replication and Extension Project (RREP). The RREP study (Lowman et al., 1996) followed 563 clients receiving treatment-as-usual at three sites. The study included multiple follow-up points and high follow-up rates: 544 participants (97%) were interviewed at 2 months, 539 (96%) at 4 months, 518 (92%) at 6 months, 514 (91%) at 8 months, 507 (90%) at 10 months and 469 (83%) at 12 months posttreatment. Breath tests were conducted at all assessments. In addition, blood and urine tests were performed at the 6- and 12-month intervals. Quantity and frequency were measured via the Form-90 interview (Miller, 1996b). Drinking-related dependent variables included time to first drink, time to first heavy drinking day, percent days abstinent and drinks per drinking day. Abstinence was defined as no alcohol use during each follow-up interval, but no definition of moderation was specified. The Alcohol Dependence Scale (ADS; Skinner and Allen, 1982) and DIS were completed to assess symptoms of dependence. Project MATCH (MA TCH). Project MATCH (1997) included outpatient (opt) and aflercare (aft) samples that were defined and analyzed as two separate studies: 952 individuals participating in outpatient treatment for alcohol problems at five outpatient treatment centers, and 774 individuals who had completed inpatient or intensive outpatient treatment in community programs just prior to entering additional MATCH treatment at one of five aftercare sites. Consistent with the approach of the Project MATCH Research Group, we treated the two study arms as separate studies because (1) all aftercare clients had just received intensive treatment-as-usual, whereas outpatient clients received only the MATCH treatments, (2) the aftercare sample reported substantially more severe problems on many dimensions and (3) overall outcomes and matching effects were different for the two samples. MATCH clients were assessed at multiple time points, with the major outcome assessment coming at 12 months after treatment termination (15 months from intake) for both samples. At that assessment, 92% of the outpatient sample and 93% of the aftercare sample were reassessed. Both arms of the study included collateral interviews and objective breath, blood and urine tests that were completed by over 75% of each sample. Quantity and frequency of use were measured in both arms of the study using Form 90, the Alcohol Use Inventory (AUI; Wanberg et al., 1977) and the ASI. Two drinking-related dependent variables were assessed: percent days abstinent and drinks per drinking day. Abstinence was defined as no use of alcohol, and a discrete outcome variable provided a composite measure of outcome with the following options: (I) no drinking, (2) moderate drinking without problems, (3) heavy drinking or recurrent problems and (4) both heavy drinking and recurrent problems. The Drinker Inventory of Consequences (DrInC; Miller et al., 1995b) was used to assess alcohol-related problems and the Structured Clinical Interview for DSM-1II-R (SCID; Spitzer et al., 1990) to assess alcohol dependence. VA study of treatment Jbr substance abuse (VAST). Ouimette et al. (1997) compared the effectiveness of 12- step and cognitive-behavioral treatment for 3,698 clients from 15 substance abuse treatment programs at United States VA medical centers. At the 12-month follow-up interval, 3,018 (81.6%) were reassessed and a subsample (230) provided breath, blood and urine tests. Drinking-related dependent variables were absence of alcohol dependence syndrome, absence of substance-related problems, percent improved, percent deteriorated, percent abstinent and average ounces of alcohol consumed per day. A conservative definition of abstinence was used, which was no alcohol consumption, no illicit drug use and no problems resulting from alcohol or drug use. A definition of moderation was also included: consumption of 3 ounces or less on a typical drinking day, no illicit drug use and no problems from al-

6 216 JOUJRNAL OF STUDIES ON ALCOHOL / MARCH 2001 cohol or drug use. Alcohol-related problems were assessed using 18 items sampling several domains including health, financial, occupational, intra- and interpersonal and residential difficulties. Symptoms of dependence were measured using nine items reflecting DSM-II1-R (American Psychiatric Association, 1987) criteria. The RAND reports (RAND). One of the earliest studies of alcohol treatment outcomes was reported by the Rand Corporation (Armor et al., 1978; Polich et al., 1981). Although these reports stirred public controversy over "controlled drinking," the study was large and well conducted, encompassing 1,340 clients from eight programs. Followup assessments through 4 years included collateral interviews and breath tests for some clients. Quantity-frequency measures included ounces of alcohol consumed daily; number of days of use in the last month of beer, wine and distilled spirits; and the amount of each beverage consumed on a typical drinking day. The drinking outcome variables in the study were the average ounces of alcohol consumed per day, the number of days of use in the past month, the percentage of clients abstinent at each follow-up interval and time to first drink. The study included an assessment of behavioral impairment, defined as the frequency of experiencing 12 alcohol-related problems on a 0-3 scale. These problems were used to create a behavioral impaimnent index, changes in which were examined at each follow-up interval. The study provided detailed operational definitions of both abstinence and moderation. Other multisite studies were considered but did not meet review criteria with regard to collection of follow-up data. In a multisite Midwestern-state study (Hartmann and Wolk, 1996), for example, only 45% of clients were retained at 12-month follow-up. A European multisite trial of acamprosate similarly retained only 44% of cases at 12 months (Paille et al., 1995). A Schick study (Smith et al., 1991) reported an 83% follow-up rate at 12 months for Schick clients and 82% follow-up with a multisite comparison sample. The number actually interviewed, however, comprised only 27% and 2% of the source samples from which they were drawn. Even when the number of clients contacted at 6-month follow-up is used as the denominator, only 45% and 3% were completed at 12 months. Review procedure For each of the seven multisite studies we began with the dependent variables reported by the authors. When sufficient information was provided, we applied the above transformation formulas to estimate outcome variables that were not directly reported. For the MATCH and RREP samples, access to the original data sets allowed direct computation of some variables not initially reported. Whenever possible, direct computation was preferred to estimation via transformation formulas. When different treatment approaches or programs were compared, we pooled them. Then we averaged across studies the available values for each variable, giving equal weight to all studies (rather than weighting by sample size). Weighted averages differed little from simple averages, and in no case would significantly alter conclusions. In two cases, one study yielded findings markedly different from all others, and these values are noted below as outliers (in Table 4) and excluded to avoid misleadingly skewing mean values. Results The seven studies together comprise 8,389 clients seeking treatment for alcoholism, a truly broad clinical sample. The percentage of clients with known outcomes (including deceased) at I-year follow-up ranged from 61% to 94%, with a mean (83%) high enough to provide reasonable confidence in the representativeness of follow-up data. Abstinence Table 2 reports outcome data from the seven multisite studies. The first of these is the percentage of cases with continuous abstention for at least 12 months at follow-up TABLF month drinking outcomes in multisite treatment trials Study N %FU %Abs %Mod 0 PDA, PDAd ODD, DEDD DPD8 DPDA LITHIUM NR NIR NR NiR NR DISUL.FIRAM NR NR NR NR NR RREP NR , * 1.12* MATCH opt * 1.66* MATCH aft * 1.45* VAST 3, NR NR NR ,99* Ni NR RAND 1, "' 11.6 NR NR NR NR Total 8,389 Mean Notes: Figures with asterisks (*) were computed from other values available in this table. All other values were obtained directly from published reports or from study authors. N = total number of clients; %FU = percentage of cases completed at 12-month follow-up, including documented deaths in numerator; %Abs = percentage of cases continuously abstinent for 12 months; %Mod = percentage of cases classified as moderate drinkers at 12 months; PDA = percent days abstinent; DDD = drinks per drinking day; DPD = drinks per day;, total sample; d drinkers only; NR = not reported: opt = outpatient; aft = aftercare. "Moderate drinking with no adverse consequences; />At least 12 months of continuous abstinence during 18-month follow-up period.

7 MILLER, WALTERS AND BENNETT 217' (%Abs). The only significant complexity was posed by the RAND study, in which follow-up was conducted at 18 months but not at 12 months. For this study, we report the percentage of cases with at least 12 months of continuous abstinence during the 18-month follow-up period. Total abstinence rates range from 17% (RAND study) to a high of 35% (MATCH aft). Across all seven studies, an average of 24% of clients maintained continuous abstinence for 12 months or more. This is consistent with the classic report of Hunt et al. (1971), who found a steep decline in abstinence during the first 3 months, and a gradual leveling off to a 12-month abstinence rate near 20%. Moderate asymptomatic drinking Less often reported is the percentage of cases maintaining moderate asymptomatic drinking (without negative consequences or dependence). In two outpatient multisite trials, the figures were similar: 12.4% (MATCH opt) and 11.6% (RAND). A lower rate of 7.3% was reported in the MATCH aftercare sample, which had just completed intensive treatment prior to entering the study. The average rate of about 10% corresponds well with reports from other samples (e.g., Miller et al., 1992; Vaillant, 1995). It is noteworthy that using only one criterion-moderate drinking (with or without symptoms) or asymptomatic drinking (regardless of amount) will usually result in a higher reported proportion of cases than when both criteria are required. Frequency of drinking Frequency data (PDA, the mean percentage of abstinent days) were available for five of the trials. The MATCH outpatient study yielded the lowest estimate (74%) and RREP the highest (85%) when the total sample (PDA,) was used (including abstainers) to calculate PDA, with an average of 81%. A slightly different picture emerges when PDA is calculated only for those who had consumed alcohol during the 1-year follow-up period (PDAd). Given that for total abstainers the value is always 100%, PDA will be lower when calculated with abstainers excluded. For drinkers only, PDA ranged from 69% to 81%, averaging 75%. The percentage of drinking days, of course, can be calculated by subtracting these PDA values from 100%; even those who continued to drink after treatment drank on only I day in 4 (25%), on average. Baseline PDA values for studies are reported in Table 3. With this additional information it is possible to compute the improvement in PDA for the total sample, as well as for those who continued to drink. For the total sample, PDA increased by 145%, to 245% of baseline level in the total sample. Even those who drank alcohol during the follow-up year increased their days of abstinence by 128%. TABLE 3. Percentage of change from baseline levels in percent days abstinent (PDA) and drinks per drinking day (DDD) Study LITHIUM DISULFIRAM RREP MATCH opt MATCH aft VAST RAND Mean Increase Reduction PDA, Total Drinkers DDD, Total Drinkers '(%) (%) (%) (no.) (%) (%) NR NR NR NR NR NR NR NR NR NR NR NR NR NR Votes: PDA, - percent days abstinent for total study sample at baseline; DDD, = drinks per drinking day for total study sample at baseline; Total = percentage of change in the total study sample; Drinkers = percentage of change among those who continued to dnrnk at follow-up; NR = not reported; opt = outpatient; aft = aftccare. That is, drinking days dropped from 63% before treatment, to 18% in the total sample, and to 25% among continuing drinkers. Quantity of alcohol consumption Frequency of drinking tells only part of the story. Additional data regarding quantity of consumption (as DDD, drinks per drinking day) were available from five of the seven trials. We used a standard drink unit of 0.5 ounces (I 5ml) absolute alcohol for these calculations (Miller et al., 1991). When DDD is computed for the entire sample (DDDt), findings consistently suggest an average of 5-6 standard drinks per drinking day (Table 2). As discussed above, however, the DDD metric is meaningless for total abstainers who, by definition, have no drinking days. Removing abstainers from analyses, therefore, provides a clearer picture of how much clients are drinking on days when they do drink (DDDd). This value ranged from 6 to 10, averaging about 7.5 drinks per drinking day--a value that is about one third higher than when abstainers are included. Clients who did drink during the follow-up year nevertheless showed an average 57% reduction in DDD (Table 3). Another way to characterize quantity of drinking is by the total volume consumed per unit time (e.g., average DPD [drinks per day]). Whenever PDA is greater than zero (i.e., when there have been some abstinent days), DDD will be higher than DPD. The larger PDA is, the greater the discrepancy. At 12-month follow-up, DPD for the total sample (DPDt) averaged about one standard drink per day (Table 2). Removing total abstainers from this calculation (DPDd) increased the average to 1.4 DPD, as compared with 11 DPD before treatment. This is equivalent to 77 standard drinks per week before treatment, and 10 per week for drinkers afterward-an 87% reduction.

8 218 JOURNAL OF STUDIES ON ALCOHOL / MARCH 2001 TABl E 4. Other outcome measures at 12 months Survival Problems ('i') (% reduction) Problem measure LITHIUM Addiction Severity Index" DISULFIRAM * Unemployment RREP NR NR Drinker inventory of Consequencesh MATCH opt Drinker Inventory of Consequences MATCH aft Drinker Inventory of Consequences VAST * Unemployment RAND 93.2* 63.3 % with consequences or symptoms Means (all studies) (removing outlier values) Notes: -Outlier values excluded from averages; NR = not reported; opt - outpatient; aft = aftercare. 'McLellan et al., 1992; "Miller et al., 1995b. Mortality What percentage of people treated for alcohol use disorders survived to the 1 -year follow-up? Table 4 reflects 12- month mortality rates ranging from 1.0% to 6.8%. With the exception of the oldest (RAND) study, which was an outlier at 93.2%, i-year survival rates clustered tightly around a mean of 98.5%. Alcohol-related problems Drinking and mortality data provide only part of the outcome picture. To what extent do alcohol-related problems decrease after treatment? Here there is substantial variation in measures, posing analytic challenges. Two studies (DISULFIRAM, VAST) reported only a single consequence-unemployment; the other five used scales that assess the extent of alcohol-related problems (see Table 4). After removing the two studies reporting only unemployment, drinking-related problems were reduced by almost 60% during the year after initiation of treatment. This finding mirrors the frequent report that improvement after treatment is not limited to reduction in drinking, but is evident on many other dimensions of health and social functioning. Discussion Drawing together the outcomes for more than 8,000 people treated for alcoholism, we offer the following narrative summary. After a single treatment episode, roughly one client in four will abstain from alcohol throughout the first year, which is the period of highest risk for return to drinking. In addition, about I in 10 will moderate the quantity and frequency of their drinking to remain free of alcohol-related problems in this same I-year period. In combination, these unambiguously positive outcomes account for about one third of treated cases. Mortality during this first peak-risk year is about 1.5%. The remaining two thirds of treated clients continue to have at least some periods of heavy drinking in the first year, but outcome data reflect substantial improvement, a fact often overlooked. After treatment, even those who do drink are abstinent on 3 days out of 4. Stated another way, they go from drinking on 2 days out of 3 on average before treatment, to I day out of 4 afterward. On days when they do drink, the average amount of alcohol they consume is less than half what it was before treatment, albeit still heavy. The combined effect of these reductions in frequency and quantity is substantial. Even for those who continue to drink, alcohol consumption drops by more than 87% on average in the year after treatment (from an average of 77 standard drinks per week to 10). Clearly, this is enough to result in substantial reduction of health and social problems related to drinking. Alcohol-related problems are also reduced by 60% after treatment. This substantial improvement in clients who do not maintain perfect abstinence or moderation is obscured by any simplistic classification of cases as "successful" versus "relapsed." For almost any chronic health problem, these would be excellent outcomes: more than 98% alive, of whom one third are totally symptom-free and the rest improved by an average of 57-87%, depending upon the metric. It is worth noting, too, that these are outcomes after a single treatment event. On average, two thirds of clients are not symptomfree after a single treatment episode; therefore, it is common for clients to seek further help. The number of prior treatment events has been found to be unrelated to the outcome of a subsequent course of treatment (i.e., the likelihood of benefiting from treatment may not be substantially altered by having had prior treatments; e.g., Miller et al., 1996). We intentionally sought average outcome statistics, without regard to differences among various treatment methods. While many outcome studies have found no substantial differences among compared treatment approaches (e.g., Project MATCH, 1997), it is also the case that about half of all such studies have reported significant differences (Miller et al., 1998). The averages presented here should not be interpreted as representing the outcomes of any and all treatment methods and programs. By virtue of the law of averages, some approaches will have better results, and some worse, than those summarized in this article. We believe it is nevertheless helpful to have these averages against which to compare specific program or study outcomes. Some have claimed, for example, that the clinical outcomes in Project MATCH were extraordinarily positive, and have therefore searched for reasons why MATCH participants fared so uncommonly well. In fact, Tables 2 and 4 suggest that client outcomes in Project MATCH were not far from the averages for multisite studies of alcoholism treatment, with some measures above and some below the respective means. For fair comparisons to be made, however, outcomes should be documented with the same degree of method-

9 MILLER, WALTERS AND BENNETT 219 ological care taken in these multisite trials. In these studies, drinking data were collected via carefully structured protocols, usually in person (although VAST relied primarily on mailed questionnaires). Six of the seven studies checked the validity of self-report via biochemical and/or collateral interview verification. The average rate of completed follow-up was 83%, with five of seven studies over 80%. Follow-up extended for at least 12 months. The outcomes presented here are also averaged across all people entering treatment in these multisite studies. Most of the trials comprised clients with a broad range of problem severity and other personal characteristics. Our use of the term "alcoholism" in this article is intended to mirror Jellinek's (1960), who thereby referred to the full range of alcohol-related problems and not merely the severe end of the continuum. The commonsense idea that greater problem severity equals poorer prognosis was not upheld in Project MATCH (1 997), and the relationship of severity to outcome is complex (Miller et al., 1992; Polich et al., 1981). The average outcomes presented here represent a best guess, all else being equal, but prognosis is influenced by attributes of the particular client, treatment and clinician. We also emphasize that all seven studies meeting inclusion criteria were evaluations of American treatment programs, and findings cannot be assumed to generalize to other nations. In the multisite European acamprosate trial (Paille et al., 1995), for example, the aggregate outcomes lay outside and below the range observed in the seven U.S. studies, as summarized in Table 2. The percentage of continuously abstinent cases reported by Paille and colleagues was 11.3 (U.S. range: 16.7 to 35.0, with a mean of 24.1). PDA, was 54.2%, as compared with a U.S. range of 74.0 to 84.6% (U.S. mean = 81.4%). Some of the computations to convert study findings into the standard outcome variables used in this review require that both frequency and quantity of drinking be reported. We recommend that future alcoholism treatment outcome studies always include at least the following indices: (I) the frequency of drinking (PDA or PDD); (2) the quantity of drinking (DDD or DPD): and (3) the percentage of clients who remained completely and continuously abstinent. When at least these three measures are provided, all of the other outcome indices in this review can be computed even if they are not directly reported. More recently, broad adoption of timeline follow-back methodology has increased the consistency of how consumption outcome data are collected, yielding a degree of detail that allows for more sophisticated measures of drinking behavior. The measurement of alcohol-related problems is less well developed and standardized. This required us to lump together some diverse measures in order to estimate impact on alcohol problems. As harm reduction strategies are implemented and tested, it becomes all the more important to assess alcohol-related harm apart from consumption measures, with which it is at best moderately correlated. We caution that one cannot confidently assert from these averages that treatment caused the observed outcomes. Causal inference is usually drawn from controlled trials, of which there are hundreds in the alcohol field (Miller et al., 1998). We have sought only to characterize typical client outcomes after treatment, and we regard the picture presented by these large studies to be a hopeful one indeed. The vast majority of treated clients showed major reductions in their drinking. Certainly the data warrant more optimism than the 1977 estimate of successful outcomes in only one out of four cases at I year after alcoholism treatment (Costello et al., 1977). Our conclusions lie closer to those of Emrick (1974), who relied mostly on studies with shorter follow-up intervals, and who judged that one third abstain and another one third are improved. Our findings do support that one third remain free of alcohol problems throughout the peakrisk period of the 12 months after treatment. The remaining two thirds, as a group, clearly show major reductions in alcohol use and problems, albeit with large individual variability. To arrive at any single figure of "success," one must dichotomize outcomes by drawing a line at what is regarded to be enough improvement, and outcomes in the management of chronic illnesses are seldom judged in such a simplistic manner. If pressed, however, we believe that the data justify, as a conservative estimate, the "rule of thirds": that a year after a single treatment event, one third, on average, remain in full remission and at least another third evidence substantial improvement. The latter assertion is supported by the magnitude of average improvement shown by those who continued to drink after treatment. If all drinkers reduced their alcohol consumption by 87%, on average, it seems a reasonable assumption that at least half of them are substantially improved. To yield a mathematical average reduction of 87%, in fact, no fewer than three fourths of drinkers must have reduced their alcohol consumption by more than 50% and, conversely, ovcr half must have reduced their drinking by at least 75%. Last, we caution against misuse of these data in formulating treatment policy decisions. Using the narrowest of definitions, the argument could be made that the chances of successful outcome (total abstinence) are only 25% and therefore treatment expenditures are unwarranted. At the opposite extreme, a success rate of two thirds or higher might be claimed, to argue for the expansion of treatment services. Neither is an appropriate application of these average outcome estimates for people who entered treatment. There are other avenues besides treatment whereby drinking and related problems can be reduced, and the relative merits of treatment versus other approaches cannot be judged from the analyses presented in this article.

10 220 JOURNAL OF STUDIES ON ALCOHOL / MARCH 2001 References ALCOHoLics ANoNymoNDs. Alcoholics Anonymous: The Story of How Many Thousands of Men and Women Have Recovered from Alcoholism, 3rd Edition, New York: Alcoholics Anonymous World Services, AMERIAN PSYt(:IATRAt: AsSOCIATIoN. Diagnostic and Statistical Manual of Mental Disorders (DSM-11I-R), Washington, DC, ARmoR, DI., POULCH, J.M. AND STAMBLL, H.B. Alcoholism and Treatment, New York: John Wiley & Sons, BRowNE1I., K.MD., MARI Ar, G.A., L(IITENSTN, E. AND WILSON, G.T. Understauding and preventing relapse. Amer. Psychol. 41: , CAHALAN, D. Problem Drinkers: A National Survey, San Francisco, CA: Jossey-Bass, CAHHIAN, D., CISIN, I.H. AND CROsSLEY, H.M. American Drinking Practices: A National Study of Drinking Behavior and Attitdes, Rutgers Center of Alcohol Studies Monograph No, 6, New Brunswick, NJ, COSTELLO, R.M., BIEVER, P. AND BAIILARGEON, JG. Alcoholism treatment programming: Historical tends and modem approaches. Alcsm Clin. Exp. Res. 1: ,7977. DoRUS, W., OSTROw, D.G., ANTON, R., CUSHMAN. P., COLLIN.S, I.F., SCHAEFER, M., CHARLES, H.L., DESAi, P., HAYASHIDA, M., MALKERNEKER, U., WILI ENBRING, M., FISCELI A, R. ANI) SATHER, M.R. Lithium treatment of depressed and nondepressed alcoholics. JAMA 262: , 1989, EMR( K, C.D. A review of psychologically oriented treatment of alcoholism: 1. The use and inten-elationships of outcome criteria and drinking behavior following treatment. Q. J. Stud. Alcohol 35: , FUTLLER, R.K., BRANCHEY. L., BRIGHTWELL, DR., DERAN, R.M, EMRI( K, C.D., ILER, FL., JAmEs, KE., LACOURSIERE, RB., KL, K.K., LowFNSIAM, L., MAANY, I, NEIDERILESER, D., NO( KS, J.J. AND SHAW, S. Disulfiram treatment of alcoholism: A Veterans Administration cooperative study. JAMA 256: , HARTMANN, D0J. AND WOLK, J.L. Assessing multisite alcohol and other drug dependency treatment programs. Alesm Treat. Q. 14 (4): 1-32, HLNr, W.A., BARNETT, L.W. AND BRANCH, L.G. Relapse rates in addiction programs. J. Clin. Psychol. 27: , INSTITUTE OF MEDICINE. Broadening the Base of Treatment for Alcohol Problems, Washington, DC: National Academy Press, 1990, JELLINEK, E.M. The Disease Concept of Alcoholism, New Brunswick, NJ: Hillhouse Press (distributed by Rutgers Center of Alcohol Studies, New Brunswick, NJ), LOWMAN, C., ALLEN, J. AND MILbER, W.R. Perspectives on Precipitants of Relapse. Addiction 91 (Suppl.), McLELIAN, A.T., KUSIlNER, H., METZGER, D., PETERs, R, SMITH, I., GRISSOM, G., PETTINATI, H. AND ARERIOUL, M. The fifth edition of the Addiction Severity Index. J. Subst, Abuse Treat. 9: , MILI ER, W,R. What is a relapse? Fifty ways to leave the wagon. Addiction 91 (Suppl.): SI5-S27, 1996a. MILLER, W.R. Fomm 90: A Suctured Assessment Interview for Drinking and Related Behaviors: Test Manual. NIAAA Project MATCH Monograph Series, Vol. 5, NIH Publication No , Rockville, MD: Department of Health and Human Services, 1996b. MIlLER, W.R., ANDREWS, N.R., WILBOIIRNE, P. AND BENNETT, M.E. A wealth of altematives: Effective treatments for alcohol problems. In: MILLER, W.R. AND HEArHER, N. (Eds.) Treating Addictive Behaviors, 2nd Edition, New York: Plenum Press, 1998, pp MILLEK, W.R., BROWN, J.M., SIMPSON, T.L., HANDMAKER, N.S., BIEN, T.H., LIJ KAe, LF., MON rgomery, H.A., HEStER, R.K. AND TONIGAN, J.S. What works? A methodological analysis of the alcohol treatment outcome literature. In: HESTER, R.K. AND MILLER, W.R. (Eds.) Handbook of Alcoholism Treatment Approaches: Effective Altemnatives, 2nd Edition, Needham Heights, MA: Allyn & Bacon, 1995a, pp. t2-4. MILLER, W.R., HEATHER, N. AND HALL, W. Calculating standard drink units: International comparisons. Brit. J. Addict. 86: 43.47, MITI FR, W.R. AND HESTER, R.K. Inpatient alcoholism treatment: Who benefits? Amer. Psychol. 41: , MILiER, W.R., LECKMAN, A. L., DEIANEY, [.D. AND TINKCOM, M. Longterm follow-up of behavioral self-control training. J. Stud. Alcohol 53: , MILLER, W.R. AND SAN CEZ-CRAIG, M. How to have a high success rae in treatment: Advice for evaluators of alcoholism programs. Addiction 91: , MILLER, W.R., TONI( AN, is. AND LON( ABAUUI, R. The Drinker Inventory of Consequences (DrInC): An Instrument for Assessing Adverse Consequences of Alcohol Abuse (Test Manual). NIAAA Project MATCH Monograph Series, Vol. 4, NIH Publication No , Rockville, MD: Department of Health and Human Services, 1995b. MI LER, W.R., WESTrERERO, V.S., HARRIS, R.J. ANO TONIGAN, J.S. What predicts relapse'? Prospective testing of antecedent models. Addiction 91 (Suppl.): S155-S172, NAJAVITS, L.M. AND WEiss, R.D. Variations in therapist effectiveness in the treatment of patients with substanc use disorders: An empirical review. Addiction 89: , OUIMEITE, P.C., FINNEY, J.W. AND MODS, R.H. Twelve-step and cognitivebehavioral treatment for substance abuse: A comparison of treatment effectiveness. J. Cons. Clin. Psychol. 65: , PAILLE, F.M., GUELL, J.D., PERKINS, A.C., ROYER, RJ., STERU, L. AND PAI(OT, P. Double-blind randomized multicentre rial of acamprosate in mainamining abstinence from alcohol. Alcohol Alcsm 30: , POLICH,.M., ARMOR, DiJ. AND BRAIKER, H.B The Course of Alcoholism: Four Years after Treatment, New York: John Wiley & Sons, PRmOJrtI MATCH RESEAR(HI GRoIn-. Project MATCH: Rationale and methods for a multisite clinical tial matching patients to alcoholism treatment. Alcsm Clin. Exp. Res. 17: , PROJECT MATCH RESEARCH GROUP. Matching treatments to client heterogeneity: Project MATCH posttreatment drinking outcomes. J. Smd. Alcohol 58: 7-29, PROJECT MATCH RESLARCHS GROUP. Therapist effects in three treatments for alcohol problems. Psychother. Res. 8: , ROBINS, L.N., HEIZER,.E., CROIJOHAN, J.L. AND RATCLIFF, K.S. National Institute of Mental Health Diagnostic interview Schedule: Its history, charactcristics, and validity. Arch. Gen. Psychiat. 38: , SKINNER, H.A. AND ALLEN, B.A. Alcohol dependence syndrome: Measurement and validation. J. Abnorm. Psychol. 91: , SMITH, J.W.. FRAWLEY, PJ. AND POLISSAR, L. Six- and twelve-month abstinence rates in inpatient alcoholics treated with averion therapy compared with matched inpatients from a treatment registry. Alesm Clin. Exp. Res. 15: , SOBEI I, L.C. AND S0DDELL, M.B. Alcohol consumption measures. In: ALLEN, J.P. AND COLIIMBLS, M. (Eds.) Assessing Alcohol Problems: A Guide for Clinicians and Researchers. National Institute on Alcohol Abuse and Alcoholism Treatment Handbook Series No. 4, NIH Publication No , Washington: Government Printing Office, 1995, pp SPITZER, R.L., WILLIAMS, J.B.W., GIBBON, M. AND FIRST, M.B. User's Guide for the Strucmred Clinical Interview for DSM-I11-R: SCID, Washington, DC: American Psychiatric Press, VAIILANT, G.E. The Natural History of Alcoholism Revisited, Cambridge, MA: Harvard Univ. Press, WANBERO, K.W., HORN, J.L. AND FOSTER, F.M. A differential assessment model for alcoholism: The scales of the Alcohol Use Inventory. J. Stud. Alcohol 38: , WESTERDERC, V.S., MILLER, W.R., HARRIS, RIJ. AND TONIGAN, J.S. The topography of relapse in clinical samples. Addict. Behav. 23: , 1998

11 COPYRIGHT INFORMATION TITLE: How effective is alcoholism treatment in the United States? SOURCE: Journal of Studies on Alcohol 62 no2 Mr 2001 WN: The magazine publisher is the copyright holder of this article and it is reproduced with permission. Further reproduction of this article in violation of the copyright is prohibited. Copyright The H.W. Wilson Company. All rights reserved.