British Journal of Haematology, 2003, 122, 441 450 A new staging system for multiple myeloma patients based on the Southwest Oncology Group (SWOG) experience Joth L. Jacobson, 1 Mohamad A. Hussein, 2 Bart Barlogie, 3 BrianG.M.Durie 4 and John J. Crowley 1 1 Southwest Oncology Group Statistical Center, Seattle, WA, 2 Cleveland Clinic Cancer Center, Cleveland, OH, 3 Myeloma Institute for Research and Therapy, University of Arkansas for Medical Studies, Little Rock, AR, and 4 Cedars Sinai Comprehensive Cancer Center, Los Angeles, CA, USA Received 23 December 2002; accepted for publication 25 March 2003 Summary. We aimed to develop and evaluate a staging system for multiple myeloma (MM) based on easily obtained laboratory measures. The Durie Salmon stage is most commonly used and is an effective system of patient stratification for clinical trial research. However, the criteria are complex and many laboratory parameters are required to properly stage patients. In this analysis, we focused on two common measures with prognostic importance in MM: serum b2 microglobulin (b2m) and serum albumin. Pre-study data on 1555 previously untreated MM patients enrolled on four recent South-west Oncology Group (SWOG) phase III trials were used in the analysis. Staging models were developed and validated using regression tree methods for survival outcomes. SWOG stages were defined as: stage 1, b2m < 2Æ5 mg/l (14% of patients, median overall survival of 55 months); stage 2, 2Æ5 b2m < 5Æ5 (43% of patients, median overall survival of 40 months); stage 3, b2m 5Æ5 and albumin 30 g/l (32% of patients, median overall survival of 24 months); and stage 4, b2m 5Æ5 and albumin < 30 g/l (11% of patients and median overall survival of 16 months). This staging scheme was also predictive of event-free survival, first-year mortality and long-term ( 5 years) event-free survival. We conclude that although the SWOG stage does not represent a new prognostic marker for MM (cytogenetics, FISH), it could provide a simple alternative to the Durie Salmon stage for patients with previously untreated MM. Additional evaluation in other MM patient populations is needed to confirm results. Keywords: b2 microglobulin, multiple myeloma, staging. Durie Salmon (DS) stage (Durie & Salmon, 1975) is the most commonly used staging system for patients with multiple myeloma. Developed in the mid-1970s, it has proven to be an effective system of patient stratification for clinical trial research. As the survival rate for patients with multiple myeloma varies greatly, a staging system such as DS is important in the evaluation of trial results, allowing researchers to prospectively identify patients with survival characteristics at either end of the spectrum. However, the DS staging system is complex and requires knowledge of myeloma cancer biology to properly stage patients. Some of the criteria are difficult and inconvenient to evaluate on a routine basis. For instance, for evaluating the number of lesions to be noted on the bone survey, the system does not specify if all the lesions should be in different skeletal organs or if presence in one organ suffices. Staging a patient under the DS system requires results from a bone marrow biopsy, Correspondence: Dr J.L. Jacobson, Southwest Oncology Group Statistical Center, 1730 Minor Avenue, Suite 1900, Seattle, WA, 98101 1468, USA. E-mail: jothj@crab.org bone survey, serum electrophoresis, and values for haemoglobin, haematocrit and serum calcium. Since the development of the DS staging system, new prognostic factors have been identified as having importance in the pretreatment evaluation of patients with multiple myeloma. Among these factors are routine laboratory measures, such as serum b2 microglobulin (b2m) and serum albumin. A staging system based on a combination of these laboratory variables could prove to be a useful tool with the classification power of the DS system but based on more simple criteria and more easily obtained tests. In this analysis, we developed and evaluated a staging system based on two laboratory measures with prognostic importance in multiple myeloma: serum b2m and serum albumin. b2m has been one of the few factors in univariate and multivariate analysis found to have independent prognostic importance for survival. Studies correlating b2m levels with myeloma stage, disease status and survival suggest that b2m may be a product of myeloma cells and can be used as a tumour marker to predict the course of the disease (Berggarrd & Beam, 1968; Peterson et al, 1972; Cassuto et al, 1978; Kin et al, 1979; Norfolk et al, 1980; Bataille Ó 2003 Blackwell Publishing Ltd 441
442 J. L. Jacobson et al et al, 1982; Child et al, 1983; Peest et al, 1986). Its predictability for prognosis and its involvement in tumour cell growth make it a reasonable candidate as a main factor for patient staging (Mori et al, 1999; Facon et al, 2001). Serum albumin is an indirect indicator of interleukin 6 (IL-6) levels, liver function and the nutritional status of the patient. It is an easily obtainable, standardized test. IL-6 is a pro-inflammatory cytokine that normally fluctuates within a narrow range and is expressed at low levels, except during infection, trauma or other stress related situations. IL-6 is a potent mediator of inflammatory processes, and it has been proposed that the age-associated increase in IL-6 accounts for certain phenotypic changes associated with advanced age, particularly those that resemble chronic inflammatory disease, such as decreased lean body mass, osteopenia, lowgrade anaemia, decreased serum albumin and cholesterol, and increased inflammatory proteins such as C-reactive protein (CRP) and serum amyloid A. Furthermore, the ageassociated rise in IL-6 has been linked to osteoporosis, lymphoproliferative disorders and multiple myeloma (Ershler & Keller, 2000). IL-6 is a potent myeloma cell growth factor and serum levels of IL-6 reflect disease severity in both myeloma and related disorders (Bataille et al, 1989). It is of particular interest that serum IL-6 levels are inversely proportional to the serum albumin levels. Moreover, the serum albumin level is a significant indicator of the patient s nutritional status. Serum albumin level inversely correlates with dietary well-being (Mazolewski et al, 1999). Therefore, low serum albumin correlates with both rapid myeloma growth and the patients overall performance status. The development of a staging system involves decisions on not only which clinical factors to include, but also which statistical methods are most appropriate. The choice of a statistical modelling method is important as the results often depend on the method itself. The most commonly used method for identifying prognostic groups based on survival data is to fit a Cox proportional hazards model to the group of potential factors and to then use the resulting regression equation to divide the patients into groups. This technique often results in prognostic groups that can be difficult to characterize and define. In the development of our staging system, we employed statistical techniques designed to identify and classify data based on certain outcomes of interest that have been recently adapted to the analysis of survival data. PATIENTS AND METHODS Patient population. Pre-study data from four recent South-west Oncology Group (SWOG) multiple myeloma phase III trials were used in the analysis (Table I). Patients eligible for these studies had untreated, newly diagnosed multiple myeloma of any stage ( DS stage I). SWOG S8229 (Salmon et al, 1990) evaluated VMCP (vincristine, melphalan, cyclophosphamide and prednisone) and VBAP [vincristine, carmustine (BCNU), adriamycin and prednisone] for remission induction therapy followed by VMCP versus sequential half-body radiotherapy + vincristine prednisone in patients who achieved remission status with chemotherapy, or sequential half-body radiotherapy + vincristine prednisone in patients who fail to achieve remission. SWOG S8624 (Salmon et al, 1994) was a comparison of VMCP/VBAP with either VAD (vincristine, adriamycin, dexamethasone) or VMCPP/ VBAPP (VMCP/VBAP with prednisone between cycles) for induction followed by alpha-2b interferon or no therapy for maintenance, or alpha-2b interferon + dexamethasone for incomplete or non-responders. SWOG S9028 (Salmon et al, 1998) compared VAD with VAD/ verapamil/quinine for induction, followed by alpha-2b interferon or alpha-2b interferon plus prednisone for remission maintenance. The VAD + verapamil + quanine arm of the S9028 study was closed early as a result of the excessive mortality related to quinine toxicity. The arm was not included in the analysis. SWOG S9210 (Berenson et al, 2002) compared VAD + prednisone (VAD-P) with VAD-P/quinine for induction, followed by a randomization of prednisone dose intensity for remission maintenance. The primary outcome of interest in this analysis was overall survival (OS), measured as the time from initial study registration to time of death from any cause or last contact. The prognostic variables of interest in this analysis were common laboratory measurements collected prior to treatment on all patients registered on the SWOG multiple myeloma trials that were identified as variables with potential prognostic value, including albumin, calcium, creatinine, haemoglobin, platelet count and b2m. Other patient characteristics such as age, sex, race and disease type were examined as covariates. To evaluate and further validate the new staging scheme in a population of non-swog patients, data were examined Table I. Patients enrolled on SWOG myeloma protocols. SWOG study Description Total registered Total eligible S8229 VMCP + VBAP/VMCP + Lev vs vincristine + prednisone 621 614 S8624 VMCP + VBAP vs VAD vs VMCPP + VBAPP/interferon 522 509 S9028 VAD vs VAD + Ve + Q/interferon 233 182* S9210 VADP vs VADP + Q/prednisone 262 250 Total 1638 1555 *The VAD + verapamil + quanine arm of S9028 was closed early as a result of excessive mortality related to quanine toxicity. The arm was not included in the analysis.
from 231 patients enrolled in the University of Arkansas Total Therapy program for newly diagnosed multiple myeloma patients (Barlogie et al, 1999). Patients were enrolled between 1990 and 1995, and treated with an induction regimen of VAD, followed by tandem transplant and a maintenance regimen of interferon alpha-2b. Statistical methods. OS was calculated as the time from study registration to death from any cause or last contact. Event-free survival (EFS) was calculated as the time from study registration to either progression of disease or death from any cause or last contact. Survival curves were estimated by the product-limit method (Kaplan & Meier, 1958) and compared using the log-rank test (Mantel, 1966). Cox proportional hazards regression (Cox, 1972) was used to assess the influence of prognostic factors on survival outcome. Tree regression models were based on regression tree methods adapted to survival data, as described in Crowley et al (1997). Tree pruning was based on methods described in LeBlanc (LeBlanc & Crowley, 1993). Permutation and bootstrap methods were used to calculate a bias-adjusted splitting statistic and P-value. RESULTS Patient characteristics A total of 1555 protocol-eligible patients were available for analysis (Table I). The median age was 62 years (range 26 87 years), 61% were men and 19% were African American. Laboratory characteristics are presented in Table II, and disease characteristics, including DS stage in Table III. Laboratory variable distributions did not differ between the four studies, with a few notable exceptions. b2m was higher in older studies (65% 4 mg/l in S8229 vs 57% 4 mg/l in S9210), as was haemoglobin (64% 10 g/dl in S8229 vs 46% 10 g/dl in S9210). While a similar number of patients had more than three lytic lesions in the four studies, a greater percentage of patients had no bone lesions in the more recent studies (25% in S9210) than the older studies (16% in S8229). Disease characteristics [immunoglobulin (Ig) isotype, light chain isotype] were similar across studies. Median OS in the combined data set was 33 months. OS varied slightly between the four studies (Fig 1). OS by DS stage in this patient population is presented in Fig 2. Median OS by DS stage ranged from 57 months in stage I patients to 21 months in stage IIIB patients. The distribution of DS stage differed slightly between studies, with a smaller percentage of patients identified as either stage I II or stage IIIB in the more recent studies (32% stage I II in S8229 vs 28% stage I II in S9210; 24% stage IIIB in S8229 vs 18% stage IIIB in S9210). Prognostic factors DS stage is a statistically significant (P < 0. 001) predictor of OS in the overall population, although it proved to be less predictive in some subgroups (women, IgA isotype). Univariate Cox regression analysis showed that pre-study albumin, calcium, creatinine, haemoglobin, platelets, b2m and the number of bone lesions had prognostic significance SWOG Staging System for Multiple Myeloma 443 Table II. Patient and laboratory characteristics. All patients (n ¼ 1555) Age n ¼ 1555 Median (min, max) 62 (26, 87) 65 years 39% Albumin n ¼ 1534 Median (min, max) 3Æ6 (0Æ3, 7Æ2) < 30 g/l 20% Calcium n ¼ 1545 Median (min, max) 2Æ4 (0Æ6, 4Æ7) 2Æ5 mmol/l 34% Creatinine n ¼ 1549 Median (min, max) 115 (27, 1947) 177 mol/l 22% Haemoglobin n ¼ 1551 Median (min, max) 10Æ3 (3Æ7, 20Æ0) < 10 g/dl 42% Platelets n ¼ 1533 Median (min, max) 235 (14, 1161) < 200 10 )9 /l 35% Serum b2m n ¼ 1415 Median (min, max) 4Æ8 (0Æ0, 63Æ7) 4 mg/l 61% Bone lesions n ¼ 1513 None 285 19% Osteoporosis 108 7% < ¼ 3 lytic lesions 207 18% > 3 lytic lesions 850 56% n, number of patients tested. Table III. Disease characteristics. All patients (n ¼ 1555) Serum light chain n ¼ 1464 None 167 11% Kappa 809 55% Lambda 485 33% Serum heavy chain n ¼ 1539 None 305 20% IgG 894 58% IgA 325 21% IgM/IgD/IgE 15 1% Renal stage n ¼ 1549 A 1147 74% B 402 26% DS stage n ¼ 1549 I II 420 27% IIIA 789 51% IIIB 240 22% n, number of patients tested.
444 J. L. Jacobson et al 100% 80% S9028 S8624 S9210 S8229 N 182 509 250 614 P =.007 Events 152 471 193 600 Median in Months 38 36 32 31 60% 40% 20% 0% 0 5 10 15 20 Years After Registration Fig 1. OS by SWOG multiple myeloma protocol. Median OS ranged from 31 months to 38 months. The percentage of surviving patients in shown on the axis. at P <0Æ05, both as continuous variables and as dichotomous indicators split at clinically accepted values (Table IV). No statistically significant survival differences were found for either sex or race. Elevated calcium, platelet count and b2m, decreased albumin and haemoglobin, and more than three lytic lesions were identified as having an independent prognostic effects in the multivariate (forward) stepwise Cox model analysis (Table IV). b2m was the first variable entered into the stepwise multivariate model for both the continuous and dichotomous factor models. An interesting result of the multivariate dichotomous factors models is the similarity of hazard ratios among all of the factors. Tree regression models In order to validate the predictive potential of the new staging scheme developed in this analysis, the full data set was split to create training and validation data sets. A random sample of approximately two-thirds of the data (1000 patients) was taken from the full data set to create the training data set; the remaining one-third comprised the validation data set (n ¼ 555 patients). The random sampling did not lead to important differences in the distributions of the lab variables, patient and disease characteristics, or outcome characteristics between the training and validation data sets. The recursive partitioning process behind tree regression modelling proceeds by first splitting the predictor space into two regions or nodes, based on a specified rule. As the proportional hazards model proposed by Cox (Cox, 1972) is the most commonly used analysis tool for survival data, the Cox model log-rank statistic was used as the splitting rule. The log-rank statistic was calculated for each potential split point for each variable of interest. The maximum log-rank value from this group of log-rank values was chosen as the first split. In this training data set, the first split was chosen to be a b2m of 5Æ5 mg/l. As a diagnostic tool, it is often helpful to look at a plot of the log-rank values for the possible split points of a given variable. Figure 3 plots the log-rank values for potential split points for b2m. This plot shows that the overall logrank maximum is achieved at a b2m around 5Æ5 mg/l, but this maximum does not necessarily distinguish itself from many other potential split points. Also included on this plot are lines representing significance thresholds for a ¼ 0Æ05 and a ¼ 0Æ001 level tests. These values have been adjusted upwards for all multiple comparisons on the variable, but it is still clear that all split points would be considered statistically significant for b2m based on these thresholds. The most likely explanation is that very low values of b2m (typically < 2Æ5 mg/l) represent the favourable prognosis group, and for values above that level only the choice of best separation between two prognostic groups remains. The rule was then applied recursively to the resulting nodes until the space had been split into a large number of nodes with a few observations each. As this large tree represents an over fit to the data, an algorithm for pruning back the branches of the tree to choose the best subtree is then applied. This pruning method keeps only those splits deemed to be statistically significant (P <0Æ05) after correcting for the potential bias that could be introduced by fitting such a large number of Cox models. Figure 4 presents the pruned tree for the b2m and albumin model. The terminal nodes of the tree identified four groups as distinct prognostic groups based on b2m and albumin values. Good prognosis patients are those with a b2m < 2Æ5 mg/l. The middle prognosis patients are those with a b2m between 2Æ5 and 5Æ5 mg/l (HR ¼ 1Æ48), or
SWOG Staging System for Multiple Myeloma 445 Fig 2. Kaplan Meier survival curves comparing DS stage and SWOG stage in the model data set, validation data set, and overall population for OS, EFS and Total Therapy I patients. The percentage of surviving patients in shown on the Y-axis and years since registration on the X-axis. Durie Salmon stage I ( ), II (. ), IIIA (_._._.) and IIIB ( ), and SWOG stage 1 ( ), 2 (...), 3 (_._._.) and 4 ( ) are shown as separate curves. b2m 5Æ5 mg/l and albumin 30 g/l (HR ¼ 2Æ27). Poor prognosis patients are those with a b2m 5Æ5 mg/l, with the worst being those with and albumin < 30 g/l (HR ¼ 3Æ32). OS by these four tree-defined groups is presented in Fig 2. This model seems to define four distinct groups.
446 J. L. Jacobson et al Table IV. Cox regression results for OS. Continuous Dichotomous n HR P-value Split HR P-value Univariate Age 1555 1Æ02 < 0Æ001 65 years 1Æ32 < 0Æ001 Sex 1555 Women 0Æ98 0Æ072 Race 1555 African American 1Æ0 0Æ99 Albumin (g/l) 1534 0Æ76 < 0Æ001 < 30 g/dl 1Æ63 < 0Æ001 Calcium (mmol/l) 1545 1Æ11 < 0Æ001 2Æ5 mmol/l 1Æ52 < 0Æ001 Creatinine* (mol/l) 1549 1Æ51 < 0Æ001 177 mol/l 1Æ61 < 0Æ001 Haemoglobin (g/dl) 1551 0Æ90 < 0Æ001 < 10 g/dl 1Æ47 < 0Æ001 Platelets ( 10 )9 /l) 1533 0Æ99 < 0Æ001 < 150 10 )9 /l 1Æ82 < 0Æ001 Serum b2m* (mg/l) 1415 1Æ48 < 0Æ001 4 mg/l 1Æ71 < 0Æ001 Bone lesions (num) 1513 1Æ12 < 0Æ001 > 3 lesions 1Æ36 < 0Æ001 Multivariate Step Step Albumin 2 0Æ82 < 0Æ001 4 1Æ41 < 0Æ001 Calcium 6 1Æ05 0Æ011 5 1Æ33 < 0Æ001 Haemoglobin 5 0Æ96 0Æ010 6 1Æ19 0Æ006 Platelets 4 0Æ99 < 0Æ001 2 1Æ59 < 0Æ001 Serum b2m* 1 1Æ30 < 0Æ001 1 1Æ37 < 0Æ001 Bone lesions 3 1Æ13 < 0Æ001 3 1Æ34 < 0Æ001 *Log( ) of variable used. HR, hazard ratio; n, number of patients; num, number of bone lesions. Fig 3. Plot of Cox model log-rank statistic for potential split points in b2m based on OS model. Log-rank maximum is b2m ¼ 5Æ5 mg/l, but this plot shows that most cut-off points identify distinct groups as the log-rank statistic is consistently above P ¼ 0Æ001 reference line. To validate the model developed in the training data set of patients, the tree-defined groups were applied to patients in the validation data set (Fig 2). The prognostic groups are well defined with clear separation between the survival curves estimated on the test data. Using the results of this tree regression model, a new staging system was defined using b2m split at 2Æ5 mg/l and 5Æ5 mg/l, and albumin split at 30 g/d. Patients with b2m < 2Æ5 mg/l were defined as the best prognosis group (stage 1) with a median OS of 55 months, including 16% of the patients (Fig 2). Patients with 2Æ5 mg/l b2m < 5Æ5 mg/l had the next best prognosis (stage 2), with a median OS of 42 months, including 37% of the patients. In the group of patients with b2m 5Æ5 mg/l, albumin split at 30 g/l identifies two distinct prognosis groups. Patients with b2m 5Æ5 mg/l and albumin 30 g/l identified a poor prognosis group (stage 3) with a median OS of 25 months, including 35% of the patients. Patients with b2m 5Æ5 mg/l and albumin < 30 g/l identified the worst prognosis group (stage 4) with the hazard ratio of > 3Æ0, when compared with the b2m 2 2Æ5 mg/l group, and a median OS of 18 months, including 12% of the patients. For no other reason than SWOG, data were used to create this new
SWOG Staging System for Multiple Myeloma 447 Fig 4. Pruned regression tree based on b2m and albumin in model data set. HR refers to hazard ratio comparing each group with the best prognosis group (b2m < 2Æ5 mg/l). staging scheme, which is subsequently referred to as called the SWOG stage. To see how the SWOG staging scheme would change if all variables found to have independent prognostic effect in the multivariate Cox models were included, a regression tree was created and pruned using the variables from Table IV. This all-inclusive tree had a very similar structure to the tree created using only b2m and albumin, adding only bone lesions and calcium to final tree (see Fig 5). This result appeared to indicate that b2m and albumin represent the prognostic information contained in the larger set of lab measures well, and that if an additional lab characteristic were to be added to the list of variables considered, a measure of bone disease might be the best choice, as both the number of bone lesions and calcium appear in this more inclusive tree model. A logical question for this new staging scheme is how well it did (or did not) approximate the DS staging system. While there was a statistically significant correlation between SWOG stage and DS stage (P < 0Æ001), Table V shows that patients classified by SWOG stage are fully distributed among the DS stage, and vice-versa. Progression-free survival Progression-free survival (PFS) is an outcome commonly examined in multiple myeloma, usually in the context of evaluating treatment effects. Identifying groups with a poor PFS allows researchers to exclude the patient populations that have little chance of benefiting from treatment and may dilute any potential treatment benefit in comparisons with standard therapies. On the other hand, this would identify a group of patients that might need a different Fig 5. Pruned regression tree considering all variables from multivariate regression model. HR refers to hazard ratio comparing each group with best prognosis group (b2m < 2Æ5 mg/l).
448 J. L. Jacobson et al Table V. Comparison of DS stage with SWOG stage. DS stage I II IIIA IIIB Percentage of all patients 6% 21% 51% 22% In patients with b2m < 2Æ5 mg/ml 10% 29% 55% 6% b2m > ¼ 5Æ5 mg/ml 4% 16% 42% 39% Albumin < 30 g/l 3% 13% 55% 29% SWOG stage 1 10% 29% 55% 6% SWOG stage 2 8% 25% 60% 7% SWOG stage 3 4% 17% 38% 42% SWOG stage 4 2% 7% 46% 45% SWOG stage 1 2 3 4 Percentage of all patients 14% 43% 32% 11% In patients with DS stage I 23% 55% 19% 3% DS stage II 19% 52% 25% 4% DS stage IIIA 15% 51% 24% 10% DS stage IIIB 3% 14% 61% 22% treatment strategy for their aggressive disease. As can be seen in Fig 2, the DS stage for the most part identified distinct prognostic groups for PFS, with stage IIIB patients remaining progression free and alive for a median 13 months. Likewise, the SWOG stage identified distinct prognostic groups for PFS, with stage IV patients having a median PFS of only 10 months (Fig 2). Total Therapy patients As external validation, the new SWOG staging scheme was applied to patients on the University of Arkansas Total Therapy program (Barlogie et al, 1999). Patients on this Total Therapy program had a better OS than SWOG patients (median OS 5Æ7 years) and, in general, presented with an earlier stage of disease (43% stage DS stage I II). Figure 2 shows the OS by DS stage for this group of patients. Aside from stage IIIB patients, the DS staging system did not define distinct prognostic groups (in terms of OS). Figure 2 shows the OS by SWOG stage in this group of Total Therapy patients. While we didn t see the distinct separation found in the SWOG patient data, the SWOG stage does appear to have better prognostic value in this group of patients. Patients with a b2m 5Æ5 mg/l had a distinctly worse survival then those with a b2m < 5Æ5 mg/l. Further classification by b2m < 2Æ5 mg/l or albumin < 30 g/l was not important. However, a smaller percentage of patients in this population had a b2m 5Æ5 mg/l, and these small numbers make evaluating the prognostic potential of albumin in patients with an elevated b2m more difficult. Secondary outcomes The ability of the DS staging system and SWOG staging system to identify distinct groups based on the secondary response and survival outcomes of interest are examined in Table VI. An association between a lower stage and with a higher complete remission (CR) rate was not observed with either staging system; both were marginally better at predicting durable CR (defined as CR lasting at least 4 years). Both DS stage IIIB and SWOG stage 4 identified groups with a comparatively high 1-year mortality (35% and 40% respectively) and low long-term (> ¼ 5 years) EFS (8% and 4% respectively). SWOG stage 1 identified a group of patients with a low 1-year mortality (8%), while DS stage showed little difference in 1-year mortality rates between stages I IIIA. Both DS stage I and SWOG stage 1 identified a group of patients with a superior long-term EFS (29% and 24% respectively). DISCUSSION The objective of this analysis was to develop a system for staging patients with previously untreated multiple myeloma using only widely available laboratory tests. DS stage has proven to be an effective and informative prognostic tool, but the overall complexity of the criteria used to stage patients makes its implementation less that straightforward at times. The staging system proposed here is simple and would be applicable to all patients. Serum b2m and albumin are known markers of disease in multiple myeloma, with b2m being a tumour marker (Durie et al, 1990) and Table VI. Comparison of complete response rate and survival outcomes. Percentage of patients Complete response Survival Rate Durable CR 1 year mortality Long-term EFS DS stage I 6% 28% 33% 16% 29% II 21% 38% 23% 15% 21% IIIA 51% 46% 15% 18% 11% IIIB 22% 37% 18% 35% 8% SWOG stage 1 14% 42% 19% 8% 24% 2 43% 44% 18% 13% 15% 3 32% 37% 17% 27% 11% 4 11% 40% 8% 40% 4%
albumin being a marker of rapid disease growth (Bataille et al, 1989). Serum albumin is also a marker of a patient s nutritional status and was found to have a high degree of negative correlation with SWOG performance status (P <0Æ0001). A staging system for multiple myeloma patients based on b2m and albumin levels is not a new idea; a system very similar to the one proposed here was explored in 1986 by Bataille (Bataille et al, 1986). The staging system proposed by this group differed somewhat and used a slightly higher cut-off point for b2m. They proposed a low-risk stage as a b2m < 6 mg/l and albumin > 30 g/l, an intermediate-risk stage as a b2m 6 mg/l and albumin > 30 g/l, and a highrisk stage as albumin 30 g/l. In comparison with a number of staging systems for multiple myeloma (DS, Medical Research Council and Merlini Waldenstrom Jayakar), they found that the b2m and albumin combination was the most predictive. This prognostic model was again proposed in an analysis of SWOG 8229 data by Durie (Durie et al, 1990). The development of the SWOG staging system differs in that adaptive statistical methods were used to select cut-off points for b2m and albumin, essentially letting the data dictate how the most prognostically significant groups should be defined. Most prior analyses that have been used to develop new prognostic groups used predefined cut-off points on the variables of interest and then went about the process of selecting variables through multivariate regression modelling. Groups based on regression equation tertiles or on a sum-of-bad-independent-prognostic-factors approach may be easy to implement, but can result in groups that are more difficult to describe. Tree regression modelling results in groups that are easy to explain and interpret. In addition, tree regression introduces a hierarchical structure to the model that addresses and accounts for interactions in the data. The prognostic abilities of the SWOG staging system could potentially be improved by including other factors of prognostic importance, as we saw in the regression tree based on a more complete set of prognostic factors (Fig 4). From those results, it appears that measures of bone disease may add prognostic information to the staging system. However, the added information is small and adds variables to the criteria. Recent studies have shown the dramatic prognostic importance of additional measures such as plasma cell labelling index (PCLI) and cytogenetics (chromosome 13 deletion). Once again, these measures are more difficult to obtain and would reduce the simplicity of the proposed SWOG staging system. As would be expected of a useful staging system, the SWOG stage had prognostic importance when evaluating outcomes in addition to OS in the data used for analysis. For EFS, SWOG stage 4 identified a group of patients with a very poor progression-free duration (median 10 months) and also a comparatively high 1-year mortality rate. SWOG stage was not as efficient as DS stage in identifying patients with an improved chance of durable CR and, in addition, DS stage I patients had the worst CR rate among the stages, which are both signs indicating that stage may not be a SWOG Staging System for Multiple Myeloma 449 good predictor of response. However, numerous analyses have shown little association between CR and OS in myeloma patients. The SWOG stage could be a simple alternative to DS stage for patients with previously untreated multiple myeloma. It has an advantage in that it is based on common laboratory measurements that are widely available and can be reliably determined and reproduced. Improved simplicity can be helpful when a measure of disease stage is used to prospectively identify and stratify patients as part of a registration process for clinical trials. For this reason, the SWOG stage is currently used to stratify patients to SWOG front-line myeloma protocols. This staging system should appeal to clinicians as a tool to identify a patient s potential prognosis when evaluating potential therapies. More complex systems (e.g. molecular/genetic) may only be necessary if selective targeted therapy is proposed, for example, therapies for patients with deletion 13. This analysis on a large data base of previously untreated myeloma patients confirms previous analyses on smaller groups of patients that also identified b2m and albumin as the most important prognostic factors in multivariate analysis. Additional evaluation in other myeloma data sets is needed to confirm this proposed staging system. In data presented at the 2002 ASH meeting, Dimopoulos (Dimopoulos et al, 2002) showed a high utility for SWOG stage in 397 myeloma patients from the Greek Myeloma Study Group (GMSG). The data that were used to develop the SWOG stage system are now part of an International Prognostic Index project coordinated by the International Myeloma Foundation (IMF), where myeloma data from all over the world will be used to identify and define prognostic staging systems with similar applications to that of SWOG stage. REFERENCES Barlogie, B., Jagannath, S., Desikan, K.R., Mattox, S., Vesole, D., Siegel, D., Tricot, G., Munshi, N., Fassas, A., Singhal, S., Mehta, J., Anaissie, E., Dhodapkar, D., Naucke, S., Cromer, J., Sawyer, J., Epstein, J., Spoon, D., Ayers, D., Cheson, B. & Crowley, J. (1999) Total therapy with tandem transplants for newly diagnosed multiple myeloma. Blood, 93, 55 65. Bataille, R., Magub, M., Grenier, J., Donnadio, O. & Sany, J. (1982) Serum b2 microglobulin in multiple myeloma: relation to presenting features and clinical status. European Journal of Cancer and Clinical Oncology, 18, 59 66. Bataille, R., Durie, B.G.M., Grenier, J. & Sany, J. (1986) Prognostic factors and staging in multiple myeloma: a reappraisal. Journal of Clinical Oncology, 4, 80 87. Bataille, R., Jourdan, M., Zhang, X.G. & Klein, B. (1989) Serum levels of interleukin 6, a potent myeloma cell growth factor, as a reflection of disease severity in plasma cell dyscrasias. Journal of Clinical Investigation, 84, 2008 2011. Berenson, R., Crowley, J.J., Grogan, T.M., Zangari, M., Briggs, A.D., Mills, G.M., Barlogie, B. & Salmon, S.E. (2002) Maintenance therapy with alternate-day prednisone improves survival in multiple myeloma patients. Blood, 99, 3163 3168. Berggarrd, I. & Beam, A.G. (1968) Isolation and properties of a low molecular weight b2 microglobulin occurring in human biological fluids. Biology Chemistry, 243, 4095 5002.
450 J. L. Jacobson et al Cassuto, J.P., Krebs, B.P., Viot, G., Dujardin, P. & Masseyefl, R. (1978) b2 microglobulin: a tumour marker of lymphoproliferative disorders. Lancet, 2, 950. Child, J.A., Crawford, S.M., Norfolk, D.R., O Quigley, J., Scarffe, J.H. & Struthers, L.P.L. (1983) Evaluation of serum b2-microglobulin as a prognostic indicator in myelomatosis. British Journal of Cancer, 47, 111 114. Cox, D.R. (1972) Regression models and life-tables (with discussion). Journal of the Royal Statistical Society, Series B, 34, 187 220. Crowley, J.J., LeBlanc, M., Jacobson, J.L. & Salmon, S.E. (1997) Some exploratory tools for survival analysis. Lecture notes in statistics. Proceedings of the First Seattle Symposium in Biostatistics, 199 229. Dimopoulos, M.A., Zervas, K., Pouli, A., Hamilos, G., Mitsouli, C., Repousis, P., Symeonidis, A., Gika, D., Stamatelou, M., Anagnostopoulos, N. & Maniatis, A. for the Greek Myeloma Study Group (GMSG) (2002) Independent validation of the Southwest Oncology Group (SWOG) staging system: a clinically useful staging system for multiple myeloma. Blood, 100, 2361. Durie, B.G.M. & Salmon, S.E. (1975) A clinical staging system for multiple myeloma. Cancer, 36, 842 854. Durie, B.G.M., Stock-Novack, D., Salmon, S.E., Finley, P., Beckford, J., Crowley, J. & Coltman, C.A. (1990) Prognostic value of pretreatment serum B2 microglobulin in myeloma: a Southwest Oncology Group study. Blood, 75, 823 830. Ershler, W.B. & Keller, E.T. (2000) Age-associated increased interleukin-6 gene expression, late-life diseases, and frailty. Annual Review of Medicine, 51, 245 270. Facon, T., Avet-Loiseau, H., Guillerm, G., Moreau, P., Genevieve, F., Zandecki, M., Laý, J., Leleu, X., Jouet, J.P., Bauters, F., Harousseau, J.L., Bataille, R. & Mary, J.Y. (2001) Chromosome 13 abnormalities identified by FISH analysis and serum b2-microglobulin produce a powerful myeloma staging system for patients receiving high-dose therapy. Blood, 97, 1566 1571. Kaplan, E.L. & Meier, P. (1958) Nonparametric estimation for incomplete observations. Journal of the American Statistical Association, 53, 457 481. Kin, K., Kasahara, T., Itoh, Y., Sakurabayashi, I., Kawai, T. & Morita, M. (1979) b2-microglobulin production by highly purified human T and B lymphocytes in cell culture stimulated with various mitogens. Immunology, 36, 47 54. LeBlanc, M. & Crowley, J.J. (1993) Survival trees by goodness of split. Journal of the American Statistical Association, 88, 457 467. Mantel, N. (1966) Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50, 163 170. Mazolewski, P., Turner, J.F., Baker, M., Kurtz, T. & Little, A.G. (1999) The impact of nutritional status on the outcome of lung volume reduction surgery: a prospective study. Chest, 116, 693 696. Mori, M., Terui, Y., Ikeda, M., Tomizuka, H., Uwai, M., Kasahara, T., Kubota, N., Itoh, T., Mishima, Y., Douzono-Tanaka, M., Yamada, M., Shimamura, S., Kikuchi, J., Furukawa, Y., Ishizaka, Y., Ikeda, K., Mano, H., Ozawa, K. & Hatake, K. (1999) b2-microglobulin identified as an apoptosis-inducing factor and its characterization. Blood, 94, 2744 2753. Norfolk, D., Child, J.A., Cooper, E.H., Kerruish, G. & Ward, A.M. (1980) Serum b2 microglobulin in myelomatosis: potential value in stratification and monitoring. British Journal of Cancer, 42, 5 10. Peest, D., Bartels, B., Dallmann, I., Schedel, I. & Deicher, H. (1986) Cytostatic drug sensitivity test for human multiple myeloma, measuring monoclonal immunoglobulin produced by bone marrow cells its vitro. Cancer Chemotherapy and Pharmacology, 17, 69 74. Peterson, P.A., Cunningham, B.A., Berggárd, I. & Edelman, G.M. (1972) b2-microglobulin: a free immunoglobulin domain. Proceedings of the National Academy of Sciences of the United States of America, 69, 1697 1701. Salmon, S.E., Tesh, D., Crowley, J.J., Saeed, S., Finley, P., Milder, M.S., Hutchins, L.F., Coltman, Jr, C.A., Bonnet, J.D., Cheson, B., Knost, J.A., Samhouri, A., Beckford, J. & Stock-Novack, D. (1990) Chemotherapy is superior to sequential hemibody radiation for remission consolidation in multiple myeloma: a Southwest Oncology Group study. Journal of Clinical Oncology, 8, 1575 1584. Salmon, S.E., Crowley, J.J., Grogan, T.M., Finley, P., Pugh, R.P. & Barlogie, B. (1994) Combination chemotherapy, glococorticoids, and alpha interferon in the treatment of multiple myeloma: a Southwest Oncology Group study. Journal of Clinical Oncology, 12, 2405 2414. Salmon, S.E., Crowley, J.J., Balcerzak, S.P., Roach, R.W., Taylor, S.A., Rivkin, S.E. & Samlowski, W. (1998) Interferon versus interferon plus prednisone remission maintenance therapy for multiple myeloma: a Southwest Oncology Group Study. Journal of Clinical Oncology, 16, 589 592.