Advertisement

A disease risk index for patients undergoing allogeneic stem cell transplantation

Philippe Armand, Christopher J. Gibson, Corey Cutler, Vincent T. Ho, John Koreth, Edwin P. Alyea, Jerome Ritz, Mohamed L. Sorror, Stephanie J. Lee, H. Joachim Deeg, Barry E. Storer, Frederick R. Appelbaum, Joseph H. Antin, Robert J. Soiffer and Haesook T. Kim

Abstract

The outcome of allogeneic HSCT varies considerably by the disease and remission status at the time of transplantation. Any retrospective or prospective HSCT study that enrolls patients across disease types must account for this heterogeneity; yet, current methods are neither standardized nor validated. We conducted a retrospective study of 1539 patients who underwent transplantation at Dana-Farber Cancer Institute/Brigham and Women's Hospital from 2000 to 2009. Using multivariable models for overall survival, we created a disease risk index. This tool uses readily available information about disease and disease status to categorize patients into 4 risk groups with significantly different overall survival and progression-free survival on the basis of primarily differences in the relapse risk. This scheme applies regardless of conditioning intensity, is independent of comorbidity index, and was validated in an independent cohort of 672 patients from the Fred Hutchinson Cancer Research Center. This simple and validated scheme could be used to risk-stratify patients in both retrospective and prospective HSCT studies, to calibrate HSCT outcomes across studies and centers, and to promote the design of HSCT clinical trials that enroll patients across diseases and disease states, increasing our ability to study nondisease-specific outcomes in HSCT.

Introduction

Allogeneic HSCT can be a curative option for a large number of hematologic malignancies, including acute and chronic leukemias as well as indolent and aggressive lymphoid neoplasms. However, the success of HSCT is heavily dependent on the disease and disease status at the time of transplantation.16 This creates a particular problem in studying and reporting HSCT outcomes because of the need to account for the heterogeneity of disease and disease status in a study cohort. This problem is akin to that of having to account for heterogeneity in comorbidities or donor HLA match, which both influence HSCT outcome. In those 2 cases, accepted grouping schemes already exist,7,8 but in the case of disease and disease status, there is at present no uniformly accepted scheme, even though the variation in HSCT outcome on the basis of disease/status is very large. For example, the survival for patients undergoing HSCT for chronic myelogenous leukemia in the chronic phase is more than 70%,9 whereas that for patients undergoing HSCT for adverse karyotype AML not in complete remission is less than 20%.10

Therefore, it is essential to account for this in any retrospective or prospective HSCT study that enrolls patients across diseases and disease stages. Furthermore, given the major prognostic importance of cytogenetics for AML and MDS,11,12 it is critical that this information be included in a comprehensive disease/status grouping scheme. At present, most investigators stratify patients among disease/status groups by including a binary classification of low- or high-risk disease in multivariable models; the particular classifications vary across studies, usually reflecting the general experience of that transplantation center,1315 and often ignore cytogenetics. An alternative strategy is to decrease the heterogeneity of the study population by restricting the analysis or the clinical trial to a limited number of diseases and remission states, but this can substantially diminish the available sample size.

A validated scheme for stratifying patients by disease and disease status (disease/status) could find important uses in the design and interpretation of both retrospective and prospective studies. In the present study, we developed and validated such a disease/status scheme for adult HSCT patients on the basis of a retrospective analysis of 2 large single-institution cohorts.

Methods

Patients

The training set comprised 1539 consecutive adult patients who underwent their first HSCT with myeloablative conditioning (MAC) or reduced-intensity conditioning (RIC) at Dana-Farber Cancer Institute/Brigham and Women's Hospital within the 10-year period of 2000-2009. Patients undergoing transplantation for benign hematologic conditions were excluded. We also excluded 16 patients with very rare diseases (natural killer or large granular lymphocyte leukemia, mast cell leukemia, Burkitt, or lymphoblastic lymphoma) or with more than 1 hematologic malignancy. For all patients, we collected pre-HSCT information and HSCT outcomes from our transplantation database, with independent confirmation of all disease and status information via a review of electronic medical records. We collected cytogenetics for acute myeloid leukemia (AML), acute lymphoblastic leukemia (ALL), myelodysplastic syndrome (MDS), and chronic lymphocytic leukemia (CLL) when such information was available in the medical records. For patients who underwent transplantation between 2005 and 2009, we collected (when available) data on comorbidities necessary to calculate the hematopoietic cell transplantation comorbidity index (HCT-CI).8 Comorbidity information was extracted retrospectively for 394 patients who underwent HSCT between 2005 and 2007 and prospectively collected for 324 patients who underwent HSCT after 2007. Institutional research board approval was obtained from the Dana-Farber Cancer Institute/Brigham and Women's Hospital Office for Human Research Studies.

To validate the scheme, we used an independent external cohort of 672 consecutive adult patients who underwent HSCT at the Fred Hutchinson Cancer Research Center between 2000 and 2006 with MAC or RIC.

Transplantation

Patients in the training cohort underwent transplantation under a variety of treatment plans and investigational protocols. MAC regimens consisted mostly of cyclophosphamide (3600 mg/m2 or 120 mg/kg) plus total body irradiation (1400 cGy in 7 fractions) or busulfan (12.8 mg/kg intravenously) plus cyclophosphamide (3600 mg/m2). RIC regimens consisted of fludarabine (120 mg/m2) plus intravenous low-dose busulfan (3.2-6.4 mg/kg) with or without antithymocyte globulin. Patients received BM or peripheral blood stem cells from HLA-matched or mismatched, related or unrelated donors, or double umbilical cord blood units. GVHD prophylaxis consisted mostly of a calcineurin inhibitor (cyclosporine or tacrolimus) combined with methotrexate, with or without sirolimus, or cyclosporine with mycophenolate mofetil. Supportive care for all patients followed institutional standards.

Definitions

For AML and MDS, we classified cytogenetics according to HSCT-specific schemes described previously.11,12 Patients whose cytogenetics were unavailable (6% of patients in the training cohort and 4% of those in the testing cohort) were assigned to the intermediate-risk category (because their outcomes were very similar). Of note, AML and MDS cytogenetics in the validation cohort were classified according to the Southwest Oncology Group/Eastern Cooperative Oncology Group scheme16 (and could not be reclassified because the primary data were not available for many of the patients). For CLL, we considered del(17p), del(11q), and complex as adverse; for ALL, we considered t(9;22), t(4;11), and complex as adverse. However, cytogenetics did not affect HSCT outcome for CLL or ALL; the hazard ratio (HR) for mortality associated with adverse cytogenetics (compared with intermediate) was 1.1 (P = .9) for CLL and 0.9 (P = .5) for ALL in the multivariable model. Therefore, all cytogenetics categories were grouped together in the final models for those 2 diseases. We subclassified non-Hodgkin lymphoma as described in Table 1.16 Complete remission (CR) and partial remission (PR) were documented as reported in the medical record, with the latter term only applying to lymphomas. For patients with recurrent or residual disease, we defined relapse as the recurrence of disease after a documented CR or PR and induction failure as persistent disease without first achieving remission of any type. In the case of chronic myelogenous leukemia, accelerated and blast phase were considered together, given the small number of patients in each category.

Statistical analysis

Patient baseline characteristics were reported descriptively. Overall survival (OS) and progression-free survival (PFS) were calculated with the Kaplan-Meier method. OS was defined as the time from stem cell infusion to death from any cause. Patients who were alive or lost to follow-up were censored at the time last seen alive. PFS was defined as the time from stem cell infusion to disease relapse or progression or death from any cause, whichever occurred first. Patients who were alive without disease relapse or progression were censored at the time last seen alive and progression-free. The log-rank test was used for comparisons of Kaplan-Meier curves. Cumulative incidence curves for nonrelapse mortality (NRM) and relapse with or without death were constructed reflecting time to relapse and time to NRM, respectively, as competing risks. Time to relapse and time to NRM were measured from the date of stem cell infusion.

The difference between cumulative incidence curves in the presence of a competing risk was tested with the Gray method.17 Potential prognostic factors for OS, PFS, relapse, and NRM were examined in the proportional hazards model as well as in the competing risks regression model.18 The variables considered are detailed in Table 4. The proportional hazards assumption for each variable of interest was tested, and interaction terms were examined. The linearity assumption for continuous variables was examined by the use of restricted cubic spline estimates of the relationship between the continuous variable and log relative hazard,19 and the cutoff points of these variables were determined by the change of the log relative hazards. All P values are 2-sided with a significance level of .05. All calculations were performed with SAS 9.3 (SAS Institute), and R Version 2.13.2 (the Comprehensive R Archive Network project).

Results

Patient characteristics in the training set

The baseline characteristics of the 1539 patients in the training set are shown in Table 1. Eight-hundred twelve patients (53%) underwent MAC, 47% (RIC. The median age was 49 years (range, 18-73 years). AML was the most common disease (37% of patients). Twenty-nine percent of patients were in first CR (CR1) at the time of transplantation; the frequencies of other diseases and stages are shown in Table 1. Forty percent of patients underwent transplantation from HLA-matched, related donors, whereas 45% underwent transplantation from HLA-matched, unrelated donors, and 15% received HLA-mismatched transplantation (including 6% of patients who received double umbilical cord blood products). Eighty percent of patients received stem cells harvested from peripheral blood, and the majority (96%) received GVHD prevention regimens that included a calcineurin inhibitor. Median follow-up for survivors was 4 years.

Table 1

Baseline patient characteristics

Derivation of the disease/status groups

We constructed Cox proportional hazards models for OS by using disease and status that included the following variables: age, sex of donor and recipient, donor type and HLA match, graft source, CMV serostatus of donor and recipient, GVHD prophylaxis regimen, therapy-related or transformed disease, Flt3-ITD status for AML (when available), year of transplantation, and whether treatment was performed on or off a clinical trial. Initially, separate models were built for the MAC and RIC patients. Because graft source and year of HSCT violated the proportional hazards assumption, the models were stratified according to both variables. Diseases that were underrepresented in 1 of the 2 groups (multiple myeloma, myeloproliferative neoplasms, mantle cell lymphoma, T-cell lymphomas, Hodgkin lymphoma) were excluded in this step. On the basis of the HRs for mortality associated with each disease and with each status, with cutoffs of 0.67 and 1.5, a disease grouping scheme and a status grouping scheme were created within each conditioning group. The choice of those HR cutoffs was mostly determined by the fact that in our models, this seemed to correspond roughly to the threshold of statistical significance, with almost all differences above that level significant and none under that level significant. Remarkably, the disease grouping schemes were identical for MAC and RIC groups. The status schemes were almost identical, with the exception of second or subsequent PR (PR2+), which fell into the intermediate-risk group for RIC and into the high-risk group for MAC. On the basis of this finding, the RIC and MAC cohorts were combined to assign the rarer diseases to the appropriate risk groups.

The preceding steps yielded a 3-group disease risk scheme and a 2-group status risk scheme for MAC patients, generating 6 possible combinations of disease and status (Table 2). Of note, there was no significant interaction between disease and status risk in the multivariable models. The OS of patients in each of these 6 MAC disease/status groups are shown in Figure 1A. As can be seen from the figure and confirmed in multivariable models, the 6 combinations could be collapsed into 4 distinct groups (Table 3). Similarly, the RIC patients could also be assigned to 1 of 3 disease and 1 of 2 status risk groups (Table 2), again generating 6 possible combinations, whose OS is plotted in Figure 1B. As with the MAC groups, the 6 RIC groups could easily be collapsed into 4 groups (Table 3). In the last step, the MAC and RIC cohorts were recombined, and patients were assigned to the appropriate disease/status group based on the groups defined in the previous steps (Tables 2 and 3). Figure 1C shows the OS of the patients within each group and for each conditioning intensity. It is apparent from the figure, and again confirmed in multivariable models, that the MAC and RIC risk groups had nearly identical outcomes, justifying the assignment of all patients, regardless of conditioning intensity, to 1 of 4 disease/status risk groups.

Table 2

Summary of disease and stage risk groups

Figure 1

OS after HSCT stratified by disease/status group and conditioning intensity. (A) MAC patients, (B) RIC patients, and (C) overall disease/status risk groups, all patients.

Table 3

Summary of overall risk groups

Performance of the disease/status grouping scheme

As illustrated in Table 3 and Figure 2A, this scheme stratified patients into 4 groups with very different rates of OS (4-year OS 64% in the low risk group [which comprised 15% of patients], 46% in the intermediate group [55% of patients], 26% in the high-risk group [27% of patients], and 6% in the very-high-risk group [3% of patients]; P < .0001). In the multivariable model (Table 4), compared with the intermediate group, the HR for mortality associated with low-risk disease/status was 0.6 (P < .0001), high-risk 1.8 (P < .0001), and very high risk 3.1 (P < .0001). The groups were also very different in terms of PFS (Figure 2B), with 4-year PFS ranging from 56% in the low-risk group to 6% in the very high risk group (P < .0001), confirmed by multivariable analysis (Table 4). The difference in outcome was driven entirely by a difference in the cumulative incidence of relapse, with no significant difference in NRM (Tables 34, Figure 2C-D).

Figure 2

Outcomes of HSCT stratified by overall disease/status risk group. (A) OS. (B) PFS. (C) Cumulative incidence of relapse. (D) Cumulative incidence of NRM.

Table 4

Multivariable analyses in the training cohort

Impact of comorbidities

We were able to calculate an HCT-CI8 for 718 of the patients in our cohort (83% of the patients who underwent transplantation after January 2005). The median score for those patients was 1 (range, 0-9). The median HCT-CI of the patients whose data were collected retrospectively was 1 (range, 0-9) versus 0 (range, 0-6) for those whose data were collected prospectively (P = < .0001). Patients in the high- or very-high-risk groups were more likely to have a high HCT-CI (31% of patients had an HCT-CI > 2) than those in the low-risk group, where only 19% had an HCT-CI greater than 2 (P = .03). We built a multivariable model for OS for those 718 patients. In this model, the disease/status group retained its significance (HR for low risk 0.4, P = .0001; for high risk 1.6, P = .0003; for very high risk, 3.8, P < .0001, all compared with intermediate risk). The HR for mortality for HCT-CI of 1-2 (compared with HCT-CI = 0) was 1.2 (P = .14), and for HCT-CI 3+ was 1.7 (P = .0002). Therefore, the disease/status score and a high comorbidity score (HCT-CI > 2) were independently predictive of increased mortality.

Validation of the disease/status scheme

To validate our scheme, we used an independent cohort of 672 patients who underwent transplantation at Fred Hutchinson Cancer Research Center between 2000 and 2006. Their baseline characteristics are also shown in Table 1. Patients in the validation cohort were on average younger, much more likely to receive MAC, and more likely to receive stem cells from an HLA-matched related donor. We classified the 672 patients according to our disease/status grouping scheme. As shown in Figure 3A and B, the scheme stratified the patients successfully for both OS and PFS (P < .001 for both).

Figure 3

Outcomes of patients in the validation cohort stratified by disease/status risk group. (A) OS. (B) PFS.

Discussion

We propose a new tool for stratifying HSCT patients by disease and disease status at the time of transplantation, which we term the disease risk index (DRI). The DRI uses a combination of a ternary breakdown for disease type and a binary breakdown for remission status to assign patients to 1 of 4 risk categories that differ very significantly (statistically and clinically) with respect to OS and PFS. The DRI incorporates variables that have been shown to be strongly prognostic in the HSCT population, specifically the histologic subtype in lymphomas and cytogenetics for AML and MDS. Patients with non-Hodgkin lymphoma often are lumped into a single group for risk-stratification purposes, yet in recent series investigators repeatedly have shown that indolent B-cell lymphomas and CLL have a better prognosis after HSCT than do other lymphomas,4,2025 a finding supported (without any a priori assumption) by our analysis.

Similarly, it is clear that cytogenetics in AML and MDS are a very strong determinant of HSCT outcome,11,16,26,27 which our results also confirm, and which makes cytogenetics an essential part of any risk stratification. A disease stratification system by post-HSCT relapse risk has been previously proposed28 but was limited to patients undergoing non-MAC and did not successfully stratify the patients for OS. The DRI is exclusively determined on the basis of disease/status and should be used alongside other prognostic variables, unlike global indices such as the European Group for Blood and Marrow Transplantation score29 or the pretransplantation assessment of mortality score.13

The DRI was by far the most important determinant of HSCT outcome in multivariable modeling, which was driven primarily by differences in the risk of relapse between the groups. The stratification ability of the DRI was validated in an independent cohort from a separate institution. The outcomes in the validation cohort within each group were slightly different from those in the training cohort, which is likely explained by differences in the baseline characteristics of the patients, as well as differences in the classification scheme used for cytogenetics. However, this result supports, in fact, the robust applicability of DRI across different cytogenetics grouping schemes used for AML and MDS. We also recognize that the validation cohort contained few patients undergoing RIC HSCT and few patients with CLL, multiple myeloma, or lymphoma, which weakens our ability to validate those disease assignments. Further validation studies will be important to strengthen or refine this index, especially for rarer diseases.

A striking result in this analysis was the similarity of outcomes between MAC and RIC groups. We started with the assumption that the optimal disease/status grouping schemes would be different for MAC and RIC patients. In fact, although the MAC and RIC groups were completely independently derived, they turned out to be nearly identical. Moreover, when patients were stratified by DRI, the outcomes in the MAC and RIC groups were remarkably similar (Figure 2C). This finding offers a very general validation of the growing number of disease-specific studies suggesting that conditioning intensity is not a strong determinant of HSCT outcome.3032

It is noteworthy that a high HCT-CI (which in general reflects a greater risk of NRM) remained prognostic independently of the DRI, implying that the 2 scores may be used simultaneously. Interestingly, the very high-risk group appeared to have a greater risk of NRM than the other groups (Figure 2D). This finding is consistent with other reports showing that patients with more advanced disease may have a greater NRM.18,19 However, this assessment was not supported in multivariable analyses, where the HR for NRM in the very-high-risk group (compared with intermediate) was only 1.2 (P = .5). It may therefore be that patients with more advanced disease go into transplantation in a worse state and, therefore, with a greater risk of NRM. In support of this interpretation, patients in the high- or very-high-risk groups were more likely to have a high HCT-CI than those in the low-risk group, as has been previously described.33 This pattern would be consistent with advanced disease patients receiving more intensive pretransplantation chemotherapy, although our data were not sufficient to test this hypothesis.

We have attempted to minimize the possible biases in this study by studying a large patient cohort, by manually reconfirming all the disease and status information obtained from our transplantation database, and by validating the scheme in a large independent external cohort. One limitation of the DRI is the broadness of its categories. Undoubtedly distinct subgroups with different prognoses exist within each broad disease/status group. However, this simplicity may also be this scheme's strength. It has only 4 groups, is readily obtainable on any patient, using information that is routinely collected and requiring minimal interpretation (eg, in the current reporting forms of the Center for International Blood and Marrow Transplant Research) and that should be available in all centers' databases. A DRI assignment should therefore be easily made for patients in any retrospective study, as well as in past, ongoing, and future clinical trials. Prognostic schemes are by nature fluid, not static, and as new prognostic factors (eg, molecular markers) emerge and are confirmed, the DRI could easily be revised to accommodate the new information.

Transplantation physicians will find few surprises in our risk groups, which is in fact reassuring, because it implies that our cohort was large enough to eliminate the influence of unique populations. We chose OS as our primary end point because it is ultimately the most relevant clinical outcome. Not surprisingly, given the close relationship between PFS and OS after HSCT, the DRI had high stratification ability for PFS as well. In fact, we developed a grouping scheme for PFS as well (not shown), which turned out to be nearly identical to the DRI.

In summary, we propose the DRI as a simple system for risk-stratifying heterogeneous populations of HSCT patients into 4 risk groups on the basis solely of disease and remission status at the time of transplantation. These 4 risk groups have significantly different risks of relapse, OS, and PFS; they apply regardless of conditioning regimen intensity, retain their prognostic relevance after stratifying by comorbidity index, and were validated in an external independent cohort. The DRI could be useful in a variety of settings: first, it could improve the quality of any HSCT study that examines the prognostic role of variables other than disease/status; second, it would facilitate the interpretation of single-arm studies when such studies are performed across a variety of disease/status categories (eg, a phase 2 study of new conditioning or GVHD prevention regimens); third, it could be used to calibrate HSCT outcome across transplantation centers, as is done for the Stem Cell Therapeutic Outcomes Database; fourth, it would provide a reliable way to prospectively stratify patients entering HSCT clinical trials that are not disease-specific (eg, comparative trials of GVHD prevention regimens), which is critical on the basis of the importance of disease/status for outcome; fifth, and perhaps even more importantly, it could actually promote the design of HSCT trials that are not disease-specific by removing the obstacle of outcome variability introduced by disease/status heterogeneity, which could significantly increase the power and generalizability of the trials.

Authorship

Contribution: P.A. designed and performed the research, analyzed the data, and wrote the paper; C.J.G. performed the research, analyzed the data, and wrote the paper; C.C., V.T.H., J.K., E.P.A., and J.R. collected data and edited the paper; M.L.S. collected data, analyzed the data, and edited the paper; S.J.L. and H.J.D. collected data and edited the paper; B.E.S. analyzed the data and edited the paper; F.R.A., J.H.A., and R.J.S. collected data and edited the paper; and H.T.K. designed the research, analyzed the data, and edited the paper.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Philippe Armand, MD, PhD, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston MA 02215; e-mail: parmand{at}partners.org.

Acknowledgments

P.A. is supported by an American Society of Hematology Scholar Award and by an ASCO/Conquer Cancer Foundation Career Development Award. M.L.S. is supported by a National Institutes of Health Pathway to Independence Award (HL088021). This work also was supported by National Institute of Allergy and Infectious Diseases U19 AI29530, National Heart, Lung, and Blood Institute PO1 HL070149, and National Cancer Institute PO1 CA18029.

Footnotes

  • * P.A. and C.J.G. contributed equally to this work.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted March 17, 2012.
  • Accepted June 6, 2012.

References

View Abstract