Blood Journal
Leading the way in experimental and clinical research in hematology

Quantitative analysis of minimal residual disease predicts relapse in children with B-lineage acute lymphoblastic leukemia in DFCI ALL Consortium Protocol 95-01

  1. Jianbiao Zhou1,
  2. Meredith A Goldwasser2,
  3. Aihong Li1,
  4. Suzanne E. Dahlberg2,
  5. Donna Neuberg2,
  6. Hongjun Wang1,
  7. Virginia Dalton3,
  8. Kathryn D McBride3,
  9. Stephen E. Sallan3,4,
  10. Lewis B Silverman3,4, and
  11. John G. Gribben1
  12. for the Dana-Farber Cancer Institute ALL Consortium
  1. 1Department of Medical Oncology,
  2. 2Biostatistics and Computational Biology, and
  3. 3Pediatric Oncology, Dana-Farber Cancer Institute, and
  4. 4Children's Hospital, Harvard Medical School, Boston, MA

Abstract

In a prospective trial in 284 children with B-lineage acute lymphoblastic leukemia (ALL), we assessed the clinical utility of real-time quantitative polymerase chain reaction analysis of antigen receptor gene rearrangements for detection of minimal residual disease (MRD) to identify children at high risk of relapse. At the end of induction therapy, the 5-year risk of relapse was 5% in 176 children with no detectable MRD and 44% in 108 children with detectable MRD (P < .001), with a linear association of the level of MRD and subsequent relapse. Recursive partitioning and clinical characteristics identified that the optimal cutoff level of MRD to predict outcome was 10−3. The 5-year risk of relapse was 12% for children with MRD less than one leukemia cell per 103 normal cells (low MRD) but 72% for children with MRD levels greater than this level (high MRD) (P < .001) and children with high MRD had a 10.5-fold greater risk of relapse. Based upon these results we have altered our treatment regimen for children with B-lineage ALL and children with MRD levels greater than or equal to 10−3 at the end of 4 weeks of multiagent induction chemotherapy now receive intensified treatment to attempt to decrease their risk of subsequent relapse.

Introduction

Modern multiagent anticancer drugs and risk-stratification treatment have made childhood acute lymphoblastic leukemia (ALL) the most successful example of a curable cancer.14 In 4 consecutive clinical trials conducted by Dana-Farber Cancer Institute (DFCI) ALL Consortium (a complete list of the members of the Dana Farber Cancer Institute ALL Consortium is provided in Document S1, available on the Blood website; see the Supplemental Materials link at the top of the online article). Between 1981 and 1995, the 5-year event-free survival improved from 74% (± 3%) to 83% (± 2%).5,6 Identification of clinical and biologic features associated with poor outcome allowed introduction of a risk-adjusted strategy.7 It remains important to evaluate novel risk factors to predict outcome so that therapy can be changed for children at high risk of relapse and potentially to decrease toxicity for children in whom less intensive therapy might be administered

Children in complete remission (CR) can have up to 1010 leukemic cells,810 and with one exception,11 studies in childhood ALL have demonstrated the prognostic significance of detection and quantification of minimal residual disease (MRD).1219 MRD assessment from as early as 2 weeks after starting therapy predicts outcome,20,21 with additional information obtained by multiple time points analyses.14,22 Three methods are widely used for monitoring MRD, multiparameter flow cytometric analysis,8,10 and polymerase chain reaction (PCR) amplification of either fusion transcripts,23 or of the antigen receptor rearrangements for immunoglobulin (Ig) or the T-cell receptors (TCR).14,15 PCR amplification of Ig and TCR rearrangements is limited by oligoclonality and clonal evolution,2426 but is widely applicable and has high sensitivity. Difficulties in quantification of MRD reproducibly can be overcome by real-time quantitative (RQ)-PCR, using consensus probes for framework27,28 or joining regions.29 The clinical utility of RQ-PCR has been reported to date in only small numbers of children.25,3034

In this prospective study, we report on the clinical utility of RQ-PCR analysis of MRD in a subset of 284 children with B-lineage ALL on DFCI ALL Consortium Protocol 95-01. Detection of MRD at levels greater than or equal to one leukemia cell in 103 normal cells at the end of induction therapy was associated with a 10.5-fold greater risk of relapse compared with those with MRD below this level, after adjusting for risk and treatment group. Based upon these findings, we have changed our treatment and intensify therapy for children with B-lineage ALL with high MRD levels at the end of remission induction therapy.

Patients, materials, and methods

Patients and samples

From 1996 to 2000, 498 children with ALL were enrolled consecutively at 8 participating consortium institutions in DFCI ALL Consortium Protocol 95-01.35 Institutional review board approval was received for treatment and procurement of samples in all cases. Informed consent was obtained in accordance with the Declaration of Helsinki. Children were classified at diagnosis as standard risk (SR) or high risk (HR) based upon age, white blood cell count (WBC) count, immunophenotype, presence or absence of central nervous system leukemia, and presence or absence of anterior mediastinal mass, and therapy adjusted for risk group.36 All patients received a 4-week remission induction regimen, including vincristine, prednisone, doxorubicin, high-dose methotrexate, and intrathecal therapy. Postremission consolidation included weekly high-dose asparaginase for all patients, with a randomization to either E coli or Erwinia asparaginase; high risk patients also received doxorubicin up to a cumulative dose of 300 mg/m2. Total duration of therapy was 25 months. Bone marrow (BM) and/or peripheral blood (PB) samples were obtained at diagnosis and at the end of induction (day 30). Among these children, 491 were eligible for evaluation, 52 had T-cell ALL and 1 had missing immunophenotype, so that 438 eligible children with B-lineage ALL were eligible for evaluation. Criteria for inclusion in the present study included B-lineage ALL achieving complete clinical remission (CR) at the end of induction therapy, the presence of at least one MRD marker with a sensitivity of at least 10−3, and a day-30 BM sample available for analysis. Of the 438 eligible children enrolled, 154 were excluded from this present analysis because of induction death (4), induction failure (4), unavailable diagnostic samples (15), unavailable day-30 BM sample (53), no informative molecular marker (57), or the required level of sensitivity of 10−3 by RQ-PCR was not reached (21). Therefore, 284 children (65%) were included in the final MRD analysis. Date of analysis was March 2007.

Molecular target identification

DNA and RNA were extracted and purified from mononuclear cells, and IgH, TCRγ, and TCRδ products PCR amplified and both strands sequenced.25,37 MRD quantification by RQ-PCR was performed for IgH rearrangements and TCR rearrangements with results reported as the mean of triplicates of copy numbers of the target gene divided by the mean of triplicates of copy numbers of glyceraldehydes-3-phosphate dehydrogenase (GAPDH).27 Unlike chromosomal translocations where primers can be designed that will produce reproducible levels of sensitivity, primers and probes that can be used for IG and TCR rearrangements are constrained by the specific variable and joining genes used and by the length of the complementarity determining region III (CDRIII).27,28 In cases where the CDRIII region is short, there is competition between the rearrangement in the leukemic cells and in healthy lymphocytes. Assays where the level of detection was not reproducibly higher than 10−3 were excluded from analysis, and this occurred for 21 children.

Statistical analysis

Differences in presenting characteristics were compared using chi-square tests. All freedom from relapse (FFR) analyses at the end of induction used the maximum MRD value of any BM sample obtained at day 30 (± 7-day window) of treatment for any molecular markers identified. MRD was categorized as undetectable or into 5 ordered groups defined by a 1 log difference in detectable MRD level (Table 1). FFR is time between date of CR and date of relapse, censored at date of last contact or remission death. The Kaplan-Meier method38 was used to estimate the distribution of FFR, and univariate associations between MRD groups were tested using log-rank tests. Multivariable regression analysis of FFR was conducted using Cox proportional hazards models,39 controlling for treatment group, risk group, and potential interactions of these and other covariates through a stepwise selection approach. Recursive partitioning was used to explore the best cut-point for classifying MRD groups using the Rpart function in R.40 All tests conducted were 2-sided at .05 significance level. There were no corrections for multiple comparisons.

View this table:
Table 1

Distribution of MRD values at day 30 and freedom from relapse (FFR) outcome

Results

Study population

The primary goal was to assess the clinical significance of MRD quantification at the end of induction therapy. Two hundred eighty-four children with B-lineage ALL fulfilled criteria for inclusion in this study. The median follow-up was 5.6 years and 5-year overall FFR was 0.80 (± .02; ± SE). Two children died in remission without relapse and were censored at date of death and 56 had relapsed by the time of analysis. No significant differences were observed between those included (n = 284) and not included (n = 146) in the end of induction MRD analysis for risk group (P = .36), WBC count group (P = .14), sex (P = .35), treatment group (P = .59), and age group (P = .0547). Since the P values for age group were close to the level for significance, we also looked at association between age combining the HR age categories compared with the SR ages, and the P value was then .46. Those included had lower 5-year FFR (80%) than those not included (88%) (P = .054). No significant differences were observed between the 81 patients with no informative marker and those for whom a marker was identified for risk group (P = .50), WBC count (P = .16), age group (P = .19), sex (P = .21), treatment group (P = .99), and FFR (P = .08).

MRD value at end of induction predicts relapse

We analyzed the impact of the level of MRD at the end of induction (day 30) in the 284 children, who had at least one BM sample. For each child, the maximum MRD value for any marker identified25,37 among the day-30 samples was used. Potential sensitivity of detection of the rearrangement was assessed from the standard curves of the cloned PCR product27 and was 10−6 in 189 (66.6%), 10−5 in 49 (7.3%), 10−4 in 26 (89.2%), and 10−3 in 20 (7.0%) cases. MRD was undetectable in 176 (62.0%), 11 of whom relapsed, with 5-year FFR of 0.95 (± 0.02). Eight of these 11 cases had evidence of clonal evolution, with a different sequence at relapse compared with that observed at presentation.25 MRD was detectable in 108 (38.0%), 45 of whom relapsed, with 5-year FFR of 0.56 (± 0.05; P < .001). There was a linear association of MRD with risk of relapse (P < .001) based upon ordinal categories of MRD level (Table 1). When the MRD level is stratified by these levels as ordinal, the log-rank trend test indicated a significant linear association of MRD with risk of relapse (P < .001). A Cox regression model of FFR estimated a 2-fold (95% CI, 1.8-2.4) risk of relapse for each 1 log increased level of MRD (Figure 1). Moreover, results from fitting a Cox regression model of FFR with MRD detected or not and an interaction term between MRD detection and the actual MRD value included as predictors suggested that among those with detectable MRD, the actual MRD value was associated with increased risk of relapse (P = .001) after accounting for MRD detection overall (P < .001).

Figure 1

Freedom from relapse by MRD level at day 30. Patients are grouped based on 6 categories of MRD: undetectable or, if detectable, defined by a 1 log difference in MRD value as shown in Table 1.

Recursive partitioning analysis was used to explore the best MRD cut-point based on the FFR outcome with the MRD groups categorized as in Table 1 and treated as nonordinal categories (results based on ordinal categories and the actual MRD values did not differ substantially). One primary split and 2 secondary splits were observed, resulting in 4 distinct MRD groups: undetectable, MRD of 10−6 to < 10−4, 10−4 to < 10−3, and 10−3 or higher.

Based upon these recursive partitioning analyses, clinical considerations, the difficulty of accurately quantifying MRD levels below a level of 10−4, and the finding that all sensitivity of detection values were 10−3 or less, the optimal cutoff level of MRD to predict relapse was chosen as 10−3. As shown in Figure 2, among the 38 children (13%) with MRD greater than or equal to 10−3 (high MRD), 27 relapsed and 5-year FFR was 0.28 plus or minus 0.08 compared with 0.88 plus or minus 0.02 for the 246 (87%) children with MRD less than 10−3 (low MRD) (P < .001). Of note, low MRD includes those with undetectable levels. The median FFR was 34 months in the high MRD group but because of the small number of relapses was undefined in the low MRD group.

Figure 2

Freedom from relapse by MRD level based on high MRD (≥ 10−3) versus low MRD (< 10−3).

Effect of high versus low MRD within risk subgroups

Children with high MRD at end of induction were more likely to be high risk, have WBC count of 50 000 × 109/L or higher, less likely to be age 1 to younger than 10 years, and less likely to be randomized to E coli asparaginase (Table 2). We therefore further explored the effect of MRD level on FFR within risk and treatment groups. The effect of the clinical significance of the MRD level on FFR did not appear to differ by risk group or asparaginase treatment group. Among the subset of standard risk patients, 5-year FFR was 0.88 (± 0.03) among the 160 with low MRD versus 0.43 (± 0.13) among the 14 with high MRD (P < .001). Among the subset of high risk patients, 5-year FFR was 0.88 (± 0.04) among the 86 with low MRD versus 0.19 ± 0.08 among the 24 with high MRD (P < .001). Similarly, we did not identify clinically meaningful interactions between MRD level and asparaginase treatment group (Table 2). Therefore, the effect of MRD on FFR does not appear to differ by risk or treatment group. Although we now routinely use fluorescent in situ hybridization (FISH) to screen for known relevant translocations, this assay was not used routinely until 2000, before the period of diagnosis of the patients examined here, so information regarding specific translocations is missing from sufficient number of patients in the present study to make reporting not meaningful. However, expression of Tel/AML136 was not associated with level of MRD (P = .97). However, the effect of high MRD versus low MRD may be greatest among those who expressed Tel/AML1 in whom the 5-year FFR was 0.96 (± 0.03; 2 relapses in 56 children) with low MRD versus 0.13 (± 0.12; 7 relapses in 8 children) with high MRD. Among those who are Tel/AML1 negative, 5-year FFR was 0.85 (± 0.03) with low MRD (24 relapses in 160 children) versus 0.36 (± 0.10) with high MRD (16 relapses in 25 children).

View this table:
Table 2

Comparison of known risk factors, treatment groups, and clinical outcome of patients with high versus low MRD at day 30

Multivariate modeling of FFR with MRD controlling for other factors

Multivariable analysis of the prognostic value of high versus low MRD detection for the risk of relapse was performed using Cox proportional hazards regression models, with predictors including MRD (high versus low), risk group (high versus standard risk), asparaginase treatment group (using 2 treatment indicators: randomized Erwinia versus randomized E coli asparaginase, and directly assigned versus randomized E coli asparaginase), sex, and each of their 2-way interactions added separately. Likelihood ratio tests were used to assess statistical significance, and interactions with the treatment group variables were tested jointly. None of the 2-way interactions were significant at the 5% level of significance, and these were not included in further models. Sex was also not significant and was dropped from the modeling. In all models containing risk group and asparaginase treatment group, MRD (high versus low) remained the only significant independent prognostic factor (P < .001). When MRD was included in the main effects model, both risk group and treatment group were no longer independent predictors of relapse (Table 3). Controlling for risk and treatment group, children with high MRD following induction had 10-fold (95% CI, 6.1-18.6) risk of relapse than those with low MRD.

View this table:
Table 3

Estimates from the final Cox model of FFR with high versus low MRD at end of induction, risk group, and treatment group included as covariates

Since presenting WBC count and age were collinear with risk group they were not included in this modeling. Substituting these variables for risk group into the main effects model demonstrates that although MRD remained highly significant, the WBC count at presentation was also an independent predictor of relapse after adjusting for other variables in the model (P = .07). When an interaction term between MRD and WBC count group was added to this model it was not significant (P = .41), suggesting that the effect of MRD does not differ by WBC count group. This was further explored by examining the effect of MRD within WBC count subgroups. Among the 238 children with WBC count lower than 50 000 × 109/L, 211 had low MRD of whom 24 relapsed, with 5-year FFR of 0.89 (± 0.02), and 27 had high MRD of whom 18 relapsed, with 5-year of FFR 0.35 (± 0.09), P < .001. Among the 46 patients with WBC count of 50 000 × 109/L or higher, 35 had low MRD of whom 5 relapsed, 5-year FFR of 0.85 (± 0.06), and 11 had high MRD of whom 9 relapsed, with 5-year FFR of 0.11 (± 0.10; P < .001). The effect of level of MRD did not differ by WBC count group. Another informative way to look at the subgroups was to compare relapse rates between WBC count groups within high or low MRD groups. FFR was not significantly different between WBC count groups among those with high MRD (P = .13) or among those with low MRD (P = .59), although in both subgroups there was a trend that having WBC count lower than 50 000 × 109/L was associated with better FFR. Having WBC count of 50 000 × 109/L or higher did not explain the relapses among the low MRD group, since 5 of 35 patients with WBC count of 50 000 × 109/L or higher relapsed (5-year FFR of 0.85 ± 0.06) and 24 of 211 with WBC count lower than 50 000 × 109/L relapsed (5-year FFR of 0.89 ± 0.02) among the low MRD group. These exploratory subgroup analyses further suggested no interaction between MRD group and WBC count group, that is, that the effect of MRD did not differ between WBC count groups. Although having WBC counts of 50 000 × 109/L or higher leads to poorer outcome in general, this by itself did not explain the relapses that occurred in the low MRD group.

Discussion

Previous studies have shown that early response to therapy in childhood ALL is an important indicator of treatment outcome and that quantitative assessment of MRD by either flow cytometric or PCR-based analyses can be used to assess the risk of subsequent relapse.1219 Here, we report the use of RQ-PCR to determine the prognostic significance of the detection and quantification of MRD at the end of induction therapy in B-lineage ALL. We demonstrate that quantitative assessment of MRD in children with ALL is the most important prognostic factor determining subsequent relapse. Children with high MRD at the end of induction had a shorter time to relapse and a 10.5-fold risk of relapse than those with low MRD levels after controlling for risk and treatment group. Further illustration of the value of this approach is highlighted by the finding of a linear relationship of MRD with relapse, as well as the importance of the actual quantitative MRD value after controlling for MRD detection.

The difficulty in reproducibly quantifying PCR products can largely be overcome by the use of RQ-PCR. Although this approach is being incorporated in ongoing clinical trials, results have been reported to date in only relatively small numbers of patients.25,3034 The sensitivity of flow cytometric detection of MRD depends on the specificity of the immunophenotype and on the number of cells available for study. Although the estimated levels of MRD may vary, flow cytometry and PCR-based detection of MRD in ALL yields concordant results in the vast majority of cases, leading some groups to suggest that both types of analyses should be performed in combination.34,41

Previous studies have demonstrated that antigen receptor rearrangements are not always stable and that that the risk of change increases with time.24,42 We have demonstrated in this protocol that up to 10% of relapses occurred with a different antigen receptor rearrangement than at presentation,25 and of note the majority of relapses that occurred in children with no detectable MRD at day 30 relapsed with disease bearing a novel rearrangement compared with that observed at diagnosis. Ideally more than one target should be followed for MRD assessment in children with ALL as well as a combined use of flow cytometric and PCR-based analyses.34,41 However, consideration must also be given to the cost of multiple types of analyses for multiple markers in the assessment of MRD. In the present report, assessment of the level of MRD at day 30 alone, even when only one MRD target is assessed, was sufficient to identify patients at sufficiently high risk of relapse to merit change in therapy. More importantly, assessment of MRD at day 30 is sufficiently early to allow subsequent change of therapy to attempt to alter the poor prognosis of these patients. In this study, first pass direct sequencing alone was used to identify suitable markers since it was an aim of the study to identify in what proportion of children this approach would be successful. On this basis, 102 children were excluded from analysis in whom either no informative rearrangement was detected or the marker could not be quantified at sufficient sensitivity. This is unacceptably high if MRD levels are to be used for treatment decisions. In ongoing studies where clinical decisions are being made based upon the MRD levels, additional steps are being taken to identify suitable markers for study in the vast majority of patients. Irrespective of these limitations, quantification of MRD by RQ-PCR at day 30 by itself identified patients at sufficiently high risk of relapse and early enough in their disease course to change therapy to attempt to alter prognosis. Based upon these results, we have changed our treatment approach and children with B-lineage ALL and high MRD at the end of induction therapy now receive treatment intensification to attempt to decrease their risk of subsequent relapse.

Supplementary PDF file available online.

Authorship

Contribution: J.G.G. designed the study and supervised the analyses; J.Z., A.L., and H.W. performed laboratory analysis; M.A.G., S.E.D., and D.N. performed statistical analysis; L.B.S. and S.E.S. designed the clinical study; V.D. and K.D.M. collected and verified samples and clinical data; J.G.G., M.A.G., and J.Z. wrote the paper; all authors agree with the final paper.

A complete list of the members of the Dana-Farber Cancer Institute ALL Corsortium is provided in Document S1.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: John G. Gribben, CRUK Medical Oncology Department, Barts and The London School of Medicine, Charterhouse Square, London EC1M 6BQ; e-mail: john.gribben{at}cancer.org.uk.

Acknowledgments

This work was supported by NIH grant CA68484 (J.G.G. and S.E.S.).

Footnotes

  • The online version of this manuscript contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted September 2, 2006.
  • Accepted April 16, 2007.

References

View Abstract