Residual disease detected by multidimensional flow cytometry signifies high relapse risk in patients with de novo acute myeloid leukemia: a report from Children's Oncology Group

Michael R. Loken, Todd A. Alonzo, Laura Pardo, Robert B. Gerbing, Susana C. Raimondi, Betsy A. Hirsch, Phoenix A. Ho, Janet Franklin, Todd M. Cooper, Alan S. Gamis and Soheil Meshinchi


Early response to induction chemotherapy is a predictor of outcome in acute myeloid leukemia (AML). We determined the prevalence and significance of postinduction residual disease (RD) by multidimensional flow cytometry (MDF) in children treated on Children's Oncology Group AML protocol AAML03P1. Postinduction marrow specimens at the end of induction (EOI) 1 or 2 or at the end of therapy from 249 patients were prospectively evaluated by MDF for RD, and presence of RD was correlated with disease characteristics and clinical outcome. Of the 188 patients in morphologic complete remission at EOI1, 46 (24%) had MDF-detectable disease. Those with and without RD at the EOI1 had a 3-year relapse risk of 60% and 29%, respectively (P < .001); the corresponding relapse-free survival was 30% and 65% (P < .001). Presence of RD at the EOI2 and end of therapy was similarly predictive of poor outcome. RD was detected in 28% of standard-risk patients in complete remission and was highly associated with poor relapse-free survival (P = .008). In a multivariate analysis, including cytogenetic and molecular risk factors, RD was an independent predictor of relapse (P < .001). MDF identifies patients at risk of relapse and poor outcome and can be incorporated into clinical trials for risk-based therapy allocation. This study was registered at as NCT00070174.


Acute myeloid leukemia (AML) is a heterogeneous and molecularly complex group of diseases with variable hematologic phenotypes. Its genomic complexity, which contributes to its variable response to therapy, is considered the underlying reason for suboptimal outcome in patients with AML.1 Although a subset of patients with AML can be assigned to a specific risk class on the basis of their disease's molecular characteristics (ie, cytogenetics, mutations, etc), most patients lack risk-associated molecular markers. Although response to therapy is a powerful predictor of outcome in leukemias, morphologic assessment of response has low sensitivity and poor specificity for accurate determination of disease status.

Multidimensional flow cytometry (MDF) uses aberrant expression of surface antigens on the malignant cells to identify residual cells not detectable by standard morphologic assessment.2,3 The use of MDF to predict relapse was suggested by Wormann et al,4 whereby 67% of patients in morphologic complete remission (CR) had evidence of disease by MDF, which was associated with worse survival. Further validation of clinical significance of RD by MDF assessment of remission marrow was performed in 53 adult patients with AML who achieved CR,5,6 whereby patients with minimal residual disease (MRD) had a nearly 70% chance of relapse compared with 20% of patients without MRD. Evaluation of MDF in patients treated on Children's Cancer Group 2941/2961 studies that used the “different-from-normal” approach reported an MRD prevalence of 16% and a highly elevated risk of relapse and death in patients with MRD.79 Additional studies that have validated potential clinical utility of flow cytometry in predicting impending relapse were summarized in a recent review.1013 One pediatric AML study reported that, although the presence of MRD was predictive of relapse, its significance was lost when cytogenetic risk factors were included in multivariate analysis, questioning its utility in standard risk patients.14,15 St Jude AML 02 trial used the status of MRD after remission (> 0.1%) to alter patients' treatment, whereby patients with evidence of MRD at the end of induction (EOI) were allocated to receive a stem cell transplant from the most suitable allogeneic donor.16

In this study, we used a standardized panel of antibodies for MDF evaluation of diagnostic and postremission marrow specimens from a large cohort of pediatric patients and found that detection of RD by MDF highly correlates with disease outcome, especially in patients with no other known risk factors.


Patient eligibility and study protocol

Patients younger than 21 years with newly diagnosed de novo AML who were enrolled on Children's Oncology Group (COG) AAML03P1 were eligible for this study. The COG AAML03P1 protocol has been previously described17,18 and included the administration of 2 courses of induction with cytarabine, daunorubicin, and etoposide, with the addition of gemtuzumab ozogamicin in the first course. The remaining 3 courses of chemotherapy included cytarabine/etoposide, mitoxantrone/cytarabine, and the addition of gemtuzumab ozogamicin to the fourth chemotherapy course, cytarabine/asparaginase (Capizzi II). Patients with HLA-suitable sibling donors were nonrandomly assigned to bone marrow transplantation after intensification 1.

Specimen collection

Bone marrow aspirates were collected at diagnosis, at the EOI1 and EOI2, and at the end of therapy (EOT) and submitted via priority overnight delivery in heparin-containing tubes for MDF assessment. Nucleic acids were extracted from a fraction of the diagnostic specimens for mutation profiling. Per protocol specifications, diagnostic and after induction morphologic marrow assessments were performed at local institutions and were not centrally reviewed. Patients who did not achieve morphologic remission were recommended to have a repeat marrow evaluation to confirm disease status before the next course of chemotherapy. RD assessments were limited to the marrows obtained immediately before the start of the next chemotherapy.

Flow cytometric analysis

Specimens were processed as previously described.8 Briefly, 100 μL of bone marrow (or peripheral blood) was reacted with antibody cocktails of pretitered antibodies for 20 minutes at room temperature in the dark. Red blood cells were lysed with the use of 3.5 mL of buffered NH4Cl (0.83%) at 37°C for 5 minutes, followed by centrifugation at 300g. The cells were washed with 3 mL of phosphate-buffered saline containing 2% fetal calf serum and resuspended to 0.5 mL in 1% paraformaldehyde for analysis on a FACSCalibur flow cytometer (Becton Dickinson Biosciences). A total of 200 000 events were collected for each tube. The flow cytometers were standardized and calibrated with the use of RCP-5 and RFP-5 beads (Spherotech), with spectral compensation performed with the use of cells labeled with CD4 (SK3; BD) conjugated to fluorescein isothiocyanate, phycoerythrin, peridinin chlorophyll protein, or allophycocyanin. Eight combinations of reagents were used to assess the subsets of CD34+ progenitor cells observed in normal reactive bone marrow (Table 1).

Table 1

Combinations of reagents to assess CD34+ progenitor cells in normal reactive bone marrow cells

Data analysis was performed with WinList software (Verity Software House). Boolean gating was used to focus on the precise relations between antigens on cells expressing CD34 and those undergoing transition and thus losing CD34 as they developed. Two analysts (M.R.L. and L.P.) independently assessed each specimen and compared results to reach agreement. Analysts were blinded to all aspects of the clinical data except for date of specimen collection.

Evaluating RD with the use of MDF

RD was detected in this study with the use of a different-from-normal approach, which is based on the comparison of gene product (ie, antigen) expression during maturation from hematopoietic stem cell to mature leukocyte. CD34, CD45, and side scatter (SSC) were used to identify the immature cells of each lineage for comparison of intensity relations between cells of various lineages. All analyses were performed by 2 independent analysts, without access to the clinical information, and agreement between 2 analyses was a requisite for each “RD” call. Abnormal populations were detected by identifying a cluster of events having the same dispersion as a unique population of cells ≥ 0.5 decade separation from the position of corresponding normal cells. In most instances, the abnormal population was observable in multiple antibody combinations. The proportion of abnormal leukemic cells was calculated per total nonerythroid nucleated cells by limiting the denominator to CD45+ cell, thereby excluding erythroid and platelet precursors.19 Once the analysis was complete, the dataset was locked and submitted to the COG statistical office for integration with the clinical findings.

Mutation screening

Genomic DNA was extracted from the diagnostic marrow specimens with the use of the Puregene protocol (Gentra Systems Inc). Screening for FLT3-ITD (internal tandem duplication), NPM, and CEBPA mutations was performed as previously described.17,2022

Statistical methods

Correlation of RD with clinical outcome, including overall survival (OS) and relapse-free survival (RFS), was defined according to international criteria.23 The Kaplan-Meier method was used to estimate OS, and RFS was defined as the time from the end of course 1 for patients in CR (defined as bone marrow aspirate containing < 5% blasts by morphology and no evidence of extramedullary disease) until death. RFS was defined as the time from the end of course 1 for patients in CR until relapse or death. Estimates of relapse risk (RR) were obtained by the method of cumulative incidence that accounts for competing events. RR was defined as the time from the end of course 1 for patients in CR until either relapse or death because of progressive disease, whereby deaths from nonprogressive disease were considered to be competing events. The significance of predictor variables was tested with the log-rank statistic for OS and RFS and with Gray statistic for RR. Children who also received a stem cell transplant while on study were censored at the time of transplantation for all analyses unless otherwise indicated. Children lost to follow-up were censored at their date of last known contact or 6 months before March 31, 2011. The significance of observed differences in proportions was tested with the chi-square test and Fisher exact test when data were sparse. The Mann-Whitney test was used to determine the significance between differences in medians. Cox proportional hazard models were used to estimate hazard ratios (HRs) for univariate and multivariate analyses of OS and RFS.


Patients and study population

The COG AAML03P1 study included 340 children and young adults with de novo AML, 311 of whom enrolled on the accompanying biology study to evaluate the role of postinduction RD in patients with AML. Specimens from EOI1 were submitted from 219 of these 311 patients for RD evaluation. At the EOI1, 188 of the 219 patients (86%) were in morphologic CR (ie, < 5% blast by morphology), and 15 (8%) had partial response (PR; ie, 5%-20% blasts); 15 others either had refractory disease (ie, > 20% blast; n = 12), had persistent central nervous system (CNS) disease (n = 1), or experienced a CNS relapse (n = 2); and 1 patient's disease was not evaluable (Table 2). Specimens from EOI2 were submitted from 190 of the 271 patients who completed the second induction. Of these 190 patients, 180 (95%) were in morphologic CR; 5 (3%) had refractory or progressive disease, and 5 were unevaluable. End-of-therapy specimens were submitted for 90 patients. Specimens were available from only a small subset of patients after intensification 1 (n = 22) or 2 (n = 11).

Table 2

Correlation of residual disease with morphologic remission status at the end of induction 1

Prevalence of RD

At the EOI1, disease was detected in 67 of 219 evaluable patients (31%) regardless of morphologic remission status at levels ranging from 0.02% to 85% (median, 2.0%). Three patients had RD levels < 0.1%, and 5 patients reported to be in morphologic CR had > 5% leukemia detected by MDF. Of the 188 patients who achieved a morphologic CR, 46 (25%) had measurable disease by MDF, ranging from < 0.1% to > 5%. Two patients had RD levels < 0.1% (0.02% and 0.03%), and 5 patients reported to be in morphologic CR had > 5% aberrant cells detected by MDF. Twenty-seven patients did not achieve morphologic CR, with 15 having disease in morphologic PR (ie, 5%-20% blast) and 12 having refractory disease (> 20% blast). Of these 27 patients, 6 (40%) with PR and 1 (8%) with refractory disease had no evidence of aberrant cells as assessed by MDF. Thus, MDF identified RD in patients in morphologic CR as well as distinguishing patients with reported morphologic disease (mainly those in PR) who did not have immunophenotypic evidence of disease.

We compared the demographic, laboratory, and clinical characteristics of patients with morphologic CR with or without RD at the EOI1 (Table 3). Patients with and without RD had similar diagnostic white blood cell counts and diagnostic marrow blast results; age and sex were also similar between the groups. RD prevalence in patients with favorable-, intermediate-, and high-risk cytogenetics was 11%, 29%, and 50%, respectively (P = .007). The prevalence of RD at the EOI1 was 25% in patients with FLT3-ITD, 44% in patients with CEBPA mutations, and 0% in patients with NPM1 mutations (Figure 1).

Table 3

Characteristics of patients in CR with and without residual disease at the end of induction 1

Figure 1

Prevalence of residual disease in specific cytogenetic, risk, molecular, and response groups in patients in morphologic complete remission (CR) after one course of chemotherapy. EOI indicates end of induction; ITD, internal tandem duplication; and RD, residual disease.

Clinical outcome

Clinical implication of presence of RD at EOI1 for patients in morphologic CR was assessed. Presence of RD by MDF was correlated with RR and RFS and OS from EOI1. Presence of RD in patients in CR (n = 188) was associated with a RR of 60% ± 16% at 3 years compared with that of 29% ± 8% in patients without RD (P < .001). Corresponding RFS was 30% ± 15% versus 65% ± 9% in patients with and without RD (P < .001; Figure 2) with an OS of 56% ± 16% versus 80% ± 8% (P = .002). In addition, we evaluated the ability of MDF to define outcome in patients who failed to achieve morphologic CR (> 5% blast by morphology). At the EOI1, 42 patients had failed to achieve a CR according to the morphologic examination of the marrow (> 5% blast). MDF data were available on 27 of whom 20 (74%) were RD positive and 7 (26%) were RD negative. All patients who failed to achieve morphologic CR without evidence of RD are long-term survivors compared with a 3-year OS of 35% ± 21% for patients with RD (P = .005; Figure 3)

Figure 2

Relapse risk and relapse-free survival from end of induction 1. Relapse risk (A) and relapse-free survival (B) from end of induction 1 in patients with morphologic response to induction chemotherapy. RD indicates residual disease.

Figure 3

Overall survival of patients with morphologic induction failure diagnosed on the basis of disease detection by multidimensional flow cytometry. EOI indicates end of induction; and RD, residual disease.

We further evaluated whether the presence of RD beyond induction 1 carries clinical significance. Of 180 patients in CR at EOI2 with MDF data, 34 patients (19%) had evidence of RD (median, 0.6% RD). Cumulative RR at 3 years from EOI2 in patients with RD was 67% ± 18% compared with 30% ± 8% in patients without RD (P < .001) with a corresponding RFS of 29% ± 17% and 65% ± 9%, respectively (P < .001; Figure 4A-B). At the EOT, 6 of 90 patients (7%) were RD positive. Patients with RD at the end of therapy had a RR of 83% ± 30% compared with that of 36% ± 11% for the RD-negative patients (P < .001) with a corresponding RFS of 17% ± 30% and 62% ± 11%, respectively (P < .001; Figure 4C-D).

Figure 4

Relapse risk and relapse-free survival by residual disease (RD) status at the end of induction 2. Relapse risk and relapse-free survival by residual disease status at the end of induction 2 (A-B) and at the end of therapy (C-D).

Clearance of RD

We inquired whether clearance of initial RD correlates with improved outcome. Of the 84 RD-negative patients at the EOT, 23 patients had a previously documented RD at EOI1 or EOI2. We evaluated the clinical outcome from EOT for patients who were RD negative at EOT with or without a previously documented RD. For these RD-negative patients, RR at 3 years from EOT was 26% ± 11% for patients with no history of prior RD compared with that of 65% ± 20% for patients with a previously documented RD (P < .001; Figure 5A). Corresponding RFS from EOT was 75% ± 11% and 26% ± 18% for patients with and without prior RD, respectively (P < .001; Figure 5B).

Figure 5

Relapse risk and relapse-free survival from end of therapy. Relapse risk (A) and relapse-free survival (B) from end of therapy for patients with no documented residual disease (RD) during therapy (no RD), with RD at the end of therapy (RD positive), and without RD at the end of therapy with previously documented RD.

RD threshold

We inquired whether various levels of RD correlate with different clinical outcome. Disease burden by MDF at the EOI1 for patients in morphologic CR varied significantly, with a median RD of 1.1%, with 2 patients (4.3%) with RD < 0.1%, 21 of 46 patients (46%) had RD levels of ≥ 0.1% to 1%, 18 patients (39%) had RD levels of > 1% to ≤ 5%. The remaining 5 patients (10.9%) had RD levels > 5%. Relapse rate was determined for RD thresholds of either > 0% to < 1% or ≥ 1% and were compared with patients without RD. Patients with RD > 0% to < 1% had a RR of 65% ± 23% versus RR of 55% ± 22% in patients with RD > 1% (P = .637; Figure 6), which were significantly higher than patients without RD. Corresponding RFS from end of EOI1 for patients with different RD thresholds was 24% ± 21%, 36% ± 22%, and 65% ± 9% for patients with < 1%, ≥ 1%, and no RD, respectively (P = .0006).

Figure 6

Relapse risk on the basis of residual disease (RD) threshold of 1%.

Prognostic factors

We used Cox regression analysis to evaluate RD status, cytogenetic/molecular favorable risk (ie, CBF AML and NPM1 and CEBPA mutations) or unfavorable risk [ie, −7, −5/del(5q), high AR FLT3-ITD], and diagnostic white blood cell count as predictors of OS and RFS in a univariate model (Table 4) for patients in CR. The presence of RD was a significant prognostic factor for lower OS (HR = 2.46; P = .003) and RFS (HR = 2.46; P < .001). In a separate univariate model, compared with patients with standard risk AML, patients with molecular high-risk disease had a significantly worse RFS rate (HR = 2.72; P = .008), and patients with favorable-risk AML had a better RFS rate (HR = 0.83; P = .484) but not significantly. In a multivariate model that included the above-mentioned prognostic factors, the presence of RD remained an independent prognostic factor for lower RFS (HR = 2.38; P < .001). In this multivariate model, patients with RD had a HR for death of 1.87 (P = .06).

Table 4

Prognostic significance of presence of RD in patients in morphologic CR by univariate Cox analyses

Implications of RD in specific risk groups in AML

We assessed the clinical implications of RD in specific clinical risk groups. Of 188 patients in CR at the EOI1 with RD results, 177 patients had complete cytogenetic and molecular data available, 84 of which had favorable (CBF AML, NPM and CEBPA mutations; n = 69) or unfavorable (high-risk FLT3/ITD, −7, −5 or del5q; n = 15) features. The remaining 93 patients without favorable or unfavorable features were regarded as standard risk, of whom 26 (28%) had evidence of RD by MDF (median, 1.5%). Of these 26 standard-risk patients in morphologic CR, 24 had RD < 5%, 1 patient had 6% RD by MDF, and 1 patient had 20% aberrant cells detected by MDF. Standard-risk patients in CR with RD had a RFS at 3 years from EOI1 of 29% ± 20% versus 65% ± 13% for the RD-negative patients (P = .008; Figure 7A). Of the 69 patients with favorable-risk features, 10 had RD (14%) at the EOI1. In this favorable-risk cohort, RFS at 3 years from CR for patients with and without RD was 59% ± 37% versus 67% ± 13% (P = .470; Figure 7B). Fifteen patients were deemed high risk on the basis of their cytogenetic or molecular features, of which 5 patients had RD (33%). RFS was 0% for patients with RD versus 45% ± 38% for the RD-negative patients (P = .047; Figure 7C).

Figure 7

Relapse-free survival on the basis of the presence or absence of residual disease (RD). Relapse-free survival on the basis of the presence or absence of RD in patients with standard-risk (A), favorable-risk (B) or high-risk (C) acute myeloid leukemia.


Risk-adapted therapy allows more appropriate therapy allocation in leukemias, whereby patients at high risk of relapse are allocated to alternate therapy (more intensive or targeted) in attempts to improve outcome, and patients at lower risk of relapse are spared the more aggressive managements. Despite efforts to identify specific risk groups in AML, most patients are without specific molecular risk markers. In this prospective study we evaluated the utility of MDF to identify patients with RD, who may be at elevated risk of relapse and poor outcome. As expected, RD was most prevalent in patients with high-risk disease and was less common in favorable-risk patients after one course of chemotherapy.

Although the definition of remission remains based on morphologic assessment of response to therapy, we found that in this cohort of uniformly treated patients, RD was detected in nearly a third of patients with no morphologic evidence of disease after initial induction chemotherapy. The presence of postinduction RD in patients in morphologic CR was highly correlated with relapse and was an independent predictor of outcome. We also provide data that nearly one-third of patients classified as having disease on the basis of pathologic review had no evidence of disease by MDF, a finding that highly correlated with survival whereby patients with > 5% marrow blast, but with normal immunophenotype, were long-term survivors. This finding was most notable in patients with 5%-20% of blasts, whereby distinction of elevated blasts because of postinduction marrow recovery versus disease may be difficult. Similar findings have been observed in other COG studies.24 These data indicate that MDF provides a more-sensitive and -specific tool for response assessment compared with morphology both in patients with induction failure and patients who have achieved a morphologic remission.

In close evaluation of the effect of RD in specific risk groups, we found that RD was most predictive of eventual relapse in patients without known risk features, making MDF a feasible method to be combined with other cytogenetic and molecular risk markers in risk assessment and risk-based therapy allocation. We further observe lack of statistical significance of RD in the favorable-risk cohort. This may in part be because of the small number of favorable-risk patients with RD, thus limiting power to detect differences. In addition, because favorable-risk patients have a longer time to relapse, lack of observed difference may be because of the follow-up period. In addition, it is feasible that at least a subset of patients with favorable risk features have a slower regression of the leukemic clone, similar to what is observed in acute promyelocytic leukemia25,26; thus, early RD assessment may not be informative in a subset of patients. This may be the case in patients with CEBPA mutations, who have a higher rate of RD (44%) without a subsequent relapse.

In most patients without genomic predictors of outcome, MDF provides discriminatory measure of risk of long-term outcome. This study confirmed that low levels of leukemia can be detected on the basis of surface antigen expression and that detection of these cells early in treatment predicts eventual outcome. A key question about the significance of RD is whether patients with evidence of RD early in disease who clear their disease with subsequent chemotherapy have an improved outcome. In this study we found that patients with no RD at the end of therapy, but with a previously documented RD, remain at high risk of relapse and poor outcome, suggesting that intervention beyond clearance of RD is required for improved outcome. By combining the 3 technologies of molecular biology, karyotyping, and MDF, all patients can be stratified into more appropriate risk groups for risk-based therapy allocation, whereby patients at high risk of relapse would receive additional or modified therapy, and the other, low-risk group would receive therapy and its resultant toxicity might be reduced. This study found that MDF-based RD assessment can be a powerful tool in response and risk assessment in AML and a means of identifying risk groups in patients with otherwise no prognostic markers. MDF can be combined with cytogenetic/molecular risk factors to create a robust risk allocation system for all patients with AML. This data helped shape the current risk-based therapy allocation in the COG AAML1031 clinical trial, in which patients are allocated to 2 risk groups on the basis of the cytogenetic, molecular, and MDF profiles.

Despite its ability to define relative risk of relapse on the basis of response in patients who undergo induction therapy, MDF's ability to define absolute risk remains limited because nearly a quarter of patients without any measurable RD have an eventual relapse and an additional cohort of patients who have a documented RD remain long-term relapse-free survivors.7,14,16 Such outcome heterogeneity is probably because of a combination of technical (assay sensitivity) and biologic (clonal heterogeneity) reasons. Refinement of techniques, inclusion of additional sensitive disease detection modalities, and merging of flow and molecular genetic assays may provide more accurate risk assessment in managing patients with AML.


Contribution: M.R.L. and S.M. designed and performed research, analyzed data, and wrote the manuscript; T.A.A. and R.B.G. performed statistical analyses and edited the manuscript; S.C.R., B.A.H., P.A.H., L.P., J.F., and T.M.C. performed research and edited the manuscript; and A.S.G. designed research, analyzed data, and edited the manuscript.

Conflict-of-interest disclosure: M.R.L. is president and laboratory director of Hematologics Inc. The remaining authors declare no competing financial interests.

Correspondence: Soheil Meshinchi, Clinical Research Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, D5-380, Seattle, WA 98109-1024; e-mail: smeshinc{at}


The authors thank Dr Eric Sievers for his contributions to early-stage evaluations of minimal residual disease in childhood AML and thank the patients and families for participating in this study.

This work was supported by grants U10-CA98543 (Chair's Grant), U10-CA98413 (Statistics and Data Center Grant), U24-CA114766 (Children's Oncology Group), NCI R01-CA114563 (S.M.), and NCI R21 CA 104964-02 (S.M.).

A complete listing of grant support for research conducted by Children's Cancer Group and POG before initiation of the COG grant in 2003 is available at


  • * M.R.L. and T.A.A. contributed equally to this study.

  • There is an Inside Blood commentary on this article in this issue.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted January 31, 2012.
  • Accepted May 19, 2012.


View Abstract