Advertisement

Minimal/measurable residual disease in AML: a consensus document from the European LeukemiaNet MRD Working Party

Gerrit J. Schuurhuis, Michael Heuser, Sylvie Freeman, Marie-Christine Béné, Francesco Buccisano, Jacqueline Cloos, David Grimwade, Torsten Haferlach, Robert K. Hills, Christopher S. Hourigan, Jeffrey L. Jorgensen, Wolfgang Kern, Francis Lacombe, Luca Maurillo, Claude Preudhomme, Bert A. van der Reijden, Christian Thiede, Adriano Venditti, Paresh Vyas, Brent L. Wood, Roland B. Walter, Konstanze Döhner, Gail J. Roboz and Gert J. Ossenkoppele

Publisher's Note: There is a Blood Commentary on this article in this issue.

Abstract

Measurable residual disease (MRD; previously termed minimal residual disease) is an independent, postdiagnosis, prognostic indicator in acute myeloid leukemia (AML) that is important for risk stratification and treatment planning, in conjunction with other well-established clinical, cytogenetic, and molecular data assessed at diagnosis. MRD can be evaluated using a variety of multiparameter flow cytometry and molecular protocols, but, to date, these approaches have not been qualitatively or quantitatively standardized, making their use in clinical practice challenging. The objective of this work was to identify key clinical and scientific issues in the measurement and application of MRD in AML, to achieve consensus on these issues, and to provide guidelines for the current and future use of MRD in clinical practice. The work was accomplished over 2 years, during 4 meetings by a specially designated MRD Working Party of the European LeukemiaNet. The group included 24 faculty with expertise in AML hematopathology, molecular diagnostics, clinical trials, and clinical medicine, from 19 institutions in Europe and the United States.

Introduction

A myriad of factors present at diagnosis in acute myeloid leukemia (AML), including cytogenetics, molecular genetics, and age have been associated with prognosis but still fall short in accurately predicting outcomes.1-3 Increasing evidence now indicates that the ability to identify residual disease far below the morphology-based 5% blast threshold is an important tool for refining our approach to risk classification. Minimal or, more appropriately, measurable residual disease (MRD) denotes the presence of leukemia cells down to levels of 1:104 to 1:106 white blood cells (WBCs), compared with 1:20 in morphology-based assessments. There are several reasons to apply MRD detection in AML: (1) to provide an objective methodology to establish a deeper remission status, (2) to refine outcome prediction and inform postremission treatment, (3) to identify impending relapse and enable early intervention, (4) to allow more robust posttransplant surveillance, and (5) to use as a surrogate end point to accelerate drug testing and approval.

Numerous studies have investigated the value of MRD in AML and have consistently shown that MRD negativity, as defined by specified cutoff values, is highly prognostic for outcome (see eg, Table 1 for flow cytometric MRD). Reflecting the molecular diversity of AML, different MRD platforms are available for detecting MRD. Two methods are currently widely applied (ie, multiparameter flow cytometry [MFC] and real-time quantitative polymerase chain reaction [qPCR]), and newer technologies, including digital PCR and next-generation sequencing (NGS), are emerging. Each methodology differs in the proportion of patients to whom it can be applied and in its sensitivity to detect MRD. It is expected that integration of baseline factors and assessment of MRD will improve risk assessment.4 MRD assessments are performed in an increasing number of laboratories worldwide and used in various clinical settings. However, no guidelines or recommendations are available on how and when to apply MRD assessments and how to translate the results to clinical practice. An international group of experts addressed these issues on behalf of the European LeukemiaNet (ELN) and reports here on its conclusions.

Table 1.

Key studies on the prognostic value of MRD by MFC

Methods

An international panel of 24 experts, including 19 from European countries and 5 from the United States, met 4 times during 2016 and 2017, with numerous e-mail exchanges during this time. The panel included members with recognized technical, clinical, and translational knowledge of MRD in AML, including specific expertise on MFC MRD, molecular MRD, NGS, and clinical issues. For the clinical section, only MRD publications including at least 50 patients were reviewed (Table 1). Unpublished technical details from individual laboratory directors were also discussed and used. In several areas, there was inadequate data to draw firm conclusions.

The final ELN MRD recommendations are subdivided into 3 parts: MFC, molecular, and clinical. The paper presents a summary of consensus and nonconsensus issues, with extended views present in the supplemental Data (available on the Blood Web site) under headings corresponding to those in the main document.

Flow cytometric (MFC) MRD

Approaches for MFC MRD assessment (LAIP vs different from normal)

For the detection of MRD, a comprehensive panel characterized by early marker(s) like CD34 and CD117, myeloid-lineage associated markers, and differentiation antigens like CD2, CD7, CD19, or CD56, must track aberrant AML blast cells.

Two separate approaches have been used for assessing MFC MRD: (1) the LAIP approach, which defines LAIPs at diagnosis and tracks these in subsequent samples; and (2) the different-from-normal (DfN) approach, which is based on the identification of aberrant differentiation/maturation profiles at follow-up. The DfN approach can be applied if information from diagnosis is not available, and also to detect new aberrancies, together with disappearance of diagnosis aberrancies, referred to in earlier literature as “immunophenotype shifts.”5-7 These may emerge from leukemia evolution or clonal selection.8-10 In essence, LAIPs are DfN abnormalities in the vast majority of cases, and the difference between these 2 approaches is likely to disappear if an adapted, sufficiently large panel of antibodies (preferably ≥8 colors) is used.

We recommend that the advantages of both approaches be combined to best define MFC MRD burden, allowing detection of new aberrancies emerging at follow-up, and monitoring patients when there is an absence of diagnostic information.

The ELN MRD Working Party suggests the term “LAIP-based DfN approach” for this combined strategy. To be more specific, aberrancies may be referred to as LAIPs or DfN-LAIP, whichever is the more appropriate term. LAIPs and DfN-LAIPs can be further categorized as (1) diagnostic, (2) follow-up (based on diagnosis information), (3) follow-up (no diagnostic information), and (4) changed (ie, new aberrancy compared with diagnosis LAIPs or previous follow-up LAIPs).

Suggestion for further improvements.

We recommend to use the integrated LAIP-based DfN approach to separately validate the, largely unknown, prognostic impact of emerging aberrancies.

Markers for MRD assessment

Marker panel content.

Many different panels of markers have been used to assess MRD (for the panels currently used by the ELN Working Group members, see supplemental Table 1).

Based on the collective experiences of the working group, a 2-step consensus recommendation is proposed, which includes gating on CD45, sideward scatter, forward scatter, a primitive marker (CD34, CD117), and abnormal expression of marker(s) or abnormal combination(s) of marker expression. In addition, a monocytic combination, including CD64, CD11b, and CD4 (see legends of supplemental Table 1), is proposed to assess MRD in monocytic or myelomonocytic AML.11,12

Other interesting markers are in supplemental Table 1 and include CD133, CD38, and CD123, which allow us to define more primitive progenitor and/or leukemia stem cell populations.13,14

Number and nature of fluorochromes.

We recommend using a minimum of 8 colors. Although not formally proved, this may allow more specific assessment of aberrancies than is feasible with fewer colors.

Rather than recommending suitable clones and fluorochromes, the panelists suggest taking advantage of extensive validation studies as done, for example, by the Euroflow consortium15 and the French Groupe d'Étude Immunologique des Leucémie group (consensus document in revision). Specific attention should be given to staining index of fluorochromes (see supplemental Data).

Using the same tubes (with the same antibody-fluorochrome combinations) at diagnosis and at follow-up is considered a prerequisite for the LAIP-based DfN approach of tracking of both LAIPs established at diagnosis and emerging aberrancies.

Suggestions for further improvements.

  1. To minimize the number of different panels used, we strongly recommend the design and validation of a single common panel assay, preferably as an ELN initiative, for all MRD studies.16

  2. We recommend exploration of the value of a separate (single tube) leukemia stem cell (LSC) panel (see supplemental Figure 1) in which the total LSC load can be assessed at any time from diagnosis to relapse.17 Validation of such a panel has been initiated among different ELN and non-ELN members.

Technical requirements

Bone marrow (BM) sampling.

Sampling for MFC MRD usually is done in such anticoagulants as EDTA or heparin, with no significant difference between these. A recurrent concern is that MFC MRD in peripheral blood (PB) is characterized by a lower frequency than in BM (up to ∼1 log18,19). The use of PB at present cannot be recommended.20 To maximize assay sensitivity, it is mandatory to avoid hemodilution of BM samples. We therefore strongly recommend to submit the first BM pull for MRD analysis, at least for follow-up BM samples intended for MFC MRD, preferably using the same volume across time points and patients. It is recommended to estimate the possible contamination with PB, the presence of >90% mature neutrophils in a BM sample indicating significant hemodilution.20-24 Sampling time points and volumes for MFC (and molecular) assays are outlined in supplemental Table 2.

BM transport.

In the multicenter setting, we recommend transport at controlled room temperature. Up to 3 days storage is allowed, without the need for a viability marker, provided BM is stored undiluted.

Flow cytometers.

Basic principles of flow cytometric settings have been described for many purposes including MRD.25-27 Harmonization of instrument settings is of high value for interlaboratory comparison of results. One robust, simple way to assess this harmonization has been described by the Harmonemia study.27 The Euroflow consortium also provided standard operating procedures for their panels.25

Preparation of samples.

There are 2 major approaches for preparing BM samples for FCM: (1) stain/lyse/wash (or no wash) has the advantage of reducing cell losses; and (2) bulk lysis followed by washing and staining (and washing) has the advantage of having all tubes prepared in a similar way for the different staining steps. Both approaches are in use for AML MRD assays. Incubation typically should be performed in the dark to preserve the quality of fluorochromes. The greater skill, with no consensus at the moment, resides in the analysis step, typically using a series of linked gatings aiming at best identifying the MRD population. Comparison with the diagnosis pattern is the safest, seeking for residual cells of the same population as that seen at diagnosis. However, in some instances, a clearly focused population, differing from the initial one, can be seen. It may represent a shift of the initial clone/population or the emergence of a chemotherapy-resistant subpopulation. Whether this will lead to relapse is impossible to determine, but it is recommended that in such instances closer surveillance of the patient is suggested.

How to calculate MRD burden and minimal requirements

Several strategies have been used to quantify the MRD burden. To harmonize reporting we recommend the following:

  1. Use LAIPs that clearly occupy an empty space, that is, aberrancies not found at the same MFC location in control BM, at diagnosis and follow-up. In cases where only part of a population is occupying an empty space, inclusion of additional cells outside the empty space is allowed provided they define 1 single clustered population together with the cells from the empty space.

  2. Use the best (most specific and/or highest frequency) LAIP for assessing MRD frequency; in case of multiple, nonoverlapping LAIPs, frequencies of individual LAIPs should be added up.

  3. Relate LAIP events to the leukocyte population of CD45+ cells (excluding CD45 erythroblasts).

  4. Use the diagnosis LAIPs if diagnosis sample and diagnosis LAIPs are available to optimally inform MRD gating for these LAIPs.

  5. Use the DfN approach to identify any new LAIPs. Such new LAIPs can be used for quantitation.

It is also recommended to acquire between 500 000 and 1 million events (excluding all CD45-negative cells and debris) unless the cluster of MRD becomes obvious during acquisition and is recognized by a trained operator.

Suggestion for further improvements.

In order to minimize subjectivity in data interpretation/analysis, it is recommended to evaluate the possibilities for improved discrimination of LAIPs achieved by multiparameter displays, such as principal component analysis in commercially available programs (eg, in the APS system of the Kaluza or Infinicyte programs). Several initiatives are ongoing to develop and/or apply more sophisticated analysis programs.

Thresholds and time points for MRD assessment during treatment

The present concept is to use MRD for risk analysis at an early time point prior to consolidation therapy. With the large number of aberrancies that can be defined (up to 100)28 and their inherent differences in specificity, cutoff levels that capture MRD positivity applicable to all LAIPs have to be relatively high (ie, 0.035% to 0.2%) (Table 1; and Table 2 in Ossenkoppele and Schuurhuis29). A cutoff of 0.1% was included and found relevant in most published studies to date, and, thus, we recommend using 0.1% as the threshold to distinguish MRD-positive from MRD-“negative” patients. However, it should be noted that MRD tests with MRD quantified below <0.1% may still be consistent with residual leukemia, and several studies have shown prognostic significance of MRD levels below 0.1%.12,29-32 Thus, cutoff levels below 0.1% (eg, <0.01%) may define patients with particularly good outcome.

Table 2.

Prognostic thresholds for molecular MRD markers in AML patients who are in complete morphological remission

Suggestion for further improvements.

To perform retrospective analyses for patients with MRD burden <0.1% but >0% vs ≥0.1%.

Thresholds and time points for MRD assessment during follow-up/definition of relapse

In general, the definition of MRD positivity after consolidation therapy is similar to the postinduction definition.33 Not much is known about the optimal time intervals for clinically relevant sequential measurements of MRD.34 More information on such time intervals is reported in molecular MRD studies (see “Tissue sampling and time points for MRD assessment during treatment” and “Tissue sampling and time points for MRD assessment during follow-up and definition of complete molecular remission, molecular persistence at low copy number, molecular progression, and molecular relapse”).

Suggestion for further improvements.

With the emergence of potential novel remission treatment options in AML, there is urgent clinical need to establish the optimal intervals needed to define progression/impending relapse. Unpublished data from several institutions exist on sequential MRD measurements and may be informative.

Design of MRD studies: multicenter vs single center approaches

To facilitate and optimize data from MRD studies, we recommend that for multicenter studies samples may be processed by different centers applying the same MRD panels, according to the recommendations offered in the present article. With insufficient experience in MRD analysis, the final interpretation should be performed at a central institute or in a group workshop. Alternatively, samples may be sent under carefully controlled conditions (see “Technical requirements”) to a central institute for workup and analysis. The advantages of such a centralized approach need to be weighed against the disadvantages, for example, delays in processing and/or in establishing a final report for clinical decision making. For single center studies in institutions with relevant experience, we recommend following the procedures described in this article. Single center studies without relevant experience are strongly discouraged. The present local policies of the ELN Working Group members are outlined in supplemental Table 3.

Suggestion for further improvements.

With the increasing number of centers embarking on MRD studies, it is strongly recommended to establish working relationships with experienced centers. Meanwhile, we hope, and will support, that community practices and commercial laboratories seek opportunities to design common panels and procedures.

How to report MRD

In general, the minimum number of cells needed for accurate reporting of MRD is 500 000 to 1 million, excluding all CD45-negative cells and debris, although lower cell numbers may still suffice if the level of MRD is relatively high, and notably to merely assess a positive/negative status based on the 0.1% (10−3) threshold. The high numbers enable us to assess possible MRD below the level of 0.1% (see point 4).

Reports on MRD status should be constructed to allow clinicians to draw clear conclusions about how to interpret the report. Elements in an MRD report should contain the following parameters (see also Figure 1A-B):

  1. (a) Absolute numbers of LAIP cells and WBCs, and LAIP cells as percentage of WBCs; (b) for diagnostic LAIPs, the percentage coverage of blast cells at diagnosis; and (c) clinicians and laboratory staff should collaborate to decide if the final report will contain a statement “MRD-positive” or “MRD-negative” (ie, MRD ≥0.1% or < .1%). In cases with complete absence of aberrancies, the term “no MFC MRD identified” can be added to report of “MRD negative.”

  2. Detection sensitivity threshold for the aberrancy used with details: all aberrancies have the 0.1% threshold level, but additional information about the particular nature (sensitivity/specificity) of an aberrancy may be important, for example nature of myeloid, primitive, aberrant, and exclusion marker, especially in cases of newly defined LAIPs not present at diagnosis.

  3. Comments on quality of the sample, for example viability, insufficient regeneration, and PB contamination (Figure 1B). For suboptimal samples with detectable MRD, numbers of LAIP+ cells need to be communicated.

  4. It is up to the clinician (or clinical study group) how to deal with information for MRD <0.1%: the report could contain “MRD detectable but <0.1%, may be consistent with residual leukemia” but also the statement “this level has not been clinically validated” when applicable for the laboratory involved. Alternatively, MRD <0.1% may be reported as “MFC MRD detectable and quantifiable, but with uncertain significance.” Leaving out such information may have medico-legal consequences.

An example of a report form is shown in Figure 1B.

Figure 1.

MRD scenarios and reporting. (A) MFC MRD scenarios. (B) MFC MRD scenario. Example of MFC MRD report template.

Suggestion for further improvements.

As outlined earlier, very low levels of MRD (<0.01%) differentiate patients with a particularly good prognosis in some studies. Meta-analysis of prognostic models from other study groups, as well as independent validation of these very low threshold levels, may be of clinical importance.

Future directions

Retrospective analyses of databases to establish the value of the DfN vs LAIP approach in terms of prognostic impact, further exploration of the value of LSC detection in prognosis, and the urgent need for testing automated data analysis programs are of great importance in future studies of MRD in AML.

Optimizing the use of PB for MRD analysis, if feasible, would reduce the need for painful, time-consuming, and expensive BM testing.18,19 For the moment, PB MRD may offer a “first indication,” but BM MRD should always be assessed to define the MRD status of the patient (“positive” or “negative”).

As a final area of investigation, in contrast to molecular MRD, nothing is known about the possible relationship between preleukemic populations and immunophenotypic aberrancies. Investigation of this potential relationship may become important in the future.

Molecular MRD

Approaches for molecular MRD assessment

There are 2 general approaches to molecular MRD assessment: real-time PCR-based approaches and sequencing approaches wherein sequences from individual DNA/complementary DNA (cDNA) molecules are generated.

The PCR approach includes classical real-time qPCR using fluorescent probes, digital PCR, and molecular chimerism analysis.35 This approach is usually of high sensitivity and therefore currently considered the gold standard. However, its applicability is limited to the ∼40% of AML patients that harbor 1 or more suitable abnormalities.

NGS for MRD assessment can, theoretically, be applied to all leukemia-specific genetic aberrations. With improved experimental and bioinformatics approaches, we expect this approach to become applicable for another 40% to 50% of AML patients.

In general, we suggest that a MRD platform should be able to detect leukemic cells to a level of 0.1% (1 in 1000 mutated cells). We recommend the use of real-time qPCR platforms for MRD assessment because of their established high sensitivity. In the future, it is likely that NGS and digital PCR platforms will be used after careful validation. Genescan-based fragment analysis (eg, for FLT3 aberrations) has a low priority as a MRD platform because of limited sensitivity.

Markers for molecular MRD assessment

The persistent presence of NPM1 mutations and the fusion genes RUNX1-RUNX1T1, CBFB-MYH11, and PML-RARA following therapy is a strong predictor of relapse. Thus, patients with these abnormalities should have molecular assessment of residual disease using qPCR (sensitivity 10−4 to 10−6) at informative clinical time points (see “Tissue sampling and time points for MRD assessment during treatment” and “Tissue sampling and time points for MRD assessment during follow-up and definition of complete molecular remission, molecular persistence at low copy number, molecular progression, and molecular relapse”).

Preleukemic founder clones (and associated mutations; typical examples are those observed for DNMT3A, ASXL1, and TET2 genes) may persist at significant levels, even upon achievement of complete morphological remission,36-38 but the detection of these may not reliably represent the presence of AML MRD and may not be of prognostic significance. Mutations in these genes also occur in healthy individuals with increasing frequency as they age.39-41 This is referred to as age-related clonal hematopoiesis or clonal hematopoiesis of indeterminate potential (CHIP).42 In AML, such mutations often occur very early in the process of malignant transformation.31,36,38,43,44 For many other acquired mutations (that may occur later during disease development), it is unknown whether they represent reliable AML MRD markers.

Several genes mutated in germ line are associated with a risk of AML development like RUNX1, GATA2, CEBPA, DDX41, and ANKRD26.44 Naturally, they will not correlate with disease burden, and while remaining at a variant allele frequency of 50%, will not be useful for MRD assessment. If nevertheless potential somatic mutations in these genes are used as MRD markers, we recommend excluding germ line origin by DNA sequencing from germ line tissue (skin biopsy, hair follicle, or buccal swab). Germ line origin or CHIP should be suspected and excluded if the mutation level is unchanged compared with diagnosis, despite decreased blast count.

WT1 expression45,46 (Table 2) should not be used as an MRD marker, because of low sensitivity and specificity, unless no other MRD markers, including flow cytometric ones, are available in the patient. If nevertheless WT1 is used, it should follow the validated WT1 MRD assay45 developed by ELN researchers, and preferably in PB.

In patients undergoing allogeneic hematopoietic stem cell transplantation (allo-HSCT) the analysis of donor/recipient chimerism in PB and/or BM has been suggested as MRD marker. The conventional detection method using fragment analysis of short tandem repeats has limited sensitivity and therefore is not recommended for MRD.47 Modern techniques may allow higher sensitivity.35 In addition, variant allele-specific qPCR detecting small DNA insertions or deletions may be used as a sensitive method (10−3) to detect autologous cells.48,49

Because of frequent losses or gains of certain mutations at relapse, we also recommend against the use of mutations in FLT3-ITD, FLT3-TKD, NRAS, KRAS, IDH1, IDH2, MLL-PTD, and expression levels of EVI1 as single markers of MRD. However, several of these nonrecommended markers may have more prognostic significance when used in combination with a second MRD marker.

Suggestions for further improvements.

  1. The combination of several markers for MRD assessment can overcome limitations of MRD assessment that are because of subclonal heterogeneity of AML and CHIP. Such combination analysis will become increasingly feasible with advances in NGS MRD. For example, a patient may present with mutations in TP53, ASXL1, and PTPN11. In complete remission, the ASXL1 mutation may persist at a high variant allele frequency because of clonal hematopoiesis and cannot further be used for MRD assessment. The PTPN11 mutated clone may be successfully eradicated by chemotherapy. However, the TP53 mutated clone may persist and be part of the relapse-inducing clone. Thus, analysis of several MRD markers in 1 patient may increase the likelihood to identify molecular relapse.

  2. In allo-HSCT patients, germ line variants in genes associated with hematopoietic malignancy and mutations associated with CHIP should be evaluated as markers of recipient hematopoiesis to monitor MRD in the future.

Technical requirements for molecular MRD assessment

For reasons of sensitivity for qPCR, we recommend the use of cDNA over DNA for genes that are well expressed in AML cells (for technical details, see supplemental Data, “Technical requirements”). For new MRD markers, the expression level in AML cells should be evaluated. Detailed recommendations for MRD assays detecting RUNX1-RUNX1T1, CBFB-MYH11, and PML-RARA have been published by the Europe Against Cancer initiative, including appropriate housekeeping genes.50,51

Each MRD analysis by PCR should be run in triplicate. Amplification in at least 2 of 3 replicates with Ct values ≤40 (at a cycling threshold [CT] of 0.1) is required to define a result as PCR positive according to Europe Against Cancer criteria.50 As controls, we recommend including a wild-type sample (normal control), at least 2 positive controls that cover the desired sensitivity range, and a nontarget control (water control). If the positive controls are generated from plasmids, the stability of the plasmids should be monitored regularly.

After conversion of MRD from negative to positive, we recommend 2 specific measures to control for assay variability in the repeat samples: first, the initial sample in which molecular relapse was suspected should be included during the measurement of the repeat sample. Second, if the MRD assay is a real-time qPCR assay, standards should be included that cover the CT range of the patient samples to ensure linearity of the assay at the measured MRD level. If a negative MRD measurement is obtained, it is essential to know the sensitivity level at which it was determined. The following formula has been suggested to calculate the sensitivity of an individual real-time qPCR measurement, which can be used for absolute quantification using an external plasmid calibrator to estimate numbers of target molecules, as well as for relative quantification16,52:Embedded Image(ABL, housekeeping gene ABL; diagnosis, MRD analysis at diagnosis; FU, MRD analysis during follow-up; slope, slope of the standard curve, for an assay with 100% efficiency = −3.32; target, target gene for MRD analysis)

We recommend reporting the individual assay sensitivity in patients with complete molecular remission.

Tissue sampling and time points for MRD assessment during treatment

The details of sampling time points and corresponding tissue source are outlined in supplemental Table 2 and supplemental Data (under “Tissue sampling for MRD assessment”). During the treatment phase, we recommend molecular MRD assessment at minimum at diagnosis, after 2 cycles of standard induction/consolidation chemotherapy and after the end of treatment in PB and BM, as MRD in PB may provide better prognostic stratification. For patients undergoing allo-HSCT, MRD should be assessed in PB and BM after the last conventional chemotherapy, but not earlier than 4 weeks before conditioning treatment. The recommended thresholds for MRD positivity are discussed in the clinical section. The risk of relapse and overall survival probabilities for different MRD thresholds and constellations in prior studies are shown in Table 2.

Tissue sampling and time points for MRD assessment during follow-up and definition of complete molecular remission, molecular persistence at low copy number, molecular progression, and molecular relapse

In general, for patients with PML-RARA, RUNX1-RUNX1T1, CBFB-MYH11, mutated NPM1, and other molecular markers, we recommend molecular MRD assessment every 3 months for 24 months after the end of treatment in BM and in PB. Monitoring beyond 2 years of follow-up should be based on the relapse risk of the patient and decided individually. The prognostic impact of different MRD levels in follow-up is summarized in Table 2.

In this section and in supplemental Table 4, we specify outcome criteria of molecular MRD based on the depth of remission at the end of the treatment phase. Patients with complete morphological remission after treatment may be in complete molecular remission (CRMRD) or may have molecular persistence at low copy numbers. Patients in CRMRD may develop molecular relapse and patients with molecular persistence may develop molecular progression. It is not known yet whether molecular relapse and molecular progression have similar clinical characteristics or outcomes. Therefore, we currently recommend distinguishing between molecular progression and molecular relapse. In the following we shortly define these terms, and in supplemental Data (“Time points for MRD assessment…”) and supplemental Table 4, the recommended frequencies of monitoring and preferable tissue source are outlined.

Complete molecular remission (CRMRD).

To determine complete molecular remission (CRMRD) a patient must be in complete morphological remission (CR). We define CRMRD as 2 successive MRD negative samples obtained within an interval of ≥4 weeks at a sensitivity level of at least 1 in 1000. Negative MRD in the presence of blasts suggests molecular loss of the particular marker.

Molecular persistence at low copy numbers.

Molecular MRD may persist at low copy numbers, which is associated with a low risk of relapse. To label these patients, we suggest the definition of molecular persistence at low copy numbers, which we define as MRD with low copy numbers in patients with morphological CR (<100-200 copies/104 ABL copies corresponding to <1% to 2% of target to reference gene or allele burden)53,54 and a copy number or relative increase <1 log between any 2 positive samples collected after the end of treatment.

Molecular progression.

We define molecular progression in patients with molecular persistence at low copy number as an increase of MRD copy numbers ≥1 log10 between any 2 positive samples.

Molecular relapse.

Patients in complete morphological remission who achieve molecular remission may convert to positive MRD. We define molecular relapse as an increase of the MRD level of ≥1 log10 between 2 positive samples in a patient who previously tested negative in technically adequate samples.

How to report molecular MRD results

The recommended parameters that should be included in a report of molecular MRD assessments are listed in supplemental Table 5. We recommend to report absolute copy numbers for reverse transcription polymerase chain reaction (RT-PCR) results, in addition to the fold increase, to enable the clinician to make his/her own judgments.

Future directions

As discussed previously, the predictive power of several mutations is low or needs to be clarified. For frequently occurring point mutations, this is challenging because with current routine NGS approaches, the sensitivity of detecting these is ∼1%. A higher sensitivity of detecting point mutations can be obtained with digital droplet PCR (details in supplemental Data).55 A disadvantage of digital droplet PCR is that for each mutation a specific assay needs to be developed. Because this is time consuming and costly, this assay is especially suitable for sensitive detection of recurrent mutations like for instance in IDH1 and IDH2. Recent developments including error-corrected NGS also allow for highly sensitive point mutation detection (details in supplemental Data).56-58 A significant advantage of this NGS approach is that multiple mutations can be analyzed in 1 single patient sample. However, this approach does require more bioinformatic processing of data. Ultimately, this approach should provide greater sensitivity and, if adopted on BM and PB, may be able to identify low level mutations in terminally mature myeloid and lymphoid cells in PB; mutations of this nature are typically associated with clonal hematopoiesis and not leukemia.

Clinical discussion of MRD

MRD in clinical AML studies

During the last 20 years, numerous single institution studies in adult and pediatric patients have established that, regardless of the detection technique (MFC, RT-PCR, or NGS) and irrespective of hematopoietic cell transplantation, presence of MRD is associated with increased relapse risk and shorter survival in AML.4,16,59 Using a cutoff at a specified MRD detection threshold, the 2 resulting patient groups are referred to as “MRD positive” and “MRD negative,” although the latter is an oversimplification because improved outcomes do not necessarily require undetectable levels of MRD, while, inversely, a minority of MRD-negative patients will relapse as well.4,16

Two large prospective, multicenter studies (details in supplemental Data) have identified flow cytometry–based MRD as an independent prognostic indicator in adults with AML.28,30 In both studies MRD-positive patients had poorer outcome in multivariate analyses.28,30 In contrast to MFC, molecular assays enable MRD tracking in only a subset of patients.4 Currently, validated molecular MRD targets in AML include the PML-RARA translocation in APL, core-binding factor (CBF) translocations, and mutations in NPM1.4,16,60 As an example, NPM1-based MRD presented as the only independent prognostic factor for death in multivariate analysis.60 Details are in the supplemental Data.

Measurements of MRD using NGS techniques are under development but are not ready for routine application outside of clinical trials.56-58 Therefore, the current gold-standard measurements of MRD use complementary molecular and MFC-based techniques. Based on that, the following guidelines were constructed to facilitate the routine evaluation of MRD for AML patients in clinical practice, as well as for those participating in investigational trials.

General principles for clinical practice

In AML, morphology-based assessments of CR can be meaningfully refined with additional information about MRD.61,62 This is reflected in the 2017 ELN AML recommendations, which now include MRD as a new response criterion (CR with/without MRD).63 MRD monitoring should be considered part of the standard of care for AML patients. For molecular MRD this is limited to APL, CBF AML, and NPM1-mutated AML. For other AML patients, MRD should be assessed using MFC.4 This recommendation may change over time with emerging data for other molecular subgroups. Failure to achieve an MRD-negative CR or rising MRD levels during or after therapy are associated with disease relapse and inferior outcomes and should prompt consideration of changes in therapy, preferably in the setting of a controlled clinical trial.60,64 Although a rather rare event, it will have to be decided how to deal with patients who are not in morphological CR, but are in CR based on MRD assessment.

There are 2 concerns as to the clinical application of MRD: first, the use of cutoff levels in chemotherapy-based therapies generally reveals that different cutoff levels have different meaning in different risk groups in terms of patient outcome, and secondly, knowledge on the significance of MRD for patients treated with nonintensive therapies, for example DNA methyltransferase inhibitors (“hypomethylating agents”), is currently limited.65 We nevertheless suggest that such patients should be monitored for MRD with the caveat that there are few data to guide interpretation of MRD results.

APL

In APL, the most important MRD end point is achievement of PCR negativity for PML-RARA at the end of consolidation treatment, either with ATRA + chemotherapy-based or ATRA + arsenic trioxide–based therapies. PCR negativity at the end of consolidation is associated with a low risk of relapse and a high chance of long-term survival (see Table 2).66,67 Detectable levels of PML-RARA by PCR during active treatment of APL should not change the treatment plan for an individual patient, and it is controversial whether serial PCR measurements of PML-RARA during treatment are of value outside of clinical trials.64,68

At the completion of therapy, a change in status of PML-RARa by PCR from undetectable to detectable, as measured in either BM or PB and confirmed by a repeat sample, heralds imminent disease relapse in APL.63,66

For patients with low- and intermediate-risk disease (by Sanz score69), who are treated with an ATRA and anthracycline-based regimen, monitoring in BM at completion of induction therapy and in BM or PB every 3 month for the first 2 years after remission is recommended. For patients with low/intermediate-risk Sanz score who are treated with ATO and ATRA, MRD analysis should be continued until the patient is in CRMRD in BM and then should be terminated.66 For patients with high-risk APL, BM or PB monitoring is recommended every 3 months after completion of therapy for at least 2 years. Early identification of molecular relapse could quicken clinical action (eg, reducing bleeding complications), but impact of early detection on clinical outcome has not been shown.64 Finally, the presence of a FLT3 mutation should neither change clinical management nor demand serial monitoring.

CBF AML

CBFB-MYH11 [Inv(16)].

Despite the prognostic value of MRD in CBFB-MYH11 AML in terms of relapse rate (Table 2), no effect was noted on overall survival in multivariate analysis, probably because of the relatively high response rates of inv(16) AML to salvage treatment,70 and thereby no recommendation is made for a change in therapy (for more details, see supplemental Data).

MRD monitoring after 2 cycles of chemotherapy and after the end of therapy should be performed as described in the molecular paragraph (see also supplemental Data). It should be noted that low, stable levels of transcripts may be detectable by PCR for years after initial diagnosis without evidence of disease relapse.71

RUNX1-RUNX1T1 [t(8;21)].

As with CBFB-MYH11 positive AML, MRD assessment during the treatment phase of patients with RUNX1-RUNX1T1 positive AML is valuable for establishment of baseline transcript levels, but, with the controversies in prognostic impact of achieving MRD negativity either in PB or in BM (Table 2)72,73 (details in supplemental Data), there is no time point or MRD threshold during the active treatment phase that should trigger a recommendation to change therapy in patients with RUNX1-RUNX1T1 positive AML. MRD negativity at earlier time points was not prognostically relevant in patients with RUNX1-RUNX1T1 fusion.72,73 A >3 log reduction in BM between diagnosis and the end of induction 163 or consolidation73 was associated with significantly different relapse rates and a trend for longer OS in multivariate analysis. Patients who do not achieve >3 log reduction in transcripts have poor outcomes, but it is unclear whether this can be improved with allogeneic stem cell transplantation.

AML with NPM1 mutation, with or without other, concomitant mutations

MRD for NPM1 can be assessed by quantitative RT-PCR. The presence of measurable NPM1 transcripts in PB after at least 2 cycles of cytotoxic chemotherapy is associated with a high risk of relapse (>80%, Table 2).60 We recommend monitoring of NPM1 transcripts in BM and PB, if possible.60 If NPM1 MRD remains negative in PB but positive in BM after the end of treatment, transcripts should be closely monitored in PB and BM every 4 weeks for at least 3 months.60 If an upward trajectory of MRD, as defined by a log increase in either BM or PB, is detected, consideration should be given to salvage treatment.16,53,54 If a rising MRD titer is not confirmed or MRD becomes undetectable, then retesting may be performed at 3-month intervals for at least the first 2 years after the end of treatment.53,54,60

AML with BCR-ABL1

BCR-ABL positive AML was included as a provisional entity in the 2016 World Health Organization classification.74 Nearly half of the patients present with the p190 transcript, which is rarely found in chronic myeloid leukemia patients.74 The prognostic value of BCR-ABL MRD in AML is largely unknown, and therefore, no specific recommendations on clinical cutoffs and their prognostic impact in AML patients can be given.

Other molecular MRD markers

MRD thresholds and time points for other molecular MRD markers have not been defined sufficiently to provide recommendations.16 Based on current experience with fusion genes, we recommend to report the results of future MRD studies for achievement of MRD negativity in PB and BM for the time points after 2 cycles of chemotherapy and after the end of treatment.

AML subgroups not including APL, CBF AML, and AML with NPM1 mutation

MRD for patients not included in the molecularly defined subgroups APL, CBF AML, AML with NPM1 mutation, and AML with BCR-ABL1 should be measured using MFC. Having undetectable levels of MRD using MFC is associated with significantly better outcomes than having measurable disease,4,16,59 even in the setting of allogeneic stem cell transplantation.28,30

Pretransplant MRD

Evidence is accumulating that the presence of MRD assessed by MFC immediately prior to allo-HSCT is a strong, independent predictor of posttransplant outcomes in AML.75 In a recent update, Walter et al showed that MRD status had strong predictive value both in the ablative and nonmyeloablative transplant setting with MRD defined depth of response prior to transplant being the most important predictor of transplant outcome.3,76 Unfortunately, conversion from MRD positivity pretransplant to MRD negativity after myeloablative conditioning does not substantially improve relapse rate or OS.77

On the other hand, in NPM1 mutated patients, MRD had prognostic impact,78 while only in patients who achieved suboptimal reduction (<4 log10) of NPM1 levels after chemotherapy, allo-HSCT resulted in improved overall survival. However, no prospective studies using MRD to guide postremission therapy are available at the time of this publication.

Recommendations for MRD monitoring in clinical trials

CRMRD+ patients have inferior outcomes even in the setting of allo-HSCT representing an unmet medical need and should be considered for enrollment in controlled clinical trials. In order to assess whether eradication or reduction of MRD using either existing or experimental therapies can (a) be accomplished or (b) result in improved outcomes should be a goal of clinical trials.

All clinical trials should require molecular and/or MFC MRD at all times of evaluation of response, using the technical guidelines in this manuscript.4,29,33

Use of MRD as a surrogate end point for survival to accelerate drug approval

Clearly, MRD is used in clinical practice to guide the care of individual patients, but more data are required to establish the use of MRD as a surrogate end point for clinical trials in AML.4 If MRD negativity is established as a surrogate end point for survival, it is likely to be helpful for the evaluation of new drugs, possibly accelerating drug approval or stopping development of suboptimal drugs or treatment strategies. Currently, 2 studies strongly suggest that MRD can be used as a surrogate for overall survival end points. In CBF-AML, better clinical outcomes with higher dosage of daunorubicin were found to be associated with MRD level,79 while in another study, improved overall survival with the addition of gemtuzumab ozogamicin to standard induction therapy correlated with MRD status.80

Concluding remark

Recommendations for the MFC, molecular, and clinical aspects are summarized in Table 3.

Table 3.

ELN recommendations for MRD assessment

Acknowledgments

The authors gratefully acknowledge Johnson & Johnson for supporting a meeting of the ELN MRD Working Party and Rudiger Hehlmann for his continuous generous support of these recommendations on behalf of the European LeukemiaNet. K.D. was supported in part by grants from the Else Kröner-Fresenius-Stiftung Germany (project 2014_A298) and the Sonderforschungsbereich (SFB) 1074 funded by the Deutsche Forschungsgemeinschaft (SFB 1074, project B3). M.H. was appointed to the Heisenberg chair of the Deutsche Forschungsgemeinschaft (DFG grant HE 5240 / 6-1). C.S.H. was supported in part by the Intramural Research Program of the National Heart, Lung, and Blood Institute, National Institutes of Health (grant 1ZIAHL006163). G.J.R. acknowledges Leukemia Fighters for support, and C.T. acknowledges support by Bundesministerium für Bildung und Forschung (grant BMBF 031A24). P.V. was supported by the Oxford Biomedical Research Centre funded by the National Institute for Health Research UK and the Medical Research Council (grants MC_UU_12009/11 and G1000729 ID: 94931 and MR/L008963/1). R.B.W. is a Leukemia & Lymphoma Society Scholar in Clinical Research.

Authorship

Contribution: All authors gathered and/or reviewed specific literature; G.J.S., M.H., S.F., K.D., G.J.R., and G.J.O. wrote first drafts of specific sections; all authors reviewed specific sections; G.J.S. and G.J.O. put together the specific sections; and all authors, except D.G., who died at an earlier stage, reviewed and approved the final draft.

Conflict-of-interest disclosure: M.-C.B. received research support (Harmonemia project) from Beckman Coulter. J.C. received research funding from Helsinn Healthcare, Janssen Pharmaceuticals, Merus, and Takeda. S.F. received support from National Institute for Health Research, CRUK, and Bloodwise. T.H. and W.K. are both part owners of Munich Leukemie Laboratory. C.S.H. received research funding from Merck and Sellas. G.J.O. provided consultancy services to Janssen and Sunesis; served on the advisory board for Novartis, Pfizer, BMS, Janssen, Sunesis, Celgene, Karyopharm, Amgen, and Seattle Genetics; and received research funding from Novartis, Janssen, Celgene, Immunogen, and Becton Dickinson. G.J.R. provided consultancy services to AbbVie, Amgen, Amphivena Therapeutics, Astex Pharmaceuticals, Array BioPharma Inc., Celgene, Clovis Oncology, CTI BioPharma, Genoptix, Immune Pharmaceuticals, Janssen Pharmaceutica, Jazz Pharmaceuticals, Juno Therapeutics, MedImmune, Novartis, Onconova Therapeutics, Orsenix, Pfizer, Roche/Genentech, and Sunesis Pharmaceuticals and received research support from Cellectis. G.J.S. received research funding from Novartis, Janssen, Immunogen, and Becton Dickinson. C.T. is part Chief Research Officer and Chief Executive Officer and owner of AgenDix GmbH, a company performing molecular diagnostics. The remaining authors declare no competing financial interests.

David Grimwade, a pioneer in the field of MRD in AML and an active participant in the present work, died on 16 October 2016.

Correspondence: Gerrit J. Schuurhuis, Department of Hematology, VU University Medical Center, De Boelelaan 1117, 1081HV Amsterdam, The Netherlands; e-mail: gj.schuurhuis{at}vumc.nl.

Footnotes

  • * M.H. and S.F. contributed equally to this study.

  • K.D. and G.J.R. contributed equally to this study.

  • The online version of this article contains a data supplement.

  • Submitted September 5, 2017.
  • Accepted January 3, 2018.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83.
  84. 84.
  85. 85.
  86. 86.
  87. 87.
  88. 88.
  89. 89.
  90. 90.
  91. 91.
  92. 92.
  93. 93.
  94. 94.
  95. 95.
  96. 96.
View Abstract