Gene-expression profiling identifies distinct subclasses of core binding factor acute myeloid leukemia

Lars Bullinger, Frank G. Rücker, Stephan Kurz, Juan Du, Claudia Scholl, Sandrine Sander, Andrea Corbacioglu, Claudio Lottaz, Jürgen Krauter, Stefan Fröhling, Arnold Ganser, Richard F. Schlenk, Konstanze Döhner, Jonathan R. Pollack and Hartmut Döhner


Core binding factor (CBF) leukemias, characterized by either inv(16)/t(16;16) or t(8;21), constitute acute myeloid leukemia (AML) subgroups with favorable prognosis. However, there exists substantial biologic and clinical heterogeneity within these cytogenetic groups that is not fully reflected by the current classification system. To improve the molecular characterization we profiled gene expression in a large series (n = 93) of AML patients with CBF leukemia [(inv (16), n = 55; t(8;21), n = 38)]. By unsupervised hierarchical clustering we were able to define a subgroup of CBF cases (n = 35) characterized by shorter overall survival times (P = .03). While there was no obvious correlation with fusion gene transcript levels, FLT3 tyrosine kinase domain, KIT, and NRAS mutations, the newly defined inv(16)/t(8;21) subgroup was associated with elevated white blood cell counts and FLT3 internal tandem duplications (P = .011 and P = .026, respectively). Supervised analyses of gene expression suggested alternative cooperating pathways leading to transformation. In the “favorable” CBF leukemias, antiapoptotic mechanisms and deregulated mTOR signaling and, in the newly defined “unfavorable” subgroup, aberrant MAPK signaling and chemotherapy-resistance mechanisms might play a role. While the leukemogenic relevance of these signatures remains to be validated, their existence nevertheless supports a prognostically relevant biologic basis for the heterogeneity observed in CBF leukemia.


Characterized by either t(8;21)(q22;q22) and its variants [abbreviated t(8;21)] or inv(16)(p13q22)/t(16,16)(p13;q22) [abbreviated inv(16)], core binding factor (CBF) acute myeloid leukemias (AMLs) have been shown to constitute AML subgroups with favorable prognosis.13 However, the current World Health Organization classification4,5 does not fully reflect the biologic and clinical heterogeneity within these cytogenetically defined subgroups

At the molecular level, t(8;21) and inv(16) result in the fusion genes RUNX1/CBFA2T1 and CBFB/MYH11, respectively,68 that lead to the disruption of the CBF complex, a transcription factor complex involved in the regulation of hematopoiesis.9 The CBF complex consists of a heterodimer of the RUNX1 (formerly AML1) and the CBFB protein and normally activates a number of genes critical for normal myeloid development. In CBF AML, the fusion proteins act as dominant negative forms of the CBF, thereby impairing hematopoietic differentiation and predisposing to leukemic transformation.10 However, knock-in mouse models have demonstrated that t(8;21) and inv(16) by themselves are not sufficient to cause a leukemic phenotype11,12 and that additional aberrations were essential for the development of AML.11,13,14

These findings suggest a multistep nature of leukemogenesis, a possible explanation for patients differing with respect to several biologic and clinical features, because almost one third of patients relapse within the first year following intensive chemotherapy.15,16 Secondary chromosome abnormalities such as the commonly observed loss of a sex chromosome (-Y or -X), and/or deletions of the long arm of chromosome 9 in t(8;21), and trisomies of chromosomes 22, 8, and 21 in inv(16)15,16 likely contribute to the heterogeneity of CBF leukemias. Recent molecular analyses have also provided important insights into the pathogenesis of myeloid disorders, and the commonly detected mutations of the KIT and NRAS genes as well as deregulated CEBPA expression have been identified as likely candidates for cooperating events in CBF AML.17 Furthermore, the identification of an alternatively spliced isoform of the RUNX1/CBFA2T1 transcript may be involved in the development of t(8;21)-positive leukemias.18 However, despite this progress the molecular biology underlying CBF AML is still not fully understood.

Recently, DNA microarray technology-based gene-expression profiling (GEP) studies have shown the power of genomewide analysis to capture the molecular heterogeneity of AML.1922 Interestingly, gene-expression analyses also captured the molecular variation within CBF leukemia showing that t(8;21) or inv(16) cases are not each tightly correlated, with each class, t(8;21) and inv(16), being separated into mainly 2 groups.19 In agreement, Valk and colleagues also observed a molecular variation within their “homogenously grouped” CBF cases using less stringent clustering criteria.20 Distinct patterns of gene expression within each of the subgroups might reflect alternative cooperating mutations and/or deregulated pathways leading to transformation, because the t(8;21) and inv(16) themselves are not sufficient for leukemogenesis.

Hence, we analyzed a large series (n = 93) of AML patients with CBF leukemia [inv(16), n = 55; t(8;21), n = 38] using DNA microarray technology and correlated findings with known collaborating aberrations in CBF AML like additional cytogenetic and molecular genetic aberrations in order to (1) provide a refined molecular characterization of CBF leukemia and to (2) get new insights into the biology of CBF leukemia. Here we report the results leading to the identification of clinically relevant subclasses, highlighting genes and pathways of potential pathogenic relevance that provide a basis for novel molecular targeted therapeutic approaches in CBF AML.

Patients, materials, and methods


The 93 samples (49 peripheral blood [PB] and 44 bone marrow [BM] specimens) from adult AML patients were provided by the German and Austrian AML Study Group (AMLSG), with patient informed consent obtained in accordance with the Declaration of Helsinki and institutional review board approval from all participating centers. Patients were entered into 1 of 3 AMLSG treatment protocols (AML HD93 and AML HD98-A for younger adults [age less than 60 years] and AML HD98-B for elderly patients [age 60 years and older], enrolled between November 1994 and March 2004). Patients less than 60 years of age (n = 78) received intensive induction and consolidation therapy, whereas elderly patients 60 years and older (n = 15) were treated less intensely (for protocol details, see Schlenk et al16,23). Patient age at the time of diagnosis ranged from 19 to 73 years (median, 47 years). Clinical characteristics at the time of diagnosis were available for almost all cases as detailed in Table 1. Estimated median follow-up time for the patients with survival information (n = 89) was 52.5 months (95% confidence interval [CI], 47 to 59 months).

Table 1

Distribution of factors by type of CBF AML at diagnosis

Cytogenetic and molecular genetic analyses

Conventional chromosome banding, fluorescence in situ hybridization (FISH), and FLT3 mutational analysis (screening for internal tandem duplications [ITDs] and tyrosine kinase domain [TKD] mutations) were performed as previously described24,25 at the central reference laboratory of the German and Austrian AMLSG at our institution.

RUNX1/CBFA2T1 and CBFB/MYH11 fusion gene transcript levels at the time of diagnosis were evaluated by quantitative reverse transcriptase-polymerase chain reaction (RT-PCR) as previously reported26,27 by using the following primers and probes: RUNX1 primer 5′-AATCACAGTGGATGGGCCC-3′; CBFA2T1 primer 5′-TGCGTCTTCACATCCACAGG-3′; RUNX1/CBFA2T1 probe 5′-FAM-CTGAGAAGCACTCCACAATGCCAGACT-TAMRA-3′; CBFB primer 5′-AGGTCTCATCGGGAGGAAATG-3′; MYH11 primer 5′-TCTTCATCTCCTCCATCTGGGT-3′; CBFB/MYH11 probe 5′-FAM-CCATGAGCTGGAGAAGTCCAAGCG-TAMRA-3′.

Exemplary technical validation of microarray-based gene-expression findings was performed accordingly using the following primers and probes: FOXO1A forward 5′-CTCATGGATGGAGATACATTGGATTT-3′; FOXO1A reverse 5′-GGTGAAAGACATCTTTGGACTGCTT-3′; FOXO1A probe 5′-FAM-CTAACCCTCAGCCTGACACCCAGCTAT-TAMRA-3′; MLL5 forward 5′-GGGTTGATACAGCAGAGACGTCA-3′; MLL5 reverse 5′-GGATTTCTCAACTACCACAGGGC-3′; MLL5 probe 5′-FAM-TGGCTGCAGGTTCAGAACCAGAATCC-TAMRA-3′; ETS1 forward 5′-CCGTACGTCCCCCACTCCT-3′; ETS1 reverse 5′-TGGAATGTGCAGATGTCCCA-3′; and ETS1 probe 5′-FAM-CGTCGATCTCAAGCCGACTCTCACCAT-TAMRA-3′.

Screening for mutations in KIT and NRAS was performed by a denaturing high-performance liquid chromatography (dHPLC)–based method using a WAVE dHPLC-system (Transgenomic, Omaha, NE) as previously reported.28 In brief, DNA isolation was performed as described,24,29 and 50 ng DNA was used for all PCR amplifications with the previously published primers for KIT (exons 8 and 17) and NRAS (exon 1)30 as well as NRAS (exon 2).31 Cycling conditions for mutation detection were as follows: 1 cycle, 2 minutes at 95°C; 35 cycles, 30 seconds at 94°C, 1 minute at 56°C, and 1 minute at 72°C; and 1 cycle, 10 minutes at 72°C. Heteroduplexes were then generated by means of a thermal cycler as follows: 95°C for 5 minutes; annealing/stabilization (starting at 94°C for 2 minutes with a −1°C touchdown until 45°C). Then, 5 to 10 μL of heteroduplexed PCR products were subsequently subjected to dHPLC. The elution temperature varied according to the analyzed products: KIT exon 8: 56°C; KIT exon 17: 56.2°C; NRAS exon 1: 59.8°C; NRAS exon 2: 58.2°C. The exact mutant sequence was confirmed for all samples showing an abnormal dHPLC profile. PCR products were purified followed by direct sequencing with the reverse primers using an ABI-PRISM310 genetic analyzer (Applied Biosystems, Foster City, CA).

Gene-expression profiling

Gene-expression profiling (GEP) was performed essentially as reported in all 93 samples using the previously described cDNA microarray platform (26 of these cases have already been published, whereas 67 cases were newly analyzed).19 The percentage of blasts for PB and BM samples prior to enrichment for leukemic cells by Ficoll-density gradient centrifugation ranged from 25% to 97% (median 53%) and 25% to 91% (median, 68%), respectively. Following enrichment all samples contained at least 80% leukemic cells. Fluorescence ratios were normalized by mean-centering genes for each array and then by mean-centering each gene across all arrays within a large AML data set (n = 260). For subsequent analyses, we only included well-measured genes whose expression varied as determined by signal intensity over background greater than 2-fold in either test or reference channel in at least 75% of samples and 4-fold ratio variation from the mean in at least 2 samples; 8556 genes met these criteria. The complete gene-expression microarray data set is available at the Stanford Microarray Database32 and the filtered data set is provided as Table S1 (available on the Blood website; see the Supplemental Materials link at the top of the online article). For hierarchical clustering, we applied average-linkage hierarchical clustering and visualized results using TreeView.33

Array CGH

For a subset of 58 cases [t(8;21), n = 29; inv(16), n = 29] array comparative genomic hybridization (CGH) experiments were performed as previously described using an 8k BAC/PAC microarray platform.34 Fluorescence ratios were normalized using the median of the fluorescence ratios computed as log2 values from the DNA control fragments spanning the whole genome. For each individual experiment the cutoff level was determined by using an individual set of balanced clones that was used to calculate the mean and standard deviations. We then defined the cutoff level as mean ± 3 times the standard deviation. Frequently affected regions recently detected as copy number polymorphisms were excluded from data analysis.35,36 Table S2 provides the entire normalized array CGH data set. Parallel analysis of gene-expression and array CGH data were performed as previously described.34,37 Map positions for arrayed cDNA clones were assigned using the National Center for Biotechnology Information (NCBI) May 2005 genome assembly accessed through the University of California Santa Cruz (UCSC) genome browser. In this way, approximately 35 000 arrayed clones representing 18 000 unique genes could be assigned map positions.

Data analysis

To evaluate the robustness of our hierarchical clusters, we used the R (reproducibility) measure38 based on perturbing the expression data with gaussian noise, reclustering, and measuring the similarity of the new clusters to the original clusters. The perturbation and reclustering was done 100 times. For each pair of samples in a cluster of the original data, the R measure is the proportion of the time they stay in the same cluster after perturbation and reclustering. The R measure is expressed for each original cluster as an average over all pairs of samples and all perturbations and reclustering.

For 2-class supervised analyses, we identified genes that were differentially expressed among the 2 classes by using the significance analysis of microarrays (SAM) method,39 which uses a modified t test statistic, with sample-label permutations to evaluate statistical significance. For class prediction we performed the prediction analysis for microarrays (PAM) method40 based on nearest shrunken centroids to define a cross-validated gene-expression predictor for the cluster-defined CBF classes.

We identified gene ontology (GO) groups of genes whose expression was differentially regulated among the classes by computing the number of genes represented on the microarray in the respective GO group and the statistical significance P value for each gene in the group. These P values reflect differential expression among classes and were computed based on random variance t tests.41 For each GO group, 2 statistics were computed that summarize the P values for genes in the group: the Fisher (LS) statistic and the Kolmogorov-Smirnov (KS) statistic.42 We considered a GO category significantly differentially regulated if the significance level was less than .005. All GO categories with between 5 and 100 genes represented on the array were considered with some of the categories showing an overlap. The same computational algorithm was used to identify groups of genes belonging to distinct BioCarta or KEGG pathways whose expression was differentially regulated among the classes.

Survival times and censored waiting times measured from the date of diagnosis were plotted with Kaplan-Meier estimates. Cumulative incidence of relapse (CIR) and cumulative incidence of death (CID), their SE, and differences between groups were estimated according to Gray.43 The median duration of follow-up was calculated according to the method of Korn.44 Groupwise comparisons of the distributions of clinical and laboratory variables were performed using Fisher exact test and the Cochran-Armitage test. All tests were 2-sided. An effect was considered significant if the (adjusted) P value was .05 or less. The analyses were performed using BRB-Array Tools Version 3.3.0 Beta_3 developed by Dr Richard Simon and Amy Peng Lam and using R, version 2.2.1.


Gene-expression–based CBF subclass discovery

We profiled gene expression in 93 diagnostic peripheral blood and bone marrow samples using cDNA microarrays to survey the molecular variation of CBF AML. To explore the relationship among samples as well as the underlying patterns of gene expression, we performed an unsupervised 2-way hierarchical cluster analysis33 using the 8556 genes whose expression varied most across samples (Figure 1A). In accordance with the previously observed heterogeneity within the CBF subgroups,19 the t(8;21) and inv(16) cases were not tightly correlated, with each cytogenetic class being segregated by the 2 main clusters defined by gene-expression profiling. Interestingly, within the larger cluster (group II) t(8;21) and inv(16) cases grouped mainly into 2 homogeneous t(8;21) and inv(16) classes, while in the other cluster (group I) t(8;21) and inv(16) cases were interspersed throughout, thereby forming one “mixed” t(8;21)/inv(16) CBF subgroup (Figure 1A).

Figure 1

Unsupervised hierarchical cluster analyses. (A) Thumbnail overview of an unsupervised 2-way hierarchical cluster analysis of 93 CBF AML cases (columns) and 8556 variably expressed genes (rows). Mean-centered gene-expression ratios are depicted by a log2 pseudocolor scale (indicated). Gray denotes poorly measured data. Samples are color-coded according to the cytogenetic groups t(8;21) and inv(16). The sample dendrogram shows that CBF samples separated into 2 major subgroups, as indicated. Gene clusters characterizing the respective groups as well as t(8;21) and inv(16) are highlighted by colored bars. (B) Kaplan-Meier estimates of overall survival in the 2 CBF subgroups; the difference between groups I and II was significant (P = .029, log-rank test). The “x” symbols indicate censored data. (C,D) Kaplan-Meier estimates of overall survival of CBF subgroups based on unsupervised hierarchical cluster analysis in t(8;21) (C) and inv(16) cases only (D).

To evaluate the robustness of our hierarchical clusters we reclustered our samples 100 times using hierarchical clustering and measured the proportion of the time each sample stayed in the same cluster (Figure 2A). The consensus index of samples within consensus cluster no. 1 (group I) was r = 0.836, and samples in consensus cluster no. 2 (group II) were also tightly correlated (r = 0.815). We only observed that 2 “borderline cases” of group II were more often assigned to group I by reclustering the samples 100 times. Thus, these data suggest that group I and II represent robust classes.

Figure 2

Consensus clustering. (A) For each pair of samples in the unsupervised hierarchical cluster analysis, the R measure (the proportion of the time sample pairs stay in the same cluster during consensus clustering) is indicated for each original cluster as an average over all pairs of samples. The R measures for individual pairs of samples are color-coded with white indicating that a given sample pair clustered 100 times in the same group and red denoting no coclustering. The white diagonal line displays the intraindividual comparison of results for a patient with AML (ie, 100× coclustering). (B) Kaplan-Meier estimates of overall survival in the 2 CBF consensus clusters; the difference between cluster no. 1 and no. 2 was significant (P = .046, log-rank test).

Technical validation of microarray-based gene-expression findings was performed for selected genes, FOXO1A, MLL5, and ETS1. In accordance with findings from previous studies, we found a high correlation of our microarray and quantitative RT-PCR data with correlations of 0.84, 0.81, and 0.86 for FOXO1A, MLL5, and ETS1, respectively (data not shown).

Correlation with clinical and genomic findings

To gain further insight into the significance of new subtypes, we examined the distribution of relevant clinical and molecular genetic parameters among samples. Sex, age, and percentage of bone marrow blasts were evenly distributed between the CBF subclasses group I and II (Table 2). For French-American-British (FAB) subtypes there was a trend toward a correlation of FAB M4 with group I cases (P = .085, Fisher exact test), which is likely due to the higher frequency of inv(16) cases in this group (P = .035, Fisher exact test). Furthermore, cases in group I displayed higher white blood cell counts (WBCs) (P = .011, Fisher exact test); this effect could be mainly attributed to inv(16) cases, which had a higher WBC in group I compared with group II (Table 2). While there was a significant association with loss of one X chromosome copy in females in group II (P = .019, Fisher exact test), there was no correlation with secondary chromosome aberrations known to be prognostically relevant in CBF-like loss of the Y chromosome in male t(8;21) cases or trisomy 22 in inv(16) leukemias.15,16 There was no correlation with fusion gene transcript levels at diagnosis. Regarding the distribution of molecular aberrations, while FLT3-ITDs were more prevalent in group I (P = .026, Fisher exact test), there was no significant association of the newly defined groups with FLT3-TKD, KIT, or NRAS mutations (Table 2). Notably, the significant association of group I with FLT3-ITD was based on only 4 FLT3-ITD-positive cases. Thus, the difference between group I and II cannot be fully explained by the presence or absence of FLT3-ITDs because these were only found in 14% of group I cases.

Table 2

Distribution of factors between hierarchical cluster-defined CBF subgroups

Correlation with outcome

Kaplan-Meier analysis identified a statistically significant difference in overall survival (OS) between the 2 subclasses (P = .03, stratified for type of CBF, log-rank test; Figure 1B), and a similar difference was observed for the consensus cluster-defined groups (P = .046, log-rank test; Figure 2B). Unsupervised hierarchical cluster analyses performed only within the t(8;21) or the inv(16) group revealed t(8;21) and inv(16) clusters corresponding to the subgroups found in the analysis of the combined CBF data set (data not shown). In accordance, the respective subgroups corresponding to group I showed a trend toward inferior outcome (P = .17 and P = .12, respectively, log-rank test; Figure 1C,D). Correlation of the hierarchical cluster-defined subgroups revealed no statistically significant difference regarding the CIR between groups I and II (P = .28, log rank test; Figure S1A), although there were more relapses in group I (15 of 28 in group I versus 18 of 50 in group II, P = 0.16, Fisher exact test).

To test whether the difference in OS was independent of type of CBF leukemia and presenting WBC we performed a multivariate proportional hazards analysis. While there was a trend for the hierarchical cluster-derived group II toward better OS (hazards ratio, 0.51; CI, 0.24 to 1.07; P = .08), WBC and CBF subtype did not seem to contribute to the differences in outcome (P = .68 and P = .25, respectively). In addition, we repeated the Kaplan-Meier estimates of overall survival in the 2 hierarchical cluster-defined CBF subgroups excluding 4 FLT3-ITD-positive cases. With the exclusion of these 4 samples, we still observed a trend toward poorer outcome in the remaining group I cases compared with group II (P = .12, log-rank test).

The number of younger adults with inv(16) AML who had received an autologous or allogeneic stem cell transplantation as postremission therapy in first complete remission (CR) was equally distributed between group I (n = 9) and II (n = 12) cases. Nevertheless, we investigated the CIR and CID in the subset of group I and II patients who had received chemotherapy as postremission therapy. While we did not see a significant difference between the hierarchical cluster-defined groups, we however observed a trend (P = .11 and P = .14 for CIR and CID, respectively, log-rank test; Figure S1B), which is in accordance with our results in the entire cohort.

Characterization of CBF cases by array CGH

To further investigate whether unsupervised hierarchical clustering of our CBF AML cases was driven by as yet unknown secondary aberrations, we performed array CGH experiments for a random subset of 58 samples. High-resolution screening for unbalanced genomic aberrations did not reveal a substantial number of additional recurrent aberrations but allowed us to further define with higher resolution the boundaries of a 9q deletion that had been previously identified by conventional cytogenetic banding analysis. We identified an approximately 9.3 Mb-sized deleted fragment, a del(9)(q21q21) spanning a previously described 2.4 Mb-sized commonly deleted fragment in 9q2145 (Figure S2).

CBF leukemia subgroups—biologic insights

Distinct gene-expression signatures underlying the cytogenetic CBF subgroups as well as the novel unsupervised hierarchical cluster-defined subgroups (Figure 1A) were identified using a supervised analytical approach. By using the SAM method we identified more than 1000 genes that significantly correlated with the CBF groups t(8;21) and inv(16) (false discovery rate, less than 0.0001; Table S3). These cytogenetic group-specific signatures basically reflected previous findings,19,20 because genes defining the t(8;21) signature included, for example, POU4F1, CAV1, HSPG2, and TRH, and in cases with inv(16) we found high-level expression of, for example, NT5E, PTPRM, CLIPR-59, and SPARC. In accordance, gene set enrichment analyses (GSEA) for the cytogenetic signatures showed a significant enrichment of genes associated with many GO groups mainly involved in humoral immune response, immune cell activation, as well as receptor-mediated endocytosis and phagocytosis (Table S4).

Using SAM we also identified more than 1000 genes displaying a significant (false discovery rate, less than 0.0001) differential expression among the newly defined CBF subtypes (Figure 3; Table S5). Group I was, for example, characterized by high-level expression of BRCA1, RAD51, and CHEK2, genes involved in the response to DNA damage and in DNA repair.46 In accordance, GSEA identified a significant enrichment of genes belonging to the GO category “damaged DNA binding” (Table 3). Furthermore, group I cases were associated with elevated expression of genes belonging to the GO category “nucleoside metabolism” and the pathway “pyrimidine metabolism” (Table 4). In addition, high-level expression of JUN and FOS among group I cases (Figure 3) suggested a potential role of aberrant MAPK signaling and Jun N-terminal kinase (JNK) signaling pathways, which regulate biologic processes, such as cell differentiation, proliferation, and transformation. In agreement, GSEA identified a significant association with the pathway “MAPK signaling pathway” (Table 4), and many GO categories were associated with proliferation like “cell division” and “M phase of mitotic cell cycle” (Table 3).

Figure 3

Hierarchical cluster-defined CBF subgroups. Subset of the top SAM genes (rows; ordered by SAM score) characterizing the hierarchical cluster-defined CBF subgroups. Mean-centered imputed gene-expression ratios are depicted by a log2 pseudocolor scale (indicated). The 93 CBF AML cases (columns) have been ordered according to the dendrogram of the unsupervised 2-way hierarchical cluster analysis (Figure 1). Owing to space limitations, only selected genes are indicated.

Table 3

GO categories discriminating among hierarchical cluster-defined CBF subgroups

Table 4

Pathways discriminating among hierarchical cluster-defined CBF subgroups

Group II CBF cases were characterized by a prominent gene-expression feature with the top SAM gene being the Rapamycin-insensitive companion of mTOR, RICTOR (Figure 3), encoding a component of the TOR protein complex.47 The group II-defining pattern was also characterized by members of pathways downstream of TOR such as EIF4EBP1 (eukaryotic translation initiation factor 4E [eIF4E]-binding protein 1) and PDPK1 (phosphoinositide-dependent protein kinase 1)47 as well as upstream-regulators of TOR like AKT1 (Figure 3). In addition, we observed expression of AKT1 target genes like FOXO1A; higher expression levels of antiapoptotic genes like BIRC3 and BIRC6 (Figure 3); and significantly lower expression of PTEN in group II cases compared with group I (mean of ratios, 1.318 versus 0.805, respectively; P < .001). Notably, GSEA identified a significant enrichment of genes belonging to the GO categories “ATP-dependent helicase activity,” “RNA helicase activity,” “mRNA metabolism,” “mRNA processing,” “nuclear mRNA splicing, via spliceosome,” “RNA splicing,” and “RNA polymerase II transcription mediator activity,” suggesting increased translation initiation in this CBF subgroup (Table 3). Furthermore, there was a significant association with the “TNFR1 signaling pathway” and pathways involved in apoptosis and GO categories associated with ubiquitination (Tables 3,4).

CBF leukemia subgroup prediction

Recently, it has been shown by several groups that the cytogenetic AML subgroups with t(8;21) and inv(16) can be predicted at high accuracy based on their characteristic gene-expression signatures.20,21 Similarly, in our data 98% of samples were correctly classified using PAM with a sensitivity of 100% and 95.2%, and a specificity of 95.2% and 100% for inv(16) and t(8;21), respectively.

For class prediction of the newly defined, clinically relevant CBF subgroups the PAM algorithm also provided a high cross-validation performance with a sensitivity of 97.1% and 93.1% and a specificity of 93.1% and 100% for group I and group II cases, respectively. The positive predictive value in this cohort was 89.5%. Together, these results indicate that the identified signatures might be used to accurately predict the underlying tumor subclasses.


In previous analyses, we had observed that samples with t(8;21) and inv(16) each separated into different subgroups based on unsupervised hierarchical clustering.19 Because the primary translocation events themselves are not sufficient for leukemogenesis,17 distinct patterns of gene expression found within each of these cytogenetic groups may suggest alternative cooperating mutations and dysregulated pathways leading to transformation. Thus, a major objective of this study was to survey the molecular variation of CBF leukemia in a large set of samples to gain new insight into the underlying biology of these cytogenetic AML groups but, more importantly, to also better understand the clinical heterogeneity we observe in CBF patients. In agreement with previous findings, using unsupervised hierarchical clustering we have discovered that CBF samples stratify into 2 robust subgroups based on distinct patterns of gene expression.

A significant correlation with WBCs and the unequal distribution of FLT3-ITD mutations support distinct biologic behaviors, and the difference in overall survival supports distinct clinical behavior between the newly defined subgroups. While the FLT3-ITD cases accounted for only 4 of the 28 AMLs in group I with available FLT3 mutational status and because we also did not observe a correlation with secondary chromosome aberrations, fusion gene transcript levels, FLT3-TKD, KIT (exon 8 and 17), and NRAS mutations, our findings suggest that the hierarchical cluster-based inv(16)/t(8,21) subgroups reflect yet unknown prognostically relevant pathogenic mechanisms. For group I this mechanism might be similar to the one initiated by FLT3-ITD, thereby resulting in increased proliferation that is in part reflected by the elevated WBC in this group. Interestingly, the elevation in the WBC was not just the reflection of an imbalance of inv(16) cases between group I and II but was mainly attributed to higher WBC in inv(16) cases in group I compared with group II. Thus, the identification of these new subgroups suggests an improved molecular classification of AML based on gene-expression profiling.

Importantly, the genes differentially expressed between the subgroups also provide insight into distinct pathways for the molecular pathogenesis of CBF AML. For example, the newly defined CBF group I was defined by elevated expression of JUN and FOS. These genes encode proteins that can form the dimeric AP-1 transcription factor complex that is involved in several “hallmarks” of cancer like tumor cell proliferation and survival of tumor cells.48 Thus, overexpression of the proto-oncogene JUN and constitutive activation of the JNK and MAPK signaling pathway might be implicated in the leukemic transformation in this CBF subgroup. Interestingly, elevated JUN expression has previously been found in primary AML bone marrow cells of patients with a t(8;21) and inv(16).49 Similar to the leukemic effects of FLT3-ITD, this mechanism might lead to an increased proliferation. In accordance, CBF group I showed a significant correlation with GO categories associated with proliferation as well as with genes sensing DNA damage and activating DNA repair. In agreement, for example, enhanced BRCA1 expression has previously been linked to the c-Jun N-terminal kinase (JNK) pathway.50,51 BRCA1 and RAD51 play a crucial role in DNA repair.46 Cells that are defective for BRCA1 are hypersensitive to agents that produce breaks in double-stranded DNA, and it is thought that the chromosomal instability resulting from loss of BRCA1 function is a crucial feature of tumorgenesis. However, elevated BRCA1 expression has on the other hand been demonstrated to lead to an increased resistance against agents or radiation-producing DNA double-strand breaks.5254 Thus, increased DNA repair mechanisms might contribute to poorer outcome in group I CBF cases due to “resistance” to DNA double-strand break-inducing agents like idarubicin or etoposide, which also were used for induction chemotherapy in our patients.

PTEN function is attenuated in many tumors through deletion, silencing, or mutation, leading to constitutive activation of AKT55 and up-regulation of TOR-dependent pathways.47 This might confer a possible mechanism in group II CBF cases that are characterized by elevated expression of AKT and mTOR-signaling pathway members. While PTEN down-regulates the activity of PDPK1 and AKT, which slows cell growth and induces their accumulation in G1, it recently has been shown that RICTOR-mTOR may be an intriguing target in tumors with impaired expression of PTEN, a tumor suppressor opposing AKT activation.56,57 Notably, a significant enrichment of genes belonging to the GO categories “ATP-dependent helicase activity,” “RNA helicase activity,” “mRNA metabolism,” “mRNA processing,” “nuclear mRNA splicing, via spliceosome,” “RNA splicing,” and “RNA polymerase II transcription mediator activity” supports in increased translation initiation in this CBF subgroup. This might possibly be induced by the TOR protein complex, which is a central regulator of both cell growth and proliferation by regulation of translation initiation.47

Additional targets for molecular therapies in the newly defined CBF group II include deregulated apoptotic pathways in part reflected by significantly higher expression levels of anti-apoptotic genes like BIRC3 and BIRC6, 2 members of a protein family that inhibits apoptosis by binding to tumor necrosis factor receptor-associated factors.58 In agreement, GSEA identified a significant association with the “TNFR1 signaling pathway” and pathways involved in apoptosis in this CBF subgroup. Today, an increasing basic understanding of the inhibitor of apoptosis proteins (IAPs) like BIRC3 and BIRC6, which inhibit and modulate cell division, cell cycle progression, and signal transduction pathways,59 is being translated into clinically useful applications in the treatment of malignancy. IAPs are attractive therapeutic targets, because they are preferentially expressed in malignant cells, and currently efforts are underway to develop antisense and chemical IAP inhibitors.60 In the future, these agents might also be useful for the treatment of this CBF AML subgroup.

In conclusion, while the biologic impact of these signatures in leukemogenesis remains to be validated before novel treatment approaches can be implemented, our findings nevertheless support a clinically useful refined CBF leukemia classification based on gene-expression profiling. Ultimately, the refined molecular characterization of CBF subgroups might provide the means of individualized patient management that will guide an effective combination of both conventional and molecular therapies.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.


Contribution: L.B. and F.G.R. designed and performed research, analyzed and interpreted data, and wrote the paper; S.K., J.D., C.S., S.S., and A.C. performed research and analyzed data; C.L. analyzed and interpreted data and wrote the paper; J.K. and A.G. contributed vital reagents or analytical tools, collected data, and analyzed data; S.F. performed research, analyzed and interpreted data, and wrote the paper; R.F.S. analyzed and interpreted data and wrote the paper; K.D. designed research, analyzed and interpreted data, and wrote the paper; J.R.P. designed research, contributed analytical tools, analyzed and interpreted data, and wrote the paper; and H.D. designed research, contributed vital reagents, analyzed and interpreted data, and wrote the paper.

L.B. and F.G.R. contributed equally to this work.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Lars Bullinger, Department of Internal Medicine III, University of Ulm, Robert-Koch-Str 8, 89081 Ulm, Germany; e-mail: lars.bullinger{at}


This study was supported in part by the Deutsche Forschungsgemeinschaft (BU 1339/2-1), the Deutsche José Carreras Stiftung e.V. (DJCLS R 05/22), and the Leukemia and Lymphoma Society (6151-06).

The authors are indebted to the staff of the Stanford Functional Genomic Facility (SFGF) for providing high-quality cDNA microarrays and to the staff of the Stanford Microarray Database (SMD) group for providing outstanding database support. For excellent technical assistance we thank Martina Bonenberger, Ursula Botzenhardt, Karina Eiwen, and Marianne Habdank. We also thank all the members of the German-Austrian AML Study Group (AMLSG) for their continuous support of the treatment protocols.


  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted October 2, 2006.
  • Accepted April 22, 2007.


View Abstract