Clonal hematopoiesis, with and without candidate driver mutations, is common in the elderly

Florian Zink, Simon N. Stacey, Gudmundur L. Norddahl, Michael L. Frigge, Olafur T. Magnusson, Ingileif Jonsdottir, Thorgeir E. Thorgeirsson, Asgeir Sigurdsson, Sigurjon A. Gudjonsson, Julius Gudmundsson, Jon G. Jonasson, Laufey Tryggvadottir, Thorvaldur Jonsson, Agnar Helgason, Arnaldur Gylfason, Patrick Sulem, Thorunn Rafnar, Unnur Thorsteinsdottir, Daniel F. Gudbjartsson, Gisli Masson, Augustine Kong and Kari Stefansson

Key Points

  • Whole-genome sequencing of 11 262 Icelanders reveals that clonal hematopoiesis is very common in the elderly.

  • Somatic mutation of some genes is strongly associated with clonal hematopoiesis, but in most cases, no driver mutations were evident.

Publisher's Note: There is an Inside Blood Commentary on this article in this issue.


Clonal hematopoiesis (CH) arises when a substantial proportion of mature blood cells is derived from a single dominant hematopoietic stem cell lineage. Somatic mutations in candidate driver (CD) genes are thought to be responsible for at least some cases of CH. Using whole-genome sequencing of 11 262 Icelanders, we found 1403 cases of CH by using barcodes of mosaic somatic mutations in peripheral blood, whether or not they have a mutation in a CD gene. We find that CH is very common in the elderly, trending toward inevitability. We show that somatic mutations in TET2, DNMT3A, ASXL1, and PPM1D are associated with CH at high significance. However, known CD mutations were evident in only a fraction of CH cases. Nevertheless, the highly prevalent CH we detect associates with increased mortality rates, risk for hematological malignancy, smoking behavior, telomere length, Y-chromosome loss, and other phenotypic characteristics. Modeling suggests some CH cases could arise in the absence of CD mutations as a result of neutral drift acting on a small population of active hematopoietic stem cells. Finally, we find a germline deletion in intron 3 of the telomerase reverse transcriptase (TERT) gene that predisposes to CH (rs34002450; P = 7.4 × 10−12; odds ratio, 1.37).


Hematopoietic stem cells (HSC) are responsible for the generation of all mature blood cells throughout life. Clonal hematopoiesis (CH) arises when a single HSC clonal lineage contributes disproportionately to the population of mature blood cells. Early indications of this phenomenon came from observations that the ratio of maternal to paternal X-chromosome inactivation is skewed in the blood of some otherwise healthy individuals, especially among the elderly.1-4 Skewing can be seen in all hematopoietic lineages, consistent with an origin in HSCs, but is most easily seen in nucleated cells of the myeloid lineage because they are short lived and require continuous replenishment from HSCs.5,6 Age-related CH does not arise from a simple depletion of HSCs, as the abundance of HSCs in human bone marrow actually increases in older people.7

Skewed X-inactivation also occurs in myeloid neoplasias, including acute myelogenous leukemia (AML), myelodysplastic syndromes, and myeloproliferative disorders.8-11 In AML, genes encoding epigenetic regulators such as DNMT3A, ASXL1, IDH2, and TET2 tend to mutate early in the development of disease.12 Mutations of these so-called “early genes” can persist in HSCs of patients in remission, creating reservoirs of pre-leukemic clones that can engender a relapse.13-16 Early gene mutations are also detectable in mature blood cells of patients with AML and some subjects with skewed X-inactivation, but ostensibly normal hematopoiesis.13,16,17 The frequencies of the mutant alleles in these cases indicate the mutant cells must have undergone clonal expansions, despite retaining a capacity to differentiate normally. This suggests early myeloid neoplasia-associated mutations in HSCs can drive CH without an obvious phenotypic effect on mature blood cells. In this article, we refer to genes and mutations that are suspected of promoting clonal expansion in CH as candidate drivers (CDs). A provisional list of CD genes, composed primarily of genes showing the characteristics of the early genes mentioned here, has been compiled by Steensma et al.18 Whole-exome sequencing (WES) and candidate gene analyses have shown that low-variant allele fraction (VAF) CD mutations indicative of CH are associated with increased all-cause mortality rates and risks for subsequent hematological malignancy.19,20 This prompted some investigators to propose that the presence of a CD mutation with a VAF above 2% constitutes an at-risk clinical entity, and to consider how it might be managed appropriately.18

In most studies based on DNA sequencing, the detection of CH per se has been inexorably bound up with the detection of mutations in genes previously known to be involved in hematological malignancies.19-23 Until now, this has precluded formal tests of association between CD mutations and CH. In this study we use the observation that HSCs accumulate somatic mutations during their life history, most of which have no apparent effect on cellular phenotype.24 Each HSC and its clonal descendants are therefore “barcoded” with a unique spectrum of mutations. If a particular clone becomes a substantial contributor to hematopoiesis, its unique spectrum of mutations should be evident in the sequence of peripheral blood DNA. The proportion of mature blood cells derived from a particular barcoded HSC clone will (for heterozygous mutant loci) be equivalent to about twice the VAF. Using this method to detect CH does not rely on the detection of a CD mutation, but it does require an extensive amount of sequence data. Genovese et al used a method like this on whole-exome sequence data to show that CH can occur in elderly people without CD mutations being detected.19 Similarly, Holstege et al used this method to detect extreme CH without CD mutations in a 115-year-old woman.25 We have sequenced the whole genomes of a substantial number of Icelanders.26 Here we use the whole-genome sequence (WGS) data to search for CH events by counting the number of low-VAF mutations present in peripheral blood. We find that CH is far more common in the elderly than has previously been demonstrated. In the majority of CH cases, no CD mutation was evident.


Full details of the methods used are given in the supplemental Data, available on the Blood Web site.

Subject recruitment and phenotyping

The study is based on WGS data from whole blood samples from 11 262 Icelanders participating in various disease projects at deCODE genetics. The study was authorized by the Icelandic National Bioethics Committee and the Data Protection Authority. All individuals gave informed consent. Patients were excluded if they had a diagnosis of hematological malignancy in the Icelandic Cancer Registry before or within 6 months after blood draw.


DNA samples were isolated from whole blood and prepared for sequencing using TruSeq Nano or TruSeq PCR-free library kits (Illumina). Libraries were sequenced on Illumina GAIIx, HiSeq2000/2500 or HiSeq X instruments. Single nucleotide polymorphisms (SNPs) and indels were called using GATK and GenotypeGVCFs27 and then annotated using Ensembl Variant Effect Predictor.28

Identification of mosaic somatic mutations

To differentiate mosaic somatic mutations from germline ones, we first restricted analysis to mutations that occurred only once in individuals of proven Icelandic ancestry. Because of the large size of our sample and the population structure of Iceland, germline variants are most likely to be observed more than once in our sample. We then imposed VAF restrictions to identify mosaic somatic mutations. Germline mutations can by chance have an observed VAF considerably less than 0.5. To control the false-discovery rate stemming from germline mutations to less than 1%, we considered only singleton mutations with a VAF less than 0.2 as somatic and mosaic. The lower frequency limit of detected somatic mutations was 0.1, as determined by the sensitivity of the variant caller. At least 3 independent reads containing a variant allele were required to call a somatic mutation. The spectrum of mosaic somatic mutations called was similar to what has been observed previously in AML24,29 (supplemental Figure 1).

Definition of clonal hematopoiesis cases

We counted the number of mosaic somatic mutations in each subject. The 99.5% quantile of the distribution of counts in subjects younger than 35 years was 20 mutations. Subjects with more than 20 mutations were defined as WGS-outliers. We took outlier status as evidence that the subject had CH.

Criteria for detection of CD mutations

For 18 CD genes on a list from Steensma et al,18 we included all high-impact mutations and any missense mutation that had been reported in the Catalogue of Somatic Mutations in Cancer (COSMIC) for hematopoietic or lymphoid tissue. For the less stringent “COSMIC5” criterion, we extended this list to include any high- or moderate-impact mutation (in any gene) that had been reported 5 times or more in COSMIC for hematopoietic or lymphoid tissue. Germline mutations were excluded, but somatic mutations were unconstrained by their VAF and were allowed to occur in more than 1 individual. For genome-wide burden testing of RefSeq genes, similar criteria were used for identification of somatic mutations. Mutations were graded by Variant Effect Predictor impact and then binned by gene for association testing against WGS-outlier status.

Panel resequencing

We used an Illumina TruSight Myeloid panel for resequencing 54 genes known to be mutated in myeloid neoplasia (to >5000× on Illumina MiSeq instruments). This was supplemented with a custom assay for the last 2 exons of PPM1D, sequenced to more than 600×. We called somatic mosaic variants down to a VAF of 0.01. Modeling revealed that the probability for a given CD mutation to explain an observed WGS-barcode clone drops precipitously when the VAF of the CD mutation drops < 0.05. At a VAF of 0.01, the probability that a causal CD mutation could generate a clone with a detectable WGS-outlier barcode is only 2.1 × 10−12. We note in passing that under this model, none of the low-VAF CD mutations reported by Young et al23 could have generated a case that meets our criteria for CH.

WGS-based genome-wide association for inherited variants

Long-range haplotype phasing and genotype imputation was performed as described previously.30-32 This permitted us to reliably test for association with germline variants down to a minor allele frequency of about 0.05%.

Telomere length assay

Estimates of telomere length were obtained from WGS data using TelSeq software.33

Statistical analysis

Analyses were performed using R packages, including packages survival and ggplot2. For modeling of clonal hematopoiesis, the model of Dingli et al34 was adapted to our data. Details of all analyses are presented in the supplemental Data.


Clonal hematopoiesis is very common in the elderly

We generated WGS from 11 262 people to an average sequencing depth of 35.6 (median, 34.8; range, 20.4-119.2). We identified 3 300 768 singleton SNPs, of which 146 389 were classified as mosaic somatic mutations. The Icelandic genealogy was used to ensure none of the mosaic somatic mutations were transmitted in the germline (supplemental Data; supplemental Figure 2). The mean VAF of the mutations was 0.17, with a range of 0.11 to 0.20. The upper boundary was set to control the false-discovery rate because of interference from constitutional mutations to a level of less than 1%, whereas the lower boundary was determined by the sensitivity of the variant caller. As shown in Figure 1A, younger subjects showed a distribution with a median of around 3 mosaic somatic mutations. Above an age of 35 to 45 years, the number of people with a high count of mosaic somatic mutations climbed rapidly. We applied a cutoff of more than 20 mosaic somatic mutations (corresponding to the 99.5% quantile of the distribution for ages younger than 35 years) to classify individuals as WGS-outliers (supplemental Table 2). We took WGS-outlier status as evidence that the person had CH. By this criterion, 1403 of 11 262 participants had CH, a prevalence of 12.5% over all age groups. The frequency of WGS-outliers increased from 0.5% (by definition) in subjects younger than 35 years to more than 50% for subjects older than 85 years (Figure 1A-B). This peak prevalence is more than twice that of previous estimates made using WES data.19,20 Hence, it appears that CH is far more common among the elderly than has previously been shown. This confirms a prediction by McKerrell et al that CH would prove to be an almost inevitable consequence of advanced aging.21

Figure 1.

Age distribution of clonal hematopoiesis detected by WGS-outlier status. (A) Histograms showing the number of mosaic somatic mutations per person stratified by their age at blood sampling (adjusted as described in the supplemental Data). The vertical line shows the cutoff at 20 mosaic somatic mutations (corresponding to the 99.5% quantile of the distribution for ages younger than 35 years) that was used to classify individuals as WGS-outliers. (B) Prevalence of clonal hematopoiesis and CD mutations stratified by age class. Lavender bar, the fraction of samples classified as WGS-outliers; red bar, the fraction of samples with detected CD mutations from the 18-gene candidate list18; green bar, the fraction of samples detected as outliers, using exon-restricted analysis; blue bar, combined fraction of samples detected with CD mutation or exon-restricted analysis. Error bars indicate 95% confidence intervals.

To align our analysis with what might be detected by WES, we restricted the scope to mosaic somatic mutations that occurred in exons. Only 176 subjects were then detected as outliers (supplemental Data), 174 of whom had already been identified as WGS-outliers. The exome-restricted detection rate corresponded to 12.4% of the 1403 CH cases detected as WGS-outliers (Figure 1B). These 174 people had a 2.65-fold higher number of mosaic somatic mutations, as measured by WGS (mean number, 138 vs 52; P = 2.8 × 10−81), and were on average 7.8 years older than other WGS-outliers (P = 1.2 × 10−7). The age-specific prevalence of outliers detected by exon-restricted analysis was similar to published estimates using WES (Figure 1B).19,20 Thus, the WGS-outlier method offers much greater sensitivity for detection of CH than exome-restricted methods.

Association of driver mutations with clonal hematopoiesis

We investigated what proportion of our CH cases carries detectable mosaic somatic mutations in CD genes. As candidates, we used a list of 18 CD genes that have previously been seen mutated in myeloid neoplasia.18 The list also includes PPM1D, which was never implicated in hematological malignancy, but shows somatic mutation in blood in association with breast cancer, ovarian cancer, and prior chemotherapy.35-38 For detection of CD mutations, we used a different strategy from the one used to detect barcode mutations. We permitted somatic mutations that were present in more than 1 individual and at a wider range of VAFs (supplemental Data). We found 286 CD mutations in 16 of the genes in 246 of 11 262 people (supplemental Table 2). For 196 individuals, their CD mutation occurred in DNMT3A (n = 93), TET2 (n = 76), ASXL1 (n = 25), or PPM1D (n = 18). The probability of detection of a CD mutation was strongly dependent on age (Figure 1B). Subjects with CD mutations were on average 18 years older than those without (P = 2 × 10−46). The age-specific prevalence of CD mutations was in line with what has been observed previously using WES.19,20 However, mutations in the 18 CD genes were detected in only a small proportion of CH cases 12.6%, 177/1403; Figure 1B). Employing a less stringent criterion to define CD mutations (“COSMIC5”; supplemental Data) explained a further 9 CH cases (supplemental Table 2). Analysis of sample subsets for copy number variations (CNV) resulting in deletions of CD genes, recurrent AML fusion genes, or for mutations detected by high-depth WGS did not yield substantial numbers of additionally explained CH cases (supplemental Data; supplemental Figure 3; supplemental Table 4). We then employed deep resequencing of a panel of amplicons from 54 genes frequently mutated in myeloid malignancies, as well as PPM1D, on 76 CH cases selected by stratified random sampling. The method allowed us to detect mutations reliably down to a VAF of 0.01 (ie, approximately 10-fold lower than the lower limit of WGS-outlier barcode mutations). We found CD mutations in 30 (39.5%) of the 76 cases who were resequenced (supplemental Table 3). Adding the estimated 1.2% of cases who have a large deletion of a CD gene (supplemental Data; supplemental Figure 3), this corresponds to 40.7% of WGS-outliers who might plausibly be accounted for by a CD mutation detectable by our methods. This is likely to be an overestimate, however, because some of the observed CD mutations may not have a sufficiently high VAF to account for the clone generating the associated WGS-outlier barcode (supplemental Data). In any case, the majority of our CH cases remain in need of a satisfactory mechanistic explanation.

Overall, 177 (72%) of the 246 subjects with WGS-detected CD mutations were also WGS-outliers. For younger individuals in particular, CD mutations sometimes occurred without WGS-outlier status being detected (Figure 2A; supplemental Tables 2 and 3). We speculated that some younger people who have CH with CD mutations might not have been detected as WGS-outliers because they had not accumulated enough mosaic somatic mutations to qualify. Indeed, the fraction of subjects with a CD mutation who were also WGS-outliers increased significantly with age (P = 1.9 × 10−12; Figure 2B). DNMT3A, TET2, and ASXL1 were the genes most commonly mutated in both WGS-outliers and nonoutliers (Figure 2C). So, despite its high positivity rate in older people, the WGS-outlier method is still likely to underreport CH among the young.

Figure 2.

Presence of candidate driver (CD) mutations by age. (A) VAF vs age at blood draw for the 16 CD genes where mutations were detected. The 177 subjects who were classified as WGS-outliers are plotted as blue points, and the 69 subjects who were not outliers are plotted as red points. (B) Conditional probability of being identified as a WGS-outlier given that a CD mutation was detected, stratified by age bins. P = 1.9 × 10−12; β = 0.10, assessed by logistic regression. Error bars indicate 95% confidence intervals. (C) Co-mutation plot for WGS-outliers and nonoutliers in whom CD mutations were detected. Each column represents a subject, each row a candidate pre-leukemic driver gene. Cells are shaded if a mutation was detected, and the color of the shading indicates the number of mutations detected for the particular gene. The vertical black line separates non–WGS-outlier from WGS-outlier subjects.

Unlike previous studies, our WGS-outlier method defines CH cases irrespective of whether or not they have a mutation in a CD gene. Therefore, we could carry out unbiased tests of whether the presence of a mutant CD gene is associated with CH status. As shown in Table 1, 11 of the 18 genes nominated by Steensma et al18 demonstrated an association with CH at a significance of P < .05 in burden tests. The less stringent COSMIC5 criterion added some plausible candidate genes for which mutations were found in WGS-outliers: ATM, IDH2, KIT, MPL, MYD88, and NRAS. Of these, IDH2 and MYD88 reached P < .05 significance.

Table 1.

Association of mutant candidate pre-leukemic driver genes with clonal hematopoiesis defined by WGS-outlier status

To extend the search for mutant genes associated with CH to a genome-wide scale, we carried out burden tests of 17 933 RefSeq genes with high- or moderate-impact mosaic somatic mutations for association with WGS-outlier status. After Bonferroni correction, TET2, DNMT3A, ASXL1, and PPM1D demonstrated highly significant associations with CH in burden tests based on high-impact mutations only, or both high- and moderate-impact mutations (Table 2; supplemental Table 5). DNMT3A and TET2 were also significant when moderate mutations alone were considered. These associations directly implicate those genes as drivers of CH. Other genes reached suggestive levels of significance, notably moderate-impact mutations in ATM (P = 3.9 × 10−6), SRSF2 (P = 3.0 × 10−5), and MTA2 (P = 9.6 × 10−5). The latter gene, Metastasis-Associated Protein 2, encodes a component of the nucleosome remodeling and histone deacetylation (NuRD) complex. It is widely expressed, including in leukocytes and bone marrow, but has no known involvement in myeloid disease. Further investigation of MTA2 is warranted.

Table 2.

Genome-wide burden test for association of mutant genes with clonal hematopoiesis

CH is associated with higher death rates and increased risks for hematological malignancy

Subjects with CH defined by WGS-outlier status had significantly higher rates of all-cause mortality (hazard ratio [HR], 1.18; P = 2.7 × 10−4; Figure 3A). WGS-outliers with or without detected CD mutations had similar risks. Subjects who had CD mutations were also at increased risk, irrespective of WGS-outlier status (HR, 1.30; P = .0042). To set the increased mortality rates in perspective, we found that smoking (ever) had an HR of 1.19 (P = 2.6 × 10−5). Therefore, the effect of clonal hematopoiesis on mortality rate is similar to ever smoking.

Figure 3.

Survival analysis using Cox proportional hazard model. Baseline was defined as subjects who were neither WGS-outliers nor carriers of a mosaic somatic CD mutation. Plots show HRs with 95% confidence intervals. (A) HRs for all-cause mortality adjusted for age at blood draw, year of birth, sex, previous diagnoses of cancer, and smoking. (B) HRs for subsequent hematological malignancy adjusted for age at blood draw and year of birth. Details of the subjects who developed hematological malignancies are shown in supplemental Table 6.

We also assessed the risk for a hematological malignancy arising 6 months or more after sampling (Figure 3B). CH defined by WGS-outlier status substantially increased the risk for a subsequent hematological malignancy (HR, 2.43; P = 9.0 × 10−5). Again, WGS-outliers with or without detected CD mutations had similar risks, and all subjects with CD mutations were at increased risk. We noted that the risk increased with count of mosaic somatic mutations to an HR of 42.2 (P = 1.3 × 10−9) in subjects with more than 250 mutations. This might be indicative of disease risk associated with the age (in divisions) of the HSC clone when it began to expand, its rate of expansion, its degree of predominance in supporting hematopoiesis (as the number of detected mutations is correlated with VAF), or the probability that a leukemic driver gene had been hit by a mutation. Alternatively, it might indicate that individuals with very high mutation counts have an undiagnosed, non-HSC hematological malignancy.

Association of CH with other phenotypes

We tested for association between the WGS-outlier status and 1482 case–control phenotypes from the deCODE database (supplemental Table 7). After consideration of a Bonferroni correction threshold (P < 3.4 × 10−5), significant associations were found with smoking (P = 6.0 × 10−13), treatment of addiction (P = 4.2 × 10−8), psychiatric disease (P = 9.5 × 10−6), smoking-related diseases (P = 1.1 × 10−5), and chronic pulmonary disease (P = 1.4 × 10−5). In analyses limited to those for whom smoking information was available, addiction and psychiatric disease remained significant (P < .05) after adjustment for smoking. However, these traits do have known correlations with smoking behavior, including metrics of smoking quantity, which were not taken into account. Smoking was also associated with the number of mosaic somatic mutations detected, irrespective of WGS-outlier status (P = 1.3 × 10−11). No cancer phenotypes reached Bonferroni corrected significance, the lowest P value being for lung adenocarcinoma (P = .001). Testing association with 4078 quantitative traits revealed significant increases in counts of white blood cells (P = 5.0 × 10−11), monocytes (P = 2.2 × 10−10), platelets (P = 1.2 × 10−9), lymphocytes (P = 6.4 × 10−8), granulocytes (P = 6.8 × 10−6), and an increase in total platelet volume (P = 7.8 × 10−10; supplemental Table 8).

Mosaic loss of the Y chromosome is reportedly similar to CH in its age and phenotypic associations.39-41 Mosaic loss-of-Y showed a highly significant association with CH in our WGS data (P = 5.02 × 10−110) and had a similar age distribution (supplemental Data; supplemental Figure 4). This suggests that mosaic loss-of-Y and CH are related phenomena.

Modeling clonal hematopoiesis caused by neutral drift

The lack of identified CD mutations in most of the WGS-outliers led us to question whether such mutations are essential for CH to arise. Modeling studies show that in a stem cell pool of constrained size, and in the absence of any clonal selective advantage, a large fraction of the stem cells will eventually derive from a single clone as a result of neutral drift.42 The question is, therefore, not whether CH will manifest itself, but over what timescale in relation to the human life span. To examine whether neutral drift could possibly account for some of our CH cases, we considered a simple model of stem cell dynamics34 adapted to our data (supplemental Data). Simulations with a plausible set of parameters indicated that clonal expansion could occur so rapidly by neutral drift that a substantial proportion of the WGS-outliers without known CD mutations would be explained (Figure 4). Neutral drift should therefore be considered as one possible mechanism underlying CH.

Figure 4.

Computer simulation of clonal hematopoiesis arising under neutral drift. The graph shows the proportion of simulations producing more than 20 observable mosaic somatic mutations with a VAF less than 0.2 as a function of subject age, for different choices of N, the size of the active HSC compartment. The value of p, the probability that an HSC division will produce 2 daughter stem cells, was set at 0.25.51 Other parameters were fixed at λ = 1 division per 40 weeks,43 mutation rate µ = 6.4 × 10−10 per base pair per division.52

We noted that the simulations were very sensitive to N, the assumed size of the active HSC pool (Figure 4). Estimates for the size of this pool vary widely.43-45 The model was also sensitive to the value of p, the probability that a given cell division will produce 2 daughter stem cells. Although a high pi can give a particular HSC clone i a competitive advantage, a high p value could apply equally to all members of the HSC pool and still promote rapid development of CH (supplemental Figure 5). In other words, a person with an endogenously high p could be predisposed to develop CH early through neutral drift.

Association of germline TERT variants with CH

To examine further the concept of constitutive predisposition to CH, we carried out a WGS-based genome-wide association study for germ-line SNPs and small indels,26 using WGS-outlier status as the query phenotype. The strongest association came from an 8-bp deletion in intron 3 of the telomerase reverse transcriptase (TERT) gene (rs34002450, g.1280826_1280833delAGCCCACC; P = 7.4 × 10−12; odds ratio [OR], 1.37; allele frequency, 40.6% in Iceland; Figure 5). Previously, we reported an association between myeloproliferative neoplasms and a common variant in TERT (rs2736100, r2 = 0.28 vs rs34002450).46 Conditional analysis showed that the CH association mapped preferentially to rs34002450 (Padj = 3.6 × 10−8), whereas the myeloproliferative neoplasms association mapped preferentially to rs2736100 (Punadj = 2.2 × 10−8 , Padj = 6.4 × 10−4). Rare coding mutations in TERT have been implicated in marrow failure and HSC dysfunction.47,48 In our study, no high- or moderate-impact germline variants (with CH-association P values of <.05 and <2.5 × 10−3, respectively) were seen in TERT or any of the 18 CD genes.

Figure 5.

Genome-wide association for germline variants associated with clonal hematopoiesis detected by WGS-outlier status. (A) Manhattan plot of association [expressed as –log10(P)] with WGS-outlier status, determined using logistic regression. (B) Locus zoom of the signal in the TERT gene on chromosome 5. The location of the 8-bp indel rs34002450 (chr5:1280825) giving the strongest signal is indicated by a purple diamond. Other variants are plotted in colors corresponding to their r2 values relative to rs34002450, as indicated in the legend. Recombination rates, in cM/Mb and based on Icelandic data, are plotted as a red line. The lower panel shows the locations of RefSeq genes and the chromosomal position (GRCh38/hg38).

The TERT association suggested telomerase activity might have a role in CH. To explore this further, we estimated telomere lengths for 2703 samples based on the occurrence of the telomeric TTAGGG motif in WGS reads (supplemental Data). Telomere length estimates were significantly lower in people with CH defined by WGS-outlier status (β, −0.19 standard deviations; P = 1.0 × 10−3). The presence of CD mutations had no effect in addition to WGS-outlier status. Moreover, we could detect no effect of the rs34002450 deletion on telomere length (P = .22, linear regression adjusted for sex, age at blood draw, chronic obstructive pulmonary disease, cancer diagnoses, and smoking). Both telomere length and rs34002450 genotype were significant independent predictors of WGS-outlier status in a multivariate regression (P = .008 and .003, respectively, adjusted for age at blood draw and smoking). The involvement of TERT and telomere length with CH merits further investigation. Moreover, studies that ascribe phenotypic effects to variations in telomere length should consider a possible role of CH.


Using WGS data from peripheral blood of 11 262 people, we have developed a method to identify individuals with CH based on the accumulation of somatic mutations in the dominant HSC clone. Unlike previous studies of CH, this method does not depend on prior knowledge of CD genes or mutations. We detected a high prevalence of CH among the elderly, trending toward inevitability.

Holstege et al25 studied WGS of a 115-year-old woman with extreme CH and found no CD mutations (nor indeed any mutations present in the COSMIC catalog). This suggested that CH might occur in the absence of detectable CD mutations. Similarly, Genovese et al,19 using WES, did not detect CD mutations in a substantial fraction of their patients with CH. In our study, we did not find obvious CD mutations in the majority of subjects with WGS-outlier status indicative of CH. Conceivably, the lack of identified CD mutations, first, could be the result of a simple inability to detect them with the methods applied. Our sensitivity to detecting a driver mutation might be rather limited for clones with low degrees of predominance and high mutation burdens. Second, because of genetic heterogeneity, each individual driver mutation might occur at too low a frequency in the CH population to be recognized as such, or be located in a nonexonic region. Third, CH could be driven by variation in clonally inherited epigenetic states that affect the self-renewal and proliferative capacities of HSCs.49 Fourth, some CH may be a simple (or indeed, inevitable) consequence of neutral drift operating on the small, aging population of active HSCs.42 If this latter scenario is true, it invites the question of why CH without defined CD mutations still carries high hazards of hematological malignancy and all-cause mortality, as we and others19 have seen. Perhaps some instances of CH are indications the HSC compartment is in a permissive state that allows clonal expansions to occur over short timescales. The situation may be akin to melanoma, in which a high number of benign nevi is a strong risk factor even though few nevi progress to melanoma.50

The ability to identify CH cases presents opportunities for monitoring and intervention. A number of authors have recommended the development of strategies to target and eliminate clonal reservoirs of HSCs containing pre-leukemic mutations.13,15,16,18 Such strategies must take into account that CH appears to be quite common in elderly asymptomatic individuals and that absolute risks for progression to hematological malignancy are low. On the basis of the observations described here, targeting known pre-leukemic driver mutations may address only a fraction of the at-risk individuals. Moreover, some people may be reliant on surviving mutant HSC clones to support their normal hematopoiesis, and this may be associated with a general frailty. Clearly a deeper understanding of the nature and associated risks for clonal hematopoiesis would be valuable.


This work was supported, in part, by National Institutes of Health, National Institute on Drug Abuse grants R01-DA017932 and R01-DA034076.


Contribution: The study was designed and the results interpreted by F.Z., S.N.S., G.L.N., A.K., and K.S.; subject ascertainment and recruitment was carried out by S.N.S., I.J., T.E.T., J.G., J.G.J., L.T., T.J., T.R., and U.T.; sequencing and genotyping was done by F.Z., S.N.S., O.T.M., A.S., and G.M.; statistical and bioinformatics analysis was done by F.Z., S.N.S., G.L.N., M.L.F., S.A.G., A.H., A.G., P.S., D.F.G., G.M., and A.K.; the manuscript was drafted by F.Z., S.N.S., G.L.N, A.K., and K.S.; and all authors contributed to the final version of the paper.

Conflict-of-interest disclosure: All deCODE authors are employees of the biotechnology company deCODE genetics/AMGEN.

Correspondence: Augustine Kong, deCODE genetics/AMGEN, Sturlugata 8, 101 Reykjavik, Iceland; e-mail: augustine.kong{at}; and Kari Stefansson, deCODE genetics/AMGEN, Sturlugata 8, 101 Reykjavik, Iceland; e-mail: kari.stefansson{at}


  • * F.Z. and S.N.S. contributed equally to this study.

  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted February 20, 2017.
  • Accepted May 1, 2017.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
View Abstract