Advertisement

Whole-exome sequencing to identify genetic risk variants underlying inhibitor development in severe hemophilia A patients

Marcin M. Gorski, Kevin Blighe, Luca A. Lotta, Emanuela Pappalardo, Isabella Garagiola, Ilaria Mancini, Maria Elisa Mancuso, Maria Rosaria Fasulo, Elena Santagostino and Flora Peyvandi

Key Points

  • Exome sequencing of severe hemophilia A patients with/without inhibitors identified rare, damaging variants in immunoregulatory genes.

  • Replication confirmed the association of rs3754689 in a conserved haplotype region surrounding the LCT locus with inhibitor development.

Abstract

The development of neutralizing antibodies (inhibitors) against coagulation factor VIII (FVIII) is the most problematic and costly complication of FVIII replacement therapy that affects up to 30% of previously untreated patients with severe hemophilia A. The development of inhibitors is a multifactorial complication involving environmental and genetic factors. Among the latter, F8 gene mutations, ethnicity, family history of inhibitors, and polymorphisms affecting genes involved in the immune response have been previously investigated. To identify novel genetic elements underling the risk of inhibitor development in patients with severe hemophilia A, we applied whole-exome sequencing (WES) and data analysis in a selected group of 26 Italian patients with (n = 17) and without (n = 9) inhibitors. WES revealed several rare, damaging variants in immunoregulatory genes as novel candidate mutations. A case-control association analysis using Cochran-Armitage and Fisher’s exact statistical tests identified 1364 statistically significant variants. Hierarchical clustering of these genetic variants showed 2 distinct patterns of homozygous variants with a protective or harmful role in inhibitor development. When looking solely at coding variants, a total of 28 nonsynonymous variants were identified and replicated in 53 inhibitor-positive and 174 inhibitor-negative Italian severe hemophilia A patients using a TaqMan genotyping assay. The genotyping results revealed 10 variants showing estimated odds ratios in the same direction as in the discovery phase and confirmed the association of the rs3754689 missense variant (OR 0.58; 95% CI 0.36-0.94; P = .028) in a highly conserved haplotype region surrounding the LCT locus on chromosome 2q21 with inhibitor development.

Introduction

Hemophilia A (HA) is an inherited bleeding disorder characterized by factor VIII (FVIII) deficiency due to mutations in the F8 gene.1 HA treatment consists of replacement with FVIII concentrates and is highly effective in stopping and preventing hemorrhage. However, up to 30% of previously untreated patients with severe HA develop anti-FVIII neutralizing alloantibodies (referred to as inhibitors), which impair the efficacy of replacement therapy and render the management of bleeds difficult.2 For these reasons, inhibitor development is the most problematic and costly complication of FVIII replacement therapy. To purge the immune system of alloantibodies against FVIII, the immune tolerance induction strategy is frequently implemented. However, it is a demanding and costly treatment that involves long-term regular infusions of high doses of FVIII with a success rate of 60% to 80%.3-5 Therefore, prevention of inhibitor development is an urgent and important issue that requires a deeper insight into the etiology of this complication.

It has been postulated that inhibitor formation results from a complex interaction of environmental and genetic factors.6-8 Studies on genetic determinants of inhibitor development have indicated numerous genetic markers, such as the type of causative F8 gene mutation, single nucleotide polymorphisms (SNPs) in the HLA locus and in other immunoregulatory genes, ethnic background, and family history of inhibitors, to be involved in the etiology of this treatment complication.9 Recently, Gouw et al10 performed a systematic review and meta-analysis to associate the type of F8 gene mutation with inhibitor development in severe HA patients and found null mutations that lead to a complete lack of protein expression to be the strongest risk factors for inhibitor development. Studies on genetic determinants other than F8 mutations identified SNPs in genomic regions associated with immunoregulatory genes.9 These include the HLA class II molecules,11-14 which have an essential role in the presentation of FVIII peptides to CD4+ T-helper cells and pro- and anti-inflammatory cytokines such as interleukin (IL)-1, IL-2, IL-10, CTLA4, and tumor necrosis factor α.13,15-19 Moreover, polymorphism p.R131H in the Fc Fragment of immunoglobulin (Ig)G (FCGR2A) encoding Fc γ receptor 2A expressed on immune cells has recently been associated with inhibitor development.20 However, due to the poor reproducibility of the genotyping results, small replication cohorts, or different genetic background of the examined populations, the gene targets that cause this predisposition are still largely unknown.

To identify novel genetic risk variants underlying inhibitor development, we applied whole-exome sequencing (WES) analysis in a group of Italian patients with severe HA, followed by replication of our results in a larger cohort of severe HA patients with and without inhibitors. This manuscript reports our experience using exome sequencing to discover novel, coding variants in immunoregulatory genes responsible for the development of inhibitory anti-FVIII alloantibodies in severe HA patients.

Materials and methods

Patients

Patients were selected from a cohort of 256 severe HA patients regularly followed up between 1960 and 2010 at the Angelo Bianchi Bonomi Hemophilia and Thrombosis Center, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ca’ Granda, Ospedale Maggiore Policlinico (Milan, Italy). For the discovery phase involving exome sequencing, 26 Italian patients with severe HA were chosen, 17 with and 9 without a history of inhibitors. To reduce heterogeneity due to the F8 gene mutations, all patients chosen for the discovery phase carried intronic inversion mutations (n = 24 intron 22 inversions and n = 2 intron 1 inversions). Four brother pairs, including 2 discordant (E835-E1309 and E172-E1355) and 2 concordant E287-E1147 (inhibitor negative) and E1361-E502 (inhibitor positive) for inhibitor development, were included in the discovery phase. Clinical features of 26 Italian patients with severe HA are presented in supplemental Table 1, available on the Blood Web site. WES was performed at the Human Genome Sequencing Center (HGSC), Baylor College of Medicine (Houston, TX).

The replication study was conducted in 227 severe HA patients with (n = 53) and without (n = 174) inhibitor history. Patients with different types of the F8 gene mutations, such as intron 22 and 1 inversion mutations, missense, large deletions, frameshifts, and STOP gain, were chosen for replication. Nearly all patients (97.4%) were of Italian origin. Genotyping was performed in our center using a TaqMan SNP genotyping assay. A positive history of inhibitor development was defined as a positive Bethesda assay >0.5 Bethesda units (BU)/mL in 2 independent measurements. High titer inhibitor patients, henceforth called high responders (HR), were defined as those with inhibitor peak titers >5 BU/mL. All patients signed informed consent, and the study was approved by the Ethics Committee of the Fondazione IRCCS Ca’ Granda and was carried out in accordance with the Declaration of Helsinki.

Sequencing and data analysis

DNA isolation and preparation methods have been described in detail elsewhere.21 Individual exomes were enriched by the SeqCap EZ HGSC VCRome kit (Roche NimbleGen), a clinical research exome oligonucleotide panel designed to capture 189 028 nonoverlapping exons in 23 585 human genes, producing a total target size of 45.1 Mb. Sequencing of paired-end fragments was conducted on an Illumina HiSeq 2000 sequencing platform (Illumina, San Diego, CA) at the HGSC.

Data analysis on raw reads to produce individual variant call format (VCF)22 files was performed at HGSC. Raw reads were aligned to the reference human genome (hg19/GRCh37) and variants were called using the HGSC proprietary software 'Atlas2' (https://www.hgsc.bcm.edu/software/atlas-2). After variant calling, individual VCF files were merged using VCFtools,22 and variants not meeting quality control (QC) criteria, as specified by the HGSC, were removed: single nucleotide variant (SNV) posterior probability ≥0.95, number ≥3, variant-to-read ratio ≥0.1, variant reads mapping to different strand direction, and total coverage ≥6. Annotation of all QC-filtered variants was then conducted using the Variant Effect Predictor,23 including annotation with dbSNPvs138,24 ClinVar,25 Sorting Intolerant From Tolerant (SIFT),26 and Polyphen-2.27 We also obtained the minor allele frequencies (MAFs) from the Exome Variant Server (EVS; http://evs.gs.washington.edu/EVS/) and the 1000 Genomes Project phase 3 populations.28,29 This merged, QC-filtered, and annotated VCF containing all variants called in 26 Italian patients with severe HA became the investigation dataset used in 3 different types of analysis: case-control association analysis, candidate approach to identify rare, damaging variants in immunoregulatory genes, and discordant brother-pair analysis (Figure 1).

Figure 1

Data analysis and filtering steps of genetic variants identified in WES of 26 severe hemophilia A patients. The analysis dataset was obtained by merging all annotated variants called in 26 Italian severe HA patients into a single dataset that was subsequently used in 3 different types of analysis: case-control association analysis, candidate approach to identify rare, damaging variants in immunoregulatory genes, and discordant brother-pair analysis.

A case-control association analysis was conducted between inhibitor-positive and inhibitor-negative patients using SnpSift case-control,30 taking into account 3 different genetic testing models: trend model (Cochran-Armitage test), a dominant model, and a comparison of raw allelic counts (both Fisher’s exact tests). Quantile-quantile (Q-Q) plots and P value distribution histograms were generated using custom scripts in R (R release 3.1.1; R Foundation for Statistical Computing, Vienna, Austria)31 on each derived set of P values. To obtain variants with putative functional impact, we only focused on the following variant types, henceforth referred to as damaging: STOP gains/losses, coding region insertions/deletions (indels), missense (not restricted to SIFT and Polyphen-2 predictions), and variants located in splice acceptor/donor sites.

Relatedness between individuals based on all variants that passed QC was estimated according to the Ajk statistics,32 with subsequent values plotted using custom R31 scripts. Unsupervised hierarchical clustering with heatmaps was performed by encoding variants as 0 (reference sequence, black), 1 (heterozygous variants, green), and 2 (homozygous variants, red) and then using R.31

Because WES was performed on Italian HA patients, we decided to use the MAFs derived from the Italian Tuscany (TSI) population from the 1000 Genomes Project, phase 3,28,29 as normal healthy Italian controls. Where the TSI MAF was not available, those from the HapMap phase 3 project33 population of Utah residents with Northern and Western European ancestry or the CSAgilent population of European descent from the ClinSeq project34 was used. To assess the association of each variant with inhibitor development, odds ratios (ORs) and 95% confidence intervals (CIs) were calculated using R.31 The post hoc replication population’s power to confirm the associations found in the discovery phase was calculated using G*Power software, version 3.0.10,35 based on frequency and effect size of the identified variants, the replication sample size, and assuming a 0.05 2-tailed α error.

The candidate approach to identify rare, damaging variants in immunoregulatory genes was based on filtering all variants on MAF (both EVS and 1000 Genomes) <1%, followed by the identification of damaging variants predicted by both SIFT and Polyphen-2 as deleterious. The discordant brother-pair analysis is described in detail in supplemental Figure 1.

Replication and data analysis

A cohort of 227 severe HA patients with (n = 53) or without (n = 174) inhibitor history was subjected to TaqMan SNP genotyping replication (Thermo Fisher Scientific, Carlsbad, CA) using a StepOnePlus Real-Time Polymerase Chain Reaction system (Applied Biosystems) as described in the manufacturer’s instructions. Variants that showed a direction of association consistent with the discovery phase were further analyzed using logistic regression, adjusting for the relatedness among first-degree patients. The P values were 2 sided, and P < .05 was considered statistically significant. Multiple testing was calculated by Benjamini and Hochberg’s false discovery rate.36 To assess the susceptibility of all replicated variants for inhibitor development, ORs and 95% CIs were calculated. Data analysis was performed using Plink (v1.0737) and R.31

Results

We performed WES and data analysis that revealed a total of 5 486 286 variants (mean, 211 011) called across all samples, including 4 538 739 SNVs and 947 547 insertions/deletions (indels). After filtering out variants that did not pass QC in ≥1 patient, 1 810 198 variants (mean, 69 623) were identified in 235 915 unique genomic locations. These included 223 431 SNVs and 12 484 indels that were sequenced with an average read depth over each site of 59 (supplemental Table 2A). Filtering of these variants for novelty, ie, those that were not listed in dbSNPvs138 or any other variant database, we identified 28 510 variants, of which 25 945 were SNVs and 2565 were indels. Moreover, we identified 91 684 variants that were considered rare (MAF ≤1%), 14 619 low-frequency variants (MAF, 1-5%), and 129 612 common variants (MAF >5%) in the European-American population in the EVS and the combined European population in the 1000 Genomes Project.

Candidate approach to identify rare, damaging variants in immunoregulatory genes

To identify novel candidate mutations in immunoregulatory genes that could play a role in inhibitor development, we filtered rare variants for those with potential damaging consequences and missense variants predicted by both SIFT and Polyphen-2 as deleterious. In this way, we obtained a list of 2099 variants from which we selected 28 variants in 26 different genes that belonged to immunoglobulin and immunoglobulin-like genes, HLA loci, receptor and ligand chemokine subunits, interleukins, and other immune system modulators, as well as genes involved in DNA repair mechanisms that play a role in V(D)J recombination (Table 1). A total of 17 variants were identified as risk (found only in inhibitor-positive patients) and 11 as protective (found only in inhibitor-negative patients). Among the 28 SNVs, we found 3 novel variants: 2 missense variants in the CD83 gene and in the CD109 gene and 1 frameshift in the chemokine (C-X3-C motif) receptor 1 (CX3CR1) gene (Table 1). Moreover, we identified frameshift mutation c.304_305insGG (p.Ala102GlyfsTer28, rs201240957) in the HLA-DRB5 gene that was present as homozygous (n = 1) and heterozygous (n = 2) in inhibitor-positive patients. Another variant that was identified as homozygous (n = 1) and heterozygous (n = 2) in inhibitor-positive patients was the rs200975376 (c.134G>A, p.Arg45His) missense variant in the chemokine (C-C motif) ligand 4-like 1 (CCL4L1) gene. We further checked the presence of these mutations in 2 discordant brother-pairs present in our discovery phase, and we confirmed the rs200975376 variant in the CCL4L1 gene as a potential risk factor in the E835-E1309 brother-pair (supplemental Figure 1).

Table 1

Rare (MAF ≤1%), damaging variants identified in immunoregulatory genes by WES of 26 Italian severe HA patients

Variants that increase or reduce the risk of inhibitor development

To identify other genetic factors associated with inhibitor development, a case-control association analysis was conducted between variants identified in inhibitor-positive patients (n = 17) and those identified in inhibitor-negative patients (n = 9) using different genetic models. To assess which genetic model best suited our data, we generated Q-Q plots and P value distribution histograms for each derived set of P values, and we found that the dominant and trend models produced the most normalized P value distributions (supplemental Figure 2). We then filtered out all the variants that did not reach statistical significance for both models (P > .05) and obtained a list of 1364 genetic variants potentially involved in inhibitor development.

To further analyze these 1364 genetic variants, we performed unsupervised hierarchical clustering that revealed 2 distinct patterns of homozygous variants (Figure 2): homozygous variants present in inhibitor-positive patients, indicating an elevated risk of inhibitor development (henceforth called risk), and 2 distinct clusters of homozygous variants present mainly in inhibitor-negative patients that can potentially have a protective role in inhibitor development (henceforth called protective). As shown in Figure 2, unsupervised hierarchical clustering correctly separated the majority of patients into 2 clusters: protective (black lines under the heatmap) and risk (red lines under the heatmap). Differences in modes of inheritance between the brothers are emphasized through pattern differences in the heatmap. Concordant brother-pairs E287-E1147 and E1361-E502 exhibited greater similarity across homo- and heterozygous variants and clustered with other inhibitor-negative and inhibitor-positive patients, respectively. On the other hand, the discordant brothers E835-E1309 and E172-E1355 exhibited greater differences across these variant types, which resulted in 2 brothers E835 and E1355 clustering away from the other risk and protective patients, respectively. This is most likely due to the close genetic background shared with their brothers, causing patient E835 to cluster together with his brother, E1309, in a protective cluster and E1355 together with E172 in a risk cluster (Figure 2). Biological relatedness between all these brother-pairs was confirmed based on all variants that passed QC using the Ajk statistic32 (supplemental Table 2B).

Figure 2

Clustering analysis (with heatmap) of variants identified in 26 Italian severe hemophilia A patients by WES. Unsupervised hierarchical clustering was performed using 1364 statistically significant variants (P ≤ .05) by Cochran-Armitage (trend model) and Fisher’s exact (dominant model) t tests. Two main, distinct patterns of homozygous variants were revealed that are characteristic of inhibitor-positive (risk) and inhibitor-negative (protective) patients. Differences in modes of inheritance between all brothers are emphasized through pattern differences in the heatmap: concordant brothers E287-E1147 (inhibitor negative; green rectangle) and E1361-E502 (inhibitor positive; orange rectangle) exhibit greater similarity across homo- and heterozygous variants, whereas the discordant brothers E835-E1309 and E172-E1355 (blue rectangles) exhibit greater differences across these variant types. Red bars under the heatmaps underline inhibitor-positive and black bars underline inhibitor-negative patients. Heatmap: red, homozygous variants; green, heterozygous variants; black, reference calls. Unsupervised hierarchical clustering was performed using R.31

Identification of potentially damaging variants

Further filtering of these 1364 gene variants for damaging effects revealed a list of 80 variants, which we further trimmed by removing those that did not occur in canonical transcript38 or found in unknown genes such as annotated but unknown open reading frames or hypothetical proteins. This left us with 43 coding variants that were observed in 35 different genes. As shown in Table 2, most of the variants were common polymorphisms showing MAF >5%. Only 1 low-frequency variant rs2407221 (MAF, 3%) in the PRSS48 gene that encodes the STOP loss (Glu/Ter) has been identified. In addition, we found 2 rare variants, the rs6582584 in the ALG10B gene (MAF, 1%) and the rs148759495 in FCGBP gene, for which no MAF has been determined thus far. The affected pathways involve genes associated with the immune responses and autoimmunity, such as Immunoglobulin Heavy Constant γ 1 (IGHG1), Fc Fragment of IgG Binding Protein (FCGBP), Killer Cell Lectin-Like Receptor Subfamily B, Member 1 (KLRB1), and CD207 genes. To obtain a final list of variants for replication analysis, we excluded SNVs that were in high linkage disequilibrium (LD). Finally, we obtained a list of 28 nonsynonymous variants in 27 genes that were subjected to TaqMan SNP replication (Table 2). Variant filtering steps are graphically represented in Figure 1.

Table 2

Nonsynonymous variants identified in a case-control association analysis using Cochran-Armitage and Fisher’s exact tests

Replication of the 28 nonsynonymous variants identified in WES

To validate the WES data, we replicated the 28 SNVs in 53 inhibitor-positive and 174 inhibitor-negative severe HA patients, and we performed logistic regression association analysis. As shown in Figure 3, we identified 10 variants that showed the same trend with the WES study. The remaining 18 variants were either equally distributed between inhibitor-positive and inhibitor-negative patients or were discordant with the sequencing results. The top variant replicated was rs3754689 in the LCT gene, which was significantly associated with inhibitor development (OR, 0.58; 95% CI, 0.36-0.94; P = .028). The rs3754689 SNV encodes C>T missense change that results in a valine to isoleucine substitution at position p.Val219Ile of the lactase protein. We found the rs3754689-T allele in 27 inhibitor-positive and 106 inhibitor-negative patients, corresponding to a MAF of 0.26 and 0.38 for inhibitor-positive and inhibitor-negative patients, respectively. This result confirmed the protective role for the rs3754689-T allele previously seen in WES with an OR of 0.6 (Table 3). However, adjusting for multiple testing using Benjamini and Hochberg’s method resulted in a false discovery rate of 22%. We also performed a separate analysis for all unrelated patients from the replication cohort (n = 185), of whom 47 were inhibitor positive and 138 were inhibitor negative. The results confirmed once again the protective effect of the rs3754689-T allele in inhibitor development (OR, 0.56; 95% CI, 0.33-0.94; P = .028). Similar results were obtained from the association analysis of 41 HR patients and 174 inhibitor-negative patients (OR, 0.56; 95% CI, 0.33-0.95; P = .032).

Figure 3

Forest plot association analysis of the 28 nonsynonymous variants identified in WES and replicated in 227 severe HA patients. TaqMan SNP genotyping-based replication was performed in 53 inhibitor-positive and 174 inhibitor-negative patients. Data analysis was performed using logistic regression analysis adjusting for first-degree relatives. Black diamonds denote OR estimates, and the black bars denote the range of 95% CIs of each variant replicated. Red dots denote OR of variants showing the same trend with WES. Red rectangle denotes statistically significant association, and blue rectangles denote variants with borderline P values. The graph was generated using R.31

Table 3

Nonsynonymous variants replicated in 227 severe hemophilia A patients and showing a direction of association consistent with discovery phase

In addition, logistic regression analysis revealed 3 missense variants with borderline P values: rs1045216 in the PLEKHA1 gene (OR, 0.67; 95% CI, 0.42-1.1; P = .08), rs2230681 in the PSMD9 gene (OR, 0.47; 95% CI, 0.21-1.02; P = .056), and rs3745009 in the SLC14A2 gene (OR, 0.68; 95% CI, 0.44-1.06; P = .087). All 3 variants showed effect direction concordant with those previously seen in WES (Table 3).

Discussion

In this article, we report the results of WES and subsequent downstream data analysis to identify novel genetic risk factors for inhibitor development in 26 Italian patients with severe HA. We identified several rare, damaging variants present in immunoregulatory genes, as well as 28 nonsynonymous variants with putative roles in inhibitor development obtained from a case-control association analysis. Replication of these 28 variants in a larger cohort of 227 severe HA patients confirmed the protective effect of the rs3754689 missense variant in the LCT gene (OR, 0.58; 95% CI, 0.36-0.94; P = .028) as putative susceptibility locus for inhibitor development.

The development of neutralizing antibodies toward FVIII involves several steps of immune activation that include endocytosis and endosomal processing of FVIII by antigen presenting cells like dendritic or B cells, presentation of these peptides via the HLA class II molecules to the CD4+ T cells, and T cell–dependent stimulation of B-cell differentiation into antibody-secreting plasma or memory cells.39 The genetic variants identified in WES could potentially disrupt immune activation at any of these steps, leading to either increased or decreased risk of developing FVIII alloantibodies. For instance, we found variants in the HLA class II genes, HLA DRB1 and HLA DRB5, which have an essential role in presenting FVIII peptides to CD4+ T-helper cells.11-14 The HLA DRB1 locus is particularly interesting as it has been shown to be, together with the HLA DQB1, the most consistent risk factor in inhibitor development.9,13 We also identified variants that may affect the production or function of the immunoglobulin molecules: 5 immunoglobulin heavy chain variable regions, 3 variants in the immunoglobulin κ variable cluster, and SNVs in immunoglobulin constant heavy γ chains (Table 1). Indeed, previous studies demonstrated that class switch recombination is necessary to produce IgG1 and IgG4, the main subclasses of FVIII neutralizing antibodies, by activated B cells.40,41 Moreover, we found variants in genes encoding DNA repair molecules like XRCC4 and PMS2. The Xrcc4 protein is involved in nonhomologous end joining, a pathway used to repair double-stranded DNA breaks generated during V(D)J recombination that occurs between the immunoglobulin and T-cell receptor variable regions in developing lymphocytes.42 Other variants identified by WES may affect cytokines, ligand/receptor chemokine subunits, interleukins, and other immune system modulators that are involved in various steps of immune system activation and inflammation.

The LCT gene encodes the lactase-phlorizin hydrolase, an enzyme with both lactase and phlorizin hydrolase activity,43,44 which is required for the digestion of lactose sugar in the small intestine of humans and other mammals. The rs3754689 missense variant leads to a valine to isoleucine amino acid substitution at position p.Val219Ile of the lactase protein, and it has been predicted by both SIFT and PolyPhen-2 as benign. Indeed, this amino acid change has been previously reported by Boll at al44 as a polymorphism.

If rs3754689 is not causal, it may be coinherited with other causal SNV(s) that affect genes involved in immune responses. To answer this question, we carried out LD analysis of all 28 nonsynonymous SNVs in our replication study, as well as WES data, and found no variants in immunoregulatory genes that were coinherited with this SNV (data not shown). However, when the LD analysis was expanded to include variants in the genomic region surrounding the main association on chromosome 2q21, we found 9 SNVs present in the same individuals as the rs3754689 variant, indicating that they are in high LD with each other. In addition to the 2 intron variants in the LCT gene, we found 5 variants located in 2 downstream genes, R3HDM1 and UBXN4, as well as 2 intron variants located in 2 upstream genes: MCM6 and DARS. Such high LD in this region is not surprising because the haplotype surrounding the LCT locus underwent strong selective pressure in certain populations as a result of establishment of lactase persistence, a heritable genetic trait in which intestinal lactase production persists into the adulthood.45 In the European populations, lactase persistence is largely attributed to the polymorphism in intron 13 of the MCM6 gene (C>T at position −13910 to the LCT gene; rs4988235) located in the cis-acting enhancer element that influences transcriptional activity of the neighboring LCT promoter.46,47 This implies that the association signal coming from the LCT locus may indicate adjacent loci within the same haplotype. Indeed, the R3HDM1 locus has been recently identified in pan-meta-genome-wide association studies as a putative susceptibility locus for 2 autoimmune disorders: systemic sclerosis and systemic lupus erythematosus.48 The Ubxn4 protein has been suggested to function in endoplasmic reticulum–associated protein degradation,49 a pathway that can potentially be involved in presenting the FVIII peptides to CD4+ T-helper cells by antigen-presenting cells.

Previous studies have identified several loci, such as IL10 (−1082 G>A), CTLA4 (−318 C>T), TNFA (−308 G>A), and others that are located in intergenic, upstream promoter regions or intronic sequences of immunoregulatory genes, as genetic susceptible loci for inhibitor development.9 Because our analysis was focused on the coding areas of the genome, most of these variants were not detected. We searched for previously identified coding variants in the immune genes and found rs231775 (CTLA4),50 rs2476601 (PTPN22),50 rs2879097 (PCGF2),51 rs1882019 (HSP90B1),51 and p.R131H SNV in the FCGR2 gene.20 None of these SNVs, however, were associated with inhibitor development in our discovery phase (data not shown), and the most plausible explanation is a lack of statistical power due to small discovery population.

Another limitation of our study is the wide range of CIs observed in the replication phase (Figure 2), which may imply small replication cohort and/or population stratification. Our replication population consisted of 227 severe HA patients, and according to our post hoc data analysis, this cohort size should have been sufficient to replicate the 28 variants identified in WES with nearly 100% power at 0.05 significance. However, one needs to bear in mind that, due to limited and unbalanced size, the MAF obtained from our discovery phase might have been imprecise, thereby affecting the replication population power analysis. To reduce potential confounding due to population stratification, we included in our replication cohort mainly Italian patients. Only 6 patients (2.6%) of the entire replication cohort were of geographic origin other than Italian. Such a low ethnic diversity could not have had profound effects on data analysis. In fact, logistic regression association analysis excluding patients with non-Italian origin confirmed once again the rs3754689 missense variant (OR, 0.55; 95% CI, 0.34-0.91; P = .019).

In conclusion, we report the identification of several rare, damaging variants in immunoregulatory molecules, as well as the rs3754689 missense variant in the LCT gene, as putative susceptibility locus for inhibitor development. Given the extensive linkage surrounding the LCT locus due to a conserved haplotype, the association signal captured by the rs3754689 variant may belong to the adjacent loci such as the R3HDM1 or UBXN4. Further research is needed to pinpoint the exact signal coming from this conserved haplotype on chromosome 2q21 and to unravel the immunological mechanisms underlying the association between this variant(s) and risk of inhibitor development. Our findings might help to define the genetic landscape of inhibitor risk and promote the development of genetic risk algorithms, which will take into account all known genetic risk factors. The rapid advent of innovation technologies, such as exome/genome sequencing, will enable the identification of yet unknown genetic risk factors and pathways involved in inhibitor development. Use of such algorithms may allow to stratify patients for risk of inhibitor development that will permit to advance personalized treatment and clinical management strategies.

Authorship

Contribution: M.M.G. and L.A.L. designed the research and, together with K.B., performed data analysis, interpreted the results, and drafted the manuscript; E.P. performed experiments and, together with I.G., collected data and helped with the interpretation of the results; I.M. helped with data analysis; M.R.F., M.E.M., and E.S. were involved in recruiting patients; and F.P. designed the project, supervised the research, interpreted the results, and critically reviewed the manuscript.

Conflict-of-interest disclosure: F.P. has received honoraria for participating as a speaker at satellite symposia and educational meetings organized by Bayer, Biotest, CSL Behring, Grifols, Novo Nordisk, and Sobi. She is the recipient of research grant funding from Alexion, Biotest, Kedrion Biopharma, and Novo Nordisk paid to Fondazione Luigi Villa, and she has received consulting fees from Kedrion Biopharma, LFB, and Octapharma. She is a member of the Ablynx scientific advisory board. E.S. received fees as a speaker in meetings organized by Kedrion; acted as a paid consultant for Bayer, Pfizer, CSL Behring, NovoNordisk, Grifols, Baxter, Baxalta, Biogen Idec, Sobi, Octapharma, and Roche; and received an unrestricted research grant from Pfizer. M.E.M. received fees as a speaker in meetings organized by Bayer, CSL Behring, Novo Nordisk, Pfizer, and Sobi/Biogen Idec and acted as a paid consultant for Bayer, Pfizer, CSL Behring, Novo Nordisk, Baxter/Baxalta, and Sobi/Biogen Idec. The remaining authors declare no competing financial interests.

Correspondence: Flora Peyvandi, Angelo Bianchi Bonomi Hemophilia and Thrombosis Center, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico and Department of Pathophysiology and Transplantation, Università degli Studi di Milano, Via Pace 9, 20122 Milan, Italy; e-mail: flora.peyvandi{at}unimi.it.

Acknowledgments

The authors thank the following colleagues from the Angelo Bianchi Bonomi Hemophilia and Thrombosis Center, Fondazione IRCCS Ca’ Granda Ospedale Maggiore: A. Cairo and S. Seregni for help with genetic analysis and M. Boscarino for help in the preparation of the figures; M. Fornili and E. M. Biganzoli (Dipartimento di Scienze Cliniche e di Comunità, University of Milan, Milan, Italy) for help with statistical analysis; and F. R. Rosendaal (Departments of Clinical Epidemiology, Thrombosis and Hemostasis, Leiden University Medical Center, Leiden, The Netherlands) for critical revision of the manuscript and its methodology.

This research project has been founded by a Bayer Hemophilia Special Project Award 2011 entitled “Discovery of genetic determinants of inhibitor development in hemophilia A by exome sequencing” awarded to Flora Peyvandi.

Footnotes

  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted December 14, 2015.
  • Accepted March 29, 2016.

References

View Abstract