Genetic determinants of plasma von Willebrand factor antigen levels: a target gene SNP and haplotype analysis of ARIC cohort

Marco Campos, Wei Sun, Fuli Yu, Maja Barbalic, Weihong Tang, Lloyd E. Chambless, Kenneth K. Wu, Christie Ballantyne, Aaron R. Folsom, Eric Boerwinkle and Jing-fei Dong


von Willebrand factor (VWF) is an essential component of hemostasis and has been implicated in thrombosis. Multimer size and the amount of circulating VWF are known to impact hemostatic function. We associated 78 VWF single nucleotide polymorphisms (SNPs) and haplotypes constructed from those SNPs with VWF antigen level in 7856 subjects of European descent. Among the nongenomic factors, age and body mass index contributed 4.8% and 1.6% of VWF variation, respectively. The SNP rs514659 (tags O blood type) contributed 15.4% of the variance. Among the VWF SNPs, we identified 18 SNPs that are associated with levels of VWF. The correlative SNPs are either intronic (89%) or silent exonic (11%). Although SNPs examined are distributed throughout the entire VWF gene without apparent cluster, all the positive SNPs are located in a 50-kb region. Exons in this region encode for VWF D2, D′, and D3 domains that are known to regulate VWF multimerization and storage. Mutations in the D3 domain are also associated with von Willebrand disease. Fifteen of these 18 correlative SNPs are in 2 distinct haplotype blocks. In summary, we identified a cluster of intronic VWF SNPs that associate with plasma levels of VWF, individually or additively, in a large cohort of healthy subjects.


The von Willebrand factor (VWF) is an essential component of hemostasis at sites of vascular injury. This hemostatically active adhesion ligand also plays a critical role in thrombus formation at the site of a ruptured atherosclerotic plaque and in platelet aggregation induced by high fluid shear stress in areas of severe vascular stenosis. The former results in myocardial infarction when it occurs in coronary arteries, whereas the latter is a major cause of thromboembolism associated with ischemic stroke.1

VWF is the largest multimeric glycoprotein in the plasma. It is exclusively synthesized in endothelial cells and megakaryocytes initially as a monomeric glycoprotein. In the endoplasmic reticulum and Golgi apparatus of these cells, the monomers form C-terminal disulfide-linked dimers. A various number of dimers subsequently form multimers through N-terminal disulfide linkages, a process assisted by the proteolytic release of a 714-amino acid propeptide.2,3 The newly synthesized VWF multimers are either constitutively released into the circulation or stored in the Weibel-Palade bodies of endothelial cells and α-granules of megakaryocytes/platelets.4,5 On stimulation, these granules secrete the stored VWF multimers that are rich in ultra-large and hemostatically hyperactive forms.4,6 Ultra-large VWF multimers are also secreted under endothelial stress caused by inflammation but are rapidly and partially cleaved by the zinc metalloprotease ADAMTS13 (a disintegrin and metalloprotease with thrombospondin type 1 repeats) into smaller, variable-sized multimers.79 The hemostatic activity of VWF multimers is determined not only by multimer size,4,10 but also by the quantity of VWF antigen in the circulation. The plasma level of VWF multimers is determined by a dynamic balance between the rate of production and that of clearance. Although acquired conditions are known to affect synthesis and secretion,11 genetic factors play a major role in regulating VWF synthesis and clearance. It has been reported that genetic factors are responsible for up to 66% of variation in plasma VWF antigen level, of which 30% is related to ABO blood type.12,13

The human VWF gene is located on chromosome 12.14,15 It spans approximately 180 kb of nucleotides and contains 52 exons that encode a multidomain prepolypeptide of 2791 amino acids.1416 The VWF gene is highly polymorphic.17 Single nucleotide polymorphisms (SNPs) of the coding and promoter sequences in the VWF gene have been extensively studied and found to influence the levels of VWF in the circulation. Furthermore, as a heavily glycosylated polypeptide,16,18 the level of circulating VWF is also affected by the structure of its carbohydrate side chains, which are associated with ABO blood group, primarily because of the presence or absence of a functional glycosyltransferase that adds either a N-acetylgalactosamine or a D-galactose to a D-galactose side chain on the H antigen.19 This structural variation in glycosylation has been reported to directly affect the rate of VWF clearance.20

Although considerable information has been learned, previous studies are limited in determining genetic impacts on VWF antigen level in the circulation as they had relatively small samples sizes and often targeted specific SNPs that are predominantly located in either exons or the promoter region. Large population-based studies are therefore needed to further investigate the relationship between VWF polymorphisms and levels of VWF antigen, especially given its risk predisposition to coronary heart disease, ischemic stroke,2124 and von Willebrand disease (VWD).25 We have correlated VWF antigen levels in the plasma with 78 SNPs and haplotypes constructed from these SNPs in 7856 subjects of European descent in the Atherosclerosis Risk in Communities study (ARIC;


Study population

ARIC is an ongoing prospective cohort study designed to assess subclinical atherosclerosis and clinical atherosclerotic events.26 Baseline samples were collected from 15 792 adults 45 to 64 years of age from 1987 to 1989 using probability sampling from Forsyth County, NC; Jackson, MS; the northwestern suburbs of Minneapolis, MN; and Washington County, MD. The cohort consists predominantly of subjects of European and African American descent, but this study was exclusively focused on subjects of European descent as information on ABO blood group was not available for African Americans when the data were analyzed. The use of human samples was approved by the National Institutes of Health Data Monitoring board and institutions.

Baseline measurements

Blood was drawn after an 8-hour or longer fasting period from an antecubital vein. VWF antigen was determined by a commercial enzyme-linked immunosorbent assay kit from American Bioproducts, and results were reported as a percentage of the Universal Coagulation Reference Plasma (Thromboscreen, Pacific Hemostasis, Curtin Matheson Scientific). The reliability coefficient (method variance plus intraindividual variance divided by total variance) obtained from repeat testing of plasma samples (from single blood draw) over several weeks was 0.68.

The following conditions were adjusted for as confounding factors (covariance) as they are known to affect plasma VWF antigen levels: age, body mass index (BMI), sex, hypertension, diabetes, and current smoking status.11

Genotyping and haplotype construction

SNPs generated from the Affymetrix, Version 6.0 platform in the region encompassing the VWF gene (5 928 997-6 098 180 bp on chromosome 12) were included in the data analyses. We also used the fastPHASE 1.227 program to resolve haplotypes from the unphased SNP genotype data for the downstream haplotype-based association analysis. To characterize the linkage disequilibrium (LD) pattern in the region of interest, we applied Haploview28 to determine the regions in strong LD, their cosegregation rates, and underlying haplotypes in each region.

Imputation of ABO blood group

The ABO blood groups were estimated using the SNP dosage in the ARIC genome-wide association study (GWAS) database. Four SNPs were used to serve as tag-SNPs for 4 major ABO genotypic blood groups based on LD (R2 > 0.7) in the HapMap CEU sample: rs514659 (tag O group), rs8176749 (one of the functional variants for B group), rs8176704 (tag A2 group), and rs512770 (the functional variant for O2).29,30

Data analysis

All SNPs were evaluated for Hardy-Weinberg equilibrium by χ2 statistics. Two P values were calculated for each SNP using the R function “chisq.test.” One was based on the asymptotic χ2 distribution, and the other was based on up to 20 000 permutations. Specifically, the P value of each SNP was first calculated by 2000 permutations; then, for all the SNPs with P values of < .01, 20 000 permutations were used to obtain more accurate P values.

Linear models were used to evaluate the association of actual or log VWF antigen levels with each SNP. Two sets of covariates were considered. Set A included 6 confounding factors: age, BMI, sex, hypertension, diabetes, and current smoker status. The age and BMI were continuous measurements, and the remaining 4 covariates were binary indicators. Set B included all variables in set A plus 4 surrogate genotypes for ABO blood types.

The frequencies of each haplotype across all the persons were calculated, which could take values of 0, 1, or 2. Linear models were applied with actual or log VWF antigen level as responses and haplotype frequency as a predictor while controlling for the confounding effects of covariate set A or B. They were also used to assess the significance of all haplotypes within each LD block. We used 2 common methods to map haplotypes directly with genotype data: the score tests approach,31 which is implemented in an R package R/haplo.stats, and a fully likelihood method,32 which is implemented in a stand-alone HAPSTAT program.

The statistical significance was determined as P < .0004, which was calculated based on a formula of 0.05 divided by 120 (the maximal numbers of potential tests in this study, including 78 SNPs, 34 haploptypes, and 8 LD blocks).


Of the ARIC participants of European descent, 8125 were available for analysis, but 269 were excluded because of (1) failure of SNP data to meet GWAS quality control standards as previously discussed30 (n = 150), (2) missing data on VWF antigen level (n = 31), (3) missing one of the covariates used for data adjustment (n = 66), or (4) missing information on ABO blood type (n = 22). The remaining 7856 were included in the final analysis. The distribution of covariates is listed in supplemental Table 1 (available on the Blood Web site; see the Supplemental Materials link at the top of the online article).

VWF antigen level

The level of plasma VWF antigen varied from 24% to 456%. The mean ± SD VWF antigen level for males (112.9% ± 42.7%) was significantly higher than that for females (111.1% ± 42.4%; Wilcoxon rank-sum test, P = .05, supplemental Table 1). Supplemental Table 2 lists the percentile of subjects in groups based on VWF antigen levels, with 45% of subjects having VWF antigen levels at 100% or less. Up to 5% of subjects have VWF levels of < 56%, but their bleeding profiles are presently unknown. As shown in Figure 1, the distribution of the VWF antigen was skewed to the right, which could potentially invalidate the normal distribution assumption of a regression model. All the data were therefore analyzed for original and log VWF antigen levels, but only data generated with log VWF antigen are presented. The VWF antigen was analyzed before and after ABO blood type adjustment. The SNP rs514659 (type O tag) has the strongest influence, contributing to 15.4% of the variance of log VWF levels, and the second strongest is rs8176704 (type A tag) at 4.1% (supplemental Table 3). Among the nongenetic covariates, age and BMI had stronger influence, accounting for approximately 4.8% and 1.6% of the variance of log VWF values. This finding is consistent with an early report from the ARIC study.11

Figure 1

Plasma VWF levels. Levels were plotted based on actual and log values to correct a skewed distribution of actual VWF antigen, which could potentially invalidate the normal distribution assumption of a regression model.

SNP correlation with VWF antigen level

Of the 78 VWF SNPs analyzed for this study, 72 (92.3%) are intronic and 6 (7.7%) are exonic. The frequencies and locations of these SNPs are listed in supplemental Table 4. The frequency of homozygosity for rare alleles varied from 0% to 24.4% with 13 SNPs being < 1% and 4 SNPs not detected. The changes in nucleic acids and amino acids for the 6 exonic SNPs and their domain locations, constructed based on a report by Mancuso et al17 and the Sheffield VWF Database (, are listed in supplemental Table 5. Four SNPs found to be in Hardy-Weinberg disequilibrium were removed from constructing the haplotype map and haplotype specific analysis but contributed to SNP-specific P value calculations.

The P values for association of the 78 SNPs with VWF antigen levels are listed in supplemental Table 6. Of those, 18 (23.1%) were significantly associated with log VWF antigen levels before they were adjusted for ABO blood group (Table 1). After adjustment for ABO, the association of positive SNPs with VWF antigen levels grew stronger. One striking observation is that the 78 SNPs are distributed throughout the VWF gene without apparent hot spots, but all positive SNPs are clustered in a 50-kb region of the VWF gene (Figure 2). The positive SNPs were predominantly in introns, except 2 coding SNPs, which are located in exon 18 and exon 22 (rs1800380 and rs1063857), but neither resulted in actual changes in amino acid sequence (supplemental Table 5). The effect size of the positive SNPs was relatively small, with the strongest association (rs1063857) explaining approximately 0.9% of VWF (or log VWF) variation. However, a dosing/additive effect was detected among the positive SNPs.

Table 1

P values of SNPs significantly associated with VWF antigen levels

Figure 2

P values of association between plasma VWF antigen and individual SNPs in the VWF gene. The data were analyzed based on log VWF antigen with or without ABO adjustment. The data show a clear cluster of positive SNPs in the 6000- to 6050-kb region, even though all SNPs analyzed in this study were distributed throughout the entire VWF gene.

Haplotype construction and correlation with VWF antigen levels

Studies in the past have identified many specific SNPs that affect plasma VWF antigen levels but had limited power to construct haplotypes because of insufficient sample sizes.3335 Our large sample set allowed us to construct a VWF haplomap that included rare SNPs to identify specific SNPs that are transmitted together. We found 9 separate blocks that are in LD in the VWF gene (Figure 3). Figure 4 lists the common haplotypes that account for more than 98% of all haplotypes found in the cohort.

Figure 3

VWF haplotypes constructed using the Haploview program. The haplotypes were constructed using the default setting that (1) ignores pairwise comparisons of markers more than 500 kb apart, (2) excludes persons with more than 50% of missing genotypes, (3) examines haplotypes found 1% or more in the population, (4) removes markers with Hardy-Weinberg disequilibrium, and (5) uses R2 to define haplotype blocks. The majority of them are found in 9 LD blocks that distributed throughout the entire VWF gene.

Figure 4

Frequencies of VWF haplotypes constructed with the Haploview program. The interactions between haplotypes in adjacent LD blocks are indicated.

Each LD block contained a various number of haplotypes with different frequencies. Blocks 5 and 6 were significantly correlated with VWF antigen levels regardless of whether VWF antigen was expressed as the original (data not shown) or log values (Figure 5; Table 2). Block 5 had the most significant association among all the genetic variations we tested with an effect size (calculated by R2) of 1.5% for log VWF levels, much stronger than the most significant single SNP (rs10638570). The result is consistent with an additive effect among individual SNPs. Consistent with individual SNPs, the association between specific haplotype blocks and VWF antigen levels was significantly improved after the ABO adjustment (Table 2). Further analysis found that the first to fourth haplotypes in block 5 and the third haplotype in block 6 were individually associated with VWF antigen levels. These results were further verified with haplotype mapping methods that directly used the genotype data (data not shown).

Figure 5

P values of the associations between VWF haplotypes and log VWF antigen with or without ABO adjustment, with and without ABO adjustment. There is clearly a cluster of significant associations in blocks 5 and 6.

Table 2

P values for haplotype association with VWF levels


We have analyzed the relationship between 78 VWF SNPs and haplotypes constructed from those SNPs and plasma levels of VWF antigen in 7856 subjects of European descent in the ARIC cohort. We have found a significant association between multiple SNPs after correction for factors known to affect VWF antigen levels, including age, sex, diabetes mellitus, hypertension, BMI, tobacco use, and ABO blood group.12 Among the nongenetic confounding factors, age and BMI explains most of the VWF variation (4.8% and 1.6%, respectively), whereas the contributions from hypertension, diabetes, and smoking are very small. VWF levels are known to increase with aging, but the underlying mechanism for the BMI effect remains to be further investigated. It is unlikely that BMI affects circulating VWF antigen levels because of obesity and associated chronic systemic inflammation because other conditions known to associate with inflammation (diabetes and smoking) do not have this positive association.

Eighteen of the 78 SNPs studied (23%) were significantly associated with VWF antigen levels (Table 1). One of these positive SNPs (rs1063857) was also identified as highly correlative of VWF level in a GWAS conducted recently by the CHARGE consortium.30 (ARIC contributes to the CHARGE dataset.) After construction of a haplotype map, we identified 9 distinct haplotype blocks that are in LD (Figure 3), each containing a different number of haplotypes (Figure 4). Thirteen of the 18 SNPs (72%) with a positive association are aggregated in block 5 and 2 of 18 SNPs (11%) are in block 6. These are the only LD blocks that are significantly associated with levels of VWF antigen (Figure 5; Table 2). Together, this study yielded several interesting findings.

First, the positive SNPs (and positive haplotypes) are strikingly concentrated in a 50-kb region of the approximately 180-kb VWF gene on chromosome 12 (27.8% of the VWF gene; Figures 2 and 5), even though the individual SNPs included in the data analyses scattered throughout the entire VWF gene. This is a very strong signal from a localized region of the gene. The strong localized signal in this large population provides a compelling argument that this region has a strong causal relationship with plasma VWF antigen levels. Although the effect size of each positive SNP is small, it could become significant because of additive effects as demonstrated by the haplotype analysis and nongenomic factors (supplemental Table 3). The region that contains all the positive SNPs spans from exons 13 to 24 and their intron intervals (Table 1) and encodes the D2, D′, and D3 domains of VWF (Figure 6).16,18 The D2 domain and a portion of D′ constitute the major part of the VWF propeptide, which is cleaved in the Golgi to facilitate multimerization and subsequent storage in granules.3638 Mutations in the VWF propeptide, specifically in the D2 domain, have been shown to result in decreased VWF storage, multimerization, and secretion.39 They also impact the ratio of VWF propeptide to mature VWF, which in turn affects the clearance of VWF in the circulation.40 The D3 domain is important in VWF multimerization, which could impact VWF tertiary structure and ultimately circulating levels of VWF and its clearance.39 It is also interesting to note that the rate of VWF clearance is found to be accelerated in patients with VWD Vicenza, which often have mutations, including intronic, in the D domains4143 (we did not find positive SNPs in the exon encoding D4 and flanking intronic sequence). However, extensive biologic experiments are required to establish a mechanistic link between positive SNPs and VWF expression in general and the putative role of this 50-kb region in regulating VWF expression in particular.

Figure 6

A schematic illustration of the VWF gene and its exon correspondence to specific domains of a monomeric VWF polypeptide.

Second, the positive SNPs are predominantly found in introns. Intronic variants are increasingly recognized as a critical disease-causing genetic constituent. Several large GWAS have associated intronic SNPs with hearing impairment,44 bipolar disorder,45 asthma,46 and rheumatoid arthritis.47 A recent report also associates an intronic SNP with serum level of transferrin.48 Furthermore, chromosome 9 region p21 (9p21), which is a large stretch of noncoding sequence away from any known genes, has been shown to associate with coronary heart disease.49 Indeed, by surveying 21 429 disease-causing SNPs from 2113 publications, Chen et al50 concluded that synonymous and nonsynonymous SNPs shared similar likelihood and effect size for disease association. The exact mechanism(s) for these intronic SNPs to affect VWF antigen levels remains to be further investigated, but they could impact the form and efficiency of gene splicing and mRNA stability.51 Alternatively, one could speculate that the positively correlated introns encode for intronic miRNA52 that could potentially regulate VWF transcription. The speculation is supported by recent reports that miRNA regulates fibrinogen production53 and this regulation could determine a person's sensitivity to chronic thromboembolic pulmonary hypertension.54 Finally, these positive intronic SNPs, which are almost entirely in LD with each other, could be linked to an as yet unknown gene(s) or region(s) that impacts the synthesis, stability, and/or clearance of VWF. Consistent with this notion, several SNPs outside of the VWF gene have been found to associate with VWF antigen in a recent GWAS by the CHARGE consortium.30

Third, we found 2 haplotype blocks (blocks 5 and 6) that are significantly associated with VWF level and contain 15 (83.3%) of the 18 positive SNPs (Tables 1 and 3). The association between haplotype blocks and VWF antigen levels was stronger compared with individual SNPs, suggesting an additive effect among SNPs. This could mean that certain haplotype blocks more accurately localize the mutation, which has a causal link to VWF level or capture favorable combinations of mutations that determine VWF level when expressed together.

Table 3

P values of association of individual haplotypes in block 5 and 6 with VWF levels

Notably, exonic and intronic splicing mutation SNPs that have been previously identified in patients with VWD5558 are not detected in the ARIC cohort, largely because ARIC recruited healthy subjects, who did not have bleeding manifestation or family history of bleeding based on an extensive questionnaire administered at the time of blood draws. Despite studying a healthy cohort, up to 5% of subjects have VWF antigen levels of 56% or less, (supplemental Table 2), which is a level close to that found in patients with type 1 VWD. As an epidemiologic study to investigate the etiology and natural history of atherosclerosis and clinical atherosclerotic disease, ARIC collected limited information on bleeding phenotypes. It therefore remains unclear as to whether these subjects have a bleeding tendency

Finally, it has been well established that ABO blood group has a significant impact on circulating VWF level. Subjects with O blood group have consistently lower levels of VWF, showing an average 25% reduction compared with non-O blood groups,59 potentially caused by a greater rate of clearance.12,13,20 Among the 4 SNPs that we used to type ABO blood groups, rs514659, which tags O blood type, has the strongest correlation, contributing to 15.4% of log VWF variation. The finding is consistent with previous reports. The second strongest association is found with rs8176704 (type A tag), which contributes to 4.1% of log VWF variation. The association of VWF SNPs and haplotypes with VWF levels grew stronger after ABO adjustment. The data suggest that either there is a dosing effect of multiple SNPs (as suggested by comparing the effect size between individual SNPs and haplotypes) or we have further localized the SNPs that have a causal link to VWF level.

In conclusion, we have analyzed the subjects of European descent within the ARIC cohort. We found a significant number of SNPs and derivative haplotypes that have a significant association with levels of circulating VWF antigen. These SNPs are almost exclusively localized to a 50-kb region of the VWF gene that codes for domains involved in encoding propeptide and regulating VWF multimerization. Our data suggest that this region plays an important role in regulating VWF antigen in the circulation by either controlling the rate of gene splicing, miRNA regulation, or serving as markers for other regulatory elements in and beyond the VWF gene.


Contribution: M.C. and A.R.F. designed the study, analyzed the data, and wrote the manuscript; W.S. and C.B. analyzed the data and wrote the manuscript; F.Y. constructed and analyzed the haplotype and wrote the manuscript; M.B. performed VWF SNP screen and identification; W.T. performed imputation of ABO information; L.E.C. developed the hypothesis and analyzed the data; K.K.W. wrote the manuscript; E.B. developed the hypothesis, designed the study, and wrote the manuscript; and J.D. developed the hypothesis, designed the study, analyzed the data, and wrote the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Jing-fei Dong, Section of Cardiovascular Sciences, Department of Medicine, BCM286, N1319, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030; e-mail: jfdong{at}


The authors thank the staff and participants of the ARIC study for their important contributions.

The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute (contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, and N01-HC-55022). This work was supported by the ARIC contract and the National Heart, Lung, and Blood Institute (grant HL71895).

National Institutes of Health


  • * M.C., W.S., and F.Y. contributed equally to this study.

  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted August 3, 2010.
  • Accepted January 30, 2011.


View Abstract