Blood Journal
Leading the way in experimental and clinical research in hematology

Population-specific genetic variants important in susceptibility to cytarabine arabinoside cytotoxicity

  1. Christine M. Hartford1,
  2. Shiwei Duan2,
  3. Shannon M. Delaney2,
  4. Shuangli Mi2,
  5. Emily O. Kistner3,
  6. Jatinder K. Lamba4,
  7. R. Stephanie Huang2, and
  8. M. Eileen Dolan2
  1. Departments of 1Pediatrics
  2. 2Medicine, and
  3. 3Health Studies, University of Chicago, IL; and
  4. 4Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis


Cytarabine arabinoside (ara-C) is an antimetabolite used to treat hematologic malignancies. Resistance is a common reason for treatment failure with adverse side effects contributing to morbidity and mortality. Identification of genetic factors important in susceptibility to ara-C cytotoxicity may allow for individualization of treatment. We used an unbiased whole-genome approach using lymphoblastoid cell lines derived from persons of European (CEU) or African (YRI) ancestry to identify these genetic factors. We interrogated more than 2 million single nucleotide polymorphisms (SNPs) for association with susceptibility to ara-C and narrowed our focus by concentrating on SNPs that affected gene expression. We identified a unique pharmacogenetic signature consisting of 4 SNPs explaining 51% of the variability in sensitivity to ara-C among the CEU and 5 SNPs explaining 58% of the variation among the YRI. Population-specific signatures were secondary to either (1) polymorphic SNPs in one population but monomorphic in the other, or (2) significant associations of SNPs with cytotoxicity or gene expression in one population but not the other. We validated the gene expression-cytotoxicity relationship for a subset of genes in a separate group of lymphoblastoid cell lines. These unique genetic signatures comprise novel genes that can now be studied further in functional studies.


Cytarabine arabinoside (ara-C) is an antimetabolite used primarily for the treatment of hematologic malignancies and is the mainstay of treatment for acute myeloid leukemia (AML). The inclusion of ara-C into the treatment regimens for AML has resulted in an improvement in remission rates and overall survival in both adults and children.13 In a recently published study from the Children's Oncology Group of 901 persons younger than 21 years who were treated for AML, the 5-year overall survival was 52%,4 although among certain groups of patients the outlook is better. For example, among patients of the same study who underwent matched related donor bone marrow transplantation, a 68% overall survival was achieved. Among adults, a study from the Cancer and Leukemia Group B of 474 patients younger than 60 years demonstrated a 34% 5-year overall survival rate.5 However, further improvements are needed. Resistance to chemotherapy, including ara-C, is a major reason for treatment failure among patients with AML.610 Treatment with ara-C is also associated with several adverse side effects, including myelosuppression, infections, mucositis, neurotoxicity, and acute pulmonary syndrome.1114 Greater sensitivity to the cytotoxic effect of ara-C may translate into an increased risk of adverse side effects in host normal tissue

Candidate gene approaches have been used to identify genetic variables that are important in resistance and susceptibility to ara-C. These studies have mainly focused on genes in the pharmacokinetic pathway of ara-C, including deoxycytidine kinase (DCK),8,1519 cytidine deaminase (CDA),8,20,21 5′-nucleotidase (NT5C2),8,19,22 and human equilibrative nucleoside transporter 1 (hENT1).8,21 Although genetic factors important in the pharmacodynamic effects of ara-C have not been studied as extensively, those that have been studied include DNA polymerase,6 topoisomerase I and II,6 and bcl-2.23 Although some of the more recent studies have considered DNA sequence variation and alternative splicing of these candidate genes,15,16,18,24 the major focus has been on variation in gene expression in leukemic blasts. Some studies have shown an association between expression and either sensitivity to ara-C or outcome; however, the true contribution of genetic variation of these candidate genes to susceptibility to ara-C remains inconclusive. In addition, the genetic contribution to ara-C-induced toxicity in germ line DNA has not been comprehensively evaluated.

Clinical trials demonstrate race-specific differences in outcomes and toxicities among patients with AML.2527 These patients receive ara-C as a main component of their treatment regimen, raising the possibility that the pharmacogenetics of this agent may play a role in differences in outcomes. In addition, Lamba et al observed higher mRNA expression of DCK in lymphoblastoid cell lines (LCLs) of African ancestry compared with those of European ancestry and identified a single nucleotide polymorphism (SNP) in DCK that associated with both DCK expression in the LCLs as well as lower blast ara-C-5′-triphosphate (ara-CTP) concentrations in patients administered ara-C.15

To identify population-specific genetic determinants that contribute to susceptibility to ara-C, we first examined SNPs in DCK and then applied a whole-genome pharmacogenomic approach to cellular susceptibility to ara-C in 2 distinct populations. A similar approach has been used in evaluating pharmacodynamic genes important in cisplatin and etoposide through the use of LCLs with publicly available genotypic data.28,29 These cell lines provide a well-controlled, reproducible system free from confounding factors, such as patient comorbidities and drug-drug interactions. Furthermore, cell lines derived from persons of different populations allow for the identification of population-specific genetic determinants. Identifying genetic factors that are responsible for variation in both pharmacokinetic and pharmacodynamic genes will be useful in designing pharmacogenomic endpoints for clinical trials aimed at identifying patients who may be at increased risk for toxicity (because of increased sensitivity to ara-C) or treatment failure (because of decreased sensitivity to ara-C) and therefore require either dose modifications or alternative therapies.


Cell lines

International HapMap EBV-transformed LCLs were purchased from Coriell Institute for Medical Research (Camden, NJ). Cell lines were derived from Utah residents with ancestry from northern and western Europe (HAPMAPT01, CEU) and from Yoruba persons in Ibadan, Nigeria (HAPMAPT03, YRI). Details on the origin of the cell lines can be found at Cell lines comprised trios of mother, father, and offspring. Cell lines were maintained in RPMI 1640 (Mediatech, Herndon, VA) supplemented with 15% fetal bovine serum (HyClone Laboratories, Logan, UT) and 1% l-glutamine (Invitrogen, Carlsbad, CA). Cell lines were passaged 3 times per week at a concentration of 350 000 cells/mL and kept at a temperature of 37°C, with 5% CO2 and 95% humidity.

Drug and nucleotides

Cytosine β-D-arabinofuranoside (ara-C), guanosine diphosphate, cytidine triphosphate, adenosine triphosphate, uridine triphosphate, thymidine triphosphate, and guanosine triphosphate were purchased from Sigma-Aldrich (St Louis, MO). Ara-C was prepared in phosphate-buffered saline (PBS; pH 7.4; Invitrogen) immediately before each experiment.

Cytotoxicity assay

Cell growth inhibition as measured by alamarBlue (BioSource International, Camarillo, CA), a colorimetric-based assay,30,31 was used to determine the cytotoxicity of ara-C. Absorbencies after alamarBlue can be used to quantify cell proliferation and viability.30,31 By comparing proliferation in cells exposed to drug relative to that of unexposed cells, a measure of cytotoxicity is obtained. This assay has been shown to compare favorably to the 3-[4,5-dimethylthiazol-2-yl]-2,5-diphenyl tetrazolium bromide30,32 as a measure of cytotoxicity.

The cytotoxicity assay was performed as previously described.33 ara-C was dissolved in PBS. The percent survival values at each concentration were determined after 72 hours of exposure to 1, 5, 10, 40, and 80 μM drug and plotted against ara-C concentrations to generate a survival curve. The area under the survival curve (AUC) was calculated for each cell line using the trapezoidal rule. All AUC values were log2-transformed before statistical modeling, creating a dependent variable from an approximately normal distribution.

Cell proliferation, population, and sex differences

To examine the effect of the rate of cellular proliferation on susceptibility to ara-C, the proliferation rate was calculated for each untreated cell line at the time of the cytotoxicity experiment for that cell line. Correlation between log2-transformed proliferation rate and log2-transformed AUC was tested using a general linear regression approach such that trios were analyzed as independent units and the covariance was modeled as previously described.28 This analysis was performed in the combined CEU and YRI cell lines with a population indicator as a covariate. In addition, the effects of population and sex on cytotoxicity were explored using the linear model framework as described.33

Association analysis of SNPs within DCK to ara-C cytotoxicity in CEU and YRI samples

As previously reported, 64 SNPs within DCK were identified in the CEU and YRI HapMap samples.15 Genotypes of these SNPs were tested for association with ara-C cytotoxicity. General linear models were constructed with AUC after transforming using log base 2 in each population separately. To begin, the additive effect of each SNP was tested as an independent predictor of AUC. Using SNPs significant in the univariate models at the α = .05, multivariable models were reduced using a backwards elimination approach. SNPs included in the final models were statistically significant at the α = .05 level. To quantify the amount of variation in percentage survival or AUC explained by the SNPs, an estimate of r2 was computed using an approach described previously.28

DCK Western blot

Antirabbit DCK-C-terminal antibody was obtained from Abgent (San Diego, CA). Cells were lysed in radioimmunoprecipitation assay buffer (Santa Cruz Biotechnology, Santa Cruz, CA). Proteins were separated by 12% sodium dodecyl sulfate–polyacrylamide gel electrophoresis and transferred to polyvinylidene difluoride membrane. The blots were blocked overnight with 5% nonfat dry milk in PBS, containing 0.05% Tween-20, and they were probed with rabbit anti–human DCK polyclonal C-terminal antibody (Abgent) for 2 hours at room temperature, followed by incubation with horseradish peroxidase-conjugated anti–rabbit IgG (Santa Cruz Biotechnology). Immunocomplexes were visualized by an enhanced chemiluminescence kit (GE Healthcare, Little Chalfont, United Kingdom) according to the manufacturer's protocols.

Association of levels of ara-CTP with DCK SNPs

LCLs were diluted to 0.5 × 106 cells/mL media 24 hours before treatment with 1 mM ara-C for 6 hours. At 6 hours, 20.5 × 106 cells were pelleted and snap frozen until nucleotide extraction. Levels of ara-CTP were measured by high performance liquid chromatography (HPLC). Figure S1 (available on the Blood website; see the Supplemental Materials link at the top of the online article) shows full details of the HPLC method.

Each cell line was treated at least twice in duplicate. With each set of cell treatments, GM18858 was treated to serve as a control. All ara-CTP levels were calculated relative to this cell line on that treatment date. The relative levels of ara-CTP were then analyzed by t test according to DCK SNP genotypes.

Whole-genome analysis of genotype and cytotoxicity association

SNP genotypes were downloaded from the International HapMap database, release 22 ( SNPs with evidence of Mendelian allele transmission errors and those with a minor allele frequency less than 5% were filtered out, giving a total of more than 2 million SNPs for the association analysis in each population.

The quantitative transmission disequilibrium test (QTDT) was performed to identify genotype-cytotoxicity associations using QTDT software ( This was performed separately within each population, with P less than or equal to 1 × 10−4 considered statistically significant. To account for the possibility of multiple testing errors, the false discovery rate was also calculated.

Analysis of genotype and gene expression association

Assessment of gene expression in the LCLs was performed using the Affymetrix GeneChip Human Exon 1.0 ST array (Affymetrix, Santa Clara, CA), as previously described.35 A second QTDT test that integrated the SNPs identified from the genotype-cytotoxicity QTDT analysis and mRNA level gene expression was then performed to identify genotype-gene expression associations as previously described,28 with the exception that transcript clusters with average intensity greater than 5.34 (the top 75%) were included in this association analysis, resulting in 13 314 transcripts analyzed. The P value cutoff for the significant association is 3 × 10−6, which is corrected by the number of transcript clusters tested. The analysis was performed independently for each population. All raw exon-array data have been deposited into Gene Expression Omnibus (accession no. GSE7761).36

Analysis of gene expression and cytotoxicity

To examine the relationship between gene expression and sensitivity to ara-C, general linear models were constructed as previously described28 with the log2-transformed percent cell survival values after treatment with 1, 5, 10, 40, and 80 μM as well as the AUC as the dependent variables, and the log2-transformed gene expression level, together with an indicator for gender, as the independent variables. Genes identified in the genotype-expression QTDT association analysis were included in this analysis. This analysis was performed independently in each population. P values less than .05 were considered statistically significant.

Model to predict ara-C phenotype with multiple SNPs

To examine the overall genetic variant contribution to variation in sensitivity to ara-C, additional general linear models were constructed using the log2-transformed percent survival or AUC value as the dependent variable. The independent variables included SNPs that were selected from the 2 QTDT models and the linear regression of expression on each of the transformed measurements of ara-C-induced cytotoxicity in the CEU or YRI population independently. Additive genetic effects were assumed for each SNP. Models were reduced using a backwards elimination approach, and an r2 was estimated between percent survival or AUC and the predicted percent survival or AUC as previously described.28

Independent validation of phenotype, genotypes, and gene expression

Cellular sensitivity to ara-C was evaluated in an additional independent set of 49 unrelated CEU cell lines using the same method described earlier. The cell lines included in this analysis are listed in the Supplemental data. The percent survival at each concentration of ara-C was determined, and the AUC was calculated for each cell line.

Candidate gene expressions identified from the whole-genome analysis were validated in these 49 LCLs. Quantitative real-time polymerase chain reaction (RT-PCR) was performed to measure the level of expression of GIT1, SLC25A37, and P2RX1. Exponentially growing cells were diluted at a density of 3.5 × 105 cells/mL per flask. A total of 5 × 106 cells were pelleted and washed in ice-cold PBS and centrifuged to remove PBS. All pellets were flash frozen and stored at −80°C until RNA isolation. Total RNA was extracted using the RNeasy Mini kit (QIAGEN, Valencia, CA) following the manufacturer's protocol. RNA quality assessment and quantification were conducted using the optical spectrometry 260/280 nm ratio. Subsequently, mRNA was reverse transcribed to cDNA using Applied Biosystems High Capacity Reverse Transcription kit (Applied Biosystems, Foster City, CA). The final concentration of cDNA was 50 ng/μL. Quantitative RT-PCR was performed for GIT1, SLC25A37, and P2RX1 and an endogenous control (huB2M, beta-2-microglobulin; NM_004048.2) using TaqMan Gene Expression Assays (Applied Biosystems) on the Applied Biosystems 7500 real-time PCR system. Total reaction was carried out in 25 μL volume, which consisted of 12.5 μL 2× Taqman Gene Expression PCR master mix, 1.25 μL primers and probe mix (final of 900 nM forward and reverse primers and 250 nM of probe), along with 10 μL of 1.25 ng/μL cDNA. The GIT1 (Hs01063104_m1), SLC25A37 (Hs00249767_m1), and P2RX1 (Hs00175686_m1) Taqman primers and probes were labeled with the FAM reporter dye and the MGB quencher dye. huB2M primer/probe mixture was labeled with the VIC reporter dye and the MGB quencher dye. The thermocyler parameters were: 50°C for 2 minutes, 95°C for 10 minutes, and 40 cycles of 95°C for 15 seconds at 60°C for 1 minute. Each cycle threshold value obtained for GIT1, SLC25A37, and P2RX1 was normalized using huB2M independently. A relative standard curve method was used to obtain the relative GIT1, SLC25A37, and P2RX1 expression in our LCL samples (guide to performing relative qualification of gene expression using real-time quantitative PCR online at, with the lowest expression set as the calibrator for all other LCLs. Each experiment was conducted a minimum of 2 times, and samples were run in triplicate for each experiment. Linear regression was then performed between the log2-transformed AUC and the relative GIT1, SLC25A37, and P2RX1 expressions. P less than .05 was considered statistically significant.



Eighty-five CEU and 89 YRI cell lines were phenotyped for susceptibility to ara-C. Using the alamarBlue assay, the percent survival of each cell line at 5 different concentrations of ara-C was determined. These data were used to calculate the AUC for each cell line. There was a significant difference in percent survival at each concentration of drug (except 1 μM) between CEU and YRI populations (Figure 1A). The mean (SD) log2 AUC was 11.71 (± 0.32 [%·μM]) in the CEU cell lines compared with 11.47 (± 0.28 [%·μM]) in the YRI lines (P < 1 × 10−4; Figure 1B). There was no difference in percentage survival or AUC between cell lines derived from females and those from males within either population.

Figure 1

Cytotoxicity of ara-C in CEU and YRI populations. (A) The mean percentage survival in the CEU compared with the YRI cell lines was 73.3 versus 70.5 at 1 μM (P = .24), 53.3 versus 47.3 at 5 μM (P = .002), 46.8 versus 39.9 at 10 μM (P = 1 × 10−4), 40.3 versus 32.8 at 40 μM (P < 1 × 10−4), and 37.3 versus 30.4 (P < 1 × 10−4) at 80 μM ara-C. (B) The distribution of log2 AUC in CEU and YRI cell lines (P < 1 × 10−4).

Cell proliferation

The proliferation rate of the CEU and YRI HapMap cell lines was calculated at the time of each cytotoxicity experiment to analyze the effect of the rate of proliferation on susceptibility to ara-C. There was a strong association between cellular proliferation and susceptibility to ara-C, as measured by AUC, within each population (P < 1 × 10−4; Figure S2). This is not unexpected given the mechanism of action of ara-C. The population difference in susceptibility to ara-C between the CEU and YRI cell lines was also analyzed using the proliferation rate as a covariate. The difference in log2 AUC between the 2 populations remained significant (P < 1 × 10−4). Thus, the population difference in susceptibility is not explained by differences in rates of cellular proliferation of the populations.

Analysis of DCK expression and SNPs

Previous data using quantitative RT-PCR demonstrated a significantly higher level of DCK mRNA in YRI compared with CEU samples.15 We surmised that this could, at least partly, explain population differences observed in sensitivity to ara-C because higher DCK expression could translate into increased intracellular ara-CTP (the active form of the drug) and therefore increased cellular sensitivity. Expression data using the Affymetrix GeneChip Human Exon 1.0 ST array data also showed higher DCK mRNA levels in YRI compared with CEU (Figure 2A) and higher expression in both populations significantly correlated to cytotoxicity (Figure S3). In addition, we performed Western blots to quantify DCK protein levels in LCLs with various levels of DCK expression (Figure S4). There was a strong correlation between the level of DCK mRNA and protein expression, as well as a significant correlation between the level of DCK protein expression and sensitivity to ara-C (Figure 2B).

Figure 2

Analysis of DCK expression. (A) Distribution of DCK mRNA expression measured on the Affymetrix GeneChip Human Exon 1.0 ST array in the CEU and YRI cell lines (P = .02). (B) Association between level of DCK protein expression and log2 AUC in a subset of YRI cell lines (r2 = 0.69, P = .04). (C) Association between DCK SNP genotype and log2 AUC (P = .02), DCK expression levels (P = .003), and intracellular ara-CTP (P = .003).

To understand the contribution of DCK SNPs to sensitivity to ara-C, we evaluated 64 SNPs previously identified within DCK15 for their association with cytotoxicity in the CEU and YRI populations. These SNPs were identified by sequencing 1.5 kb of the DCK proximal promoter and all 7 coding exons in the CEU and YRI HapMap samples. Nine of these SNPs are present in the HapMap project, whereas the remaining 55 SNPs are novel. Five of 64 SNPs (−33, 70, 2162, 31942, 36113) were associated with log2 AUC in the YRI samples, but only 1 of 64 (1124) was associated with log2 AUC in the CEU samples. After constructing multivariable models, one significant SNP explains 9% of the variation in AUC in the CEU samples and 3 of the 5 significant SNPs explain 20% of the variation in AUC in the YRI samples.

To examine the potential functional consequence of SNPs in DCK, we treated a subset of 42 YRI LCLs with different genotypes for 3 DCK SNPs (70, 31942, 36116) with 1 mM ara-C for 6 hours and then quantified the levels of ara-CTP by HPLC. LCLs that are heterozygous for SNP 70 demonstrated an increased sensitivity to ara-C in our genotype-AUC analysis (Figure 2C). Not only did these same cell lines demonstrate higher expression of DCK as measured by the exon array, but they also demonstrated significantly higher levels of ara-CTP compared with LCLs that are homozygous (Figure 2C), therefore suggesting that this SNP in DCK affects the function of the protein.

Lamba et al also performed functional studies of 3 nonsynonymous DCK-coding SNPs, including an assessment of activity of recombinant DCK protein as well as measuring DCK activity in LCLs with these coding SNPs.15 The activity of all 3 recombinant proteins was less than that of the wild-type protein. In addition, 2 of the recombinants demonstrated lower Km and Vmax compared with the wild-type. LCLs heterozygous for the coding SNPs demonstrated lower DCK activity compared with wild-type cell lines. These data provide evidence that genetic variation within the DCK gene can affect function of the protein. To assess other genetic variants contributing to phenotypic variation in cytotoxicity, we used an unbiased, whole-genome approach taking into consideration HapMap SNPs in the CEU and YRI populations.

Whole-genome association of genotype and cytotoxicity

We performed whole-genome analysis composed of 3 sequential steps within each population as illustrated in Figure S5. The first step was a QTDT association analysis between more than 2 million SNPs from the HapMap project and log2 AUC, as well as percent survival at each drug concentration (Figure S5A). Because cell survival for each drug concentration is interrelated, we specifically focused on AUC as representing a comprehensive phenotype for overall response to ara-C. The total number of SNPs identified as significantly associated with AUC was 505 in the CEU cell lines and 397 in the YRI. The 505 significant SNPs in the CEU population were located in or near 74 unique genes, whereas those in the YRI population were in or near 70 genes. A complete list of these SNPs and genes is provided in Table S1 (CEU) and Table S2 (YRI).

Association of genotypes with gene expression

To specifically identify those associated SNPs that act through effecting gene expression, we performed a second QTDT association analysis between the SNP genotypes identified in step 1 and the level of gene expression from the Affymetrix GeneChip Human Exon 1.0 ST array (Figure S5B). This was done independently for each population for each drug concentration as well as AUC. The number of transcript clusters (genes) used in this analysis was 13 314. For AUC, 26 of the 505 significant SNPs from the genotype-cytotoxicity QTDT analysis in the CEU cell lines were significantly associated with the expression of a total of 12 target genes. Within the YRI population, 33 of the initial 397 significant SNPs were associated with the level of expression of 36 unique genes. These SNPs and target genes are indicated in Tables S1 and S2.

Linear regression between gene expression and cytotoxicity

The “target genes” identified in the second QTDT association analysis were analyzed for the relationship of expression with ara-C cytotoxicity using linear regression analysis. There were 6 and 24 “target genes” significantly correlated in CEU and YRI, respectively. The final analysis resulted in 11 and 24 SNPs in CEU and YRI, respectively (Figure S5C; Table S1). Interestingly, there were no SNPs or target genes that overlapped between the 2 populations.

Predicting cytotoxicity with multiple SNPs

By identifying SNPs that both associated with susceptibility to ara-C cytotoxicity and the expression of a gene, and further focusing on the subset of these SNPs that associated with genes whose expression correlated with ara-C cytotoxicity, a genetic signature for susceptibility to the cytotoxic effects of ara-C was identified. To quantitatively evaluate the contribution of each SNP to this susceptibility, a linear model of cytotoxicity was constructed with multiple SNPs as predictors. In the CEU population, 4 SNPs of the final 11 tested were identified that explained 51% of the variation in ara-C AUC; whereas in the YRI cell lines, 5 SNPs of 24 tested explained 58% of this variation. The individual SNPs and target genes included in the final model for each population are indicated in Table 1.

View this table:
Table 1

SNPs and target genes comprising the genetic signature for susceptibility to ara-C from the whole genome analysis for AUC in both CEU and YRI populations

Population-specific associations

There was no overlap among either the SNPs or target genes between the 2 populations for any step of the analysis. We identified 2 general patterns to explain this population specificity. The first was explained by genetic variation of a given SNP in one population and not the other. For example, in the CEU population, SNP rs17808412 was associated with AUC and the expression of the gene GIT1. The CC genotype was associated with a lower AUC and higher expression of GIT1 compared with the CG and GG genotypes (Figure 3A,B). However, in the YRI population, all cell lines have the GG genotype for rs17808412, and GIT1 expression levels are consistent with the levels in CEU cell lines harboring the GG genotype (Figure 3A,B). In CEU, this SNP was also associated with cellular sensitivity to 1, 5, and 10 μM ara-C, was associated with the expression of GIT1, and was included in the final multivariable SNP model for each of these phenotypes. This single SNP was shown to contribute to 36%, 34%, 29%, and 21% to the variability in cytotoxicity for 1, 5, and 10 μM ara-C and AUC, respectively, indicating that this SNP may potentially be useful for predicting susceptibility to ara-C in whites.

Figure 3

SNP rs17808412 and GIT1. In the CEU population, rs17808412 demonstrated a significant association between SNP genotype and both (A) ara-C AUC (P = 1 × 10−5) and (B) the level of expression of GIT1 (P = 1 × 10−6). (A,B) In the YRI population, this SNP is not variable, with all cell lines having the (GG) genotype. (C) Expression of GIT1 and AUC was significantly correlated in the CEU population (r2 = 0.200, P = 7 × 10−5). (D) This correlation was validated in an independent set of CEU cell lines.

We also identified YRI-specific genetic variation with the association of SNP rs10973320 with AUC and the expression of RAD51AP1 (Figure 4A,B). Cell lines with the AA genotype had higher AUC and lower RAD51AP1 expression compared with cell lines with the AT or TT genotypes. All CEU cell lines have the TT genotype (Figure 4A,B). rs10973320 not only associated with AUC in the YRI population but also with percent survival after exposure to 1, 5, 10, 40, and 80 μM ara-C.

Figure 4

rs10973320 and RAD51AP1. rs10973320 demonstrated genetic variability in the YRI population as well as an association between genotype and both (A) AUC (P = 2 × 10−5) and (B) expression of RAD51AP1 (P = 4 × 10−8). (A,B) This SNP is not variable in the CEU population. (C) Expression of RAD51AP1 and AUC were significantly correlated in the YRI population.

The lack of genetic variability in one population as an explanation of population-specific significance of particular SNPs was the case for a small fraction of SNPs (5 of 35). In most instances, all 3 genotypes were present; however, significant genotype-phenotype relationships were only found in one population. This is illustrated by SNP rs2775139, which was associated with AUC in the CEU population but not the YRI (Figure 5A). In the CEU population, this SNP was further associated with the expression of SLC25A37 (Figure 5B). CEU cell lines having the CC compared with CT and TT genotypes had increased SLC25A37 expression and greater sensitivity to ara-C. Although this SNP does demonstrate genetic variability in the YRI cell lines, there was no difference in either ara-C AUC or SLC25A37 expression for the various genotypes (Figure 5A,B). The SNP rs2775139 associated with AUC in the CEU cell lines along with 5, 10, 40, and 80 μM, associated with the expression of SLC25A37 in these cell lines, and was included in the final multivariable SNP model of 5 and 80 μM and AUC.

Figure 5

rs2775139 and SLC25A37. rs2775139 is variable in both CEU and YRI cell lines. In the CEU cell lines, genotype is associated with both (A) AUC (P = 8 × 10−6) and (B) expression of SLC25A37 (P = 1 × 10−6). Among the YRI cell lines, all genotypes demonstrate similar (A) AUC and (B) SLC25A37 expression. (C) Expression of SLC25A37 and AUC was significantly correlated in the CEU population (r2 = 0.169, P = 7 × 10−4). (D) This association was validated in an independent set of CEU cell lines in which AUC significantly correlated with expression of SLC25A37 (r2 = 0.0817, P = .05).

For each “target gene” identified from this 3-step sequential approach, we examined the correlation between the level of expression and AUC. Among the CEU cell lines, increased expression of both GIT1 and SLC25A37 confers increased sensitivity to ara-C (Figures 3C,D and 5C,D, respectively). In the YRI cell lines, the increased level of expression of RAD51AP1 significantly associated with increased sensitivity to ara-C (P = .003; Figure 4C).


To validate the gene expression relationships from the whole-genome association, we treated an additional set of 49 unrelated CEPH cell lines with ara-C and evaluated the level of expression by quantitative RT-PCR of a subset of the candidate genes. The mean (± SD) AUC in these cell lines was 11.5 ( ± 0.4 [%·μM]). Linear regression analysis of the level of GIT1 and SLC25A37 expression and AUC in the 49 cell lines reproduced the relationship between GIT1 and SLC25A37 expression and sensitivity to ara-C seen in the HapMap cell lines. A higher level of expression of both GIT1 and SLC25A37 expression was associated with decreased AUC and therefore increased sensitivity to ara-C (P = .04 and P = .05, respectively; Figures 3D, 5D). P2RX1 expression did not associate with AUC (P > 0.05).


Clinical trials have indicated an ethnic difference in outcome among patients with AML.2527 Understanding the contribution of pharmacogenetics to interindividual and interethnic differences in response to the drugs used to treat AML could help individualize chemotherapy and therefore potentially improve outcomes among persons with this disease. To characterize the genetic contribution to this variation for ara-C, one of the primary agents used to treat AML, we confirmed and extended a previous analysis of genetic variation within DCK, a candidate gene, and used an unbiased approach that coupled results of an in vitro cytotoxicity assay with whole-genome association analyses. We took into account cellular proliferation, evaluated more than 2 million genetic variants, and found population-specific genetic determinants. In an independent set of samples, we validated the correlation between the expression of 2 novel candidates identified from the whole-genome analysis, GIT1 and SLC25A37.

Cell lines derived from the YRI population were significantly more sensitive to ara-C than the CEU. Further, the candidate approach evaluating DCK and the whole-genome analyses identified a unique pharmacogenetic signature for susceptibility to ara-C in each population. The candidate gene approach identified 1 and 3 SNPs within DCK, which contribute 9% and 20% to the variation in AUC in the CEU and YRI populations, respectively. The SNPs evaluated in DCK included 9 HapMap SNPs, whereas the remainder were not in the HapMap, therefore providing a more in-depth interrogation of this candidate gene. These SNPs explain some of the variability in ara-C cytotoxicity but not all. We therefore interrogated the entire genome for additional variants. The whole-genome approach identified 11 and 24 SNPs acting through gene expression of 6 and 24 genes in CEU and YRI, respectively. There is no overlap in either these SNPs or target genes between the 2 populations. This can be explained either by a difference in allele frequency in the 2 populations (5 SNPs) or for most SNPs (n = 30) an association in one population but not in the other population. This could be attributed to SNPs in linkage disequilibrium with a causal SNP that associates with ara-C cytotoxicity and is not variable in one of the populations. Another possibility is that these SNPs have differential effects in the 2 populations.

AML patients, all of whom receive ara-C as a significant part of the therapeutic regimen, demonstrate differences in outcome among patients of different races. In a study by the Children's Oncology Group of pediatric patients with AML treated on 2 consecutive multi-institutional trials, black patients had significantly worse survival compared with whites.25 A similar study of patients treated at St Jude Children's Research Hospital on 5 consecutive trials did not demonstrate a significant difference in outcome between black and white patients overall; however, there was a trend to worse outcome in black patients treated on the most recent trial in which most patients received ara-C–based consolidation rather than stem cell transplantation.26 Pharmacogenetic differences between patients are one potential explanation for these differences in outcome. Our study may begin to elucidate population-specific genetic variants, with the caveat that the cell lines we studied are derived from Yoruba persons residing in Ibadan, Nigeria; therefore, the findings may only partially represent the black population because of admixture in this population.37 Therefore, this important difference may affect whether these variants can be validated in a clinical setting of African Americans.

The SNPs and genes identified as the genetic signature for ara-C cytotoxicity in both the CEU and YRI populations are novel, and our findings represent a unique set of genetic variables for further study. One interesting finding was with GIT1. GIT1 (G protein–coupled receptor kinase–interacting protein 1) acts as an intracellular scaffolding protein that interacts with numerous intracellular proteins and is involved in diverse processes, including agonist-coupled receptor endocytosis and focal adhesion assembly.38 It also acts as a scaffold for certain intracellular signaling cascade proteins, including those in the MAP kinase pathway, such as MEK1 and ERK1/2.3840 Studies have shown that overexpression of GIT1 prolongs stimulation of ERK1/2 by epidermal growth factor and decreased GIT1 expression inhibits this stimulation.39,40 ERK1/2, in turn, has been shown to have proapoptotic effects in response to DNA-damaging stimuli, including chemotherapeutic agents.41 It may be possible that increased GIT1 expression leads to increased ERK1/2 activity, which then results in increased apoptosis in response to ara-C. A study of gene expression of AML blasts from patients demonstrates that GIT1 is expressed in myeloid tumor cells.42

Another candidate gene identified in our analysis includes RAD51AP1 in the YRI population. RAD51AP1 (RAD51-associated protein 1), via interaction with RAD51, is involved in homologous recombination.43,44 Studies have demonstrated increased sensitivity to DNA damage by mitomycin C, camptothecin, cisplatin, and ionizing radiation in cells depleted of RAD51AP1.4345 Our data demonstrated an opposite association, with increased sensitivity to ara-C associated with increased expression of this gene. It may be that this is related to an alternative mechanism of action of RAD51AP1 in response to antimetabolites.

SLC25A37 was another gene found in the HapMap samples and validated in a separate set of non-HapMap samples. This gene is a member of the SLC25 solute carrier family. Studies in zebrafish demonstrated that this gene acts to import iron into mitochondria and is involved in heme biosynthesis.46 Interestingly, intracellular iron concentration has been shown to be related to ara-C cytotoxicity. In a study of leukemia cell lines, exposure to desferioxamine and therefore depletion of intracellular iron resulted in increased sensitivity to ara-C.47

Well-defined prognostic factors of AML outcome exist, one of which is the cytogenetic abnormalities in AML blasts at initial diagnosis.4850 For example, persons with t(8;21) and inv(16) karyotypes are recognized as having more favorable outcomes, whereas persons with del(7) and complex karytotypes are, among others, noted to be associated with adverse prognosis. However, within each cytogenetic subgroup there is heterogeneity and other biologic and treatment factors are also associated with prognosis. The addition of pharmacogenetic variants may help to further define subgroups within each cytogenetic category and therefore further help to prognosticate risk. It may also be the case that the pharmacogenetic variants identified through this genome-wide association study and other genomic analyses will be of greater importance in certain cytogenetic subgroups. This and other refinements in the application of pharmacogenetic variants are worthy of further study.

Using an unbiased whole-genome approach in LCLs, we were able to identify unique genetic signatures for susceptibility to the cytotoxic effects of ara-C in CEU and YRI cell lines. The SNPs and target genes included in these signatures are novel. We plan to perform functional validation studies of a subset of these SNPs and genes to provide further support for the results of this analysis, and evaluate these novel candidates in a cohort of patients treated with ara-C to determine their role in patient response and toxicity. Our ultimate goal is to develop a genetic signature that can be applied clinically to identify patients at risk for either increased or decreased susceptibility to ara-C cytotoxicity. This study represents the first step in achieving this goal.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.


Contribution: C.M.H. performed cytotoxicity experiments, participated in data interpretation, and wrote the manuscript; S.D. performed genome-wide association analyses and edited the manuscript; S.M.D. performed qRT-PCR, HPLC, and edited the manuscript; S.M. performed DCK Western blots; E.O.K. performed biostatistical analyses, participated in data interpretation, and edited the manuscript; J.K.L. performed DCK genotyping and edited the manuscript; R.S.H. participated in data interpretation and edited the manuscript; M.E.D. designed the study, participated in data interpretation, and wrote the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: M. Eileen Dolan, University of Chicago, 5841 S Maryland Avenue, Box MC2115, Chicago, IL 60637; e-mail: edolan{at}


The authors thank Steve Wisel for excellent technical support in maintaining the cell lines and T. A. Clark, T. X. Chen, A. C. Schweitzer, and J. E. Blume (Expression Research, Affymetrix Laboratory, Affymetrix, Santa Clara, CA) for their contribution in generating Exon Array data.

This Pharmacogenetics of Anticancer Agents Research Group study was supported by National Institutes of Health/National Institute of General Medical Sciences (grant GM61393 and data deposits supported by GM61374). C.M.H. was supported by the National Institutes of Health (T32 GM007019).


  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted May 6, 2008.
  • Accepted December 16, 2008.


View Abstract