Blood Journal
Leading the way in experimental and clinical research in hematology

Genetic variation in 1253 immune and inflammation genes and risk of non-Hodgkin lymphoma

  1. James R. Cerhan1,
  2. Stephen M. Ansell2,
  3. Zachary S. Fredericksen3,
  4. Neil E. Kay2,
  5. Mark Liebow4,
  6. Timothy G. Call2,
  7. Ahmet Dogan5,
  8. Julie M. Cunningham6,
  9. Alice H. Wang3,
  10. Wen Liu-Mares1,
  11. William R. Macon5,
  12. Diane Jelinek7,
  13. Thomas E. Witzig2,
  14. Thomas M. Habermann2, and
  15. Susan L. Slager3
  1. 1Division of Epidemiology, Department of Health Sciences Research;
  2. 2Division of Hematology, Department of Medicine;
  3. 3Division of Biostatistics, Department of Health Sciences Research;
  4. 4Division of General Internal Medicine, Department of Medicine;
  5. 5Division of Hematopathology, Department of Laboratory Medicine and Pathology;
  6. 6Division of Experimental Pathology, Department of Laboratory Medicine and Pathology; and
  7. 7Department of Immunology; all at the Mayo Clinic College of Medicine; Rochester, MN


Smaller-scale evaluations suggest that common genetic variation in candidate genes related to immune function may predispose to the development of non-Hodgkin lymphoma (NHL). We report an analysis of variants within genes associated with immunity and inflammation and risk of NHL using a panel of 9412 single-nucleotide polymorphisms (SNPs) from 1253 genes in a study of 458 patients with NHL and 484 frequency-matched controls. We modeled haplotypes and risk of NHL, as well as the main effects for all independent SNPs from a gene in multivariate logistic regression models; we separately report results for nonsynonymous (ns) SNPs. In gene-level analyses, the strongest findings (P ≤ .001) were for CREB1, FGG, MAP3K5, RIPK3, LSP1, TRAF1, DUSP2, and ITGB3. In nsSNP analyses, the strongest findings (P ≤ .01) were for ITGB3 L59P (odds ratio [OR] = 0.66; 95% confidence interval [CI] 0.52-0.85), TLR6 V427A (OR = 5.20; CI 1.77-15.3), SELPLG M264V (OR = 3.20; CI 1.48-6.91), UNC84B G671S (OR = 1.50; CI 1.12-2.00), B3GNT3 H328R (OR = 0.74; CI 0.59-0.93), and BAT2 V1883L (OR = 0.64; CI 0.45-0.90). Our results suggest that genetic variation in genes associated with immune response (TRAF1, RIPK3, BAT2, and TLR6), mitogen-activated protein kinase (MAPK) signaling (MAP3K5, DUSP2, and CREB1), lymphocyte trafficking and migration (B3GNT3, SELPLG, and LSP1), and coagulation pathways (FGG and ITGB3) may be important in the etiology of NHL, and should be prioritized in replication studies.


Non-Hodgkin lymphoma (NHL) is the most commonly diagnosed hematologic malignancy in the United States,1 and the lifetime odds of developing NHL is 1 in 47 for men and 1 in 55 for women.2 Given the remarkable rise in incidence of NHL in the last 50 years, it is clear that environmental factors must play a major role in the etiology of this cancer, although established risk factors to date account for only a relatively small fraction of the total number of cases.3 There is also accumulating evidence from migrant and analytic epidemiology studies that genetic susceptibility plays a role in NHL etiology,4,5 although to date no major gene has been identified. However, case-control studies have identified several promising candidate susceptibility genes supporting a polygenic model based on low-penetrance alleles in line with the common-variant, common-disease hypothesis.6

For NHL, the most compelling hypothesis for increased cancer risk is immune dysfunction, and this risk may be influenced in part by variation in polymorphic genes that control immune function and regulation. Recent studies have shown that single-nucleotide polymorphisms (SNPs) from candidate genes, including TNF and LTA79; several interleukin genes, including IL4, IL5, IL6, and IL108,10,11; and genes related to innate immunity, including FCGR2A,9 TLR4,12 and CARD15,12 may be risk factors for NHL overall or for certain NHL subtypes.

We undertook a discovery exercise to identify additional genes related to immune function and inflammation and risk of NHL by using the Affymetrix Immune and Inflammation SNP panel, which included 9412 SNPs from 1253 genes. SNP coverage included both nonsynonymous (ns) SNPs (N = 537) and tagSNPs derived from the HapMap. Our analyses focused on both a gene-level test using all of the SNPs, and for nsSNPs a SNP-level test, as the latter SNPs are expected to have a higher prior probability of functional significance. The study was conducted using a clinic-based case-control study of 458 patients with NHL and 484 frequency-matched controls, and was based at the Mayo Clinic (Rochester, MN).

Patients and methods

Study population

This study was reviewed and approved by the Human Subjects Institutional Review Board at the Mayo Clinic, and informed consent was obtained in accordance with the Declaration of Helsinki from all participants. All consecutive patients with histologically confirmed lymphoma (Hodgkin lymphoma [HL] and NHL, including chronic lymphocytic leukemia [CLL]) aged 20 years and older, who were residents of Minnesota, Iowa, or Wisconsin at the time of diagnosis, and who were within 9 months of their initial diagnosis at presentation to Mayo Clinic Rochester from September 1, 2002, forward were offered enrollment into the study. Patients were excluded if they had HIV infection, did not speak English, or were unable to provide written informed consent. A Mayo hematopathologist reviewed all materials for each case to verify the diagnosis and to classify each case into the World Health Organization Classification of Neoplastic Diseases of the Hematopoietic and Lymphoid Tissues.13 This (phase 1) analysis included all patients enrolled into the study from September 1, 2002, through September 30, 2005. Of the 956 eligible patients identified during this time frame, 629 (66%) participated, 106 (11%) refused, 19 (2%) were unable to be contacted, and 202 (24%) had their eligibility expire (ie, after identification they did not consent within 9 months of diagnosis, or after consent they did not complete data collection within 12 months of diagnosis). Most patients who had their eligibility expire were only seen once at the Mayo Clinic for a second opinion, and these patients were generally recruited through the mail rather than in person.

Clinic-based controls were randomly selected from Mayo Clinic Rochester patients aged 20 years and older, who were residents of Minnesota, Iowa, or Wisconsin, and who were being seen for a prescheduled medical examination in the general medicine divisions of the Department of Medicine from September 1, 2002, to September 30, 2005. Patients were not eligible if they had a history of lymphoma or leukemia, had HIV infection, or did not speak English. Controls were frequency-matched to the case distribution on 5-year age group, sex, and county of residence (county groupings based on distance from Rochester and urban/rural status) using a computer program that randomly selects subjects from eligible patients. Of the 818 eligible subjects identified, 572 (70%) participated, 239 (29%) refused, and 7 (1%) had their eligibility expire (ie, did not complete data collection within 12 months of selection).

Data collection

Participants completed a risk-factor questionnaire that included data on demographics, ethnicity, family cancer history, medical history, and selected lifestyle factors; they also provided a peripheral blood sample for serologic and genetic studies. DNA was extracted from samples using a Gentra Systems automated salting-out methodology (Gentra, Minneapolis, MN). A total of 498 (79%) patients and 497 (87%) controls had an extracted DNA sample available for genotyping in November 2005. DNA samples were randomly assigned to 1 of 12 96-well plates. We also randomly selected (without replacement) 2 samples that were duplicated across plates and another 2 samples that were duplicated within each plate, for a total of 1043 study samples (995 unique samples and 48 duplicates). In addition, 8 wells on each plate were left blank for additional quality control samples (N = 96). The remaining wells were either blank (N = 10) or had water only (N = 3). All source DNA tubes were barcoded and wanded into the Biomek NX software system (Beckman Coulter, Fullerton, CA), creating a virtual plate map of the robotically plated samples.

Genotyping and quality control

Genotyping was conducted at the Affymetrix facility in South San Francisco, CA, using the Molecular Inversion Probe (MIP) genotyping technology,14 which has a robust genotype-calling methodology.15 The 9K Immune-Inflammation Panel consists of 9412 MIP assays representing SNPs in 1253 genes selected for their involvement in inflammation and immunity. HapMap data (, phase 1, version 16) from CEPH (Centre d'Etude du Polymorphisme Humain; white) and Yoruba (African) samples were used to select tagging SNPs, and these SNPs were chosen to give an r2 coverage of 0.8 or greater for all SNPs genotyped in the HapMap that had a minor allele frequency (MAF) of more than 5%. These SNPs covered the entire gene, from 5 kb upstream to 5 kb downstream of the gene, as well as all exons and introns. In addition, the panel included 748 validated nsSNPs. The average number of SNPs per gene was 6.8. A complete list of the SNPs on the assay panel is available from the Affymetrix website (

Affymetrix used several genotyping quality control measures. To aid in sample tracking, sex-linked markers (X or Y chromosome) were genotyped on all samples to ensure that the DNA matched the expected sex for each individual. Further, positive and negative controls were run in parallel to ensure there was no contamination of the DNA. Other quality control measures included the addition of 25 CEPH family trios to the genotyping plates to test for non-Mendelian inheritance, and checking for high concordance of genotypes across sample pairs for potential unknown relationships, contamination, or mistracking. Samples that failed genotyping were defined by Affymetrix as those with a call rate (ie, the proportion of markers that gave unambiguous genotypes) less than 90% and repeatability rate (ie, sample-genotyped twice) less than 99%. Of the 1043 samples, 13 (1%) failed Affymetrix quality control criteria. Individual SNPs that failed genotyping were defined by Affymetrix as those with a call rate less than 80%, repeatability less than 99%, or non-Mendelian inheritance. A final number of 9237 SNPs were successfully genotyped out of 9412 attempted. Overall, the assay call rate was 99.13%, and the repeatability was 99.93%.

A second level of quality control was implemented at the Mayo Clinic. Of the 1030 samples that were successfully genotyped by Affymetrix, 48 were randomly selected duplicates (unknown to Affymetrix), leaving 982 unique subjects. We excluded from further analysis 1 man with heterozygous genotypes on the X chromosome, 11 nonwhite patients, 5 Hispanic patients, 22 patients with HL, and 1 patient later found to not have a lymphoma, leaving a final sample size of 942 genotyped subjects (458 patients and 484 controls) in the analysis.

Of the 9237 SNPs with genotype data provided by Affymetrix, we further excluded 524 with a call rate of less than 95%, 885 with a MAF of less than 1%, 3 with more than 2 genotype differences among the duplicates, and 5 with 1 or more male heterozygous genotypes for an X chromosome SNP. Furthermore, we excluded 85 SNPs that were not mapped in build 36 of the human genome (dbSNP; This left 7735 SNPs that passed all quality controls.

Hardy-Weinberg equilibrium (HWE) was evaluated among the control subjects for each SNP using a chi-square test (MAF ≥ .05) or an exact test (MAF < .05). Because of multiple testing, SNPs found to be significant at a HWE threshold of .0001 or less were removed from further analyses. This threshold is conservative (ie, removing more SNPs than may be necessary); however, because not all SNPs were independent, a Bonferroni correction would be too liberal (ie, removing fewer SNPs than necessary). This removed 65 SNPs, and of these SNPs, we observed no clustering of HWE failures. All but 2 of these SNPs were independent of each other. In summary, we had 7670 SNPs (375 nsSNPs) for analysis.

Statistical analysis

The independent effects of the matching variables age (including its functional form), sex, and geographic region were examined in unconditional logistic regression models; geographic region was not significant (P > .2), and therefore was dropped from further consideration. We used unconditional logistic regression analysis to examine associations between each SNP and the risk of NHL, adjusting for the effects of age and sex. The most prevalent homozygous genotype was used as the reference group. Each polymorphism was modeled individually as having a log-additive effect in the regression model, and odds ratios (ORs) and 95% confidence intervals (CIs) were estimated. Associations between haplotypes from each gene and the risk of NHL were calculated using a score test implemented in HAPLO.SCORE16 from the Haplo.Stats S-plus library ( All SNPs located within a gene and SNPs located either 5 kb upstream or downstream were used in the haplotype analyses. Finally, we modeled the main effects for all independent (r2 < 0.25) SNPs from a gene in a multivariate logistic regression model. This approach does not require phase information and has been shown to have greater power than haplotype analysis.17 Our primary analysis approach for selecting noteworthy results was based on the gene-level tests, and genes that had a global multiple logistic regression or global haplotype P value of .001 or less were reported as noteworthy. Because nsSNPs are more likely have functional consequences, we separately selected noteworthy results for nsSNPs with a P value of .01 or less. In addition, we examined the overall significance of the P values for our gene-level and nsSNP-level tests using the tail strength methodology of Taylor and Tibshirani18; a QQ plot for gene-level P values of .10 or less is also provided as Figure S1 (available on the Blood website; see the Supplemental Materials link at the top of the online article). All analyses were done using S-plus (Insightful, Seattle, WA) or SAS (SAS Institute, Cary, NC).

To reduce the potential that population stratification affected our results, all analyses were restricted to subjects whose self-reported race was white. In addition, we tested our white subjects for potential population stratification by randomly selecting 1000 independent (r2 < 0.25) SNPs from our study and running the program Structure.19 We found no evidence for population structure in our data.


There were 458 patients with NHL and 484 controls in this analysis, and all were white. Patients and controls were well matched on age, sex, state of residence, and education level (Table 1). Patients were more likely to report a family history of NHL (4.8%) compared with controls (2.9%). The most common subtypes were CLL/small lymphocytic lymphoma (SLL) (N = 126), follicular lymphoma (N = 113), and diffuse large B-cell lymphoma (DLBCL) (N = 76).

View this table:
Table 1

Characteristics of study participants

After all exclusions, there were 7670 SNPs available for analysis, and these SNPs were assigned to 1158 genes based on chromosomal position using build 36 of the human genome (dbSNP). The average number of SNPs per gene was 6.1, with a minimum of 1 SNP (in 318 genes) and a maximum of 136 SNPs for 1 gene. There were also a total of 375 nsSNPs available for analysis, and these were assigned to 262 genes. Table S1 reports all of the SNPs and their MAF among the controls.

Table 2 and 3 reports the gene level results based on the logistic regression and haplotype analyses, ranked by P value for all P values of .001 or less from either analysis. We also report results for the individual SNPs within these genes (Table 3). Based on the logistic regression analysis, the smallest P values were seen for CREB1 (P < .001) and FGG (P < .001). In CREB1, there were 5 SNPs with a P value of .05 or less, and the SNP with the smallest P value (.002) was rs2551919, with an ordinal OR of 0.69 (95% CI, 0.55-0.88). For FGG, there was only a single SNP (rs1800792), and the ordinal OR was 1.44 (95% CI, 1.19-1.73). The global P values from the haplotype results were similar to the logistic results. To assess the impact of multiple testing at the gene level, we calculated the tail strength of the 1158 P values from the logistic regression gene-based analysis. The tail strength was 0.10 (95% CI, 0.04-0.15), suggesting that our results identified 10% more signal than expected from chance.

View this table:
Table 2

Logistic regression and haplotype results for genes with a P value of .001 or less

View this table:
Table 3

SNPs from genes with a P value of .001 or less from the logistic regression or haplotype analyses

Table 4 reports the results for the 6 nsSNPs ranked by the P value from the ordinal (log additive) model with a P value of .01 or less. The smallest P value was for ITGB3 (P = .001), and the ordinal OR was 0.66 (95% CI, 0.52-0.85). The tail strength for the nsSNP analysis (based on 375 P values) was 0.04 (−0.06-0.14), suggesting that none of these results were significant after accounting for multiple testing. Table 4 also reports the amino acid change for the nsSNPs; all of these changes were predicted to be benign using the software PolyPhen (Harvard University, Cambridge, MA;

View this table:
Table 4

nsSNPs with a P value of .01 or less from the logistic regression analysis

As a secondary analysis, we evaluated the gene level (Table 2; Table S2) and nsSNP (Table 4) associations for the 3 most common NHL subtypes in our dataset. These analyses have much less power (due to smaller sample size) and have not been corrected for multiple testing (a nominal P ≤ .05 was used for this analysis), and therefore should be interpreted with caution. With these caveats in mind, several potentially interesting patterns emerged. For CLL/SLL (N = 126 patients), associations at the gene level (from logistic regression models) were statistically significant at P values of .05 or less for most genes found to be notable in the main analysis; the exceptions included DUSP2 (P = .08) and ITGB3 (P = .3). For the nsSNPs, all ORs for CLL/SLL were of a similar magnitude as those for all NHL, although only nsSNPs for TLR6, UNC84B, and B3GNT3 were statistically significant at P values of .05 or less. For follicular lymphoma (N = 113 patients), FGG (P = .001), MAP3K5 (P = .009), CREB1 (P = .02), ITGB3 (P = .02), and LSP1 (P = .03) were associated with risk at the gene level. In nsSNP analysis, with the exception of B3GNT3, all ORs were in the same direction and of similar magnitude as the ORs for all NHL, although only nsSNPs in ITGB3, TLR6, and SELPLG achieved a P value of .05 or less. For DLBCL (N = 76 patients), DUSP2 (P = .04) and ITGB3 (P = .05) were associated with risk at the gene level. In the nsSNP analysis, all ORs were in the same direction and of similar magnitude as for all NHL, although only the association for the nsSNP in ITGB3 was statistically significant at P values of .05 or less.


In a discovery exercise, we used a panel of genes related to immune function and inflammation to identify genetic risk factors for NHL. After exclusion of SNPs that did not meet our strict quality control or that had a MAF of less than 0.01, we had data available for 7670 SNPs assigned to 1158 genes. We identified 9 genes and 6 nsSNPs (from 6 genes) of greatest interest based on our statistical criteria. These types of analyses are prone to false discovery due to the large number of statistical tests that are conducted, and while the signal from our gene-level analysis suggested that we observed a 10% greater signal over that of chance, our nsSNP findings were consistent with a chance finding after accounting for the number of tests conducted. Ultimately, replication of genetic associations in independent studies or within a consortium such as InterLymph8 is required, and the genes we identified are excellent candidates.

This is the largest evaluation of genes involved in immunity and inflammation reported to date for NHL. While our findings must be considered preliminary and hypothesis-generating, we observed several patterns among the highest-ranked genes that are supported by the limited existing literature on genetic susceptibility to NHL (Table 5). The strongest findings to date have been for a role for specific SNPs from genes involved in inflammation and immune response, particularly TNF and LTA,79,59 although smaller studies have not observed these associations,60,61 perhaps due to lower power. While these specific SNPs from TNF and LTA were not on our SNP panel, we did identify several genes important in the inflammatory and innate immune response, including TRAF, RIPK3, BAT2, and TLR6. TRAF1 expression has been shown to be elevated in NHL52 and particularly CLL,52,53 and in subtype analysis, CLL/SLL was most strongly associated with TRAF1. RIPK3 is a component of the TNFR1 signaling complex, and BAT2 is located in the HLA class III complex in the vicinity of TNF and LTA and has been associated with autoimmune disease.23,24 Even though the nsSNP for BAT2 (V1883L) is predicted to be benign using PolyPhen, further analysis in this gene-rich area is warranted. Toll-like receptors (TLRs) bridge the innate and adaptive immune systems,62 and the nsSNP in TLR6 (rs5743815) that was associated with risk in this study has not previously been evaluated in NHL. Little is known about the functional consequences, if any, of this SNP, although the substitution (V427A) was predicted to be benign using PolyPhen. Although the MAF was low in patients (0.02) and controls (0.004), the association was strong for NHL overall (OR = 5.20; 95% CI, 1.77-15.3) and for each of the subtypes. A nsSNP from TLR2 (−16933T > A) has been reported to be associated with NHL overall and with follicular lymphoma in particular,63 and a nsSNP in TLR4 (rs4986790; 1063A > G) was associated with DLBCL in 1 study,12 although this was not observed in 2 other studies.9,63

View this table:
Table 5

Summary of top-ranked genes

Our data also suggest a role for genetic variation in signal transduction pathways, particularly related to TNF and TLR signaling. MAP3K5 belongs to the mitogen-activated protein kinase (MAPK) pathway, and it can be activated by TNFα through interactions with TNF receptor 1 (TNFR1) and TNFR2, TNFR-associated factors (TRAFs), and TNFR-associated death domain (TRADD).41,42 DUSP2 is highly inducible and encodes a protein that is predominately expressed in hematopoeitic lineages, particularly immune cells infiltrating inflammatory lesions.64 DUSP2 primarily inactivates p38 MAPK and ERK1 and ERK2, and thus can regulate transcription factors like CREB and AP-1.31 CREB1 encodes a transcription factor that mediates response to a variety of growth and stress signals,26 and is a putative oncogene that has been implicated in several cancers, including myeloid neoplasia and follicular and transformed NHL.27 CREB is activated by TNF/TNFR1 signaling through a p38MAPK/MSK1 signaling pathway and through TLR signaling.65,66 Of note, the associations for MAP3K5, DUSP2, and CREB1 were not specific to any NHL subtype, suggesting these genes may play a more global role in lymphomagenesis.

Genes involved in lymphocyte trafficking and migration were also in our top hits. B3GNT3 plays an important role in L-selectin biosynthesis,20 and SELPLG encodes a ligand for P-selectin (PSGL-1), which is important for tethering and rolling of leukocytes to endothelial cells and platelets67 and also may activate β2 integrins in neutrophils.68 Selectins appear to play an important role in tumor cell survival and metastasis,69 and increased soluble P-selectin has been reported in breast and hematologic cancers.70

Our final and perhaps most unexpected observation was that genes involved in the acute-phase response and the coagulation-fibrinolytic system, specifically FGG and ITGB3, were associated with NHL risk. There is accumulating experimental evidence for a role of fibrin and fibrinogen degradation products in carcinogenesis by promoting invasive growth, angiogenesis, and metastasis.36,71,72 ITGB3 codes for the beta subunit of the glycoprotein IIb/IIIa complex, which mediates platelet aggregation by serving as a receptor for fibrinogen. The ITGB3 nsSNP rs5918 results in a substitution (Leu59Pro, also reported as Leu33Pro) that introduces a nick in the polypeptide chain, and this SNP leads to increased binding of fibrinogen to the IIb/IlIa complex,73 increased platelet aggregation,74 decreased bleeding time,75 and increased signaling through ERK2 of the MAPK pathway.76 In contrast to our results, this SNP has been associated with risk of all cancer in a Danish cohort study (relative risk [RR] for proline/proline genotype was 1.4; 95% CI 1.1-1.9).39 In subsite analysis of the Danish cohort, risks were specific to ovarian and breast cancer and melanoma; results specific to NHL were not reported (there were only 26 patients). Patients with NHL are at increased risks for coagulation disorders,36 and patients with venous thromboembolism have been found to have increased risk of NHL even more than 2 years after admission (standardized incidence ratio [SIR] = 1.4; 95% CI 1.2-1.6).77

Strengths of this study include the careful quality control in genotyping and the use of the HapMap to tag the genes of interest, as well as inclusion of nsSNPs from these genes. Although our panel had 9412 SNPs, only 7670 (81%) were available for analysis, and this decreased the average number of SNPs from 6.8 to 6.1 per gene. In terms of SNP coverage against the current version of the HapMap, defined as the number of SNPs in a gene with a MAF of 0.05 or more for which we had a tagSNP (based on r2 = 0.8) divided by the total number of SNPs with a MAF of 0.05 or more, the median coverage was 61.0% (range, 0.3%-100%). Thus, our study may have missed important genes in the etiology of NHL due to low coverage of many of the genes. The most common reasons for exclusion of an SNP included SNPs with a MAF of less than 1% (N = 885 of 1742 SNPs; 51%) and SNPs with a call rate of less than 95% (N = 524 of 1742 SNPs; 30%). The large number of exclusions for low MAF was mainly due to the inclusion of SNPs from other ethnic populations in the SNP panel, which had little or no variation in our white population. Based on the restriction of our analysis to whites and the use of the Structure program, our results are not likely to be confounded by population stratification. While this restriction increases the internal validity of the results, it does decrease the generalizability of the findings to other racial/ethnic groups. Our study was not population-based, but we have carefully designed our clinic-based study to adhere to basic epidemiologic principles of the case-control study design, and in particular our controls are derived from the same underlying population source that generated our patients and were not selected on basis of any particular medical history or other exposure (including genotype). We also restricted case ascertainment to the 3-state region surrounding Mayo Clinic Rochester in order to reduce referral bias that can occur with patients coming from farther distances. Of note, hospital-based and population-based studies have reported similar allele and genotype frequencies among controls for a variety of metabolic genes.78

In summary, our study provides additional evidence of the important role played by genes involved in the immune response in the etiology of NHL, particularly with respect to TNF and TLR pathways and MAPK signaling. Our study further suggests genes involved in lymphocyte trafficking and migration and coagulation pathways may also play important roles. Beyond the need to replicate these genes, additional work will need to be conducted to identify the causative SNPs in the candidate genes (eg, by fine mapping), the role of genes up and downstream of the candidate gene(s), further evaluation within NHL subtypes with sufficient sample size, and ultimately interactions with environmental exposures. Such approaches should help identify new etiologic pathways and high-risk populations for NHL, and this information should ultimately aid in identifying preventive strategies for this malignancy.

Supplementary PDF file available online.

Supplementary PDF file available online.

Supplementary PDF file available online.


Contribution: J.R.C. designed the study, obtained funding, and drafted the manuscript; S.M.A., T.G.C., A.D., T.M.H., D.J., N.E.K., M.L., W.R.M., and T.E.W. gathered clinical, laboratory, and pathology data; J.M.C. and W.L.-M. performed bioinformatics; Z.S.F., A.H.W., and S.L.S. (with input from J.R.C.) performed statistical analysis; and all authors revised the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: James R. Cerhan, Department of Health Sciences Research, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905; e-mail: cerhan.james{at}


We thank Sondra Buehler for her editorial assistance.

This work was supported by National Institutes of Health grants R01 CA92153 (J.R.C.) and K07 CA94919 (S.L.S.).


  • The online version of this article contains a data supplement.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted May 2, 2007.
  • Accepted September 2, 2007.


View Abstract