Blood Journal
Leading the way in experimental and clinical research in hematology

Discrepancies between genotype and phenotype in hematology: an important frontier

  1. Ernest Beutler
  1. 1 From the Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA.


An African American male infant with sickle cell disease has a devastating stroke; an African American soldier is surprised when he is informed that he has sickle cell disease. They are both homozygous for the same mutation. An Ashkenazi Jewish woman with Gaucher disease has a huge spleen and severe thrombocytopenia; her older brother, homozygous for the same 1226G glucocerebrosidase mutation, is found on routine examination to have a barely palpable spleen tip. The fact that clinical manifestations of genetic diseases can vary widely among patients has been recognized for many decades. In the past, however, it could often be attributed to the pleomorphic nature of mutations of the same gene: the patient with severe disease, it was averred, must have a different mutation than the one with mild disease. Even before a precise definition of mutations could be achieved at the DNA level, such an explanation did not serve to clarify the differences that existed between siblings with the same autosomal recessive disease. Such siblings must surely be carrying the same 2 disease-producing alleles. With the advent of sequence analysis of genes, the great extent of phenotype variation in patients with the same genotype has come to be more fully appreciated, but understanding of why it occurs continues to be meager. It is the purpose of this review to explore some of the variations in phenotype seen by hematologists in patients with identical mutations, to indicate where some progress has been made, and to suggest how understanding in this important area may be expanded.

General considerations

Differences in the clinical expression of the homozygous or, in the case of X-linked genes, the hemizygous expression of a disease state may be the result of environmental or genetic factors or of their combination. If the data were available, the contribution of each of these factors could be deduced from the study of appropriate populations, as summarized in Table1. Discordance between identical twins would of necessity be due to environmental factors. Discordance between siblings could be due to environmental factors or to genetic factors unlinked to the primary disease-producing gene. Discordance among unrelated members of the patient population could be due to environmental factors or genetic factors, even those that may be linked with the primary mutation, such as second mutations or differences in the strength of the promoter. It should be remembered, however, that in many recently arising mutations common in an ethnic group, such as the 845G6A (C282Y) mutation of hemochromatosis or the 1226 C6G (N370S) mutation of Gaucher disease, the gene of “unrelated” members of the population has a single common ancestor and is found in the context of the same haplotype in all members of the population. Thus, apparently unrelated persons may well be in the same position as siblings; barring additional mutations occurring at the same locus even more recently than the disease-producing mutation, they all share the same gene with the same promoter. Although documentation of phenotypic difference within these 3 groups among monozygotic twins, siblings, and unrelated persons would be of great value, there are virtually no published data on this important topic. In particular, the study of monozygotic twins would yield invaluable information about the relative contributions of genetics and environment in phenotypic variability.

View this table:
Table 1.

Possible causes of discordance among patients with identical genotypes of disease-producing genes

Sickle cell disease

Sickle cell disease can result from the homozygous state for the hemoglobin S mutation, from the compound heterozygous state for the hemoglobin S and hemoglobin C mutations, and from the compound heterozygous state for the hemoglobin S and a β-thalassemic mutations. Because the latter mutations are themselves heterogenous, it is only the S-S and the S-C forms of sickle cell disease that are homogeneous. Within each of these groups there is a great deal of clinical heterogeneity.1 Sibling pairs are more concordant than the general population,1 but we are unaware of any data regarding monozygotic twins.

The homozygous state for sickle cell disease can be virtually asymptomatic, particularly among some Arab populations, or can manifest a devastating phenotype with repeated strokes and early death. Various factors that may influence the severity of the disease have been investigated. The fact that fetal cells did not sickle was discovered in 1949, and the possibility that high levels of fetal hemoglobin in patients might influence the disease by not interacting with sickle hemoglobin was subsequently proposed.2 Numerous studies have been carried out to determine whether variability in the levels of hemoglobin F accounted for differences in the severity of the disease,3-7 though a minimum threshold of fetal hemoglobin may be required.3 These suggested that high levels of fetal hemoglobin do protect against sickling. Genetic control of the number of cells containing high levels of hemoglobin F may, therefore, be a factor in modulating the severity of sickling, and putative regulatory loci on chromosome 6 and the X chromosome have been mapped.8 High levels of 2,3 diphosphoglycerate (2,3-DPG) promote sickling,9-12 and the coinheritance of even sickle trait with pyruvate kinase deficiency, which increases 2,3-DPG levels, gave rise to a clinically significant sickling.13

The inverse relationship between α-thalassemia and the severity of sickling has been known for nearly 20 years.14 Its mechanism is presumably to decrease the hemoglobin concentration in red cells, a critical factor in sickling severity. There is also an association between the haplotype in which the β-globin locus resides and the severity of the clinical disease,15 but none of these associations or their combination come close to explaining the striking variability of the clinical manifestations of the disease.

Glucose-6-phosphate dehydrogenase deficiency

Glucose-6-phosphate dehydrogenase (G6PD) is the prototype of the interaction of a genotype with the environment. Common severe forms of this enzyme deficiency, eg, G6PD Union, G6PD Mahidol, and G6PD Mediterranean, are characterized by 2 main clinical manifestations, hemolytic anemia in adults and jaundice in neonates. The former can be precipitated by drug ingestion, fava bean ingestion, or infection.16 Not long after this enzyme deficiency was discovered, it was observed that expression of G6PD deficiency varied markedly among women. Based on the pioneering work of Ohno17 on the chromatin state of the 2 X chromosomes, we suggested that there was random inactivation of one or the other X chromosome of females,18 a phenomenon independently proposed by Lyon19 to explain the patchy pattern of mice with X-linked coat color mutations, the molecular basis of which has been unraveled in recent years.20 However, even among males there is marked variability—the responses of different patients with the same mutation to a single drug dose may vary widely,21 and it has been suggested that the acetylator status of the G6PD-deficient patient may modulate the response.22 The response to the ingestion of fava beans is particularly variable. G6PD deficiency seems to be a necessary, but not sufficient, factor for favism to occur. Family studies suggested the role of a single autosomal gene that would be required for favism to occur,23 and a number of superimposed genetic deficiencies have been proposed as explanations for the fact that hemolysis develops in only a few G6PD-deficient patients when they ingest fava beans. These have included deficiencies of acid phosphatase,24excretion of glucaric acid (as an index of enzymes involved in the metabolism of glucuronic acid),25 25 glucuronide formation,25 26 and increased superoxide dismutase and decreased glutathione peroxidase activities.27 However, none of these is clearly related to the difference in response to fava beans by G6PD-deficient patients.

The neonatal jaundice that occurs in G6PD-deficient infants is not associated with increased hemolysis, and it has appeared likely that its origin is insufficient conjugation of bilirubin with glucuronide in the liver.16 In 1996 a polymorphism in the promoter of the UDP glucuronosyltransferase-1 gene (UGT1) that causes Gilbert syndrome was identified.28 Examination of DNA from G6PD-deficient and healthy infants disclosed that only those infants inheriting both the G6PD-deficiency gene and the UDP glucuronosyltransferase polymorphism had an increased tendency toward the development of severe hyperbilirubinemia.29 Although some recent retrospective studies could not find such an effect,30 31 the original prospective data seem robust. A similar effect of the UGT1promoter polymorphism has now been found to produce increased jaundice in newborns with hereditary spherocytosis32 and in adults with hereditary spherocytosis,33 heterozygous β-thalassemia, hemolytic crises in G6PD deficiency,34and congenital dyserythropoietic anemia.31

Gaucher disease

Clinical manifestations of Gaucher disease span an exceptionally broad spectrum, ranging from hydrops fetalis35-37 to incidental diagnoses in patients older than 70.38 39 A major part of this variability is explained by different mutations of the glucocerebrosidase gene, but even within genotypes variability is marked. The most common Gaucher disease mutation is 1226 C → G (N270S), and though all patients carrying this mutation have the type 1, nonneuropathic form of the disease, the phenotype of patients varies widely. As shown in Figure 1, the severity score, a measurement of overall morbidity caused by the disease, varies widely among patients; the 3 most common genotypes are found in the Jewish population, and there is little age dependence in the genotype that is found.

Fig. 1.

Relation between disease severity and age in patients with two genotypes of Gaucher disease.

The 1226G (N370S) mutation is a mild mutation common in the Jewish population. The frameshift 84GG is a null mutation. Variation within each genotype is great, and there is marked overlap among the phenotypes of patients with these two genotypes.

We know of no formal studies that compare the severity of Gaucher disease among persons in the general population, siblings, and monozygotic twins. Because the extended haplotype (linked polymorphic markers) of the common mutation 1226A → G (N370S) is the same in all patients, it is clear that differences in the population, as among siblings, cannot be explained by other mutations in the gene or by differences in promoter or enhancer sequences. We have incomplete information about 2 sets of monozygotic twins with Gaucher disease. In each case one twin has considerably more severe disease than is manifested by the other. This anecdotal evidence suggests that environmental factors can at least play a role in the pathogenesis of Gaucher disease.

What could such environmental factors be? One factor that deserves attention is early experience with infections such as infectious mononucleosis. We know of 3 patients in whom relatively severe disease developed, diagnosed after the development of infectious mononucleosis, and Kolodny et al40 earlier called attention to this association. Suppose, for example, that a child with Gaucher disease is infected with the Epstein-Barr virus. This results in splenomegaly and increased sequestration of leukocyte-derived glycolipid in the spleen. When the virus infection has been controlled, the spleen may remain in an enlarged state and, as such, sequester leukocytes and their lipids at an accelerated rate. Perhaps a vicious circle ensues, consisting of increased splenic sequestration, progressive splenomegaly, and a further increase in the rate of sequestration. I have, in fact, seen a patient whose diagnosis was established after a bout of infectious mononucleosis, and this patient's disease was much more aggressive than that of her brother with the same 1226G/1226G genotype. This phenomenon could be investigated by studying antibody levels in patients with Gaucher disease with different degrees of clinical manifestations. Another possible environmental factor is the diet. Indeed, on learning that they have lipid storage disease, most patients with Gaucher disease ask, “Should I change my diet?” The source of the glycolipid is, of course, endogenous; hence, the standard answer is, “No, this fatty substance does not arise from foods you ingest.” Nonetheless, I know of no studies that have explored the effect of dietary intake on the rate of glycolipid accumulation.

No serious candidate genes that might explain phenotypic variability of Gaucher disease have emerged. It is notable that in addition to the acid β-glucosidase deficiency in this disorder, there is a neutral β-glucosidase activity with a cytoplasmic rather than a lysosomal location.41 It is possible that genetic variation of the expression of this enzyme could play a role in substituting for the deficient glucocerebrosidase. The pH of the lysosome is important in determining the activity of a least one common glucocerebrosidase mutant protein; its activity is nearly normal at pH 5 but declines markedly when the pH is lowered.42 Thus, some of the variability characteristic of this disease might be attributable to individual differences, hereditary or environmental, in the pH of lysosomes.

Hereditary hemochromatosis

Only recently has the enormous variability of the clinical manifestations of hereditary hemochromatosis been appreciated. As described in the classical 1935 monograph written by Sheldon,43 a patient with hemochromatosis had cirrhosis, diabetes, bronzing of the skin, and cardiac arrhythmias. With the advent of population screening through serum transferrin saturation and serum ferritin levels and the study of family members, particularly by examining linkage with HLA-A, it became apparent that there were many patients thought to have hemochromatosis who lacked many or most clinical manifestations of the disease. However, it was not until cloning of the HFE gene—the gene that is the cause of most cases of hereditary hemochromatosis—that the actual variability of the genotype became apparent and could be documented in some detail.

Although most Europeans with hereditary hemochromatosis are homozygous for the C282Y mutation of the HFE gene, the reverse is clearly not true. There is general agreement that many patients with the homozygous genotype do not have clinical hemochromatosis. Some studies suggest that 50% or more of homozygotes have some degree of cirrhosis.44 45 It may be significant that some of these studies were performed on relatives of patients who had the clinical disorder44; clearly, they would be more likely to carry the very modifying genes required for hemochromatosis to become manifest. Our own experience, based on a population of patients attending a health appraisal clinic, is that penetrance of the hemochromatosis mutations is extremely low. Only one of the first 152 homozygotes we detected had clinical hemochromatosis, and the prevalence of symptoms such as arthritis, impotence, and diabetes, thought to be common in hemochromatosis, were no more common in controls than in homozygotes for the C282Y mutation.46The studies of Willis et al47-49 support the view that the penetrance of the gene is very low. These data suggest to us that homozygosity for the C282Y mutation of HFE is a necessary, but not a sufficient, condition for the full-blown clinical disorder to develop.

What, then, could the other accessory factors be? An obvious consideration is the dietary intake of iron. There are two reasons why this seems unlikely. First, the normal range of iron intake is relatively narrow, yet the range of iron storage in patients with the hemochromatosis genotype is enormous. Second, studies of the effect of iron supplementation of the diet on the incidence of hemochromatosis showed no increase in the incidence of disease during the years the diet was supplemented with iron.50 It seems more likely that other genes might determine whether clinical hemochromatosis develops. Mutation of such genes might, on the one hand, be responsible for hemochromatosis in patients who do not haveHFE mutations. On the other hand, they could influence the severity of the hemochromatosis phenotype in those patients who are homozygous for the C282Y mutation or who are homozygotes for the H63D mutation. Table 2 represents a list of candidate genes that either are being investigated or have been studied. If one or more of these prove to have polymorphisms that increase iron absorption, then the coinheritance of such mutations with the HFE mutations may be what is required to give rise to classical hemochromatosis, a situation analogous with the UDP glucuronosyltransferase mutation that produces Gilbert disease and interacts with G6PD deficiency to cause jaundice.

View this table:
Table 2.

Candidate genes for modification of the hemochromatosis phenotype

Disorders of hemostasis

Most disorders of hemostasis are genetically heterogeneous. Therefore, a considerable amount of the notorious variability observed in these disorders can be accounted for by differences in the mutation the patient carries. Even within families, however, there is marked variability in the amount of bleeding in patients with the more common disorders such as hemophilia A, hemophilia B, or von Willebrand disease. The most common of the genetic defects is the factor V Leiden mutation (c.1691 G → A [Arg508Gln]), occurring in approximately 5% of the European population. In one recent study,79 the incidence of thrombosis was only 0.29 per 100 patient-years, with a risk ratio of 2.2. Clearly, most persons with this mutation do not acquire thrombotic disease. Because of the complexity of hemostatic mechanisms, many environmental and genetic differences between patients could account for the differences in clinical phenotype observed. However, many candidate genes and environmental factors have been studied for possible interactions,80 and some meaningful associations have emerged.81 These include a polymorphism in the PAI-1 gene,81 protein C deficiency, protein S deficiency, AT III deficiency, and prothrombin 20210A.82 Environmental factors, too, play a role in the occurrence of thromboses in patients with factor V Leiden. The risk is increased by pregnancy83 and by oral contraceptive use.84 85

Approaches and conclusions

The five examples discussed here are typical of the wide range of phenotypic manifestations of so-called single-gene diseases. None of the problems they pose are fully resolved, and in most cases we have only a few clues as to the nature of the interactions required to produce the severe and mild extremes of phenotypes with which we are familiar. Understanding the cause of phenotypic variation in patients with the same genotype is critical for genetic counseling and for management. Only when we can predict more accurately the natural course for a patient can we make valid risk–benefit and cost–benefit assessments for treatment. If we knew that a given patient with sickle disease had a high probability for stroke, we would more readily subject that patient to the risk of stem cell transplantation. If we could predict that a patient with Gaucher disease would have severe bone involvement, the expenditure of $200 000 per year for enzyme replacement therapy could be better justified. Moreover, if we understood why some patients have mild disease, this knowledge might lead to treatments that create the same conditions in patients with severe disease.

In approaching this problem, it is important, first of all, to have some understanding of the relative contributions of heredity and environment in causing phenotypic variation. This information is best obtained from monozygotic twins, but it is generally unavailable. A coordinated effort to find and evaluate such twin sets should be made. If there is as much variability in monozygotic twins as in the general population, a search for a genetic cause of variability would surely be futile. When there is reason to believe that genetic factors are responsible, 2 general approaches may be taken—the study of candidate genes and positional cloning. The study of candidate genes has been in the forefront of attempts to understand phenotype variation and has occasionally been successful. The interaction of α-thalassemia with sickle cell disease14 and the interaction between the glucuronosyltransferase promotor polymorphism and G6PD deficiency in causing neonatal jaundice29 are examples, but they explain only part of the variability that exists. Extensive examination of candidate genes that may modify the expression of hereditary hemochromatosis and factor V Leiden have been unsuccessful. It may be that often the genes that modify expression of single-gene disorders are undiscovered or have physiologic effects that were not appreciated. Positional cloning is a second approach. The genome can be scanned for markers that differ between severely and mildly affected patients. If the putative mutation that causes phenotypic variability is sufficiently recent in origin and the markers are sufficiently close together, a marker that is overrepresented in severely affected patients may be found and a search for genes in the region of such a marker could provide the answer. This is a difficult approach—one that may fail to yield any fruit even after major resources have been expended—but the problem is of sufficient importance that greater efforts should be made to find solutions.

There is great enthusiasm for moving on to find the causes of multigenic diseases such as diabetes and rheumatoid arthritis. I would submit that it will be even more difficult to understand these disorders than the single-gene diseases, the causes of whose variability still elude us.


This is manuscript number 13400-MEM from The Scripps Research Institute.


  • Ernest Beutler, Department of Molecular and Experimental Medicine, MEM-215, The Scripps Research Institute, 10550 N Torrey Pines Rd, La Jolla, CA 92037; e-mail: beutler{at}

  • Supported by National Institutes of Health grants HL25552-10, DK53505-02, and RR00833 and by the Stein Endowment Fund.

  • @ 2001 by The American Society of Hematology

  • Submitted May 15, 2001.
  • Accepted July 3, 2001.


View Abstract