Site-specific gene correction of a point mutation in human iPS cells derived from an adult patient with sickle cell disease

Jizhong Zou, Prashant Mali, Xiaosong Huang, Sarah N. Dowey, Linzhao Cheng


Human induced pluripotent stem cells (iPSCs) bearing monogenic mutations have great potential for modeling disease phenotypes, screening candidate drugs, and cell replacement therapy provided the underlying disease-causing mutation can be corrected. Here, we report a homologous recombination-based approach to precisely correct the sickle cell disease (SCD) mutation in patient-derived iPSCs with 2 mutated β-globin alleles (βss). Using a gene-targeting plasmid containing a loxP-flanked drug-resistant gene cassette to assist selection of rare targeted clones and zinc finger nucleases engineered to specifically stimulate homologous recombination at the βs locus, we achieved precise conversion of 1 mutated βs to the wild-type βA in SCD iPSCs. However, the resulting co-integration of the selection gene cassette into the first intron suppressed the corrected allele transcription. After Cre recombinase-mediated excision of this loxP-flanked selection gene cassette, we obtained “secondary” gene-corrected βsA heterozygous iPSCs that express at 25% to 40% level of the wild-type transcript when differentiated into erythrocytes. These data demonstrate that single nucleotide substitution in the human genome is feasible using human iPSCs. This study also provides a new strategy for gene therapy of monogenic diseases using patient-specific iPSCs, even if the underlying disease-causing mutation is not expressed in iPSCs.


Human induced pluripotent stem cells (iPSCs) that are derived from adult somatic cells hold great promise as a renewable cell source for developing novel cell and gene therapies.1 Since the first report of generating iPSCs from patients' somatic cells in 2008, a wave of studies have demonstrated either undifferentiated patient-specific iPSCs or their differentiated progenies are capable of recapitulating key aspects of disease-specific phenotypes in vitro. So far, most of the successful reports on disease modeling are based on iPSCs bearing monogenic disease mutations. Capable of expanding unlimitedly in culture, iPSCs derived from patients that suffer from a genetic mutation are also promising candidates for cell replacement therapy if the disease-causing mutations can be permanently corrected. Most published studies that corrected mutations in iPSCs relied on adding a copy of a functional transgene randomly into the genome by viruses to restore the wild-type phenotype. For example, disease-corrected Fanconi Anemia iPSCs were established using this approach.2 However, random integrations of a functional copy of the affected gene, especially mediated by viral vectors, could induce oncogenic mutations and thus jeopardize the goal of gene therapy. The precise correction or replacement of mutations at the endogenous loci by homologous recombination (HR) is highly desirable even though the efficiency is extremely low (10−6) in noncancerous human cells such as human iPSCs. Recently, several studies reported HR-mediated gene addition at a few selective loci in normal or disease-specific human iPSCs.37 However, this gene addition approach is only suitable for phenotypic correction of loss-of-function mutations, and not for correcting dominant or gain-of-function mutations. The ideal solution is targeted and precise gene correction of the underlying disease-causing mutation, which also ensures the corrected gene will be expressed in the appropriate temporal and tissue-specific manner under the regulation of endogenous cis-elements.

Toward this goal, we focused on sickle cell disease (SCD) as a model system to develop methods for precise gene correction of a disease-causing mutation. In the majority of SCD patients, the mutation of A > T (also known as βA to βs mutation) in both alleles of the β-globin (HBB) gene that changes codon 6 from Glu (GAG) to Val (GTG) results in a defective form of adult hemoglobin in oxygen-carrying red blood cells. Although SCD was one of the first described molecular diseases, the goal for treating this monogenic disorder using gene therapy approaches has remained elusive.8,9 Gene correction of βs in mouse embryonic stem cells (ESCs) by HR has been reported previously.10,11 A similar strategy was used to correct the βs mutation in mouse iPSCs derived from a humanized SCD mouse model, followed by successful transplantation of differentiated hematopoietic cells into isogenic mice to cure SCD phenotypes.12 However, gene correction of the SCD mutation in human patient-specific iPSCs has not been demonstrated yet. Here, we show that with appropriate gene-targeting vector design and zinc finger nuclease (ZFN) enhancement of HR efficiency, site-specific gene correction of the silent HBB gene in human iPSCs can be achieved. We also demonstrate the expression of the corrected wild-type βA allele in red blood cells differentiated from thus corrected iPSCs. Some unexpected discoveries and discussions on promise and pitfalls of this site-specific gene correction approach in human cells also are presented.


Cell culture

Human iPSCs were maintained on primary mouse embryonic fibroblasts (PMEFs) as feeder cells using the standard human ESC media. The SCD iPSCs used in this study were generated in a previous study.13 The MBP5s1 (S1) iPSC line was derived using piggyBac transposition of reprogramming factors into bone marrow stromal cells (MSCs) from an adult patient with SCD at Johns Hopkins Hospital. To adapt to single-cell passaging, the S1 iPSCs were first passaged from PMEFs feeders to feeder-free culture condition using Matrigel (BD Biosciences) and Stemedia Nutristem XF/FF medium (Stemgent) at 1:1 to 1:2 ratio using 0.05% trypsin (Invitrogen) digestion, and then they were maintained by this trypsin-passaging and feeder-free culture condition. All patient samples were used per approval from Johns Hopkins University internal review board for conducting laboratory research using anonymous human cells. All the iPSC lines described in this report will be provided on request to L.C., under a standard (uniformed) material transfer agreement with the Johns Hopkins University. Human 293T cells that were used to validate ZFNs, and gene targeting vectors were grown in DMEM high glucose supplemented with 10% FBS.

Characterization of iPSCs after gene targeting and excision

Characterization and chromosome karyotyping (G-banding) of iPSC clones were performed at multiple passages as described previously.3,1315 Color and fluorescent images of iPSC markers (in PBS) and teratomas (embedded sections) were acquired using a Nikon Eclipse TE2000-U inverted microscope with 10×ELWD Plan Fluor/0.3 or 20×ELWD Plan Fluor/0.45 objective and a Qimaging Micropublisher 5.0 digital camera with QCapture software (Version 3.1.2). Multichannel fluorescent images were merged using ImageJ software.

Construction of a donor plasmid vector targeting the endogenous HBB locus

The left and right homology arms were amplified from genomic DNA of SCD fibroblasts (GM02340 in Coriell collection) used in a previous study.15 The primer sets HBBL-F, 5′-ATCGGTACCGTGTGTAAGAAGGTTCCTGAGGCT and HBBL-R, 5′-ATCGCTAGCGGTCTCCTTAAACCTGTCTTGTAACC amplified the 5.9-kb left arm, and HBBR-F, 5′-ACTGCTAGCAATAGAAACTGGGCATGTGGAGACAG and HBBR-R, 5′-GATCTCGAGAAGAAGGGCTCACAGGACAGTCAA amplified the 2-kb right arm. The GTG mutation in the genomic DNA was converted to the wild-type GAG codon using PCR-mediated mutagenesis. A loxP-flanked PGK-hygromycin (Hyg) cassette from a previous targeting donor PHD3 was cloned into NheI site between 2 homology arms in the pCR2.1-TOPO vector. The loxP-flanked PGK-Hyg cassette is to be inserted at nt 179 of the HBB gene, 37 bp downstream of the exon 1 junction (nt 142). The EF1α-TK.GFP gene cassette (for counter selection) obtained from a previously published vector16 was inserted downstream of the right arm. The BD2 vector plasmid and sequence information is available on request.

Gene targeting reagents and experimental procedure

The CompoZr custom-designed HBB-ZFNs were obtained from the Sigma-Aldrich. It is now available to the public (CKOZFN1264-1KT). The ZFN pair was designed to target the following DNA sequence in exon 1 of the HBB gene (see Figure 1A underlined). Two different HBB-ZFN subunits that recognize right (12-bp) and left (19-bp) half sequences can only form a dimmer with each other, which is required for endonuclease activity. They are expressed from 2 plasmids under the control of the CMV promoter. For green fluorescent protein (GFP)–based HR reporter assay in 293T cells, we cloned a 63-bp DNA sequence (from codon 1, nt 54 to nt 116) of the HBB gene containing the putative ZFN recognition sites into the middle of the EGIP* vector following a previously published strategy.3 The insertion, together with a stop codon (taa) upstream and a HindIII site downstream aagctt), disrupts the enhanced green fluorescent protein (EGFP) reading frame. A full-length GFP activity could be restored, if the mutated GFP (GFP*) DNA is corrected by HR (see Figure 1B). A plasmid bearing a nonexpressing, truncated GFP (tGFP) that provides a repair template, together with a pair of plasmids expressing 2 ZFNs, was introduced into the stably transfected 293T cells expressing the EGIP*-HBB sequence. The successful (GFP*) gene targeting was measured by the frequency of GFP+ cells 2 days after transient transfection.3 To test the specificity of HBB-ZFNs, the homologous region from other β-locus genes such as HBE, HBG, and HBD also was inserted into the EGIP vector as tested in parallel. A shorter form of the HBB sequence (43 bp) containing the core ZFN recognition site also was tested by the same assay. For testing HBB-ZFNs and the BD2 vector to achieve the HBB endogenous gene targeting, 0.5 million 293T cells were transfected by lipofection with 1 μg of BD2 donor and 0.25 to 2 μg/each HBB-ZFN in one 6-well plate. Five days after transfection, 1 million transfected cells were plated on 150-mm tissue culture dish with 200 μg/mL hygromycin B selection, and then 20μM gancyclovir (GCV) was added 8 days afterward. The total drug selection lasted 15 days, including 7 days for GCV and 15 days for hygromycin B selection. Pooled cell populations were used for genomic DNA extraction and PCR detection assays.

For correcting βs mutation in SCD iPSCs, 5 × 106 iPSCs were trypsinized and resuspended in mESC nucleofection buffer (Amaxa) and subsequently electroporated using the A-023 setting as described previously.3 The treated PSCs were plated onto irradiated hygromycin-resistant PMEF feeders (PMEF-HL; Millipore), with Y-27632 (a Rho-associated kinase inhibitor) added at 10μM final concentration into the media for overnight duration to improve survival of the nucleofected cells. From day 3 to day 17, hygromycin B selection at 20 μg/mL was performed. From day 10 to day 17, GCV (2μM) also was supplemented to the media. Surviving colonies were picked on day 17 and expanded on PMEF feeders following standard culture protocols and then screened for targeting events.

PCR detection of targeted integration, HBB, and Hyg mRNA expression

Primer set 5′-CAAATGCGAGAGAACGGCCTTAC (in the Hyg cassette) and 5′-CTAGCACTGCAGATTCCGGGTCAC (on HBB locus downstream of 3′-homology arm) was used to amplify a 2.5-kb product of the 3′-junction of a targeted integration (TI; see Figure 2A). Primer set 5′-ACATTTGCTTCTGACACAAC (in HBB exon 1) and 5′-AGCAAGAAAGCGAGCTTAG (in HBB exon 3) were used for RT-PCR detection of mutant or corrected HBB mRNA expression (see Figures 2A and 6B). Primer set 5′-ATTCCGGAAGTGCTTGACATTG and 5′-CACGCCATGTAGTGTATTGACC (both in Hyg coding sequence) were used for RT-PCR detection of Hyg mRNA expression (see Figure 7A).

Genomic PCR amplification and sequencing of both alleles in SCD iPSCs

Primer set 5′-CGATCACGTTGGGAAGCTATAGAG and 5′-AACATCCTGAGGAAGAATGGGAC was used to amplify the 3.5-kb genomic region surrounding HBB gene for both targeted and nontargeted alleles of S1, cre4, and cre16. For c36 (targeted) iPSCs, longer extension time was used to amplify both 3.5-kb nontargeted and 5.8-kb targeted alleles. Mixed alleles from these iPSC clones were cloned into a TOPO vector. Then, DNA in individual bacterial clones was sequenced to identify specific alleles (targeted or nontargeted).

Southern blot analysis of genomic structure near the HBB locus

Standard Southern blot protocol using digoxigenin-labeled probes was followed as described previously.3 The 3′-probe was amplified from S1 iPSC genomic DNA using primers 5′-GACTGAGAAGAATTTGAAAGGCG and 5′-TCATCAATTCTGCCATAAATGG. The Hyg probe used was the same as described previously.3

Genome-wide SNP concordance analysis

The Omni1_Quad BeadArray chip (Illumina) containing more than 106 probes detecting informative single-nucleotide polymorphisms (SNPs) was used. The array analysis was performed by the Johns Hopkins SNP Center as part of Center for Inherited Disease Research (CIDR; Based on 1 140 419 SNPs identified, and the array results with 2 control genomic DNA samples (CIDR11993 and CIDR10860) that have been previously sequenced and included in each run, the Johns Hopkins SNP Center reported a 0.27% genotyping error rate in the run, a rate that was within the normal range. Genomic DNA isolated by the DNAeasy kit (QIAGEN) from c36 (after ZFN-mediated HR), S1 iPSCs, and their parental somatic cells (MSCs) were analyzed. Based on the ratio of allele intensities, the disconcordance rate between c36 to S1 iPSCs (2 samples of different passages before and after trypsin-adaptation, p17 and p26) or to MSCs (passages 0 and 6) is below 0.04%, a rate that is much lower than the basal error rate level (0.27%). In comparison, the disconcordance rates between c36/S1 iPSCs to other unrelated iPSCs and between 2 unrelated CIDR11993 or CIDR10860 controls are ∼ 40%. Therefore, our data suggest that the c36 iPSC genome is essentially identical to the S1 iPSCs or their somatic MSCs.

Erythroblast differentiation from human iPSCs

Human iPSCs were directly differentiated into hematopoietic progenitor cells by a modified embryoid body (EB) formation method as described previously.14 Two weeks after the EB-mediated hematopoietic differentiation, the EB-derived cells were collected and dissociated with accutase treatment. The dissociated cells were cultured in a serum-free medium containing SCF (100 ng/mL), IL-3 (10 ng/mL), insulin-like growth factor II (40 ng/mL), erythropoietin (2 U/mL), and dexamethasone (1μM) for erythroid cell expansion and maturation as described previously.14 Cells were harvested on day 8 of erythroid expansion and differentiation, and the differentiated cells were analyzed by fluorescence-activated cell sorting analysis, cytospin and histology staining, and RT-PCR for globin expression. Giemsa stain of cytospin was captured using the same camera and software used for iPSCs except 40×ELWD Plan Fluor/0.6 objective. Quantitative (q)RT-PCR primers and probes are from Applied Biosystems (identification numbers are Hs00361131_g1 for HBG1/2 and Hs00747223_g1 for HBB).

Adult hemoglobin (HbA) antibody is from Santa Cruz Biotechnology (sc-21757PE). Fetal hemoglobin (HbF) antibody is from Invitrogen (catalog HFH-01).


Specific enhancement of HR efficiency by a novel pair of HBB-ZFNs

Based on our previous studies of gene targeting in human iPSCs and ESCs,3 we attempted to correct the mutated βs allele in SCD iPSCs. To achieve specific HR-mediated gene replacement at the HBB gene, but not at nearby HBD (δ), HBG1 (Aγ), HBG2 (Gγ), and HBE (ϵ) globin genes (Figure 1A), we decided to use ZFNs that make a specific double-strand break to stimulate HR near the βs mutation (codon 6) in the HBB gene. After initial screening, a pair of HBB-ZFNs was identified that targets a bipartite 31-bp sequence (with a 6-bp spacer) in exon 1 (21 bp downstream of the βs mutation) of the HBB gene (Figure 1A). We first tested the activity and specificity of this newly designed HBB-ZFN pair using a GFP reporter system3: DNA sequences from HBB or highly homologous β-locus genes HBE, HBD, and HBG were tested as candidate alternative targets of this ZFN pair (Figure 1A). Successful HR between the EGIP* reporter (containing a ZFN target sequence) and a tGFP donor will restore a full-length GFP gene (Figure 1B). Using flow cytometry to measure HR-mediated gene correction in stably transfected 293T cells expressing EGIP*-HBB target, we monitored the number of GFP+ cells after tGFP donor transfection, in the absence or presence of the HBB-ZFN stimulation. Without cotransfection of HBB-ZFN expression vectors, the basal gene correction rate was ∼ 4 to 5 per million cell events as observed previously 3 (Figure 1C). In the presence of the HBB-ZFNs, numbers of GFP+ cells jumped to 0.16% from cells expressing the EGIP* reporter that contains the HBB target sequence: ∼ 350-fold stimulation (Figure 1C). In contrast, the HBB-ZFNs showed no detectable stimulatory activity on EGIP* containing the homologous target DNA from the HBD, HBE, and HBG genes, indicating the HBB-ZFNs are highly specific (Figure 1D).

Figure 1

Activities and specificities of HBB-ZFNs that stimulate gene targeting in a GFP reporter assay. (A) The putatitive recognition sequence of HBB-ZFNs in the HBB gene (5′ to 3′, starting from the first codon). The left (12-bp) and left (19-bp) ZFN sites are underlined. The homologous sequences from other β-locus genes (HBE, HBD, and HBG1/2) and their differences from the HBB gene in L-ZFN, R-ZFN, and spacer regions (no. of mismatches) also are shown. Each was inserted into the GFP reporter as ZFN target sequence to test the specificity of HBB-ZFNs. All the inserts start with a STOP codon (red, taa) and end with a HindIII site (blue, aagctt). A short version of the HBB target sequence (called HBB-Short) also was tested. (B) Schematic of the GFP* reporter rescue assay. An EGIP* mutant was created by inserting a ZFN target sequence including a STOP codon and HindIII site into the GFP gene. Only after gene targeting with a tGFP donor (with or without ZFNs), the EGIP* will be corrected to restore wild-type GFP expression. (C) Flow cytometric analysis of GFP correction after HR in 293T cells stably transfected with EGIP*-HBB reporter. Two days after transient transfection of tGFP donor alone or with HBB-ZFNs, numbers of GFP+ cells were measured by dot plot of 1 million collected cell events. (C) Gene correction efficiency of EGIP* mutants with HBB, HBE, HBD, HBG, or HBB-Short ZFN target sequence using the tGFP donor, with or without HBB-ZFNs. Numbers of GFP+ cells per 106 293T-EGIP* cells are plotted as mean ± SEM, n = 3.

Targeted gene correction of SCD mutation in the endogenous HBB locus

To achieve targeted correction of βs mutation at the HBB locus, we designed a plasmid donor (BD2) to provide wild-type GAG for codon 6 (βA) and 2 constitutively expressed drug-selectable genes: PGK promoter-driven hygromycin-resistant gene (PGK-Hyg) for positive selection and EF1α-promoter–driven thymidine kinase-GFP fusion gene for counterselection. The PGK-Hyg selection gene cassette to be inserted into HBB intron 1 also was flanked by loxP sequences for future Cre-mediated excision (Figure 2A). We first evaluated the BD2 vector for gene targeting of the endogenous HBB gene in human 293T cells, in the presence or absence of the HBB-ZFNs delivered as 2 expression plasmids. After both hygromycin B and GCV selection (HygR-GCVR), pooled 293T cells were used in genomic PCR assays to detect whether there was any cell with a TI of the PGK-Hyg in the intron cocurrent with the SCD mutation replacement after HR by the same donor template. Two PCR primers, one primer within the inserted Hyg gene and the other primer downstream of the right homology arm, were used to detect the 3′-TI (2.5 kb). Our data showed that despite the high transfection efficiency in 293T cells, the BD2 donor vector alone did not generate TI mediated by HR in the silent HBB locus. However, adding HBB-ZFNs greatly stimulated gene targeting at the endogenous HBB locus (Figure 2B). Thus, we validated that the BD2 donor is capable of targeting HBB locus, and a pair of 3′-TI primers can be used to screen targeted iPSCs after cotransfection of the BD2 donor vector and 2 HBB-ZFNs followed by hygromycin and GCV double selections.

Figure 2

Site-specific gene correction of the βs mutation in the HBB gene. (A) A scheme of stepwise gene correction of 1 mutant βs allele (shown as 3 exons, 2 introns, and flanking sequences), first by HR-mediated gene targeting and followed by Cre-mediated excision. The gene-targeting donor BD2 vector with 2 homology arms (5.9-kb left arm and 2-kb right arm indicated by X) introduces an HR template for T-to-A replacement in the βs allele and a loxP-flanked drug-selection cassette PGK-Hyg to be inserted into the HBB intron 1. The flanking counter-selection HSV-TK gene (in the form of a TK.GFP fusion) driven by the EF1α promoter (outside the right HR homology arm) is used to reduce the frequency of Hyg resistant clones because of BD2 vector random integration that also allows HSV-TK expression. For validated HR clones, Cre-mediated excision removes only the PGK-Hyg selection gene cassette and leaves 1 copy of the loxP DNA in the middle of HBB intron 1 of the corrected allele, generating “cre” clones with 1 corrected allele βA (CorrectedΔloxP). We used 2 PCR primers (red arrows) for initial screening of TI indicative of correct HR. (B) The initial results of gene targeting in 293T cells that were transfected with the 1-μg BD2 donor alone (lane 1) or with HBB-ZFNs at increasing amounts (0.25 μg in lane 2, 1 μg in lane 3, and 2 μg in lane 4). The 3′-TI event (2.5-kb PCR product) was detected when ZFNs were present. Untransfected 293T cells in lane 5 are negative. (C) Similar results of in selected clones after gene targeting in S1 iPSCs and Hyg and GVC selection. Four clones (c36, c64, c68, and c70) showed a positive PCR product. Positive control (p.c.) was DNA from targeted 293T cells (lane 3) in panel B. (D) Southern blot analysis surrounding the HBB locus in the parental S1 and selected targeted iPSC clones. A 3′-probe downstream of 3′-homology arm (top line) is used with genomic DNA digestion by PmeI (P) and EcoRV (E) enzymes to confirm the presence of targeted allele (with EcoRV site inside TI) shown as a green arrow, and the original HBB allele in a red arrow. (E) A Hyg probe is used with genomic DNA digestion by XbaI (X) and SpeI (S) enzymes to confirm the targeted allele (with Hyg insertion, green arrow) or to identify random integration events such as event in clone 65. (F) PCR screening for clones with successful Cre excision, using the 2 primers shown in red. The absence of the Hyg-containing DNA in clones such as cre16 and cre19 indicates the excision of the PGK-Hyg cassette. (G-H) Southern blot analyses of iPSC clones before and after Cre excision. (G) Results with the 3′-probe after P and E digestion as shown in panel D. Red arrows indicated 4.3-kb fragments from βs allele in S1 and c36 iPSC clones and βA allele in cre clones (4, 16, and 19). (H) Southern blots with Hyg probe after X and S digestion as shown in panel E. The S1 and 3 cre clones are free of the Hyg gene, which was found in c36 clone as expected (green arrow).

We next used a SCD patient-specific iPSC line MBP5s1 (S1) that was established previously by a virus-free method.13 We adapted S1 iPSCs to single-cell passaging so that they can survive better after nucleofection. After adaption for 10 continuous passages, the resulting S1 iPSCs at p26 retained iPSC characteristics, pluripotency, and a normal karyotype (data not shown). We then transfected S1 iPSCs cells with the BD2 donor and 2 HBB-ZFN plasmids by nucleofection, and we selected for clones conferring hygromycin and GCV resistance. Proceeding thus, we screened 300 HygR-GCVR S1 iPSC clones from multiple HR experiments. First using PCR for detecting the 3′-junction of TI, we identified 4 candidate clones (Figure 2C). One clone, c36, was further expanded and confirmed by Southern blot analysis to bear a TI of the PGK-Hyg cassette into intron 1 on one HBB allele and no additional random integration (Figure 2D-E). The other 3 clones either failed to proliferate or could not be confirmed positive by Southern blot (data not shown). DNA sequencing of c36 genomic DNA surrounding the HBB exon 1 also confirmed a GTG-to-GAG correction in the targeted allele (Figure 7B).

Although the inclusion of the PGK-Hyg selection gene cassette in intron 1 enables selection of rare HR clones critical to target genes such as HBB that are silent in undifferentiated iPSCs and ESCs, it may potentially interfere with expression of the corrected HBB allele.10 To alleviate this potential problem, we next excised the loxP-flanked PGK-Hyg cassette using transient transfection of a Cre-expressing plasmid. Four of 24 PCR-screened clones were shown to be negative of 3′-TI (Figure 2F) and 3 of those clones (cre4, cre16, and cre19) were selected and confirmed by Southern blot to be free of the integrated PGK-Hyg gene cassette (Figure 2G-H). Sequencing of genomic DNA PCR products using primers in exon 1 and 2 further confirmed the presence of 1 unmodified βs allele and 1 corrected (βA) allele that also bears the leftover loxP site (46 bp including linker DNA) in intron 1, as expected (Figure 3).

Figure 3

Genomic DNA PCR and sequencing confirm precise monoallelic gene correction and Cre-LoxP excision. (A) Schematic of genomic DNA PCR using primers in 5′-untranslated region of exon 1 and exon 2 of HBB. With a short 72°C extension step (15 seconds), only the mutant allele (291 bp) and the corrected allele after excision (337 bp) can be amplified. (B) DNA gel shows 1 band for the c36 clone representing the mutant allele that can be amplified by the PCR protocol, and 2 bands from every cre clone representing both mutant allele and corrected allele after excision. (C) The mixed PCR products from cre4 iPSCs (after gene correction and Cre-LoxP excision) were cloned into a TOPO vector, and individual clones were sequenced. Among 8 sequenced clones, 4 clones (50%) were shown to contain a mutant allele (top panel), and 4 clones (50%) bore the corrected allele with a remnant loxP site after excision (bottom panel).

Through 2 steps of genetic manipulation, we obtained multiple human iPSC clones (c36, cre4, cre16, and cre19) with precise gene correction of the SCD mutation. These clones showed human ESC-like morphology, maintained normal karyotypes and pluripotency under ESC/iPSC culture conditions (Figure 4), similar the parental s1 iPSCs. Our genome-wide SNP concordance analysis did not detect overt difference between the c36 (after HR) and its S1 parental iPSC line (see “Methods”), although a detailed copy number variation or whole-genome sequencing analysis may provide more information about subtle genetic changes.

Figure 4

Characterization of gene-corrected SCD iPSC clones. Gene corrected SCD iPSC clones (c36 and cre4 shown here) display characteristic pluripotency markers such as alkaline phosphatase (AP), OCT4, NANOG, and TRA-1-60 (A); maintain normal karyotypes (B); and form teratomas bearing cells from all 3 germ layers, that is, ectoderm, mesoderm, and endoderm (C).

Differentiation of patient-derived and gene-corrected SCD iPSCs into erythroid cells

Because the HBB gene is not expressed in the undifferentiated iPSCs, we used differentiated iPSCs to evaluate the restoration of normal β-globin expression. Human iPSC lines S1, c36, cre4, and cre16 were differentiated into erythroid cells using a 2-step serum-free differentiation protocol (Figure 5A). During the first step of EB formation, all the iPSC lines form cystic EBs efficiently. At the end of the second step (erythroid expansion and maturation for 8 days), the differentiated cells expressed CD71+ (94%-98%) and CD235a (glycophrin A, 24%-29%) at similar levels (Figure 5B) and comparable to normal iPSC lines we tested previously.14 Giemsa stain confirmed most of the cells in culture were nucleated erythroblasts (Figure 5C).

Figure 5

Erythroid differentiation of SCD iPSC clones. (A) In vitro hematopoietic differentiation of various iPSC clones to generate HBB-expressing erythroid cells. After 14 days of EB-mediated spontaneous differentiation, iPSC-derived cells were further differentiated and expanded into immature erythroid cells (erythroblasts or EryB) for another 8 days. Then, the erythroid cells were collected for flow cytometry, Giemsa stain of cytospin, and RT-PCR or qRT-PCR. (B) Flow cytometric analysis of S1, cre4, and cre16 using erythroid-specific surface markers CD71 and CD235a (glycophorin A). (C) Giemsa stain of confirming that most of differentiated cells are erythroblasts.

β-globin expression in SCD iPSCs after erythroid differentiation

We first used qRT-PCR to measure the total expression levels of the HBB gene (the βs and a corrected allele) as well as the (fetal-type) HBG genes in all SCD iPSCs (S1, cre4, and cre16) and differentiated iPSCs at the EB stage and at the erythroblast stage (Figure 6A). Although the level of HBG expression in the SCD iPSC-derived erythroblasts (S1-EryB, cre4-EryB, and cre16-EryB) is comparable to that in cord blood mononuclear cells (CB-MNCs), the level of HBB expression in the erythroblasts is > 100-fold lower than that in CB-MNCs, even though they are ∼ 100-fold higher than those in the EBs (Figure 6A). Using a pair of primers located in exon 1 and 3 (Figure 2A blue arrows), we amplified the HBB transcripts from the erythroblasts of gene-corrected SCD iPSCs (c36, cre4, and cre16) by conventional RT-PCR (Figure 6B). Then, we cloned each PCR product that contains mixed DNA of corrected/wild-type and uncorrected/mutant alleles into TOPO vector and sequenced individual clones. This approach will tell us the frequencies of both gene-corrected (GAG) and uncorrected βs (GTG) transcripts in erythrocytes from each iPSC line. We found that only the uncorrected βs (GTG) allele was expressed in c36-derived cells because no sequenced HBB cDNA contained corrected βA (GAG). In contrast, erythroid cells from both cre4 and cre16 iPSC clones (that had undergone Cre-mediated excision of the selection cassette) expressed the corrected GAG HBB allele, although at the level 25% to 40% compared with the uncorrected (and mutated) allele (Figure 6C). Our flow cytometry assay using antibodies for adult or fetal hemoglobin proteins confirmed the vast majority of hemoglobin expression in the erythroblasts is fetal-type (HbF), consistent with our qRT-PCR results (Figure 6D). Our result is also consistent with 2 recent reports on erythropoiesis of human iPSCs,17,18 indicating that a better culture system is needed to increase erythrocyte maturation and survival, increase the HBB gene expression, and improve translation from existing HBB transcripts.

Figure 6

HBB and HBG transcription and translation analyses. (A) HBB and HBG1/2 gene expression (normalized to a house-keeping gene GAPDH) in undifferentiated iPSCs (S1) and differentiated progenies (EB and EryB) of S1, cre4, or cre16 was measured by quantitative RT-PCR (data represent mean ± SEM, n = 3). After the erythroid differentiation from iPSCs, the HBB transcript level increased 10- to 100-fold, although it is still 100- to 1000-fold lower compared with the level in CB-MNCs. (B) Conventional RT-PCR that readily amplifies HBB cDNA in erythroblasts derived from various SCD iPSC clones before (S1) or after gene targeting (c36) and Cre-mediated excision (cre4 and cre16), by 2 primers located at exon 1 and 3 (left illustration). (C) Although sizes of RT-PCR products of the unmodified or corrected alleles are the same, DNA sequencing of the RT-PCR product showed uniform transcript in c36-EryB and mixed transcripts in cre4-EryB and cre16-EryB (bottom chromatographs). Cloning each transcripts into TOPO vector and sequencing at clonal levels will distinguish expression from corrected versus uncorrected alleles. Sequencing of 40 to 60 individual cloned DNA molecules of RT-PCR products from each differentiated iPSC line revealed that the absence of corrected HBB transcript (T, 100% or 40/40 cloned and sequenced) in c36-derived erythroblasts, but in the erythroblasts derived from cre4 and 16 after Cre-mediated excision of the PGK-Hyg gene cassette, expression of the corrected (A) allele was detected. In cre4, both corrected (A, 28% or 17/60) and the unmodified (and mutated T) HBB alleles (72% or 43/60) were expressed. A similar result was obtained in cre16 iPSCs: 12/60 (20%) and 48/60 (80%) of the cloned and sequenced transcripts are from the corrected (A) and the uncorrected (T) alleles, respectively. (D) S1-EryB, cre4-EryB, and cre16-EryB expressed abundant fetal-type hemoglobin HbF, but no detectable adult-type hemoglobin HbA measured by flow cytometry using specific antibodies.

Investigation of repressed gene expression of the gene-targeted HBB allele

Our data showed that although HBB transcripts can be detected at a low level in the iPSC-derived erythroid cells, expression of gene-targeting corrected βA allele was somehow totally or partially lost in differentiated erythroblasts at the same stage. We then took several approaches to investigate the potential epigenetic or genetic factors that may be responsible for the repressed βA expression in c36, cre4, and cre16 cells. For c36, there was a 2.2-kb loxP-flanked PGK-Hyg gene cassette inserted in the first exon, excision of which resulted in partial expression of the gene-targeted allele (Figure 6C). The PGK-Hyg insertion at the first intron (37 bp away from the exon–intron junction) may affect the HBB gene expression in 2 ways. (1) The PGK-Hyg (intronless) gene cassette in the same orientation may introduce a cryptic/alternative splicing acceptor. We used 3 bioinformatics programs (NNSPLICE,; ASSP,∼mwang/assp.html; and NetGene2, to predict potential cryptic/alternative splicing acceptors in the insertion fragment. We identified several candidates and performed RT-PCR analysis, but we did not detect any alternative transcript around the region (data not shown). (2). The PGK promoter may make an antisense transcript that blocks HBB expression. However, no RT-PCR products can be found using several primers covering both strands of HBB exon 1 and the Hyg gene sequences. The silencing of the Hyg gene expression (driven by the PGK promoter) in expanded iPSCs is a surprise but consistent with the observation that c36 iPSCs lost their resistance to hygromycin selection (that was used in the initial clone screening) after expansion in the absence of selection. When we compared the hygromycin expression levels among several iPSCs where the same PGK-Hyg gene cassette is inserted either in the silent HBB gene (such as c36) or in the constitutively active PIG-A gene (such as FPHR iPSCs we published previously3), we found both expanded c36-iPSCs and c36-EryB expressed much lower hygromycin transgene expression than FPHR gene-targeted iPSCs (Figure 7A). This suggests epigenetic factors may play a role in silencing the gene-targeted allele including both the inserted PGK-Hyg transgene cassette and adjacent HBB gene.

Figure 7

Investigation of reduced expression of the gene-targeted allele. (A) RT-PCR showed the same PGK-Hyg transgene was expressed at very low level in gene-targeted c36-iPSCs (lane 3) and c36-EryB (lane 4) compared with another iPSC line (FPHR, where the PGK-Hyg cassette was targeted into the actively expressed PIG-A gene,3 lane 1). Nontargeted S1-EryB cells were used as negative control (lane 2). (B) Sequencing of 3.5-kb genomic region of HBB locus in SCD iPSC clones. The 3.5-kb genomic region includes an ∼ 1.1-kb promoter, all the HBB exons and introns, and an ∼ 0.5-kb downstream sequence that were part of BD2 targeting donor (black lines/shapes/names) and also contains an ∼ 200-bp 3′-enhancer sequence (gray lines/shapes/names) downstream of the right homology arm. DNA sequencing of both alleles in early (p24) or late (p56) passage of S1 revealed uniform βS mutation in exon 1 and wild-type GATA site in 3′-enhancer (underlined with complementary strand sequence underneath). However, in c36, cre4, or cre16, sequencing of mixed alleles showed heterozygous nucleotides (N), including βA in exon 1, a G-to-T polymorphism near the 3′-end of the right homology arm, and an A-to-G mutation in GATA site all linked on gene-targeted allele.

Even though we had expected the insertion of drug-selection cassette might interfere with the transcription of gene-targeted allele, we were surprised to see partially repressed βA expression in the cre4 and cre16 after Cre-mediated excision (Figure 5C). One possibility is that the remnant loxP sequence that is 37 bp away from the exon–intron junction interferes with the HBB gene expression. To examine other possibilities, we sequenced 3.5-kb sequence covering the whole HBB gene and its flanking genomic regions of all 3 gene-targeted lines (c36, cre4, and cre16) that had T > A replacement in exon 1 (Figure 7B left panels). Two single nucleotide variants that differ from the parental S1 sequence were found in the targeted allele of all 3 iPSC lines. The first variant, located downstream of exon 3 in the gene-targeted HBB allele (Figure 7B middle panels), is a polymorphic nucleotide change (G > T) in the 3′-homology arm and is probably introduced by the gene-targeting vector containing nonisogenic sequence. This SNP, however, has not been associated with any cis-regulatory elements or SNPs affecting HBB expression. The second variant (A > G) is found only in the targeted allele located downstream of the 3′-end of the right homology arm (Figure 7B right panels); therefore, it was not contained in the gene-targeting donor. This A > G mutation modified the core GATA binding site in the 3′-enhancer of the HBB gene expression.19 Deletion of the GATA-containing 3′-enhancer element has been shown to reduce HBB expression in transgenic mice.20,21 Because this mutation was only found on the targeted allele of gene-corrected SCD iPSCs (Figure 7B), it was probably introduced during the HR-mediated repair and passed from c36 through cre4 and cre16. At present, it is unclear whether this mutation in the GATA-containing 3′ enhancer, or the remaining loxP site in the first intron, or both, are responsible for the reduced expression from the targeted HBB allele.


Various methods have been explored to increase gene-targeting efficiencies in human iPSCs and ESCs, with or without the use of ZFNs. Recombinant adenovirus-associated viruses,5,6 helper-dependent adenoviruses,22 and bacterial artificial chromosome vectors23 have been used in gene targeting/correction without ZFNs in several cell types, including human ESCs and iPSCs. However, these previous studies have been focusing on constitutively transcribed genes in human iPSCs. Regardless, if the underlying target allele is expressed or silent in undifferentiated iPSCs, the method we described here should apply to correction or replacement in human iPSCs of any specific mutation or even an SNP locus. During the revision of this report, 2 other studies reported the correction of single mutations in human iPSCs.24,25 Together with our study reported here on gene correction of the SCD mutation, these studies show the feasibility regardless the underlying gene is expressed or silent in human iPSCs.

Numerous studies have reported that homology-directed repair is often much less efficient when a silent gene is targeted.26,27 This also explains our observed low efficiency of targeting the HBB gene even in the presence of a pair of ZFNs, compared with that of a constitutive gene (PIG-A) we published previously.3 The HBB-ZFNs used here is 50% less efficient than the PIG-A ZFNs measured by the similar GFP* reporter assays (Figure 1B), which also could contribute to a lower gene targeting of the HBB gene in human iPSCs. With an improved ZFN technology28 or other emerging methods such as TALE nuclease,29 we will probably find a high-affinity ZFN or TALE nuclease at or near the SCD mutation in HBB, without cutting the nearby HBD, HBG, and HBE genes.

For early passage human iPSCs that typically have less than 1% colony-forming efficiency, successful gene targeting of a silent gene such as HBB in human iPSCs relies more on a co-integrated drug-selection gene driven by a constitutively active promoter. Even with a floxed drug-resistant gene, gene targeting of a silent gene may still face obstacles because of silencing of the targeted locus in undifferentiated iPSCs. Indeed, we noticed that the HR-targeted c36 clone (c36) obtained after 2 weeks of Hyg selection became sensitive again to hygromycin on subsequent culture. Our RT-PCR results confirmed very low expression of PGK-Hyg transgene cassette in both undifferentiated c36 iPSCs and differentiated erythroid cells (Figure 7A). We hypothesized that epigenetic mechanisms such as DNA methylation in pluripotent cells may cause expression silencing of a transgene inserted at unfavorable loci, which in turn induce repression in gene expression of nearby targeted endogenous gene such as HBB even after differentiation. Similar reduction of gene-targeted allele expression also was observed in the previous report of correcting a humanized SCD mouse ESCs harboring the human βS allele.10 However, the transgene expression cassette inserted into an intron can be easily excised out via the Cre-loxP system, and the endogenous gene expression can be largely restored after excision.

Although in most applications we do not need to completely restore expression level of the corrected allele to the same level of the uncorrected alleles, it is better for us to understand why only a partial restoration (25%-40%) of the corrected allele in cre4 and cre16 iPSCs after the loxP excision (Figure 6B). Currently, we could not distinguish the following 2 possibilities and more studies are needed. (1) The presence of 1 copy of loxP left in the first intron interfered with the endogenous HBB gene expression. (2) A genetic mutation in the GATA site of HBB 3′-enhancer (probably introduced in the iPSC HR selection process, the clonal selection process, or both) is responsible for the reduction of erythroid-specific HBB expression. For the former possibility, we may use piggyBac DNA transposition to achieve zero-footprint excision of the selection gene cassette.30 For the latter possibility, our study highlights the importance to conduct high-resolution whole-genome sequence analysis. Although the HR-gene targeting did not show increased mutation rates compared with other proliferating cell populations in studies published by us and others,3,4,24 previously used methods of chromosomal karyotyping, SNP arrays, and even exome sequencing may not be sufficient to detect small changes across the whole genome. With recent cost reduction of whole-genome sequencing that is now affordable to an average laboratory, it is now possible to sequence the whole genome of selected iPSC clones after ZFN-mediated gene targeting.

Although in this study we focused on site-specific gene correction of a defined single mutation in the HBB gene, our functional analysis and relevance to future gene therapy is hampered by the fact that existing culture methods only allow us to generate nucleated erythrocytes that express γ-globin instead of β-globin protein. Nevertheless, HBB transcripts can be detected, albeit at a low level, by RT-PCR in differentiated erythroblasts derived from parental S1 and corrected cre iPSC lines (Figure 6). This allowed us to show that the corrected HBB allele after gene targeting can be expressed. An ideal iPSC-based gene therapy for SCD will require both precise correction of disease-causing mutation while preserving cis-regulatory elements regulating HBB expression and complete switching from fetal-type globin to adult-type globin. Therefore, further genetic and epigenetic studies are required to better understand globin gene switching in the iPSC system and to improve hematopoietic differentiation into more mature/enucleated erythroid cells from gene-corrected iPSCs, if they are going to be used for cell therapy. In addition, reproducible in vivo models to generate transplantable hematopoietic progenitor cells and red blood cells from human iPSCs and ESCs are needed,31,32 before we can assess the preclinical feasibility of the combined cell and gene therapy approach using gene corrected iPSCs derived from SCD patients. Despite these hurdles to be overcome in future years, our current study demonstrates that it is now possible to correct a specific point or small mutation at its endogenous locus in patient-specific iPSCs and also to restore gene expression of the corrected form on cell differentiation. The gene targeting methodology we reported here would extend the use of various iPSC lines containing disease related mutations in genes that are either expressed or silent in human iPSCs.


Contribution: J.Z. and P.M. designed and performed experiments, collected and analyzed data, and wrote the manuscript; X.H. and S.N.D. performed experiments and collected and analyzed data; and L.C. designed the experiments and wrote the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Linzhao Cheng, The Johns Hopkins University School of Medicine, Broadway Research Bldg, Rm 747, 733 N Broadway, Baltimore, MD 21205; e-mail: lcheng2{at}; or Jizhong Zou, The Johns Hopkins University School of Medicine, Broadway Research Bldg, Rm 780, 733 N Broadway, Baltimore, MD 21205; e-mail: jzou2{at}


The authors thank the laboratory of Prof Y. W. Kan for providing the PGK-Cre expression vector, and a Sigma-Aldrich team for designing and providing the plasmids expressing HBB-ZFNs.

This research was supported by a Siebel scholarship (P.M.), Fellowships and an Exploratory grant from the Maryland Stem Cell Research Fund (2008-MSCRFF-009 and 2010-MSCRFE-0044, J.Z. and 2010-MSCRFF-0095, X.H.), and the National Institutes of Health (grants R01 HL073781 and HL073781-05S1; L.C.).


  • * J.Z. and P.M. contributed equally to the work.

  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted February 7, 2011.
  • Accepted August 11, 2011.


View Abstract