Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells

  1. Matias I Autio  Is a corresponding author
  2. Efthymios Motakis
  3. Arnaud Perrin
  4. Talal Bin Amin
  5. Zenia Tiang
  6. Dang Vinh Do
  7. Jiaxu Wang
  8. Joanna Tan
  9. Shirley Suet Lee Ding
  10. Wei Xuan Tan
  11. Chang Jie Mick Lee
  12. Adrian Kee Keong Teo
  13. Roger SY Foo  Is a corresponding author
  1. Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore
  2. Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore
  3. Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, Singapore
  4. Laboratory of RNA Genomics and Structure, Genome Institute of Singapore, Singapore
  5. Center for Genome Diagnostics, Genome Institute of Singapore, Singapore
  6. Stem Cells and Diabetes Laboratory, Institute of Molecular and Cell Biology, Singapore
  7. Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
  8. Precision Medicine Translational Research Programme, Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
4 figures, 2 tables and 9 additional files

Figures

Figure 1 with 7 supplements
Computational search for candidate GSH.

(A) Schematic representation of the computational workflow for defining candidate GSH. (B) CIRCOS plot summarising computational search results. Ring 1: chromosome ideograms; ring 2: orange bars indicating safe sites; ring 3: blue bars indicating active regions; ring 4: candidate sites within active regions, red bars site failed BLAT screening, black bars site passed BLAT screening. (C) Locations of candidate GSH targeted in vitro. Blue labels: targeted clone established; green labels: no clone established.

Figure 1—figure supplement 1
Hi-C profile for GSH candidate 1.

Screenshots of Hi-C interaction matrices from H1 hESC for targeted GSH candidate locus on chromosome 1 (113339961-113340514). TADs are indicated by the ‘pyramids’ of high interaction observed in the Hi-C matrices. UCSC genome browser track annotating the candidate GSH (targeted GSH in pink), GENCODE v36 and H3K27Ac mark from ENCODE shown below.

Figure 1—figure supplement 2
Hi-C profile for GSH candidate 2a.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 2 (128912721-128914814).

Figure 1—figure supplement 3
Hi-C profile for GSH candidate 2c.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 2 (128932307-128935799).

Figure 1—figure supplement 4
Hi-C profile for GSH candidate 4.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 4 (17373361-17374159).

Figure 1—figure supplement 5
Hi-C profile for GSH candidate 6.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 6 (15727241-15727490).

Figure 1—figure supplement 6
Hi-C profile for GSH candidate 18.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 18 (56534775-56536439).

Figure 1—figure supplement 7
Hi-C profile for GSH candidate 19.

As in Figure 1—figure supplement 1 but for GSH targeted GSH candidate locus on chromosome 19 (5400761-5402139).

Figure 2 with 9 supplements
GSH targeting and transcriptomic analysis of H1 hESC.

(A) Schematic representation of CRISPR/Cas9 plasmid (pMIA3) and homology directed repair donor (pMIA4.721) used for targeting with functional components annotated. (B) Schematic of integrated landing pad expression construct. Positions of primers for junction-PCR as well as of ddPCR assay are indicated. Representative junction-PCR Sanger sequencing reads from Pansio-1 targeted clones shown in expanded view. (C) Log2-FC of mRNA expression levels against un-targeted H1 hESC samples for the nearest genes of Pansio-1, Olônne-18, and Keppel-19 candidate GSH. Evaluated samples: H1=un-targeted hESC, Pansio-1=landing pad construct integrated to Pansio-1 GSH in H1 hESC, Olônne-18=landing pad construct integrated to Olônne-18 GSH in H1 hESC and Keppel-19=landing pad construct integrated to Keppel-19 GSH in H1 hESC. Box plots representing 95% confidence intervals of mean log2-FC. Nearest gene for each GSH indicated in orange. Individual data points shown in pink with p-value for each comparison shown above. (D) Volcano plots of RNA-seq analysis against un-targeted H1 hESC. Samples analysed as in (C). Differentially expressed (DE) genes with FDR ≤0.01 and |logFC|≥1 in pink, genes with |logFC|≥1 in green, genes with FDR ≤0.01 in blue, others in grey. (E) Venn-diagrams illustrating the overlap of DE genes between un-targeted H1 hESC and the three GSH targeted H1 hESC lines.

Figure 2—figure supplement 1
PCR gel images of junction PCR and wild type allele PCR reactions for screened clones.
Figure 2—figure supplement 2
Off-target screening for Pansio-1.

Sanger sequencing traces of the top five predicted gRNA off-target sites for un-targeted wild type H1 & H9 hESC and Pansio-1 GSH targeted clones. All sequencing traces are ordered as follows: H1 WT, H1 clone, H9 WT & H9 clone.

Figure 2—figure supplement 3
Off-target screening for Olônne-18.

As in Figure 2—figure supplement 2, but for Olônne-18.

Figure 2—figure supplement 4
Off-target screening for Keppel-19.

As in Figure 2—figure supplement 2, but for Keppel-19.

Figure 2—figure supplement 5
Transcriptomic analysis of GSH targeted H9 hESC.

(A) Log2-FC of mRNA expression levels against un-targeted H9 hESC samples for the nearest genes of Pansio-1, Olônne-18, and Keppel-19 candidate GSH. Evaluated samples: H9=un-targeted hESC, Pansio-1=landing pad construct integrated to Pansio-1 GSH in H9 hESC, Olônne-18=landing pad construct integrated to Olônne-18 GSH in H9 hESC and Keppel-19=landing pad construct integrated to Keppel-19 GSH in H9 hESC. Box plots representing 95% confidence intervals of mean log2-FC. Nearest gene for each GSH indicated in orange. Individual data points shown in pink with P-value for each comparison shown above. (B) Volcano plots of RNA-seq analysis against un-targeted H9 hESC. Samples analysed as in (A). Differentially expressed (DE) genes with FDR ≤0.01 and |logFC|≥1 in pink, genes with |logFC|≥1 in green, genes with FDR ≤0.01 in blue, others in grey. (C) Venn-diagrams illustrating the overlap of DE genes between un-targeted H9 hESC and the three GSH targeted H9 hESC lines.

Figure 2—figure supplement 6
Representative images of metaphase spreads used for karyotyping the GSH targeted H1 and H9 cell lines.
Figure 2—figure supplement 7
Representative images of haematoxylin and eosin stained teratoma from Pansio-1, Olônne-18, and Keppel-19 H1 cells.

Examples of mesodermal, endodermal, and ectodermal tissues are presented. Scale bar 100 µm.

Figure 2—figure supplement 8
Inducible expression from GSH integrated cassette.

(A) Schematic representation of ‘all-in-one’ inducible transposon donor construct (pMIA10.7). (B) Schematic of landing pad construct with integrated ‘all-in-one’ inducible cassette. (C) Representative images of H1 Pansio-1 cells targeted with pMIA10.7 untreated or treated with 1 mg/ml doxycycline for 24 hr. Brightfield and GFP channels are shown. Scale bars for all images equal to 150 µm. (D) Bar plots representing the mean percentage of FITC-A positive cells from flow cytometry analysis of three replicate treatments as described in (A). Error bars indicate 95% confidence intervals. Individual data points shown in pink. (E) Representative scatter plots flow analysis described in (D).

Figure 2—figure supplement 9
Schematic representation of the pMIA10.53-Clover donor plasmid.
Figure 3 with 2 supplements
Integration and validation of transgene expression in GSH targeted H1 hESC and their differentiated progeny.

(A) Schematic representation of integrase expression construct (pMIA22) and transposon donor construct (pMIA10.5). (B) Schematic of landing pad construct with integrated Clover transgene. (C) Representative immunofluorescence images of Clover-integrated GSH H1 cells. DAPI = nuclear staining with 4′,6-diamidino-2-phenylindole, Clover = fluorescence from Clover transgene, OCT3/4=antibody staining against OCT3/4, Overlay = overlay of the three imaged channels. (D) As in (C) apart from antibody staining against SOX2. (E) Histograms of flow cytometry analysis for FITC-A channel of un-targeted H1 hESC, and the three GSH targeted hESC lines over 15 passages. Percentages of FITC-A-positive cells according to the indicated gating. (F) Representative immunofluorescence images of Clover-integrated GSH H1 cells differentiated to neuronal-like cells. Channels imaged as in (C) apart from antibody staining against TUJ1. (G) As in (F) for cells differentiated to hepatocyte-like cells, antibody staining against AFP. (H) As in (F) for cells differentiated to cardiomyocyte-like cells, antibody staining against sarcomeric α-ACTININ. Scale bars for all immunofluorescence images equal to 150 µm.

Figure 3—figure supplement 1
Integration and validation of transgene expression in GSH targeted H9 hESC and their differentiated progeny.

(A) Representative immunofluorescence images of Clover-integrated GSH H9 cells. DAPI = nuclear staining with 4′,6-diamidino-2-phenylindole, Clover = fluorescence from Clover transgene, OCT3/4=antibody staining against OCT3/4, Overlay = overlay of the three imaged channels. (B) As in (A) apart from antibody staining against SOX2. (C) Histograms of flow cytometry analysis for FITC-A channel of un-targeted H9 hESC, and the three GSH targeted hESC lines over 15 passages. Percentages of FITC-A positive cells according to the indicated gating. (D) Representative immunofluorescence images of Clover-integrated GSH H9 cells differentiated to neuronal cells. stained for TUJ1 and MAP2, hepatic cells, stained for AFP and HNF4α and cardiac cells stainde for sarcomeric α-ACTININ and cardiac TROPONIN-T. Channels imaged as in (C) apart from respective antibody stain. Scale bars for all immunofluorescence images equal to 150 µm. (E) Flow cytometry analysis scatter plots of hESC-derived β-like cells at D35 of β cell differentiation of H9 wild-type and Pansio-1 Clover-targeted cells. Channels used in analysis were 488 for Clover and 647 for antibody staining. PDX1, Insulin and isotype control antibodies were used.

Figure 3—figure supplement 2
Representative images of H1 Pansio-1 line immunofluorescence staining with AF594 secondary antibody alone.

DAPI = nuclear staining with 4′,6-diamidino-2-phenylindole, Clover = fluorescence from Clover transgene, AF594=staining with secondary antibody alone, Overlay = overlay of the three imaged channels.

Figure 4 with 1 supplement
Representative images of Pansio-1 H9 cells differentiated to neuronal, hepatic, and cardiac cell types.

Staining for respective lineage markers TUJ1, HNF4α and cTnT and isotype controls is shown as well as nuclear staining with HOECHST and the channel for Clover-transgene. Images are composites of 61 individual images from HCI. Scale bars in all images equal to 500 µm.

Figure 4—figure supplement 1
Bar plots representing the mean ratio of positive cells from the high content imaging analysis.

Error bars indicate 95% confidence intervals. Individual data points shown in pink.

Tables

Table 1
Coordinates of candidate GSH, their associated active chromosome regions & housekeeping gene, and BLAT score against the most similar region.
ChromosomeStartEndWidthActive region startActive region endHousekeeping geneBLAT-score
1113289036113289342307113000001114000000HIPK10.21
11133148411133183693529113000001114000000HIPK10.18
1 *113339961113340514554113000001114000000HIPK10.27
2 *1289127211289148142094128000001129000000UGGT10.08
2128918961128919839879128000001129000000UGGT10.16
2*1289323071289357993493128000001129000000UGGT10.15
21289632721289657592488128000001129000000UGGT10.44
22089929982089974594462208000001209000000PIKFYVE0.28
4*17373361173741597991700000118000000MED280.04
5131058585131058947363131000001132000000FNIP10.41
51487537411487572193479148000001149000000FBXO380.10
6 *15727241157274902501500000116000000DTNBP10.47
74314741431527953940000015000000FOXK10.15
743210174323839282340000015000000FOXK10.09
743280404329659162040000015000000FOXK10.21
74353504435421971640000015000000FOXK10.32
744548084456201139440000015000000FOXK10.17
823945241239458195792300000124000000R3HCC10.34
8239869812398831913392300000124000000R3HCC10.06
8239996282400119415672300000124000000R3HCC10.02
1856339813563402454335600000157000000TXNL10.06
1856396821563973194995600000157000000TXNL10.07
1856410681564110393595600000157000000TXNL10.14
18 *565347755653643916655600000157000000TXNL10.16
19 *54007615402139137950000016000000SAFB0.18
  1. *

    = GSH shortlisted for in vitro validation.

Author response table 1
Olônne-18 GSH targeted H1 hESC
Keppel-19 GSH targeted H1 hESC
Control targeted H1 hESC (CRISPR and HDR of an expression cassette at a non-GSH locus)
Independent untargeted H1 hESC

Additional files

Supplementary file 1

Output of safe and active searches and their overlap.

(1) Coordinates that pass safe site filters. (2) Regions that pass active filters. (3) Overlap of safe sites and active regions.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp1-v2.xlsx
Supplementary file 2

Oligos, primers and antibodies used, and details of beta cell differentiation.

(1) gRNAs used for targeting. (2) Primers used for plasmid construction. (3) Primers used for junction PCR and WT allele screening. (4) Primers used for predicted off-target screening. (5) Antibodies used for immunofluorescence staining and flow cytometry. (6) Details for pancreatic differentiation.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp2-v2.xlsx
Supplementary file 3

GSH targeted hESC screening.

(1) Summary of GSH targeted hESC clones screening. (2) ddPCR copy number analysis.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp3-v2.xlsx
Supplementary file 4

Source data for qPCR results.

(1) qPCR data source file for Figure 2C. (2) qPCR data source file for Figure 2–figure supplement 5A.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp4-v2.xlsx
Supplementary file 5

RNAseq analysis of GSH targeted H1 clones.

(1) Differentially expressed genes |logFC|≥1 and FDR ≤0.01. (2) Functional enrichment analysis results from g:GOSt.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp5-v2.xlsx
Supplementary file 6

RNAseq analysis of GSH targeted H9 clones.

(1) Differentially expressed genes |logFC|≥1 and FDR ≤0.01. (2) Functional enrichment analysis results from g:GOSt.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp6-v2.xlsx
Supplementary file 7

HCI data analysis source file for Figure 4—figure supplement 1.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp7-v2.xlsx
Supplementary file 8

Percentage of FITC-A positive cells from H1 Pansio inducible cells source file for Figure 2—figure supplement 8.

https://cdn.elifesciences.org/articles/79592/elife-79592-supp8-v2.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/79592/elife-79592-mdarchecklist1-v2.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Matias I Autio
  2. Efthymios Motakis
  3. Arnaud Perrin
  4. Talal Bin Amin
  5. Zenia Tiang
  6. Dang Vinh Do
  7. Jiaxu Wang
  8. Joanna Tan
  9. Shirley Suet Lee Ding
  10. Wei Xuan Tan
  11. Chang Jie Mick Lee
  12. Adrian Kee Keong Teo
  13. Roger SY Foo
(2024)
Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells
eLife 13:e79592.
https://doi.org/10.7554/eLife.79592