Post-fertilization transcription initiation in an ancestral LTR retrotransposon drives lineage-specific genomic imprinting of ZDBF2

  1. Hisato Kobayashi  Is a corresponding author
  2. Tatsushi Igaki
  3. Soichiro Kumamoto
  4. Keisuke Tanaka
  5. Tomoya Takashima
  6. So I Nagaoka
  7. Shunsuke Suzuki
  8. Masaaki Hayashi
  9. Marilyn B Renfree
  10. Manabu Kawahara
  11. Shun Saito
  12. Toshihiro Kobayashi
  13. Hiroshi Nagashima
  14. Hitomi Matsunari
  15. Kazuaki Nakano
  16. Ayuko Uchikura
  17. Hiroshi Kiyonari
  18. Mari Kaneko
  19. Hiroo Imai
  20. Kazuhiko Nakabayashi
  21. Matthew Lorincz
  22. Kazuki Kurimoto
  1. Department of Embryology, Nara Medical University, Japan
  2. Department of Medical Genome Science, Dokkyo Medical University, Japan
  3. Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, Japan
  4. Division of Cancer and Senescence Biology, Cancer Research Institute, Kanazawa University, Japan
  5. Department of Informatics, Tokyo University of Information Sciences, Japan
  6. NODAI Genome Research Center, Tokyo University of Agriculture, Japan
  7. Department of Agriculture, Graduate School of Science and Technology, Shinshu University, Japan
  8. School of BioSciences, University of Melbourne, Australia
  9. Laboratory of Animal Genetics and Reproduction, Research Faculty of Agriculture, Hokkaido University, Japan
  10. Division of Mammalian Embryology, Center for Stem Cell Biology and Regenerative Medicine, The Institute of Medical Science, University of Tokyo, Japan
  11. Center for Genetic Analysis of Behavior, National Institute for Physiological Sciences, Japan
  12. Meiji University International Institute for Bio-Resource Research, Japan
  13. Laboratory for Animal Resources and Genetic Engineering, RIKEN Center for Biosystems Dynamics Research, Japan
  14. Molecular Biology Section, Center for the Evolutionary Origins of Human Behavior, Kyoto University, Japan
  15. Division of Developmental Genomics, Research Institute, National Center for Child Health and Development, Japan
  16. Life Sciences Institute, Department of Medical Genetics, University of British Columbia, Canada
7 figures and 4 additional files

Figures

Figure 1 with 1 supplement
Identification of GPR1-AS orthologs from public placental transcriptomes.

UCSC Genome Browser screenshots of the GPR1-ZDBF2 locus in humans (A), rhesus macaques (B), and mice (C). Predicted transcripts were generated using public directional placental RNA-seq datasets (accession numbers: SRR12363247 for humans, SRR1236168 for rhesus macaques, and SRR943345 for mice) using the Hisat2-StringTie2 programs. Genes annotated from GENCODE or RefSeq databases and long terminal repeat (LTR) retrotransposon positions from UCSC Genome Browser RepeatMasker tracks are also displayed. Among the gene lists, only the human reference genome includes an annotation for GPR1-AS (highlighted in green). GPR1-AS-like transcripts and MER21C retrotransposons are highlighted in red. Animal silhouettes were obtained from PhyloPic. Animal silhouettes were obtained from PhyloPic (mouse silhouette by Katy Lawler, available under a CC BY 4.0 license).

Figure 1—figure supplement 1
Identification of GPR1-AS orthologs using public and non-directional RNA-seq data.

(A) Heat map showing the expression levels of GPR1, GPR1-AS, and ZDBF2 in different human tissues, including the placenta. Genome browser screenshots of the GPR1-ZDBF2 locus in humans (B) and baboons (C). Predicted transcripts were generated using public non-directional placental RNA-seq datasets (accession numbers: SRR1850957 for humans, GSM4696517 for baboons). Transcript/gene information and long terminal repeat (LTR) retrotransposon positions are shown. GPR1-AS-like transcripts and MER21C retrotransposons are shown in red. Animal silhouettes were obtained from PhyloPic.

Figure 2 with 1 supplement
Identification of GPR1-AS orthologs from original placental and extra-embryonic transcriptomes.

Predicted transcripts were generated from placental and extra-embryonic directional RNA-seq datasets of chimpanzee (A), rabbit (B), pig (C), cow (D), and opossum (E) with the Hisat2-StringTie2 programs. Genes annotated from RefSeq or Ensembl databases and their long terminal repeat (LTR) positions are also shown. MER21C retrotransposons, GPR1-AS-like transcripts, and their fragments per kilobase million (FPKM) and transcripts per kilobase million (TPM) values are highlighted in red. Animal silhouettes were obtained from PhyloPic (opossum silhouette by Sarah Werning, available under a CC BY 3.0 license).

Figure 2—figure supplement 1
Search for GPR1-AS orthologs from embryonic transcriptomes.

Predicted transcripts were generated using directional RNA-seq datasets of embryonic proper tissues from rabbit (A), pig (B), bovine (C), and opossum (D) embryos. Transcript/gene information and long terminal repeat (LTR) retrotransposon positions are displayed and the annotated MER21C retrotransposon (only in rabbit) is highlighted in red. Animal silhouettes were obtained from PhyloPic (opossum silhouette by Sarah Werning, available under a CC BY 3.0 license).

Figure 3 with 1 supplement
Allele-specific RT-PCR sequencing of ZDBF2 in various mammals.

Heterozygous genotypes were used to distinguish between parental alleles in adult tissues from tammar wallabies (A), fetal/embryonic tissues from cattle (B), blood samples from rhesus macaques (C), and rabbits (D), respectively. Primers were designed to amplify the 3'-UTR regions of ZDBF2 orthologs and detect SNPs. Each SNP position is highlighted in red. Reverse primers were also used for Sanger sequencing. Animal silhouettes were obtained from PhyloPic.

© 2018, Geoff Shaw. Wallaby silhouette by Geoff Shaw, available under a CC BY-NC 3.0 license.

Figure 3—figure supplement 1
Search for germline DMRs from oocyte and sperm DNA methylomes.

The DNA methylation (DNAme) levels of individual CpG sites in oocyte and sperm from rhesus macaque (A), pig (B), and bovine (C) whole genome bisulfite sequencing datasets are shown. Oocyte-methylated and sperm-methylated differentially methylated regions (DMRs) are highlighted in red and blue, respectively. Predicted transcripts from placental and extra-embryonic directional RNA-seq datasets (shown in Figures 1 and 2), genes annotated from RefSeq databases, and long terminal repeat (LTR) positions from UCSC/RepeatMasker are included, with a MER21C retrotransposon overlapping rhesus macaque GPR1-AS highlighted in red. Animal silhouettes were obtained from PhyloPic.

Figure 4 with 2 supplements
Multi-species comparison of long terminal repeat (LTR) retrotransposon locations at GPR1 locus.

A total of 24 mammalian genomes were compared, including six primates (human, chimpanzee, rhesus macaque, marmoset, tarsier, and gray mouse lemur), one colugo (flying lemur), one treeshrew (Chinese treeshrew), two lagomorphs (rabbit and pika), eight rodents (squirrel, guinea pig, lesser jerboa, blind mole rat, giant pouched rat, mouse, rat, and golden hamster), and six other eutherians (pig, cow, horse, dog, elephant, and armadillo). Among the selected genomes, LTRs that can be considered homologous to MER21C, which corresponds to the first exon of GPR1-AS, are marked in red. In tarsier, treeshrew, lesser jerboa, and giant pouched rat, the orthologous LTRs were annotated as MER21B, which exhibits 88% similarity with MER21C in their consensus sequences through pairwise alignment. MER21B is marked in purple. According to Dfam, the MER21C and MER21B subfamilies are specific to the genomes of Boroeutherians and Euarchontoglires, respectively. The copy number of MER21C/B in selected species is shown in red and purple (LTRs likely matching the GPR1-AS exon are underlined). There are 5418 and 2529 copies of MER21C and 2894 and 1535 copies of MER21B in human and mouse genomes, respectively.

Figure 4—figure supplement 1
Reanalysis of repeat positions using RepeatMasker.

Repetitive elements were re-identified in five mammalian species: mouse, rat, and hamster—where MER21C, which overlaps the first exon of human GPR1-AS, was not found in the homologous region—and rabbit and human, where it was detected. The Percent Identity Plot (PIP, showing a conservation scale between sequences from 50 to 100% on the y-axis) illustrates the order and alignment of the 20 kb region surrounding the GPR1-AS (Liz) transcription start site in each mammalian chromosome. Detected repeat elements are displayed above each plot. RepeatMasking was performed under less stringent settings, including switching search engines from RMblast to HMMER and adjusting speed/sensitivity settings from default to slow. Despite these adjustments, MER21C insertion was not detected in the three rodent species.

Figure 4—figure supplement 2
Multiple genome alignments at the first exon of GPR1-AS locus.

Cactus generates reference-free, whole-genome multiple alignments (Armstrong et al., 2020). The Cactus track from UCSC Genome Browser displays multiple alignments across vertebrate species and evolutionary conservation metrics from the Zoonomia Project (Zoonomia Consortium, 2020). Green square brackets indicate shorter alignments where DNA from one genomic context in the aligned species is nested within a larger alignment chain from a different genomic context. The alignment within these brackets may represent a short misalignment, a lineage-specific insertion of a retrotransposon in the human genome that aligns to a paralogous copy in another species. SINE and long terminal repeat (LTR) retrotransposon positions from the UCSC Genome Browser are also displayed. Silhouette obtained from PhyloPic.

Figure 5 with 3 supplements
Comparison of MER21C-derived sequences overlapping the first exon of GPR1-AS orthologs.

(A) Phylogenetic tree of MER21C-derived sequences estimated by multiple sequence alignment (MSA) using multiple sequence comparison by log-expectation (MUSCLE) program. (B) Positions of common and unique cis-acting elements at each sequence. (C) Motif structures of the common region that contains E74-like factor 1 and 2 (ELF1 and ELF2) binding motifs. (D) Motif structures of transcription factor AP-2 gamma (TFAP2C) and Zinc finger and SCAN domain containing 4 (ZSCAN4).

Figure 5—figure supplement 1
Pairwise alignment between consensus sequences of retrotransposons and GPR1-AS-exonic MER21 sequences.

.(A) MER21C (or MER21B) sequences located in the GPR1 intron of eutherian genomes and the first exon of mouse Liz were compared with the consensus MER21C sequence. (B) Human and rabbit MER21C sequences overlapping the first exon of GPR1-AS and the first exon of mouse Liz were compared with the consensus sequences of ERV3/ERVL solo-LTRs present in human and mouse (n=182). Each graph displays the identity percentages and alignment scores for the top five long terminal repeats (LTRs) with the highest scores. In humans and rabbits, MER21C showed the highest identity with the exonic sequences. (C) The first exon of mouse Liz was compared with the consensus sequences of all retrotransposons present in mice (n=1361). The graph represents the top 10 retrotransposons with the highest scores. In mice, MER21C does not show sufficient sequence identity to the first exon of Liz to distinguish it from other retrotransposons. Pairwise alignment scores and percent identity values for each sequence pair were calculated using Genetyx software.

Figure 5—figure supplement 2
Promoter activities of first exons of mouse Liz and human GPR1-AS.

(A) Constructs (inserted sequences) used for dual luciferase reporter assays in HEK293T cells. A promoter-less vector served as the negative control. (B) Results of dual luciferase reporter assays. Relative fold changes in Firefly luciferase activity (Firefly/Renilla) were normalized to the Firefly/Renilla ratio of the negative control. Error bars indicate mean ± s.e.m. Statistical significance was determined using unpaired t-tests: *p<0.05, **p<0.01. Data represent four biological replicates.

Figure 5—figure supplement 3
Expression patterns of transcription factors and imprinted genes during human preimplantation development.

A heat map displaying the average expression of four transcription factors associated to human GPR1-AS or mouse Liz transcription, a primary KRAB-ZFP that binds to MER21C, and three imprinted genes surrounding the ZDBF2 locus.

Figure 6 with 1 supplement
Initiation of GPR1-AS transcription before implantation.

Genome browser screenshots of the GPR1-ZDBF2 locus in humans at preimplantation stages, including the MII oocyte, zygote, 2 cell, 4 cell, 8 cell, inner cell mass (ICM), and trophectoderm (TE) from the blastocyst. Predicted transcripts were generated from publicly available full-length RNA-seq datasets, with detected GPR1-AS-like transcripts and their fragments per kilobase million (FPKM) and transcripts per kilobase million (TPM) values highlighted in red. Silhouette was obtained from PhyloPic.

Figure 6—figure supplement 1
Human long terminal repeat (LTR) reactivation during preimplantation development.

Heat map displaying the average expression of select LTR retrotransposon families in human oocytes and early embryos. MLT2A1/MLT2A2 and HERVK are reactivated between the 4- to 8 cell stage and after the 8 cell stage, respectively (Grow et al., 2015; Hashimoto et al., 2021).

Figure 7 with 1 supplement
Establishment of ZDBF2 imprinted domain in evolution and genome biology.

(A) Scheme of epigenetic and transcriptional changes at the first exon of mouse Liz and human GPR1-AS. (B) Timescale of the evolution of ZDBF2 imprinting and LTR (MER21C) insertion. Animal silhouettes were obtained from PhyloPic (mouse silhouette by Katy Lawler, available under a CC BY 4.0 license; opposum silhouette by Sarah Werning, available under a CC BY 3.0 license).

© 2018, Geoff Shaw. Wallaby silhouette by Geoff Shaw, available under a CC BY-NC 3.0 license.

Figure 7—figure supplement 1
Interspecies epigenomic comparisons between human GPR1-AS and mouse Liz.

IGV screenshots of the first exon of GPR1-AS/Liz in human (A) and mouse (B) showing DNA methylation, enrichment of post-translational histone modifications (H3K4me3, H3K9me3, and H3K27me3), and transcription factor binding sites (TFAP2C and ZSCAN4C) from ChIP-Atlas in various tissues. DNA methylomes from oocyte and sperm from mouse and human were published previously (Brind’Amour et al., 2018). Animal silhouettes were obtained from PhyloPic (mouse silhouette by Katy Lawler, available under a CC BY 4.0 license).

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Hisato Kobayashi
  2. Tatsushi Igaki
  3. Soichiro Kumamoto
  4. Keisuke Tanaka
  5. Tomoya Takashima
  6. So I Nagaoka
  7. Shunsuke Suzuki
  8. Masaaki Hayashi
  9. Marilyn B Renfree
  10. Manabu Kawahara
  11. Shun Saito
  12. Toshihiro Kobayashi
  13. Hiroshi Nagashima
  14. Hitomi Matsunari
  15. Kazuaki Nakano
  16. Ayuko Uchikura
  17. Hiroshi Kiyonari
  18. Mari Kaneko
  19. Hiroo Imai
  20. Kazuhiko Nakabayashi
  21. Matthew Lorincz
  22. Kazuki Kurimoto
(2025)
Post-fertilization transcription initiation in an ancestral LTR retrotransposon drives lineage-specific genomic imprinting of ZDBF2
eLife 13:RP94502.
https://doi.org/10.7554/eLife.94502.3