5 figures, 2 tables and 1 additional file

Figures

Transcriptional abundance and structural features of 119 candidate de novo genes in the mouse lineage.

(A) Transcriptional abundance in each mouse tissue, represented as the sum of log-transformed FPKM values of each transcript: sum(log2(FPKM + 1)). Details on tissue designations and RNA samples are provided in Figure 1—source data 1. The five tissues with the highest fractions are highlighted in red and the lowest ones in blue. (B) Comparison of overall expression levels (represented as the highest FPKM values in the 35 tissues) between de novo and all other protein-coding genes (‘De novo’ and ‘Other’ on the x-axis). (C) Comparison of averages of intrinsic structural disorder scores between de novo and all other protein-coding genes. (D) Comparison of fractions of sequence covered by hydrophobic clusters between de novo and all other protein-coding genes.

https://doi.org/10.7554/eLife.44392.003
Figure 1—source data 1

Excel file with five tabs, providing (i) the legends for the tissue sources, (ii) the table for the gene lists, (iii) the information for the reassembled transcripts, (iv) the proteomic evidence shown in detail, and (v) the accession numbers for the ENCODE data.

https://doi.org/10.7554/eLife.44392.004
General information of Gm13030, expression, and knockout strategy.

(A) General information on transcript ID, location and protein characteristics. (B) Reverse transcription PCR across intron junctions to study Gm13030 expression in gonadal fat pad, ovary, oviduct, and uterus. Fat: gonadal fat pad; M: marker (from top to bottom: 1500 bp, 850 bp, 400 bp, 200 bp, 50 bp); U: Uba1 (control gene, 255 bp); j1: Gm13030 junction 1 (161 bp); j2: Gm13030 junction 2 (209 bp). (C) Transcript structure, DNA target, guide RNA, and depiction of the deletion created by the CRISPR/Cas9 knockout of Gm13030. The 20-nt guide sequence is colored blue and the 3-nt PAM is colored red. The induced deletion was verified by sequencing.

https://doi.org/10.7554/eLife.44392.005
Figure 3 with 3 supplements
Evolutionary history of the Gm13030 ORF.

The tree is based on the alignments shown in Figure 3—figure supplement 1, with only M. caroli included as the outgroup. The relevant substitutions at the different nodes are shown in boxes. Numbers refer to coding:non-coding substitutions, ‘stop’ refers to a mutation that creates a stop codon in the reading frame, ‘DEL’ refers to a deletion, ‘INT’ to an assumed introgression. 3-letter codes on the tips refer to the different populations of the respective sub-species. Expected substitutions on the top are inferred from whole genome distances and represent the approximately neutral number of substitutions for the respective comparisons (Figure 3—figure supplement 2).

https://doi.org/10.7554/eLife.44392.006
Figure 3—figure supplement 1
Alignment of the ORF of Gm13030 among the mouse populations, subspecies and related species where the sequence could be identified in the respective genomic region.

All nucleotide sequences are aligned to the mm10 mouse reference, and amino acids are translated according to the reference frame (note that this implies that the frameshifts in the individual sequences are not shown). Alignment was done in Geneious Prime (2019.0.3 Biomatters Ltd.). The topology of the species tree is shown to the left. ‘Mus m.’ represents Mus musculus. Mus musculus populations are marked with 3-letter origin designations, and sequences from inbred strains are marked with strain designations. Nucleotides and amino acids are shown in IUPAC codes. The two rectangles mark the positions of the enabler mutations.

https://doi.org/10.7554/eLife.44392.007
Figure 3—figure supplement 2
Distance matrices for whole genome comparisons and expected numbers of substitutions for Gm13030.
https://doi.org/10.7554/eLife.44392.008
Figure 3—figure supplement 3
Table for all pairwise comparisons of the aligned reading frame of Gm13030 with the calculation of coding and non-coding positions, plus the observed numbers of substitutions (generated with DnaSP; Librado and Rozas, 2009).

P-values were calculated in a 2 × 2 contingency table using Fisher’s exact test.

https://doi.org/10.7554/eLife.44392.009
Figure 4 with 3 supplements
Clusters and expression levels in the 24 RNA-Seq samples of oviducts.

(A) PC1 values from the PCA analysis, (B) hierarchical clustering result. Sample codes and genotypes are listed along X-axis. The 24 samples are assigned into three clusters accordingly. (C-E) The expression levels of three sex hormone receptor genes (Pgr, Esr1, Gper1) are shown by box plots.

https://doi.org/10.7554/eLife.44392.010
Figure 4—figure supplement 1
Oviduct sample details and read statistics.

Read statistics for the RNA-Seq samples of oviducts.

https://doi.org/10.7554/eLife.44392.011
Figure 4—figure supplement 2
Dcpp expression confirmation.

The relative expression levels of three Dcpp genes in cluster one samples. Comparison of RNA-Seq counts and two replicates of ddPCR experiments (see experimental details in the Materials and method section). Note that the three Dcpp genes (Dcpp1, Dcpp2, and Dcpp3) are too similar to each other to allow to place primers that could distinguish them. Hence, both the RNA-Seq counts, as well as the ddPCR results reflect the cumulative expression levels. ddPCR test on the relative expression levels of the three Dcpp genes in cluster one samples.

https://doi.org/10.7554/eLife.44392.012
Figure 4—figure supplement 3
Deletion patterns in the Dcpp gene region of the different Mus musculus populations.
https://doi.org/10.7554/eLife.44392.013
Distributions of the time from the first litter to the second litter.

Time points of the second litter are plotted for the different genotypes, with box plots marked. A bimodal distribution becomes evident, as discussed in the text.

https://doi.org/10.7554/eLife.44392.015
Figure 5—source data 1

Details of the fertility scores for the different groups of mice.

The mice listed under the UC codes are part of the standard stock breeding, the mice listed under the WT and KO codes were specifically set up to confirm the observations from the stock breeding.

https://doi.org/10.7554/eLife.44392.016

Tables

Table 1
Differentially expressed genes in oviduct cluster 1.
https://doi.org/10.7554/eLife.44392.014
Gene IDGene nameBase meanaFold changeAdjusted P-Value
ENSMUSG00000057417Dcpp337001.590.0000
ENSMUSG00000096278Dcpp24271.470.0000
ENSMUSG00000096445Dcpp14151.450.0000
ENSMUSG00000034009Rxfp144101.350.0003
ENSMUSG00000022206Npr33491.360.0011
ENSMUSG00000035864Syt16661.340.0011
ENSMUSG00000070348Ccnd173820.800.0012
ENSMUSG00000058897Col25a116051.340.0015
ENSMUSG00000059908Mug12681.350.0015
ENSMUSG00000063130Calml36981.310.0018
ENSMUSG00000015966Il17rb6370.750.0025
ENSMUSG00000022358Fbxo3236141.310.0038
ENSMUSG00000040724Kcna28950.750.0038
ENSMUSG00000061477Rps762471.200.0052
ENSMUSG00000067786Nnat6581.320.0052
ENSMUSG00000019987Arg112081.320.0068
ENSMUSG00000079017Ifi27l2a10651.320.0073
ENSMUSG00000028031Dkk26781.310.0077
ENSMUSG00000022037Clu171391.220.0086
ENSMUSG00000033715Akr1c14238791.210.0086
ENSMUSG00000034039Prss291761.290.0086
  1. aThe mean of the normalized read counts for all cluster one samples.

Key resources table
Reagent type (species)
or resource
DesignationSource or
reference
IdentifiersAdditional
information
Gene (Mus musculus)Gm13030; ShjNAEnsembl:ENSMUSG00000078518
Genetic reagent (M. musculus)Gm13030 linethis paperGenerated from C57BL/6N line by introducing a 7 bp deletion using CRISPR/Cas9 at Mouse Biology Program (MBP). See detail in Materials and methods.
Sequence-based reagentReverse transcription PCR primersthis paperSee Materials and methods.
Sequence-based reagentPCR and Sanger sequencing primersthis paperSee Materials and methods.
Sequence-based reagentGenotyping primersthis paperSee Materials and methods.
Sequence-based reagentDroplet digital PCR primers and probesthis paperSee Materials and methods.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Chen Xie
  2. Cemalettin Bekpen
  3. Sven Künzel
  4. Maryam Keshavarz
  5. Rebecca Krebs-Wheaton
  6. Neva Skrabar
  7. Kristian Karsten Ullrich
  8. Diethard Tautz
(2019)
A de novo evolved gene in the house mouse regulates female pregnancy cycles
eLife 8:e44392.
https://doi.org/10.7554/eLife.44392