Reassessment of weak parent-of-origin expression bias shows it rarely exists outside of known imprinted regions

7 figures, 1 table and 2 additional files

Figures

limited overlap between novel genes identified in the four studies.

(A–B) Euler diagrams showing the overlap of (A) - known imprinted genes and (B) – novel biased genes between the four studies. (C) – overlaps between different classes of novel genes. Class-1 = novel singletons >1 Mb from another gene identified in any of the studies. Class-2 = novel clusters where two or more novel genes are within 1 Mb of each other but over 1 Mb from a known imprinted gene. Class-3=novel genes are within 1 Mb of a known imprinted gene. (D) - proportion of known and novel genes by number of studies they were identified in. (E–F) Venn diagrams showing tissue-specific overlaps in E - known imprinted genes and F – novel biased genes.

Figure 2 with 2 supplements
Limited overlap between novel genes called by different analysis pipelines.

(A–C) Venn diagrams showing the overlap between allelic biased genes called by our pipeline with the ISoLDE package (gray circles) versus the original studies. A – Dataset A, B - Dataset B, C - Dataset C. Overlapping novel genes are listed to the right. Dataset A only genes called in the hypothalamus, cerebellum, liver, muscle, and whole adult brain were analyzed. (D–G) Venn diagrams showing the overlap between allelic biased genes called by our pipeline from sequence data generated from the same tissue by different studies: liver (D), muscle (E), hypothalamus (F) from Dataset A and Dataset B and cerebellum (G) from Dataset A and Dataset C. (H–I) Venn diagrams of three-way overlap between allelic biased genes called by our pipeline from sequence data from the liver (H) and muscle (I) generated by Dataset A, Dataset B, and Dataset E.

Figure 2—figure supplement 1
Expression levels of genes called as biased in at least one of the original studies.

Data generated from re-analysis of original RNA-seq datasets. To allow for comparison across the different datasets, expression levels are reported as mean TPM (transcripts per million) from all biological replicates. Source data is in Supplementary file 1f. Samples: Dataset A - whole adult brain, cerebellum (cb) hypothalamus (hypo) liver, muscle. Dataset B - arcuate nucleus (ARN), dorsal raphe nucleus (DRN), liver, and muscle. Dataset C - P8 and P60 cerebellum.

Figure 2—figure supplement 2
Overlapping underdetermined genes called between Dataset B (A) and Dataset C (B) and ISoLDE pipeline.

Number of ISoLDE called genes are in pale gray, and overlaps with original study are in dark gray. Overlapping genes are listed to the right. Genes in red were flagged by ISoLDE and genes in bold were validated by pyrosequencing.

Weakly biased Class-3 genes are preferentially expressed from the chromosome carrying the germline methylation mark.

(A) Distribution of biased genes by maximum reported bias. Class-1 = novel singletons >1 Mb from another gene identified in any of the studies. Class-2 = novel clusters where two or more novel genes within 1 Mb of each other but over 1 Mb from a known imprinted gene. Class-3 = novel genes are within 1 Mb of a known imprinted gene. (B) – Distribution of biased genes by preferential parental chromosome and maximum reported bias. Genes preferentially expressed from the maternal chromosome are shown in red, paternal chromosome in blue, and preferentially expressed from both chromosomes in a tissue-specific manner in gray. (C) - Distribution of known and Class-3 genes by the methylation status of the imprinting control region (ICR) on the preferentially expressed allele. Genes preferentially expressed from the Meth-ICR chromosome are shown in black, Un-ICR chromosome in pale-gray, and preferentially expressed from both chromosomes in a tissue-specific manner in dark-gray. (D) – Heap maps of DNA methylation (fetal and six week male frontal cortex Shen et al., 2012; Sloan et al., 2016) and Histone H3 lysine 27 trimethylation (E16 and P0 forebrain Gorkin et al., 2017; Shen et al., 2012) over the promoters of known and Class-3 novel genes. Promoters are defined as 500 bp upstream of the transcription factor binding site. Genes are sorted by maximum reported bias and methylation status of the ICR on the preferentially expressed allele. Source data for the figure can be found in Supplementary file 1h.

Weakly biased Class-3 genes are located at the periphery of known imprinted domains.

(A) – Distribution of Class-3 biased genes in relation to known imprinted genes. Genes flanked by known imprinted genes are shown in turquoise and those peripheral to known imprinted genes are shown in green.(B) – Sunburst graph of a relationship between position in cluster, the extent of bias, and ICR methylation in Class-3 genes. Low bias = <70% expression from preferential chromosome, high bias = >70% expression from the preferential chromosome. Preferential expression from the methylated imprinting control region (ICR) chromosome is shown in black, preferential expression from the unmethylated ICR chromosome is shown in gray, and genes reported as being biased on both chromosomes depending on tissue or study are shown in red. (C and D) – Schematics of the Snrpn (C) and Dlk1 (D) regions. Highly biased novel genes (80–100%) are located between known imprinted transcripts whereas low-biased genes (50–70%) are located at the periphery. Red boxes = known maternally expressed genes. Blue boxes = known paternally expressed genes. Pink boxes = novel maternally biased genes called by original studies. Turquoise boxes = novel paternally biased genes called by original studies. Blue arrow = cluster of imprinted MBII snoRNAs. Turquoise arrow Mir344 cluster. (E) – Allele-specific expression analysis in peripheral genes in the Dlk1 region in IG-DMR knockout mice. Female mice, heterozygous for IG-DMR knockout were crossed with male CastEiJ mice, and expression was assessed by pyrosequencing. Wildtype (n=5) and maternal heterozygote (n=6) expression biases were compared using an unpaired t-test ** p <0.01 * p < 0.5.

Figure 5 with 3 supplements
Experimentally validated Class-1 and Class-2 genes.

(A) – Nhlrc1 (Class-1) is paternally biased in all postnatal neuronal tissues tested. (B-C) Bisulfite sequencing analysis in P7 tissues.(B) – Cerebellum, (C)- Liver. Each line represents a different clone of bisulfite sequencing derived from two BxC animals and two CxB animals. Numbers of identical clones sequenced are indicated to the right. Black = methylated CpG and Gray = unmethylated CpG, white = CpG absent from clone. Percentage of methylated CpGs from all clones is indicated underneath.(D) – Pcdhb12 (Class-2) is maternally biased in all postnatal tissues tested. (E) – Three Class-2 genes show a maternal bias in e16.5 placenta: Vat1, Pla2g16, and Rtn3. These biases are weaker than seen in Ampd3 which is imprinted in the placenta (Schulz et al., 2006). Allele-specific expression graphs (A, D and E) show mean expression (%) from the paternal allele (deep blue) and maternal allele (red) in C57BL/6 × CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown n=3 or 4. Data are normalized to gDNA.

Figure 5—figure supplement 1
Coro1c pseudogene is almost exclusively expressed from the castaneus allele.

(A) Nhlrc1 expression bias in P6 cerebellum and liver which were also used for bisulfite sequencing analysis. (B) Allelic balance of expression of Coro1c pseudogene that overlaps the Nhlrc1 DMR. Graph shows mean expression (%) from the paternal allele (blue) and maternal allele (red). C57BL/6 × CastEiJ (BC) and CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=2 for Nhlrc1 and 3 for Coro1c pseudogene.

Figure 5—figure supplement 2
Allelic bias in Wnk4 novel cluster.

(A) Weak bias in Wnk4 gene. (B) location of genes in the novel cluster, (C) location of gDMRs, and antisense transcript. Graph shows mean expression (%) from the paternal allele (blue) and maternal allele (red) C57BL/6 × CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=3 or 4. Data are normalized to gDNA.

Figure 5—figure supplement 3
Expression levels of genes tested by allele-specific pyrosequencing.

Data are extracted from Supplementary file 1f. Genes that were only validated in the placenta were counted as unvalidated and the two known placental-specific imprinted genes are not included in the figure. Expression levels were reported as the mean transcripts per million (TPM) across all biological replicates. Samples: Dataset A - whole adult brain, cerebellum (cb) hypothalamus (hypo) liver, muscle. Dataset B - arcuate nucleus (ARN), dorsal raphe nucleus (DRN), liver, and muscle. Dataset C - P8 and P60 cerebellum. The original study the genes were reported as biased is noted on the y-axis – A=Dataset A, B=Dataset B, C=Dataset C, and D=Dataset D.

Figure 6 with 3 supplements
Allele-specific expression analysis of the Dlk1 domain.

(A–F) Allelic bias in Evl (A), Slc25a29 (B), Wars (C) Wdr24 (D) Dlk1 (E), and Dync1h1 (F) Graphs show mean expression (%) from the paternal allele (blue) and maternal allele (red) C57BL/6 × CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=3 or 4. Tissues with a mean bias greater than 45:55 are indicated by arrow heads. Amplification bias was assessed in genomic DNA and the data are corrected. (G) – Imprinting of Slc25a29 in e15.5 placenta is not under the control of the IG-DMR. WT (BC) = maternal allele is wildtype for the IG-DMR paternal allele is CastEiJ (n=5). Mat_Het = maternal allele has IG-DMR deletion and the paternal allele is CastEiJ (n=6). WT (CB)=paternal allele is wildtype for the IG-DMR maternal allele is CastEiJ (n=5). Pat_Het = Paternal allele has IG-DMR deletion and the maternal allele is CastEiJ (n=7). (H) Schematic of the validated expression data in the Dlk1 region. Red boxes = known maternally expressed genes. Blue boxes = known paternally expressed genes. Pink boxes = novel validated maternally biased genes. Turquoise boxes = novel validated paternally biased genes. Gray boxes = biallelically expressed genes.

Figure 6—figure supplement 1
Allelic bias in the Mcts2 region.

(A) Mcts2 imprinted region. (B–E) Allelic bias (%) in Mcts2 (A), Cox4i2 (B), Bcl2l1 (C) , and Tpx2 (D). Graphs show mean expression (%) from the paternal allele (blue) and maternal allele (red) C57BL/6 × CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=3 or 4. Data are normalized to gDNA.

Figure 6—figure supplement 2
Allelic bias (%) in Adam23 (A), Ifitm10 (B), and Ago2 (C).

Graphs show mean expression (%) from the paternal allele (blue) and maternal allele (red) C57BL/6 × CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=3 or 4. Adam23 and Iftim10 data are normalized for an amplification bias in gDNA. Data are normalized to gDNA.

Figure 6—figure supplement 3
Allelic bias in Peg3 region.

(A) Peg3 imprinted region. (B–D) Allelic bias (%) in Smim17 (A), Peg3 (B), and Clcn4 (C). Graphs show mean expression (%) from the paternal allele (blue) and maternal allele (red) C57BL/6 x CastEiJ (BC) and four CastEiJ × C57BL/6 (CB) crosses. Castaneus allele is denoted by a spotted pattern. Standard error of the mean is shown, n=3 or 4. Clcn4 data are normalized for an amplification bias in gDNA. Data are normalized to gDNA.

Possible mechanisms behind parent-of-origin expression biases in tissues.

(A-D) Scenarios causing biased expression in heterogenous cell populations: bias occurs in every cell in the tissue (A), random imprinting in the subset of cells (B), cell-type specific imprinting (C) or random monoallelic expression that is skewed towards one allele (D). (E-H) Possible Mechanisms behind parent-of-origin biases at the periphery of imprinted domains.

Tables

Table 1
Table showing the summary of all the allele-specific pyrosequencing performed to validate putative-biased genes.

Values show the mean expression (%) from the paternal allele of both reciprocal crosses to eliminate strain bias. Values above 55% are called as paternally biased (blue) and values below 45% are called as maternally biased (red). Assays with a strain bias of greater than 45:55 in more than one tissue are indicated in the 19th column. Genes that only validate in the placenta are called as Placental in the 20th column (Red = Maternal, Blue = Paternal).

GenesChr.DatasetDirection previously reportedClasse16.5e16.5e16.5P7P7P7P7P7P60P60P60P60P60Strain BiasValidation status
Plac.LiverBrainCortexHyp.Cb.Hipp.B.SCortexHyp.Cb.Hipp.B.S
Class-1 (Novel Singletons)
L3mbtl12Bpat1---48.153.453.446.450.149.449.746.751.049.2Biallelic
Ahi110B,Dpat152.846.5n/a47.051.650.952.846.948.649.450.052.451.3Biallelic
Platr2011Apat151.749.950.049.950.450.150.450.249.849.850.150.249.7Biallelic
Calm112B,Dpat1---54.749.551.445.943.4n/an/an/an/an/aBiallelic
Nhlrc113B, C, Dpat1---56.655.657.854.858.858.156.155.455.055.2Paternal
Tnk111Amat1-------------Low expression
Mlana18Bmat1-------------Low expression
Gm1629919Cpat1-------------Low expression
Class-2 (Novel Clusters)
Stx61Cmat2---50.749.250.349.850.1n/an/an/an/an/aBiallelic
Gabra57B,Dpat2---51.553.252.554.852.652.350.748.651.551.5Biallelic
Wnk411Cmat2---50.152.045.350.644.048.144.650.945.448.1Maternal
Vat111Bmat243.950.952.649.648.850.451.049.946.451.350.849.850.4Placental
Rdm111Amat249.4--48.848.850.150.250.746.949.550.550.148.7Biallelic
Gaa11B,Dpat248.952.353.651.347.151.753.151.851.851.647.951.950.8Biallelic
Pcdhb1018Dmat2-------------Low expression
Pcdhb1218B,Cmat2---40.541.241.039.441.842.642.742.143.244.9Maternal
Pcdhb2018B,Cpat2--52.751.551.652.052.653.151.552.351.651.650.1Biallelic
Prdx519Bpat246.951.149.449.050.448.849.449.949.250.250.048.349.2Biallelic
Rtn319Dpat244.649.950.749.447.349.549.549.249.150.949.750.450.4Placental
Pla2g1619Bmat240.351.650.651.749.748.948.550.751.551.249.246.848.3Placental
Mr11Cmat2-------------Low expression
BC0340901Cmat2-------------Low expression
Tmem106a11Amat2-------------Low expression
Class-3 (Close to known imprinted genes)
Adam231A,B,C,Dpat348.052.059.156.557.856.558.053.755.658.753.556.554.1Paternal
Mcts22A,B,CpatK77.282.964.771.880.273.177.380.485.778.370.687.884.2Paternal
Cox4i22Cpat349.3--69.851.056.660.056.555.355.052.856.854.9Paternal
Bcl2l12A,B,C,Dpat349.550.261.660.961.858.359.558.859.961.457.459.259.0Paternal
Tpx22Cpat3-49.749.153.452.650.753.651.456.455.464.662.561.1Paternal
Herc36A,B,C,DmatK47.045.843.145.740.740.549.932.444.029.539.042.324.4Maternal
Fam13a6B,Dmat3-46.153.247.947.151.349.448.4n/an/an/an/an/aBiallelic
Zfp787Bboth3-------------Low expression
Smim177B,Dmat3--38.348.835.550.944.044.545.243.252.150.849.3Maternal
Peg37A,B,C,DpatK96.399.299.692.492.894.194.395.197.598.098.097.397.7Paternal
Zfp9547Bmat3-------------Low expression
Zfp7737Bmat3-------------Low expression
Zfp7727Bmat3-------------Low expression
Clcn4-27Bmat344.654.548.548.948.849.949.148.049.550.751.049.650.7Placental
Ifitm107C,Dmat346.952.348.747.541.553.049.445.049.043.243.650.747.0Maternal
Ctsd7B,Dmat346.847.850.150.049.048.448.949.849.350.249.050.549.5Biallelic
Evl12Bpat346.248.350.248.751.351.150.249.550.950.952.650.850.6Biallelic
Slc25a2912Bpat318.945.552.353.956.254.553.354.252.853.850.653.653.2Placental
Wars12Cpat345.252.555.153.956.254.553.354.252.853.950.653.653.2Paternal
Wdr2512B,Dpat3---50.752.952.050.648.751.859.849.951.7-Paternal
Dlk112A,B,C,DpatK95.787.993.792.592.794.690.295.089.595.693.286.995.2Paternal
Ppp2r5c12B,Cpat352.451.348.548.649.354.050.648.250.648.745.251.049.3Biallelic
Dync1h112B,Cpat346.949.850.450.750.751.149.950.451.050.450.150.149.8Biallelic
Ago215A,B,C,Dmat347.750.730.124.125.828.828.719.325.928.738.033.324.6Maternal
Ampd37BmatK23.447.551.448.648.548.351.850.950.146.750.949.451.5Placental
Gab18ApatK74.950.449.852.749.648.649.451.047.149.648.850.150.2Placental

Additional files

Supplementary file 1

Data generated in this study.

(a) Study information for Dataset A (Babak et al., 2008), Dataset B (Bonthuis et al., 2015), Dataset C (Perez et al., 2015), Dataset D (Crowley et al., 2015) and Dataset E (Andergassen et al., 2017). (b) All genes called in original studies. (c) Overlapping Novel genes called in original studies. (d) All genes called in this study using ISoLDE. (e) Number of genes called in individual tissues in the original study and this study. (f) Expression levels of genes in RNA-seq data used for ISoLDE and called as biased in original study - Figure 2—figure supplement 1. (g) Strain biased genes called in Dataset B and Dataset C in this study. (h) CpG Methylation and H3K27me3 over promoter regions of imprinted and Class-3 genes - Figure 3D. (i) Overlapping genes called in this study and the original one. This list includes genes generated in the undetermined list by the ISoLDE pipeline. (j) List of primers used in the study.

https://cdn.elifesciences.org/articles/83364/elife-83364-supp1-v2.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/83364/elife-83364-mdarchecklist1-v2.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Carol A Edwards
  2. William MD Watkinson
  3. Stephanie B Telerman
  4. Lisa C Hulsmann
  5. Russell S Hamilton
  6. Anne C Ferguson-Smith
(2023)
Reassessment of weak parent-of-origin expression bias shows it rarely exists outside of known imprinted regions
eLife 12:e83364.
https://doi.org/10.7554/eLife.83364